Model load events are already emitted by the backend but were only
logged in the frontend. Track a loading counter via
model_load_started / model_load_complete and flip the existing
ProgressBar into indeterminate mode while any model is loading, so
users get visual feedback that something is happening.
Infer encoder and decoder block_out_channels independently from the
state dict and rebuild the decoder submodule when its channel widths
differ from the encoder, so the asymmetric full_encoder_small_decoder
checkpoint from black-forest-labs/FLUX.2-small-decoder loads correctly.
Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>
Add support for Kohya-format Z-Image LoRAs (lora_unet__ prefix) by
adding key detection and conversion to dot-notation module paths.
Fix ComfyUI-format Z-Image LoRAs being misidentified as main models
by ensuring LoRA-specific suffixes (including .alpha) are checked
before Z-Image key matching in _has_z_image_keys().
* feat(model-manager): add comprehensive sorting capabilities for models
dded the ability to sort models in the Model Manager by various attributes
including Name, Base, Type, Format, Size, Date Added, and Date Modified.
Supports both ascending and descending order.
- Backend: Added `order_by` and `direction` query parameters to the
``/api/v1/models`/` listing endpoint. Implemented case-insensitive
sorting in the SQLite model records service.
- Frontend: Introduced `<ModelSortControl />` UI, updated Redux slices
to manage sort state, removed client-side entity adapter sorting to
respect server-side ordering, and added i18n localization keys.
- Tests: Added test coverage for SQL-based sorting on size and name.
* feat(model-manager): add comprehensive sorting capabilities for models
dded the ability to sort models in the Model Manager by various attributes
including Name, Base, Type, Format, Size, Date Added, and Date Modified.
Supports both ascending and descending order.
- Backend: Added `order_by` and `direction` query parameters to the
``/api/v1/models`/` listing endpoint. Implemented case-insensitive
sorting in the SQLite model records service.
- Frontend: Introduced `<ModelSortControl />` UI, updated Redux slices
to manage sort state, removed client-side entity adapter sorting to
respect server-side ordering, and added i18n localization keys.
- Tests: Added test coverage for SQL-based sorting on size and name.
* ruff fix
* typegen fix
* typegen fix - this time without my custom nodes.
* another typegen fix
* refactor(ui): consolidate model filter and sort controls into a unified menu
- Replaced separate `ModelSortControl` and `ModelTypeFilter` components with a single, unified "Filtering" dropdown menu.
- Organised filtering options into categorised submenus in the following order: Direction, Sort By, and Model Type.
- Enhanced submenu labels to display the currently active selection inline for quick reference.
- Improved visual alignment within menus by using hidden checkmarks on unselected items, ensuring consistent indentation across all options.
- Resolved styling and linting issues (unused variables, JSX bind warnings) within the new component.
* Lint fix
* Addresses PR feedback to use translation strings directly within `ORDER_BY_OPTIONS`.
Previously, sort keys and their translated labels were maintained in separate constructs (`ORDER_BY_OPTIONS` array and `ORDER_BY_LABELS` map). This refactor converts `ORDER_BY_OPTIONS` into an array of objects containing both the `key` and its corresponding `i18nKey`, creating a single source of truth.
This change:
- Simplifies the `SortBySubMenu` component by removing the redundant `ORDER_BY_LABELS` lookup map.
- Improves maintainability by ensuring developers only need to update one place when adding or modifying sort options.
- Reduces the risk of mismatched keys and labels.
---------
Co-authored-by: Jonathan <34005131+JPPhoto@users.noreply.github.com>
Co-authored-by: Alexander Eichhorn <alex@eichhorn.dev>
Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>
Port SqliteSessionQueue to a SQLAlchemy Core / SQLModel hybrid that keeps the
existing public API and DB schema (migrations and triggers untouched). Hot
paths (enqueue bulk insert, dequeue, bulk cancel/delete, list with cursor
pagination, status aggregations) use Core to avoid ORM hydration overhead;
single-row reads stay ORM-style for clarity.
- Add SqlModelSessionQueue alongside the legacy SqliteSessionQueue
- Add the missing `workflow` column to SessionQueueTable (was added by
migration_2 but never declared on the SQLModel)
- Wire dependencies.py to the new implementation
- Add 36 unit tests covering enqueue/dequeue, status mutations, bulk
cancel/delete, prune-to-limit, retry, pagination and aggregations
- Avoid nested write sessions on the single StaticPool connection by reading
the current item before opening the outer write session
* fix(ui): FLUX.2 Klein VAE/Qwen3 readiness checks and diffusers source auto-detection
Fix several issues with FLUX.2 Klein model handling:
1. Add readiness validation for non-diffusers Klein models so the invoke
button is disabled when required VAE/Qwen3 submodels are missing.
2. Auto-detect installed diffusers flux2 models and pass them as
qwen3_source_model in the graph builder, so GGUF/safetensors models
can extract VAE and encoder from an available diffusers model.
3. Use variant-aware matching so Klein 9B models pick a 9B diffusers
source (not 4B), preventing Qwen3 encoder dimension mismatches.
4. Change placeholder text from "From main model" to "From diffusers
model" or "No diffusers model available" depending on availability.
5. Export readiness check functions and add comprehensive tests for both
the graph builder and readiness logic.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* Chore Fix merge
* fix(ui): unify FLUX.2 Klein Qwen3 variant matching
Extract KLEIN_TO_QWEN3_VARIANT_MAP and isFlux2KleinQwen3Compatible into
features/parameters/util/flux2Klein so UI placeholder, readiness check, and
graph builder share one rule. Accepts klein_9b and klein_9b_base as mutual
Qwen3 sources (both use qwen3_8b) and guards against undefined === undefined
false positives.
Use zModelIdentifierField.parse for qwen3_source_model construction in
buildFLUXGraph, matching the pattern used for Z-Image.
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Alexander Eichhorn <alex@eichhorn.dev>
Add a "Lock Transparency" toggle to raster layers that preserves the
alpha channel when painting. When enabled, brush strokes use
'source-atop' compositing to only paint on existing non-transparent
pixels, similar to Photoshop's "Lock Transparent Pixels" feature.
Co-authored-by: dunkeroni <dunkeroni@gmail.com>
- setVisibility now uses layer.remove()/stage.add() to detach/reattach
hidden canvas elements from the DOM, freeing browser compositing resources
- Wrap syncIsOnscreen and syncIntersectsBbox with rafThrottle to reduce
redundant calculations during pan/zoom
- Increase canvasElementCacheSize 32→128 and imageDataCacheSize 32→64
- Add 3-Layer-Flattening design document for future architectural optimization
Co-authored-by: dunkeroni <dunkeroni@gmail.com>
* feat: initial external model support
* feat: support reference images for external models
* fix: sorting lint error
* chore: hide Reidentify button for external models
* review: enable auto-install/remove fro external models
* feat: show external mode name during install
* review: model descriptions
* review: implemented review comments
* review: added optional seed control for external models
* chore: fix linter warning
* review: save api keys to a seperate file
* docs: updated external model docs
* chore: fix linter errors
* fix: sync configured external starter models on startup
* feat(ui): add provider-specific external generation nodes
* feat: expose external panel schemas in model configs
* feat(ui): drive external panels from panel schema
* docs: sync app config docstring order
* feat: add gemini 3.1 flash image preview starter model
* feat: update gemini image model limits
* fix: resolve TypeScript errors and move external provider config to api_keys.yaml
Add 'external', 'external_image_generator', and 'external_api' to Zod
enum schemas (zBaseModelType, zModelType, zModelFormat) to match the
generated OpenAPI types. Remove redundant union workarounds from
component prop types and Record definitions.
Fix type errors in ModelEdit (react-hook-form Control invariance),
parsing.tsx (model identifier narrowing), buildExternalGraph (edge
typing), and ModelSettings import/export buttons.
Move external_gemini_base_url and external_openai_base_url into
api_keys.yaml alongside the API keys so all external provider config
lives in one dedicated file, separate from invokeai.yaml.
* feat: add resolution presets and imageConfig support for Gemini 3 models
Add combined resolution preset selector for external models that maps
aspect ratio + image size to fixed dimensions. Gemini 3 Pro and 3.1 Flash
now send imageConfig (aspectRatio + imageSize) via generationConfig instead
of text-based aspect ratio hints used by Gemini 2.5 Flash.
Backend: ExternalResolutionPreset model, resolution_presets capability field,
image_size on ExternalGenerationRequest, and Gemini provider imageConfig logic.
Frontend: ExternalSettingsAccordion with combo resolution select, dimension
slider disabling for fixed-size models, and panel schema constraint wiring
for Steps/Guidance/Seed controls.
* Remove unused external model fields and add provider-specific parameters
- Remove negative_prompt, steps, guidance, reference_image_weights,
reference_image_modes from external model nodes (unused by any provider)
- Remove supports_negative_prompt, supports_steps, supports_guidance
from ExternalModelCapabilities
- Add provider_options dict to ExternalGenerationRequest for
provider-specific parameters
- Add OpenAI-specific fields: quality, background, input_fidelity
- Add Gemini-specific fields: temperature, thinking_level
- Add new OpenAI starter models: GPT Image 1.5, GPT Image 1 Mini,
DALL-E 3, DALL-E 2
- Fix OpenAI provider to use output_format (GPT Image) vs
response_format (DALL-E) and send model ID in requests
- Add fixed aspect ratio sizes for OpenAI models (bucketing)
- Add ExternalProviderRateLimitError with retry logic for 429 responses
- Add provider-specific UI components in ExternalSettingsAccordion
- Simplify ParamSteps/ParamGuidance by removing dead external overrides
- Update all backend and frontend tests
* Chore Ruff check & format
* Chore typegen
* feat: full canvas workflow integration for external models
- Add missing aspect ratios (4:5, 5:4, 8:1, 4:1, 1:4, 1:8) to type
system for external model support
- Sync canvas bbox when external model resolution preset is selected
- Use params preset dimensions in buildExternalGraph to prevent
"unsupported aspect ratio" errors
- Lock all bbox controls (resize handles, aspect ratio select,
width/height sliders, swap/optimal buttons) for external models
with fixed dimension presets
- Disable denoise strength slider for external models (not applicable)
- Sync bbox aspect ratio changes back to paramsSlice for external models
- Initialize bbox dimensions when switching to an external model
* Chore typegen Linux seperator
* feat: full canvas workflow integration for external models
- Update buildExternalGraph test to include dimensions in mock params
* Merge remote-tracking branch 'upstream/main' into external-models
* Chore pnpm fix
* add missing parameter
* docs: add External Models guide with Gemini and OpenAI provider pages
* fix(external-models): address PR review feedback
- Gemini recall: write temperature, thinking_level, image_size to image metadata;
wire external graph as metadata receiver; add recall handlers.
- Canvas: gate regional guidance, inpaint mask, and control layer for external models.
- Canvas: throw a clear error on outpainting for external models (was falling back to
inpaint and hitting an API-side mask/image size mismatch).
- Workflow editor: add ui_model_provider_id filter so OpenAI and Gemini nodes only
list their own provider's models.
- Workflow editor: silently drop seed when the selected model does not support it
instead of raising a capability error.
- Remove the legacy external_image_generation invocation and the graph-builder
fallback; providers must register a dedicated node.
- Regenerate schema.ts.
- remove Gemini debug dumps to outputs/external_debug
* fix(external-models): resolve TSC errors in metadata parsing and external graph
- Export imageSizeChanged from paramsSlice (required by the new ImageSize
recall handler).
- Emit the external graph's metadata model entry via zModelIdentifierField
since ExternalApiModelConfig is not part of the AnyModelConfig union.
* chore: prettier format ModelIdentifierFieldInputComponent
* fix: remove unsupported thinkingConfig from Gemini image models and restrict GPT Image models to txt2img
* chore typegen
* chore(docs): regenerate settings.json for external provider fields
* fix(external): fix mask handling and mode support for external providers
- Remove img2img and inpaint modes from Gemini models (Gemini has no
bitmap mask or dedicated edit API; image editing works via reference
images in the UI)
- Fix DALL-E 2 inpainting: convert grayscale mask to RGBA with alpha
channel transparency (OpenAI expects transparent=edit area) and
convert init image to RGBA when mask is present
* fix(external): update mode support and UI for external providers
- Remove DALL-E 2 from starter models (deprecated, shutdown May 12 2026)
- Enable img2img for GPT Image 1/1.5/1-mini (supports edits endpoint)
- Set Gemini models to txt2img only (no mask/edit API; editing via
ref images)
- Hide mode/init_image/mask_image fields on Gemini node (not usable)
- Hide mask_image field on OpenAI node (no model supports inpaint)
* Chore typegen
* fix(external): improve OpenAI node UX and disable cache by default
- Hide OpenAI node's mode and init_image fields: OpenAI's API has no
img2img/inpaint distinction (the edits endpoint is invoked
automatically when reference images are provided). init_image is
functionally identical to a reference image and was misleading users.
- Default use_cache to False for external image generation nodes:
external API calls are non-deterministic and incur usage costs.
Cache hits returned stale image references that did not produce new
gallery entries on repeat invokes.
* fix(external): duplicate cached images on cache hit instead of skipping
External image generation nodes use the standard invocation cache, but
returning the cached output (with stale image_name references) on cache
hits resulted in no new gallery entries — the Invoke button would spin
indefinitely on repeat invokes with identical parameters.
Override invoke_internal so that on cache hit, the cached images are
loaded and re-saved as new gallery entries. The expensive API call is
still skipped (cost saving), but the user sees a new image as expected.
* Chore typegen + ruff
* CHore ruff format
* fix(external): restore OpenAI advanced settings on Remix recall
Remix recall iterates through ImageMetadataHandlers but only Gemini's
temperature handler was wired up — OpenAI's quality, background, and
input_fidelity were stored in image metadata but never parsed back into
the params slice. Add the three missing handlers so Remix restores
these settings as expected.
---------
Co-authored-by: Alexander Eichhorn <alex@eichhorn.dev>
Co-authored-by: Alexander Eichhorn <alex@code-with.us>
Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>
Introduces SQLModel (SQLAlchemy + Pydantic) as an ORM layer to enable
future database backend switching (PostgreSQL, MySQL). All services
except session_queue have been migrated to SQLModel-based implementations
while keeping the existing migration system and raw SQLite connection
intact for backwards compatibility.
Key changes:
- Add sqlmodel dependency
- Define SQLModel table models for all 14 database tables
- Extend SqliteDatabase with SQLAlchemy Engine and Session management
- Create SQLModel implementations for 10 services (boards, images,
workflows, models, users, style presets, app settings, etc.)
- Session queue remains on raw SQLite (Phase 3)
- Add 95 unit tests and 12 performance benchmarks
- Optimize with StaticPool, expire_on_commit=False, and read-only sessions
* feat(ui): add canvas project save/load (.invk format)
Add ZIP-based .invk file format to save and restore the entire canvas
state including all layers, masks, reference images, generation
parameters, LoRAs, and embedded image files. Images are deduplicated
on load - only missing images are re-uploaded from the project file.
- Always clear LoRAs on project load, even when project has none
- Fix jszip dependency ordering in package.json
- Add useAssertSingleton to SaveCanvasProjectDialog for consistency
- Add concurrency limit (max 5) for image fetch/upload requests
- Remove redundant deep-clone in remapCroppableImage (mutate in-place)
- Default project name to "Canvas Project" instead of empty string
* Chore pnpm fix
* feat(ui): group nodes by category in add-node dialog
Add collapsible category grouping to the node picker command palette.
Categories are parsed from the backend schema and displayed as
expandable sections with caret icons. All categories auto-expand
when searching.
* feat(ui): add toggle for category grouping in add-node dialog and prioritize exact matches
Add a persistent "Group Nodes by Category" setting to workflow editor settings,
allowing users to switch between grouped and flat node list views. Also sort
exact title matches to the top when searching.
* fix: update test schema categories to match expected templates
* feat: add expand/collapse all buttons to node picker and fix node categories
Add "Expand All" and "Collapse All" link-buttons above the grouped
category list in the add-node dialog so users can quickly open or
close all categories at once. Buttons are hidden during search since
categories auto-expand while searching.
Fix two miscategorized nodes: Z-Image ControlNet was in "Control"
instead of "Controlnet", and Upscale (RealESRGAN) was in "Esrgan"
instead of "Upscale".
* refactor(nodes): clean up node category taxonomy
Reorganize all built-in invocation categories into a consistent set of
18 groups (model, prompt, conditioning, controlnet_preprocessors,
latents, image, mask, inpaint, tiles, upscale, segmentation, math,
strings, primitives, batch, metadata, multimodal, canvas).
- Move denoise/i2l/l2i nodes consistently into "latents"
- Move all mask creation/manipulation nodes into "mask"
- Split ControlNet preprocessors out of "controlnet" into their own group
- Fold "unet", "vllm", "string", "ip_adapter", "t2i_adapter" into larger
groups
- Move metadata_linked denoise wrappers from "latents" to "metadata"
- Add missing category to ideal_size
- Introduce dedicated "canvas" group for canvas/output/panel nodes
Also adds the now-required `category` field to invocation template
fixtures in validateConnection.test.ts.
* Chore Ruff Format
---------
Co-authored-by: dunkeroni <dunkeroni@gmail.com>
* Feat(Canvas): Add Lasso tool with Freehand and Polygon modes
* Refine Lasso modes behavior and optimisation.
* Fix: Pettier
* added docs/features/Lasso_tool.md
* Fix: Removed restrictions mentioned in PR's conversation:
1. Disabled when there is no visible raster content
2. Lasso is blocked when all inpaint masks are globally hidden.
---------
Co-authored-by: dunkeroni <dunkeroni@gmail.com>
* Add more settings to invokeai.yaml for improved queue management.
* Adjusted description
* More logic tweaking
* chore(api): update generated schema types
* chore(api): update generated schema types
* Add: UI element for max_queue_history to 'Settings' modal.
Now it is possible to set Max queue history in both places: .yaml and UI.
* chore(api): regenerate schema types
* chore(api): normalize generated schema path defaults
---------
Co-authored-by: dunkeroni <dunkeroni@gmail.com>
* feat: Per-user workflow libraries in multiuser mode (#114)
* Add per-user workflow isolation: migration 28, service updates, router ownership checks, is_public endpoint, schema regeneration, frontend UI
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* feat: add shared workflow checkbox to Details panel, auto-tag, gate edit/delete, fix tests
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Restrict model sync to admin users only (#118)
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* feat: distinct splash screens for admin/non-admin users in multiuser mode (#116)
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Disable Save when editing another user's shared workflow in multiuser mode (#120)
* Disable Save when editing another user's shared workflow in multiuser mode
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* chore(app): ruff
* Add board visibility (private/shared/public) feature with tests and UI
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Enforce read-only access for non-owners of shared/public boards in UI
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Fix remaining board access enforcement: invoke icon, drag-out, change-board filter, archive
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* fix: allow drag from shared boards to non-board targets (viewer, ref image, etc.)
Previously, images in shared boards owned by another user could not be
dragged at all — the draggable setup was completely skipped in
GalleryImage.tsx when canWriteImages was false. This blocked ALL drop
targets including the viewer, reference image pane, and canvas.
Now images are always draggable. The board-move restriction is enforced
in the dnd target isValid functions instead:
- addImageToBoardDndTarget: rejects moves from shared boards the user
doesn't own (unless admin or board is public)
- removeImageFromBoardDndTarget: same check
Other drop targets (viewer, reference images, canvas, comparison, etc.)
remain fully functional for shared board images.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(security): add auth requirement to all sensitive routes in multimodal mode
* chore(backend): ruff
* fix (backend): improve user isolation for session queue and recall parameters
- Sanitize session queue information of all cross-user fields except for the timestamps and status.
- Recall parameters are now user-scoped.
- Queue status endpoints now report user-scoped activity rather than global activity
- Tests added:
TestSessionQueueSanitization (4 tests):
1. test_owner_sees_all_fields - Owner sees complete queue item data
2. test_admin_sees_all_fields - Admin sees complete queue item data
3. test_non_owner_sees_only_status_timestamps_errors -
Non-owner sees only item_id, queue_id, status, and timestamps; everything else is redacted
4. test_sanitization_does_not_mutate_original - Sanitization doesn't modify the original object
TestRecallParametersIsolation (2 tests):
5. test_user1_write_does_not_leak_to_user2 - User1's recall params are not visible in user2's client state
6. test_two_users_independent_state - Both users can write recall params independently without overwriting each other
fix(backend): queue status endpoints report user-scoped stats rather than global stats
* fix(workflow): do not filter default workflows in multiuser mode
Problem: When categories=['user', 'default'] (or no category filter)
and user_id was set for multiuser scoping, the SQL query became
WHERE category IN ('user', 'default') AND user_id = ?,
which excluded default workflows (owned by "system").
Fix: Changed user_id = ? to (user_id = ? OR category = 'default') in
all 6 occurrences across workflow_records_sqlite.py — in get_many,
counts_by_category, counts_by_tag, and get_all_tags. Default
workflows are now always visible regardless of user scoping.
Tests added (2):
- test_default_workflows_visible_when_listing_user_and_default — categories=['user','default'] includes both
- test_default_workflows_visible_when_no_category_filter — no filter still shows defaults
* fix(multiuser): scope queue/recall/intermediates endpoints to current user
Several read-only and event-emitting endpoints were leaking aggregate
cross-user activity in multiuser mode:
- recall_parameters_updated event was broadcast to every queue
subscriber. Added user_id to the event and routed it to the owner +
admin rooms only.
- get_queue_status, get_batch_status, counts_by_destination and
get_intermediates_count now scope counts to the calling user
(admins still see global state). Removed the now-redundant
user_pending/user_in_progress fields and simplified QueueCountBadge.
- get_queue_status hides current item_id/session_id/batch_id when the
current item belongs to another user.
Also fixes test_session_queue_sanitization assertions that lagged
behind the recently expanded redaction set.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* chore(backend): ruff
* fix(multiuser): reject anonymous websockets and scope queue item events
Close three cross-user leaks in the websocket layer:
- _handle_connect() now rejects connections without a valid JWT in
multiuser mode (previously fell through to user_id="system"), so
anonymous clients can no longer subscribe to queue rooms and observe
other users' activity. In single-user mode it still accepts as system
admin.
- _handle_sub_queue() no longer silently falls back to the system user
for an unknown sid in multiuser mode; it refuses the subscription.
- QueueItemStatusChangedEvent and BatchEnqueuedEvent are now routed to
user:{user_id} + admin rooms instead of the full queue room. Both
events carry unsanitized user_id, batch_id, origin, destination,
session_id, and error metadata and must not be broadcast.
- BatchEnqueuedEvent gains a user_id field; emit_batch_enqueued and
enqueue_batch thread it through.
New TestWebSocketAuth suite covers connect accept/reject for both
modes, sub_queue refusal, and private routing of the queue item and
batch events (plus a QueueClearedEvent sanity check).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(multiuser): verify user record on websocket connect
A deleted or deactivated user with an unexpired JWT could still open a
websocket and subscribe to queue rooms. Now _handle_connect() checks the
backing user record (exists + is_active) in multiuser mode, mirroring
the REST auth path in auth_dependencies.py. Fails closed if the user
service is unavailable.
Tests: added deleted-user and inactive-user rejection tests; updated
valid-token test to create the user in the database first.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(multiuser): close bulk download cross-user exfiltration path
Backend:
- POST /download now validates image read access (per-image) and board
read access (per-board) before queuing the download.
- GET /download/{name} is intentionally unauthenticated because the
browser triggers it via <a download> which cannot carry Authorization
headers. Access control relies on POST-time checks, UUID filename
unguessability, private socket event routing, and single-fetch deletion.
- Added _assert_board_read_access() helper to images router.
- Threaded user_id through bulk download handler, base class, event
emission, and BulkDownloadEventBase so events carry the initiator.
- Bulk download service now tracks download ownership via _download_owners
dict (cleaned up on delete).
- Socket bulk_download room subscription restricted to authenticated
sockets in multiuser mode.
- Added error-catching in FastAPIEventService._dispatch_from_queue to
prevent silent event dispatch failures.
Frontend:
- Fixed pre-existing race condition where the "Preparing Download" toast
from the POST response overwrote the "Ready to Download" toast from the
socket event (background task completes in ~17ms, so the socket event
can arrive before Redux processes the HTTP response). Toast IDs are now
distinct: "preparing:{name}" vs "{name}".
- bulk_download_complete/error handlers now dismiss the preparing toast.
Tests (8 new):
- Bulk download by image names rejected for non-owner (403)
- Bulk download by image names allowed for owner (202)
- Bulk download from private board rejected (403)
- Bulk download from shared board allowed (202)
- Admin can bulk download any images (202)
- Bulk download events carry user_id
- Bulk download event emitted to download room
- GET /download unauthenticated returns 404 for unknown files
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(multiuser): enforce board visibility on image listing endpoints
GET /api/v1/images?board_id=... and GET /api/v1/images/names?board_id=...
passed board_id directly to the SQL layer without checking board
visibility. The SQL only applied user_id filtering for board_id="none"
(uncategorized images), so any authenticated user who knew a private
board ID could enumerate its images.
Both endpoints now call _assert_board_read_access() before querying,
returning 403 unless the caller is the board owner, an admin, or the
board is Shared/Public.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* chore(backend): ruff
* fix(multiuser): require image ownership when adding images to boards
add_image_to_board and add_images_to_board only checked write access to
the destination board, never verifying that the caller owned the source
image. An attacker could add a victim's image to their own board, then
exploit the board-ownership fallback in _assert_image_owner to gain
delete/patch/star/unstar rights on the image.
Both endpoints now call _assert_image_direct_owner which requires direct
image ownership (image_records.user_id) or admin — board ownership is
intentionally not sufficient, preventing the escalation chain.
Also fixed a pre-existing bug where HTTPException from the inner loop in
add_images_to_board was caught by the outer except-Exception and returned
as 500 instead of propagating the correct status code.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* chore(backend): ruff
* fix(multiuser): validate image access in recall parameter resolution
The recall endpoint loaded image files and ran ControlNet preprocessors
on any image_name supplied in control_layers or ip_adapters without
checking that the caller could read the image. An attacker who knew
another user's image UUID could extract dimensions and, for supported
preprocessors, mint a derived processed image they could then fetch.
Added _assert_recall_image_access() which validates read access for every
image referenced in the request before any resolution or processing
occurs. Access is granted to the image owner, admins, or when the image
sits on a Shared/Public board.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(multiuser): require admin auth on model install job endpoints
list_model_installs, get_model_install_job, pause, resume,
restart_failed, and restart_file were unauthenticated — any caller who
could reach the API could view sensitive install job fields (source,
local_path, error_traceback) and interfere with installation state.
All six endpoints now require AdminUserOrDefault, consistent with the
neighboring cancel and prune routes.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(multiuser): close bulk download exfiltration and additional review findings
Bulk download capability token exfiltration:
- Socket events now route to user:{user_id} + admin rooms instead of the
shared 'default' room (the earlier toast race that blocked this approach
was fixed in a prior commit).
- GET /download/{name} re-requires CurrentUserOrDefault and enforces
ownership via get_owner().
- Frontend download handler replaced <a download> (which cannot carry auth
headers) with fetch() + Authorization header + programmatic blob download.
Additional fixes from reviewer tests:
- Public boards now grant write access in _assert_board_write_access and
mutation rights in _assert_image_owner (BoardVisibility.Public).
- Uncategorized image listing (GET /boards/none/image_names) now filters
to the caller's images only, preventing cross-user enumeration.
- board_images router uses board_image_records.get_board_for_image()
instead of images.get_dto() to avoid dependency on image_files service.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(multiuser): add user_id scoping to workflow SQL mutations
Defense-in-depth: the route layer already checks ownership before
calling update/delete/update_is_public/update_opened_at, but the SQL
statements did not include AND user_id = ?, so a bypass of the route
check would allow cross-user mutations.
All four methods now accept an optional user_id parameter. When
provided, the SQL WHERE clause is scoped to that user. The route layer
passes current_user.user_id for non-admin callers and None for admins.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(multiuser): allow non-owner uploads to public boards
upload_image() blocked non-owner uploads even to public boards. The
board write check now allows uploads when board_visibility is Public,
consistent with the public-board semantics in _assert_board_write_access
and _assert_image_owner.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Copilot <198982749+Copilot@users.noreply.github.com>
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Jonathan <34005131+JPPhoto@users.noreply.github.com>
* feat: add configurable shift parameter for Z-Image sigma schedule
Add a shift (mu) override to the Z-Image denoise invocation and expose
it in the UI. When left blank, shift is auto-calculated from image
dimensions (existing behavior). Users can override to fine-tune the
timestep schedule, with an inline X button to reset back to auto.
* refactor: switch Z-Image sigma schedule from exponential to linear time shift
Use shift directly as a linear multiplier instead of exp(mu), giving
more predictable and uniform control over the timestep schedule.
Auto-calculated values are converted via exp(mu) to preserve identical
default behavior.
* feat: recall Z-Image shift parameter from metadata
Write z_image_shift into graph metadata and add a ZImageShift recall
handler so the shift override can be restored from previously generated
images. Auto-mode (null) is omitted from metadata to avoid persisting a
stale value.
---------
Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>
openapi-typescript computes enum types from `const` usage in
discriminated unions rather than from the enum definition itself,
dropping values that only appear in some union members (e.g. "anima"
from BaseModelType). Add a post-processing step that patches generated
string enum types to match the actual OpenAPI schema definitions.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Alexander Eichhorn <alex@eichhorn.dev>
* fix(ui): replace all hardcoded frontend strings with i18n translation keys
Remove fallback/defaultValue strings from t() calls, replace hardcoded
English text in labels, tooltips, aria-labels, placeholders and JSX content
with proper t() calls, and add ~50 missing keys to en.json. Fix incorrect
i18n key paths in CanvasObjectImage.ts and a Zoom button aria-label bug
in CanvasToolbarScale.tsx.
* chore pnpm run fix
---------
Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>
* feat(frontend): suppress tooltips on touch devices
* fix(frontend): change selector to role="tooltip" because .chakra-tooltip does not match
* chore(frontend): lint:prettier
* feat: add Anima model support
* schema
* image to image
* regional guidance
* loras
* last fixes
* tests
* fix attributions
* fix attributions
* refactor to use diffusers reference
* fix an additional lora type
* some adjustments to follow flux 2 paper implementation
* use t5 from model manager instead of downloading
* make lora identification more reliable
* fix: resolve lint errors in anima module
Remove unused variable, fix import ordering, inline dict() call,
and address minor lint issues across anima-related files.
* Chore Ruff format again
* fix regional guidance error
* fix(anima): validate unexpected keys after strict=False checkpoint loading
Capture the load_state_dict result and raise RuntimeError on unexpected
keys (indicating a corrupted or incompatible checkpoint), while logging
a warning for missing keys (expected for inv_freq buffers regenerated
at runtime).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(anima): make model loader submodel fields required instead of Optional
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(anima): add Classification.Prototype to LoRA loaders, fix exception types
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(anima): fix replace-all in key conversion, warn on DoRA+LoKR, unify grouping functions
- Use key.replace(old, new, 1) in _convert_kohya_unet_key and _convert_kohya_te_key to avoid replacing multiple occurrences
- Upgrade DoRA+LoKR dora_scale strip from logger.debug to logger.warning since it represents data loss
- Replace _group_kohya_keys and _group_by_layer with a single _group_keys_by_layer function parameterized by extra_suffixes, with _KOHYA_KNOWN_SUFFIXES and _PEFT_EXTRA_SUFFIXES constants
- Add test_empty_state_dict_returns_empty_model to verify empty input produces a model with no layers
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(anima): add safety cap for Qwen3 sequence length to prevent OOM
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(anima): add denoising range validation, fix closure capture, add edge case tests
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(anima): add T5 to metadata, fix dead code, decouple scheduler type guard
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(anima): update VAE field description for required field
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* chore: regenerate frontend types after upstream merge
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* chore: ruff format anima_denoise.py
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix(anima): add T5 encoder metadata recall handler
The T5 encoder was added to generation metadata but had no recall
handler, so it wasn't restored when recalling from metadata.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* chore(frontend): add regression test for buildAnimaGraph
Add tests for CFG gating (negative conditioning omitted when cfgScale <= 1)
and basic graph structure (model loader, text encoder, denoise nodes).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* only show 0.6b for anima
* dont show 0.6b for other models
* schema
* Anima preview 3
* fix ci
---------
Co-authored-by: Your Name <you@example.com>
Co-authored-by: kappacommit <samwolfe40@gmail.com>
Co-authored-by: Alexander Eichhorn <alex@eichhorn.dev>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>
* feat: Add canvas-workflow integration feature
This commit implements a new feature that allows users to run workflows
directly from the unified canvas. Users can now:
- Access a "Run Workflow" option from the canvas layer context menu
- Select a workflow with image parameters from a modal dialog
- Customize workflow parameters (non-image fields)
- Execute the workflow with the current canvas layer as input
- Have the result automatically added back to the canvas
Key changes:
- Added canvasWorkflowIntegrationSlice for state management
- Created CanvasWorkflowIntegrationModal and related UI components
- Added context menu item to raster layers
- Integrated workflow execution with canvas image extraction
- Added modal to global modal isolator
This integration enhances the canvas by allowing users to leverage
custom workflows for advanced image processing directly within the
canvas workspace.
Implements feature request for deeper workflow-canvas integration.
* refactor(ui): simplify canvas workflow integration field rendering
- Extract WorkflowFieldRenderer component for individual field rendering
- Add WorkflowFormPreview component to handle workflow parameter display
- Remove workflow compatibility filtering - allow all workflows
- Simplify workflow selector to use flattened workflow list
- Add comprehensive field type support (String, Integer, Float, Boolean, Enum, Scheduler, Board, Model, Image, Color)
- Implement image field selection UI with radio
* feat(ui): add canvas-workflow-integration logging namespace
* feat(ui): add workflow filtering for canvas-workflow integration
- Add useFilteredWorkflows hook to filter workflows with ImageField inputs
- Add workflowHasImageField utility to check for ImageField in Form Builder
- Only show workflows that have Form Builder with at least one ImageField
- Add loading state while filtering workflows
- Improve error messages to clarify Form Builder requirement
- Update modal description to mention Form Builder and parameter adjustment
- Add fallback error message for workflows without Form Builder
* feat(ui): add persistence and migration for canvas workflow integration state
- Add _version field (v1) to canvasWorkflowIntegrationState for future migrations
- Add persistConfig with migration function to handle version upgrades
- Add persistDenylist to exclude transient state (isOpen, isProcessing, sourceEntityIdentifier)
- Use es-toolkit isPlainObject and tsafe assert for type-safe migration
- Persist selectedWorkflowId and fieldValues across sessions
* pnpm fix imports
* fix(ui): handle workflow errors in canvas staging area and improve form UX
- Clear processing state when workflow execution fails at enqueue time
or during invocation, so the modal doesn't get stuck
- Optimistically update listAllQueueItems cache on queue item status
changes so the staging area immediately exits on failure
- Clear processing state on invocation_error for canvas workflow origin
- Auto-select the only unfilled ImageField in workflow form
- Fix image field overflow and thumbnail sizing in workflow form
* feat(ui): add canvas_output node and entry-based staging area
Add a dedicated `canvas_output` backend invocation node that explicitly
marks which images go to the canvas staging area, replacing the fragile
board-based heuristic. Each `canvas_output` node produces a separate
navigable entry in the staging area, allowing workflows with multiple
outputs to be individually previewed and accepted.
Key changes:
- New `CanvasOutputInvocation` backend node (canvas.py)
- Entry-based staging area model where each output image is a separate
navigable entry with flat next/prev cycling across all items
- Frontend execute hook uses `canvas_output` type detection instead of
board field heuristic, with proper board field value translation
- Workflow filtering requires both Form Builder and canvas_output node
- Updated QueueItemPreviewMini and StagingAreaItemsList for entries
- Tests for entry-based navigation, multi-output, and race conditions
* Chore pnp run fix
* Chore eslint fix
* Remove unused useOutputImageDTO export to fix knip lint
* Update invokeai/frontend/web/src/features/controlLayers/components/CanvasWorkflowIntegration/useCanvasWorkflowIntegrationExecute.tsx
Co-authored-by: dunkeroni <dunkeroni@gmail.com>
* move UI text to en.json
* fix conflicts merge with main
* generate schema
* Chore typegen
---------
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>
Co-authored-by: dunkeroni <dunkeroni@gmail.com>
Klein 9B Base (undistilled) and Klein 9B (distilled) have identical
architectures and cannot be distinguished from the state dict alone.
Use a filename heuristic ("base" in the name) to detect the Base
variant for checkpoint, GGUF, and diffusers format models.
Also fixes the incorrect guidance_embeds-based detection for diffusers
format, since both variants have guidance_embeds=False.
* feat: add support for OneTrainer BFL Flux LoRA format
Newer versions of OneTrainer export Flux LoRAs using BFL internal key
names (double_blocks, single_blocks, img_attn, etc.) with a
'transformer.' prefix and split QKV projections (qkv.0/1/2, linear1.0/1/2/3).
This format was not recognized by any existing detector.
Add detection and conversion for this format, merging split QKV and
linear1 layers into MergedLayerPatch instances for the fused BFL model.
* chore ruff
OneTrainer exports Z-Image LoRAs with 'transformer.layers.' key prefix
instead of 'diffusion_model.layers.'. Add this prefix (and the
PEFT-wrapped 'base_model.model.transformer.layers.' variant) to the
Z-Image LoRA probe so these models are correctly identified and loaded.
* Added If node
* Added stricter type checking on inputs
* feat(nodes): make if-node type checks cardinality-aware without loosening global AnyField
* chore: typegen
* Initial plan
* Warn user when credentials have expired in multiuser mode
Agent-Logs-Url: https://github.com/lstein/InvokeAI/sessions/f0947cda-b15c-475d-b7f4-2d553bdf2cd6
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Address code review: avoid multiple localStorage reads in base query
Agent-Logs-Url: https://github.com/lstein/InvokeAI/sessions/f0947cda-b15c-475d-b7f4-2d553bdf2cd6
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* bugfix(multiuser): ask user to log back in when authentication token expires
* feat: sliding window session expiry with token refresh
Backend:
- SlidingWindowTokenMiddleware refreshes JWT on each mutating request
(POST/PUT/PATCH/DELETE), returning a new token in X-Refreshed-Token
response header. GET requests don't refresh (they're often background
fetches that shouldn't reset the inactivity timer).
- CORS expose_headers updated to allow X-Refreshed-Token.
Frontend:
- dynamicBaseQuery picks up X-Refreshed-Token from responses and
updates localStorage so subsequent requests use the fresh expiry.
- 401 handler only triggers sessionExpiredLogout when a token was
actually sent (not for unauthenticated background requests).
- ProtectedRoute polls localStorage every 5s and listens for storage
events to detect token removal (e.g. manual deletion, other tabs).
Result: session expires after TOKEN_EXPIRATION_NORMAL (1 day) of
inactivity, not a fixed time after login. Any user-initiated action
resets the clock.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* chore(backend): ruff
* fix: address review feedback on auth token handling
Bug fixes:
- ProtectedRoute: only treat 401 errors as session expiry, not
transient 500/network errors that should not force logout
- Token refresh: use explicit remember_me claim in JWT instead of
inferring from remaining lifetime, preventing silent downgrade of
7-day tokens to 1-day when <24h remains
- TokenData: add remember_me field, set during login
Tests (6 new):
- Mutating requests (POST/PUT/DELETE) return X-Refreshed-Token
- GET requests do not return X-Refreshed-Token
- Unauthenticated requests do not return X-Refreshed-Token
- Remember-me token refreshes to 7-day duration even near expiry
- Normal token refreshes to 1-day duration
- remember_me claim preserved through refresh cycle
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* chore(backend): ruff
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Jonathan <34005131+JPPhoto@users.noreply.github.com>
* feat: add bulk reidentify action for models (#8951)
Add a "Reidentify Models" bulk action to the model manager, allowing
users to re-probe multiple models at once instead of one by one.
- Backend: POST /api/v2/models/i/bulk_reidentify endpoint with partial
failure handling (returns succeeded/failed lists)
- Frontend: bulk reidentify mutation, confirmation modal with warning
about custom settings reset, toast notifications for all outcomes
- i18n: new translation keys for bulk reidentify UI strings
* fix typgen
* Fix bulk reidentify failing for models without trigger_phrases
The bulk reidentify endpoint was directly assigning trigger_phrases
without checking if the config type supports it, causing an
AttributeError for ControlNet models. Added the same hasattr guard
used by the individual reidentify endpoint. Also restored the
missing path preservation that the individual endpoint has.
* Repair partially loaded Qwen models after cancel to avoid device mismatches
* ruff
* Repair CogView4 text encoder after canceled partial loads
* Avoid MPS CI crash in repair regression test
* Fix MPS device assertion in repair test
* fix(ui): resolve models by name+base+type when recalling metadata for reinstalled models
When a model (IP Adapter, ControlNet, etc.) is deleted and reinstalled,
it gets a new UUID key. Previously, metadata recall would fail because
it only looked up models by their stored UUID key. Now the recall falls
back to searching by name+base+type, allowing reinstalled models with
the same name to be correctly resolved.
https://claude.ai/code/session_01XYubzMK363BXGTvfJJqFnX
* Add hash-based model recall fallback for reinstalled models
When a model is deleted and reinstalled, it gets a new UUID key but
retains the same BLAKE3 content hash. This adds hash as a middle
fallback stage in model resolution (key → hash → name+base+type),
making recall more robust.
Changes:
- Add /api/v2/models/get_by_hash backend endpoint (uses existing
search_by_hash from model records store)
- Add getModelConfigByHash RTK Query endpoint in frontend
- Add hash fallback to both resolveModel and parseModelIdentifier
https://claude.ai/code/session_01XYubzMK363BXGTvfJJqFnX
* Chore pnpm fix
* Chore typegen
---------
Co-authored-by: Claude <noreply@anthropic.com>
When deleting a file-based model (e.g. LoRA), the previous logic used
rmtree on the parent directory, which would delete all files in that
folder — even unrelated ones. Now only the specific model file is
removed, and the parent directory is cleaned up only if empty afterward.
* feat: add strict_password_checking config option to relax password requirements
- Add `strict_password_checking: bool = Field(default=False)` to InvokeAIAppConfig
- Add `get_password_strength()` function to password_utils.py (returns weak/moderate/strong)
- Add `strict_password_checking` field to SetupStatusResponse API endpoint
- Update users_base.py and users_default.py to accept `strict_password_checking` param
- Update auth.py router to pass config.strict_password_checking to all user service calls
- Create shared frontend utility passwordUtils.ts for password strength validation
- Update AdministratorSetup, UserProfile, UserManagement components to:
- Fetch strict_password_checking from setup status endpoint
- Show colored strength indicators (red/yellow/blue) in non-strict mode
- Allow any non-empty password in non-strict mode
- Maintain strict validation behavior when strict_password_checking=True
- Update SetupStatusResponse type in auth.ts endpoint
- Add passwordStrength and passwordHelperRelaxed translation keys to en.json
- Add tests for new get_password_strength() function
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Changes before error encountered
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* chore(backend): docstrings
* chore(frontend): typegen
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
Co-authored-by: Jonathan <34005131+JPPhoto@users.noreply.github.com>
* fix(gallery): restore arrow-key browsing and extract shared prev/next navigation
* Added same behavior to Upscale mode and autofocus to gallery after using hotkeys Ctrl+Enter and Ctrl+Shift+Enter
* restore arrow navigation focus flow across viewer states
* fix(gallery): stabilize arrow-key browsing, remove viewer UI flicker, and optimize code
---------
Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>
LoRAs trained with musubi-tuner (and potentially other trainers) that
only target transformer blocks (double_blocks/single_blocks) without
embedding layers (txt_in/vector_in/context_embedder) were incorrectly
classified as Flux 1. Add fallback detection using attention projection
hidden_size and MLP ratio from transformer block tensors
Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>
* perf(flux2): optimize model loading order to prevent cache eviction (fixes#7513)
* Update flux2_klein_text_encoder.py
* Update flux2_klein_text_encoder.py version
---------
Co-authored-by: Alexander Eichhorn <alex@eichhorn.dev>
Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>
The reidentify endpoint overwrote the model's relative path with an
absolute path from the prober, and unconditionally accessed
trigger_phrases which doesn't exist on all config types (e.g. IP
Adapters), causing an AttributeError.
Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>
* Fix: Kill the server with one keyboard interrupt (#94)
* Initial plan
* Handle KeyboardInterrupt in run_app to allow single Ctrl+C shutdown
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Force os._exit(0) on KeyboardInterrupt to avoid hanging on background threads
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
Fix graceful shutdown to wait for download/install worker threads (#102)
* Initial plan
* Replace os._exit(0) with ApiDependencies.shutdown() on KeyboardInterrupt
Instead of immediately force-exiting the process on CTRL+C, call
ApiDependencies.shutdown() to gracefully stop the download and install
manager services, allowing active work to complete or cancel cleanly
before the process exits.
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Make stop() idempotent in download and model install services
When CTRL+C is pressed, uvicorn's graceful shutdown triggers the FastAPI
lifespan which calls ApiDependencies.shutdown(), then a KeyboardInterrupt
propagates from run_until_complete() hitting the except block which tries
to call ApiDependencies.shutdown() a second time.
Change both stop() methods to return silently (instead of raising) when
the service is not running. This handles:
- Double-shutdown: lifespan already stopped the services
- Early interrupt: services were never fully started
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
Fix shutdown hang on session processor thread lock (#108)
* Initial plan
* Fix shutdown hang: wake session processor thread on stop() and mark daemon
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Fix: shut down asyncio executor on KeyboardInterrupt to prevent post-generation hang (#112)
Fix: cancel pending asyncio tasks before loop.close() to suppress destroyed-task warnings
Fix: suppress stack trace when dispatching events after event loop is closed on shutdown
Fix: cancel in-progress generation on stop() to prevent core dump during mid-flight Ctrl+C
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
---------
Co-authored-by: Copilot <198982749+Copilot@users.noreply.github.com>
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* feat(model_manager): add export/import for model settings
Add the ability to export model settings (default_settings, trigger_phrases,
cpu_only) as JSON and import them back. The model name is used as the
filename for exports.
https://claude.ai/code/session_01LXKjbRjfzcG3d3vzk3xRCh
* fix(ui): reset settings forms after import so updated values display immediately
The useForm defaultValues only apply on mount, so importing model settings
updated the backend but the forms kept showing stale values. Added useEffect
to reset forms when the underlying model config changes. Also fixed lint
errors (strict equality, missing React import).
* fix(ui): harden model settings export/import
Prevent cross-model-type import errors by filtering imported fields
against the target model's supported fields, showing clear warnings
for incompatible or partially compatible settings instead of raw
pydantic validation errors. Also fix falsy checks for empty arrays
and objects in export, disable export button when nothing to export,
add client-side validation and FileReader error handling on import.
* Chore pnpm fix
---------
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>
Merged Z-Image checkpoints (e.g. models with LoRAs baked in) may bundle
text encoder weights (text_encoders.*) or other non-transformer keys
alongside the transformer weights. These cause load_state_dict() to fail
with strict=True. Instead of disabling strict mode, explicitly whitelist
valid ZImageTransformer2DModel key prefixes and discard everything else.
Also moves RAM allocation after filtering so it doesn't over-allocate
for discarded keys.
Co-authored-by: Jonathan <34005131+JPPhoto@users.noreply.github.com>
`get_token_permission` is deprecated and will be removed in huggingface_hub 1.0.
Use `whoami()` to validate the token instead, as recommended by the deprecation warning.
* Add comprehensive multi-user support specification and implementation plan
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Clarify Python tooling transition state
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Add executive summary for multi-user support specification
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Reorganize multiuser docs into subfolder and update with approved design decisions
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* docs: fix mkdocs syntax issues
* Fix Z-Image VAE operations not reserving working memory for OOM prevention (#8)
* Initial plan
* Fix Z-Image VAE encode/decode to request working memory
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Add test for Z-Image working memory estimation
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Fix unit test: only set config attribute for AutoencoderKL, not FluxAutoEncoder
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* style: fix ruff errors
* Fix test: use model_construct to bypass Pydantic validation for mock fields
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* chore(ruff): fix ruff errors
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>
* Phase 1: Add multi-user database schema and authentication foundation (#6)
* Initial plan
* Add Phase 1: Database schema, authentication utilities, and user service
- Added dependencies: passlib[bcrypt], python-jose[cryptography], email-validator
- Created migration_25 for multi-user support database schema
- users table with authentication fields
- user_sessions, user_invitations, shared_boards tables
- Added user_id columns to boards, images, workflows, session_queue, style_presets
- Created system user for backward compatibility
- Implemented authentication utilities
- Password hashing and validation (password_utils.py)
- JWT token generation and verification (token_service.py)
- Implemented user service
- Abstract base class (users_base.py)
- SQLite implementation (users_default.py)
- Full CRUD operations for users
- Authentication and admin management
- Added comprehensive tests for all components
- All code passes ruff linting and mypy type checking
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Refactor user service to use transaction pattern and improve security documentation
- Changed all database operations to use self._db.transaction() context manager
- Follows the same pattern as other services like board_records
- Improves thread safety and transaction management
- Added detailed security warnings for the hardcoded JWT secret key
- Documented that it must be moved to config before production
- Fixed whitespace linting issues
- All tests still pass with the refactored code
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Fix system user email to use valid domain
- Changed system user email from system@invokeai.local to system@system.invokeai
- .local domains are rejected by email-validator as special-use/reserved
- All end-to-end tests now pass successfully
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Fix test failures: password length, migration chain, and table existence checks
- Fixed bcrypt 72-byte password limit by truncating long passwords in hash_password and verify_password
- Fixed test_migration_25 to use callback directly instead of migrator to avoid fragmented chain error
- Updated system user email assertion in test from @invokeai.local to @system.invokeai
- Added table existence checks for boards, images, workflows, and session_queue before altering them
- This prevents "no such table" errors when migration runs in environments without all tables
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* (chore) fix ruff errors
* (chore) fix ruff errors (2)
* Fix test failures: use _conn instead of conn and improve password truncation
- Fixed test_user_service.py to use db._conn instead of db.conn (SqliteDatabase uses private _conn)
- Simplified password truncation logic in password_utils.py for better clarity
- Ensured UTF-8 decoding handles truncated bytes properly with errors="ignore"
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* chore(uv): updated uv lockfile
* Fix bcrypt password length errors by configuring passlib properly
- Added bcrypt__relax_truncate_checks=True to CryptContext to prevent errors on long passwords
- Removed min_length=8 constraint from pydantic models to allow service-level validation
- Service-level validation provides better error messages and more control
- Manual truncation code kept as safety net for passwords >72 bytes
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Fix bcrypt parameter name: use truncate_error instead of relax_truncate_checks
- Changed bcrypt__relax_truncate_checks=True to bcrypt__truncate_error=False
- The correct passlib parameter is truncate_error (not relax_truncate_checks)
- Setting it to False allows passwords >72 bytes without raising an error
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* fix(passwords) downgrade bcrypt to work with current passlib
* chore(uv): update locks file
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>
* Implement Phase 2: Authentication Service with JWT and FastAPI Integration (#11)
* Initial plan
* Add Phase 2 authentication service - auth dependencies, router, and service integration
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Address code review feedback - add token expiration constants and improve documentation
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Fix email validation to allow special-use domains like .local for testing
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Fix system user to not be admin - allows /auth/setup to work on fresh database
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* chore: typegen
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>
* Add Phase 3 integration tests and documentation for authentication middleware (#15)
* Initial plan
* Add Phase 3 integration tests for auth router endpoints
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Add Phase 3 completion documentation and testing guide
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* chore: ruff
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>
* chore: fix unresolved merged conflict markers
* Phase 4: Add multi-tenancy to boards and session queue services (#17)
* Initial plan
* Update boards service for multi-user support
- Add user_id parameter to board creation and filtering
- Update board_records base and SQLite implementation to filter by user
- Include shared boards and public boards in user queries
- Update boards router to require authentication
- Update invocation context to pass user_id from queue item
- Add user_id field to SessionQueueItem model
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Update session_queue service for multi-user support
- Add user_id parameter to enqueue_batch method
- Update ValueToInsertTuple to include user_id
- Update prepare_values_to_insert to accept and use user_id
- Update SQL INSERT statements to include user_id column
- Update session_queue router to require authentication
- Update retry_items_by_id to preserve user_id when retrying
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Add authentication imports to images, workflows, and style_presets routers
- Import CurrentUser dependency in all three routers
- Add CurrentUser parameter to upload_image endpoint
- Prepare for further authentication requirements in these services
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Add multiuser tests and Phase 4 documentation
- Create test_boards_multiuser.py with authentication tests
- Verify board operations require authentication
- Test board creation and listing with auth tokens
- Add comprehensive Phase 4 implementation summary
- Document all changes, data flows, and security considerations
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Add authentication to remaining board endpoints
- Add CurrentUser to get_board endpoint
- Add CurrentUser to update_board endpoint
- Add CurrentUser to delete_board endpoint
- Ensures all board operations require authentication
- Addresses code review feedback
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Feature(image boards): Implement per-user board isolation
- Complete verification report with all checks passed
- Document code quality, security, and testing results
- List all achievements and sign-off criteria
- Mark phase as READY FOR MERGE
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* chore: ruff
* chore: resolve conflicts in z_image_working_memory test
* chore: ruff
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>
* feat: Implement Phase 5 of multiuser plan - Frontend authentication (#19)
* Phase 5: Implement frontend authentication infrastructure
- Created auth slice with Redux state management for authentication
- Created auth API endpoints (login, logout, setup, me)
- Created LoginPage component for user authentication
- Created AdministratorSetup component for initial admin setup
- Created ProtectedRoute wrapper for route authentication checking
- Updated API configuration to include Authorization headers
- Installed and configured react-router-dom for routing
- Updated App component with authentication routes
- All TypeScript checks passing
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* chore(style): prettier, typegen and add convenience targets to makefile
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>
* feat: Implement Phase 6 frontend UI updates - UserMenu and admin restrictions
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
docs: Add comprehensive testing and verification documentation for Phase 6
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
docs: Add Phase 6 summary document
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* feat: Add user management script for testing multiuser features
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* feat: Implement read-only model manager access for non-admin users
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
feat: Add admin authorization to model management API endpoints
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
docs: Update specification and implementation plan for read-only model manager
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Phase 7: Comprehensive testing and security validation for multiuser authentication (#23)
* Initial plan
* Phase 7: Complete test suite with 88 comprehensive tests
- Add password utils tests (31 tests): hashing, verification, validation
- Add token service tests (20 tests): JWT creation, verification, security
- Add security tests (13 tests): SQL injection, XSS, auth bypass prevention
- Add data isolation tests (11 tests): multi-user data separation
- Add performance tests (13 tests): benchmarks and scalability
- Add comprehensive testing documentation
- Add phase 7 verification report
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* bugfix(backend): Fix issues with authentication token expiration handling
- Remove time.sleep from token uniqueness test (use different expiration instead)
- Increase token expiration test time from 1 microsecond to 10 milliseconds
- More reliable test timing to prevent flakiness
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
Add Phase 7 summary documentation
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
Fix test_performance.py missing logger fixture
Add missing logger fixture to test_performance.py that was causing test failures.
The fixture creates a Logger instance needed by the user_service fixture.
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
Add board isolation issue specification document
Document the three board isolation issues that need to be addressed:
1. Board list not updating when switching users
2. "Uncategorized" board shared among users
3. Admin cannot access all users' boards
Includes technical details, implementation plan, and acceptance criteria.
This document will be used to create a separate GitHub issue and PR.
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
Fix test failures in Phase 7 tests
- Fix board service API calls to use enum values (BoardRecordOrderBy, SQLiteDirection) instead of strings
- Fix board deletion test to use get_dto() instead of non-existent get() method
- Add exception handling to verify_password() for invalid hash formats
- Update SQL injection test to accept both 401 and 422 status codes (Pydantic validation)
All fixes ensure tests match actual API signatures and handle edge cases properly.
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
Fix token forgery test to properly decode and modify JWT payload
The test was attempting to modify the JWT payload by string replacement on the
base64-encoded data, which doesn't work since "false" doesn't appear literally
in the base64 encoding. Fixed to:
- Properly decode the base64 payload
- Parse the JSON
- Modify the is_admin field
- Re-encode the payload
- Create a forged token with the modified payload and original signature
- Verify it's rejected with 401 status
This properly tests that JWT signature verification prevents token forgery.
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
Fix test_verify_expired_token timing issue
Increased token expiration from 10ms to 100ms and wait time from 20ms to 200ms
to make the test more reliable. The previous timing was too tight and could
cause race conditions where the token hadn't actually expired yet when verified.
This ensures the test reliably validates that expired tokens are properly rejected.
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
Fix token verification to properly validate signatures and expiration
The verify_token() function was not explicitly enabling JWT verification options,
causing it to accept expired tokens and tokens with modified signatures/payloads.
Fixed by explicitly setting verification options:
- verify_signature: True (reject modified tokens)
- verify_exp: True (reject expired tokens)
- verify_nbf: True (verify "not before" claim)
- verify_iat: True (verify "issued at" claim)
- verify_aud: False (we don't use audience claims)
Also added catch-all exception handler for other validation errors.
This ensures all token security tests pass properly.
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
Revert to default python-jose verification behavior
Removed explicit options parameter from jwt.decode() call since python-jose
verifies signatures and expiration by default. The explicit options may have
been causing issues or conflicts with the library's default behavior.
Added comments clarifying that python-jose defaults are secure and handle
verification properly. Need user to test and provide debugging output if
issues persist.
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
Fix JWT expiration verification for python-jose 3.5.0
python-jose 3.5.0 has a bug where expiration verification doesn't work properly
by default. The jwt.decode() function is not rejecting expired tokens even when
they should be rejected.
Workaround implemented:
1. First, get unverified claims to extract the 'exp' timestamp
2. Manually check if current time >= exp time (token is expired)
3. Return None immediately if expired
4. Then verify signature with jwt.decode() for tokens that aren't expired
This ensures:
- Expired tokens are properly rejected
- Signature verification still happens for non-expired tokens
- Modified tokens are rejected due to signature mismatch
All three failing tests should now pass:
- test_verify_expired_token
- test_verify_token_with_modified_payload
- test_token_signature_verification
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Fix race condition in token verification - verify signature before expiration
Changed the order of verification in verify_token():
1. First verify signature with jwt.decode() - rejects modified/forged tokens
2. Then manually check expiration timestamp
Previous implementation checked expiration first using get_unverified_claims(),
which could cause a race condition where:
- Token with valid payload but INVALID signature would pass expiration check
- If expiration check happened to return None due to timing, signature was never verified
- Modified tokens could be accepted intermittently
New implementation ensures signature is ALWAYS verified first, preventing any
modified tokens from being accepted, while still working around the python-jose
3.5.0 expiration bug by manually checking expiration after signature verification.
This eliminates the non-deterministic test failures in test_verify_token_with_modified_payload.
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* chore(app): ruff
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>
* Backend: Add admin board filtering and uncategorized board isolation
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Fix intermittent token service test failures caused by Base64 padding (#32)
* Initial plan
* Fix intermittent token service test failures due to Base64 padding
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Address code review: add constants for magic numbers in tests
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* chore(tests): ruff
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>
* Implement user isolation for session queue and socket events (WIP - debugging queue visibility) (#30)
* Add user isolation for queue events and field values filtering
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Add user column to queue list UI
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Add field values privacy indicator and implementation documentation
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Allow all users to see queue item status events while keeping invocation events private
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* chore(backend): ruff
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>
* Fix Queue tab not updating for other users in real-time (#34)
* Initial plan
* Add SessionQueueItemIdList invalidation to queue socket events
This ensures the queue item list updates in real-time for all users when
queue events occur (status changes, batch enqueued, queue cleared).
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Add SessionQueueItemIdList invalidation to queue_items_retried event
Ensures queue list updates when items are retried.
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Improve queue_items_retried event and mutation invalidation
- Add individual item invalidation to queue_items_retried event handler
- Add SessionQueueStatus and BatchStatus tags to retryItemsById mutation
- Ensure consistency between event handler and mutation invalidation patterns
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Add privacy check for batch field values in Queue tab
Displays "Hidden for privacy" message for non-admin users viewing
queue items they don't own, instead of showing the actual field values.
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* i18n(frontend): change wording of queue values suppressed message
* Add SessionQueueItemIdList cache invalidation to queue events
Ensures real-time queue updates for all users by invalidating the
SessionQueueItemIdList cache tag when queue events occur.
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>
* Fix multiuser information leakage in Queue panel detail view (#38)
* Initial plan
* Implement multiuser queue information leakage fix
- Backend: Update sanitize_queue_item_for_user to clear session graph and workflow
- Frontend: Add permission check to disable detail view for unauthorized users
- Add test for sanitization logic
- Add translation key for permission denied message
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Fix prettier formatting for QueueItemComponent
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Address code review feedback
- Move Graph and GraphExecutionState imports to top of file
- Remove dependency on test_nodes in sanitization test
- Create minimal test invocation directly in test file
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Address additional code review feedback
- Create shallow copy to avoid mutating original queue_item
- Extract 'system' user_id to constant (SYSTEM_USER_ID)
- Add constant to both backend and frontend for consistency
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Fix pydantic validation error in test fixture
Add required timestamp fields (created_at, updated_at, started_at, completed_at) to SessionQueueItem in test fixture
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* fix(queue): Enforce user permissions for queue operations in multiuser mode (#36)
* Initial plan
* Add backend authorization checks for queue operations
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Fix linting issues in authorization changes
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Add frontend authorization checks for queue operations
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Add access denied messages for cancel and clear operations
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Fix access denied messages for all cancel/delete operations
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Fix merge conflict duplicates in QueueItemComponent
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* chore(frontend): typegen
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>
* fix(multiuser): Isolate client state per user to prevent data leakage (#40)
* Implement per-user client state storage to fix multiuser leakage
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Fix: Make authentication optional for client_state endpoints to support single-user mode
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Clear params state on logout/login to prevent user data leakage
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* feat(queue): show user/total pending jobs in multiuser mode badge (#43)
* Initial plan
* Add multiuser queue badge support - show X/Y format in multiuser mode
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Format openapi.json with Prettier
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Address code review feedback - optimize DB queries and improve code clarity
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* translationBot(ui): update translation files (#8767)
Updated by "Cleanup translation files" hook in Weblate.
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/
Translation: InvokeAI/Web UI
* Limit automated issue closure to bug issues only (#8776)
* Initial plan
* Add only-labels parameter to limit automated issue closure to bugs only
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* fix(multiuser): Isolate client state per user to prevent data leakage (#40)
* Implement per-user client state storage to fix multiuser leakage
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Fix: Make authentication optional for client_state endpoints to support single-user mode
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Clear params state on logout/login to prevent user data leakage
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Initial plan
* chore(backend) ruff & typegen
* Fix real-time badge updates by invalidating SessionQueueStatus on queue events
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
Co-authored-by: Weblate (bot) <hosted@weblate.org>
Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>
* Convert session queue isolation logs from info to debug level
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Add JWT secret storage in database and app_settings service
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Add multiuser configuration option with default false
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Update token service tests to initialize JWT secret
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Fix app_settings_service to use proper database transaction pattern
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* chore(backend): typegen and ruff
* chore(docs): update docstrings
* Fix frontend to bypass authentication in single-user mode
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Fix auth tests to enable multiuser mode
Auth tests were failing because the login and setup endpoints now return 403 when multiuser mode is disabled (the default). Updated test fixtures to enable multiuser mode for all auth-related tests.
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Fix model manager UI visibility in single-user mode
Model manager UI for adding, deleting and modifying models is now:
- Visible in single-user mode (multiuser: false, the default)
- Hidden in multiuser mode for non-admin users
- Visible in multiuser mode for admin users
Created useIsModelManagerEnabled hook that checks multiuser_enabled status
and returns true when multiuser is disabled OR when user is admin.
Updated all model manager components to use this hook instead of direct
is_admin checks.
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* chore(backend): ruff
* chore(frontend): typegen
* Fix TypeScript lint errors
- Added multiuser_enabled field to SetupStatusResponse type in auth.ts
- Removed unused user variable reference in MainModelDefaultSettings.tsx
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Fix test_data_isolation to enable multiuser mode
Added fixture to enable multiuser mode for data isolation tests, similar to other auth tests.
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Redirect login and setup pages to app in single-user mode
When multiuser mode is disabled, the LoginPage and AdministratorSetup components now redirect to /app instead of showing the login/setup forms. This prevents users from being stuck on the login page after browser refresh in single-user mode.
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Fix test_auth.py to initialize JWT secret
Added setup_jwt_secret fixture to test_auth.py to initialize the JWT secret before running auth tests. This fixture was missing, causing token creation/verification to fail in auth router tests.
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Prevent login form flash in single-user mode
Show loading spinner instead of login/setup forms when multiuser mode is disabled or when redirecting is about to happen. This prevents the unattractive flash of the login dialog when refreshing the page in single-user mode.
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Fix board and queue operations in single-user mode
Changed boards, session_queue, and images routers to use CurrentUserOrDefault instead of CurrentUser. This allows these endpoints to work without authentication when multiuser mode is disabled (default), fixing the issue where users couldn't create boards or add jobs to the queue in single-user mode.
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Add user management utilities and rename add_user.py
Created three user management scripts in the scripts/ directory:
- useradd.py (renamed from add_user.py) - add users with admin privileges
- userdel.py - delete users by email address with confirmation
- usermod.py - modify user details (name, password, admin status)
All scripts support both CLI and interactive modes for flexibility.
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Fix ESLint errors in frontend code
- Fixed brace-style issue in App.tsx (else-if on same line)
- Removed unused useAppSelector imports from model manager components
- Fixed import sorting in ControlAdapterModelDefaultSettings.tsx
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Add userlist.py script for viewing database users
Created scripts/userlist.py to display all users in the database. Supports:
- Table format (default): Shows ID, email, display name, admin status, and active status
- JSON format (--json flag): Outputs user data as JSON for scripting/automation
Example usage:
python scripts/userlist.py # Table view
python scripts/userlist.py --json # JSON output
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Fix test_boards_multiuser.py test failures
Fixed test failures caused by ApiDependencies.invoker not being set properly:
- Added setup_jwt_secret fixture to initialize JWT secret for token generation
- Added enable_multiuser_for_tests fixture that sets ApiDependencies.invoker as a class attribute
- Updated tests to use enable_multiuser_for_tests fixture to ensure ApiDependencies is properly configured
- Removed MockApiDependencies class approach in favor of directly setting the class attribute
This fixes the AttributeError and ensures all tests have the proper setup.
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* chore(backend): ruff
* Fix userlist.py SqliteDatabase initialization
Fixed AttributeError in userlist.py where SqliteDatabase was being passed the config object instead of config.db_path. The constructor expects a Path object (db_path) as the first argument, not the entire config object.
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Fix test_boards_multiuser.py by adding app_settings service to mock
Added AppSettingsService initialization to the mock_services fixture in tests/conftest.py. The test was failing because setup_jwt_secret fixture expected mock_invoker.services.app_settings to exist, but it wasn't being initialized in the mock services.
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* bugfix(scripts): fix crash in userlist.py script
* Fix test_boards_multiuser.py JWT secret initialization
Fixed setup_jwt_secret fixture to call set_jwt_secret() directly instead of trying to access non-existent app_settings service. Removed incorrect app_settings parameter from InvocationServices initialization in tests/conftest.py since app_settings is not an attribute of InvocationServices.
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Fix CurrentUserOrDefault to require auth in multiuser mode
Changed get_current_user_or_default to raise HTTP 401 when multiuser mode is enabled and credentials are missing, invalid, or the user is inactive. This ensures that board/queue/image operations require authentication in multiuser mode while still working without authentication in single-user mode (default).
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* chore(front & backend): ruff and lint
* Add AdminUserOrDefault and fix model settings in single-user mode
Created AdminUserOrDefault dependency that allows admin operations to work without authentication in single-user mode while requiring admin privileges in multiuser mode. Updated model_manager router to use AdminUserOrDefault for update_model_record, update_model_image, and reidentify_model endpoints. This fixes the "Missing authentication credentials" error when saving model default settings in single-user mode.
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Fix model manager operations in single-user mode
Changed all model manager endpoints from AdminUser to AdminUserOrDefault to allow model installation, deletion, conversion, and cache management operations to work without authentication in single-user mode. This fixes the issue where users couldn't add or delete models in single-user mode.
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Fix syntax error in model_manager.py
Added Depends(AdminUserOrDefault) to all AdminUserOrDefault dependency parameters to fix Python syntax error where parameters without defaults were following parameters with defaults. Imported Depends from fastapi.
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Fix FastAPI dependency injection syntax error
Removed type annotations from AdminUserOrDefault dependency parameters. FastAPI doesn't allow both Annotated type hints and = Depends() default values together. Changed from `_: AdminUserOrDefault = Depends(AdminUserOrDefault)` to `_ = Depends(AdminUserOrDefault)` throughout model_manager.py.
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Fix delete_model endpoint parameter annotation
Changed delete_model endpoint to use Annotated[str, Path(...)] instead of str = Path(...) to match FastAPI's preferred syntax and fix the 422 Unprocessable Entity error when deleting models in single-user mode.
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Fix parameter annotations for all AdminUserOrDefault endpoints
Changed all endpoints using AdminUserOrDefault from old syntax (key: str = Path(...)) to FastAPI's preferred Annotated syntax (key: Annotated[str, Path(...)]). This fixes 422 Unprocessable Entity errors when updating model settings and deleting models in single-user mode. Updated endpoints: delete_model_image, install_model, install_hugging_face_model, and convert_model.
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Revert to correct AdminUserOrDefault usage pattern
Reverted model_manager.py to state before commit c47af8f and reapplied AdminUserOrDefault correctly. Changed from `_: AdminUser` to `current_admin: AdminUserOrDefault` using the same pattern as boards.py (`current_user: CurrentUserOrDefault`). This fixes all 422 errors in single-user mode while maintaining proper admin authentication in multiuser mode.
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Fix parameter order for AdminUserOrDefault in model manager
Moved current_admin: AdminUserOrDefault parameter before all parameters with default values in model_manager.py endpoints. Python requires parameters without defaults to come before parameters with defaults. Fixed 8 endpoints: delete_model, bulk_delete_models, delete_model_image, install_model, install_hugging_face_model, cancel_model_install_job, convert_model, and do_hf_login.
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* chore(frontend): typegen
* chore(frontend): typegen again
* Docs(app): Comprehensive Documentation of Multiuser Features (#50)
* Implement Phase 8: Complete multiuser documentation (user, admin, and API guides)
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Update multiuser documentation for single-user mode and CLI scripts
- Document multiuser config option (true/false/absent)
- Explain single-user mode behavior (no login required)
- Document mode switching and legacy "system" user
- Update user management to reference CLI scripts (useradd, userdel, usermod, userlist)
- Note that web UI for user management is coming in future release
- Add adaptive API client example for both modes
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* docs(multiuser): bring user guide documentation up to date
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* docs(app): update multiuser documentation
* bugfix(app): fix misaligned database migration calls
* chore(tests): update migration test to accommodate resequencing of migrations
* fix(frontend): prevent caching of static pages
* chore(backend): ruff
* fix(backend): fix incorrect migration import
* Fix: Admin users can see image previews from other users' generations (#61)
* Initial plan
* Fix: strip image preview from InvocationProgressEvent sent to admin room
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* chore: ruff
* fix(backend): add migration_29 file
* chore(tests): fix migration_29 test
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>
* fix(queue): System user queue items show blank instead of `<hidden>` for non-admin users (#63)
* Initial plan
* fix(queue): System user queue items show blank instead of `<hidden>` for non-admin users
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* chore(backend): ruff
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>
* Hide "Use Cache" checkbox in node editor for non-admin users in multiuser mode (#65)
* Initial plan
* Hide use cache checkbox for non-admin users in multiuser mode
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Fix node loading hang when invoke URL ends with /app (#67)
* Initial plan
* Fix node loading hang when URL ends with /app
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Move user management scripts to installable module with CLI entry points (#69)
* Initial plan
* Add user management module with invoke-useradd/userdel/userlist/usermod entry points
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* chore(util): remove superceded user administration scripts
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>
* chore(backend): reorganized migrations, but something still broken
* Fix migration 28 crash when `client_state.data` column is absent (#70)
* Initial plan
* Fix migration 28 to handle missing data column in client_state table
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Consolidate multiuser DB migrations 27–29 into a single migration step (#71)
* Initial plan
* Consolidate migrations 27, 28, and 29 into a single migration step
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Add `--root` option to user management CLI utilities (#81)
* Initial plan
* Add --root option to user management CLI utilities
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Fix queue clear() endpoint to respect user_id for multi-tenancy (#75)
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
Add tests for session queue clear() user_id scoping
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
chore(frontend): rebuild typegen
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
* fix: use AdminUserOrDefault for pause and resume queue endpoints (#77)
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* fix: queue pause/resume buttons disabled in single-user mode (#83)
In single-user mode, currentUser is never populated (no auth), so
`currentUser?.is_admin ?? false` always returns false, disabling the buttons.
Follow the same pattern as useIsModelManagerEnabled: treat as admin
when multiuser mode is disabled, and check is_admin flag when enabled.
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* fix: enforce board ownership checks in multiuser mode (#84)
- get_board: verify current user owns the board (or is admin), return 403 otherwise
- update_board: verify ownership before updating, 404 if not found, 403 if unauthorized
- delete_board: verify ownership before deleting, 404 if not found, 403 if unauthorized
- list_all_board_image_names: add CurrentUserOrDefault auth and ownership check for non-'none' board IDs
test: add ownership enforcement tests for board endpoints in multiuser mode
- Auth requirement tests for get, update, delete, and list_image_names
- Cross-user 403 forbidden tests (non-owner cannot access/modify/delete)
- Admin bypass tests (admin can access/update/delete any user's board)
- Board listing isolation test (users only see their own boards)
- Refactored fixtures to use monkeypatch (consistent with other test files)
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Fix: Clear auth state when switching from multiuser to single-user mode (#86)
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Fix race conditions in download queue and model install service (#98)
* Initial plan
* Fix race conditions in download queue and model install service
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
Co-authored-by: Weblate (bot) <hosted@weblate.org>
Co-authored-by: Jonathan <34005131+JPPhoto@users.noreply.github.com>
* Add FLUX.2 LOKR model support (detection and loading) (#88)
Fix BFL LOKR models being misidentified as AIToolkit format
Fix alpha key warning in LOKR QKV split layers
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Fix BFL→diffusers key mapping for non-block layers in FLUX.2 LoRA/LoKR
BFL's FLUX.2 model uses different names than diffusers' Flux2Transformer2DModel
for top-level modules (embedders, modulations, output layers). The existing
conversion only handled block-level renames (double_blocks→transformer_blocks),
causing "Failed to find module" warnings for non-block LoRA keys like img_in,
txt_in, modulation.lin, time_in, and final_layer.
---------
Co-authored-by: Copilot <198982749+Copilot@users.noreply.github.com>
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
Co-authored-by: Alexander Eichhorn <alex@eichhorn.dev>
* WIP: Add FLUX.2 Klein LoRA support (BFL PEFT format)
Initial implementation for loading and applying LoRA models trained
with BFL's PEFT format for FLUX.2 Klein transformers.
Changes:
- Add LoRA_Diffusers_Flux2_Config and LoRA_LyCORIS_Flux2_Config
- Add BflPeft format to FluxLoRAFormat taxonomy
- Add flux_bfl_peft_lora_conversion_utils for weight conversion
- Add Flux2KleinLoraLoaderInvocation node
Status: Work in progress - not yet fully tested
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* feat(flux2): add LoRA support for FLUX.2 Klein models
Add BFL PEFT LoRA support for FLUX.2 Klein, including runtime conversion
of BFL-format keys to diffusers format with fused QKV splitting, improved
detection of Klein 4B LoRAs via MLP ratio check, and frontend graph wiring.
* feat(flux2): detect Klein LoRA variant (4B/9B) and filter by compatibility
Auto-detect FLUX.2 Klein LoRA variant from tensor dimensions during model
probe, warn on variant mismatch at load time, and filter the LoRA picker
to only show variant-compatible LoRAs.
* Chore Ruff
* Chore pnpm
* Fix detection and loading of 3 unrecognized Flux.2 Klein LoRA formats
Three Flux.2 Klein LoRAs were either unrecognized or misclassified due to
format detection gaps:
1. PEFT-wrapped BFL format (base_model.model.* prefix) was not recognized
because the detector only accepted the diffusion_model.* prefix.
2. Klein 4B LoRAs with hidden_size=3072 were misidentified as Flux.1 due to
a break statement exiting the detection loop before txt_in/vector_in
dimensions could be checked.
3. Flux2 native diffusers format (to_qkv_mlp_proj, ff.linear_in) was not
detected because the detector only checked for Flux.1 diffusers keys.
Also handles mixed PEFT/standard LoRA suffix formats within the same file.
---------
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>
* Fix bare except clauses and mutable default arguments
Replace bare `except:` with `except Exception:` in sqlite_database.py
and mlsd/utils.py to avoid catching KeyboardInterrupt and SystemExit,
which can prevent graceful shutdowns and mask critical errors (PEP 8
E722).
Replace mutable default arguments (lists) with None in
imwatermark/vendor.py to prevent shared state between calls, which
is a known Python gotcha that can cause subtle bugs when default
mutable objects are modified in place.
* add tests for mutable defaults and bare except fixes
* Simplify exception propagation tests
* Remove unused db initialization in error propagation tests
Removed unused database initialization in tests for KeyboardInterrupt and SystemExit.
---------
Co-authored-by: Jonathan <34005131+JPPhoto@users.noreply.github.com>
Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>
* Initial mashup of mentioned feature. Still need to resolve some quirks and kinks.
* Clean text tool integration
* Fixed text tool opions bar jumping and added more fonts
* Touch up for cursor styling
* Minor addition to doc file
* Appeasing frontend checks
* Prettier fix
* knip fixes
* Added safe zones to font selection and color picker to be clickable without commiting text.
* Removed color probing on cursor and added dynamic font display for fallback, minor tweaks
* Finally fixed the text shifting on commit
* Cursor now represent actual input field size. Tidy up options UI
* Some strikethrough and underline line tweaks
* Replaced the focus retry loop with a callback‑ref based approach in in CanvasTextOverlay.tsx
Renamed containerMetrics to textContainerData in CanvasTextOverlay.tsx
Fixed mouse cursor disapearing during typing.
* Added missing localistaion string
* Moved canvas-text-tool.md to docs/contributing/frontend
* ui: Improve functionality of the text toolbar
Few things done with this commit.
- The varying size of the font selector box has been fixed. The UI no longer shifts and moves with font change.
- We no longer format the font size input to add px each time. Instead now just have a permanent px indicator.
- The bug with the random text inputs on the slider value has also been fixed.
- The font size value is only committed on blur keeping it consistent with other editing apps.
- Fixed the spacing of the toolbar to make it look cleaner.
- Font size now permits increments of 1.
* Added autoselect text in font size on click allowing immediate imput
* Improvement: Added uncommited layer state with CTRL-move and options to select line spacing.
* Added rotation handle to rotate uncommiitted text layer.
* Fix: Redirect user facing labels to use localization file + Add tool discription to docs
* Fixed box padding. Disable tool swich when text input is active, added message on canvas for better UX.
* Updated Text tool description
* Updated Text tool description
* Typo
* Add draggable text-box border with improved cursor feedback and larger hit targets. Supress hotkeys on uncommitted text.
* Lint
* Fix(bug): text commit to link uploaded image assets instead of embedding full base64
---------
Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>
Co-authored-by: blessedcoolant <54517381+blessedcoolant@users.noreply.github.com>
* feat(z-image): add Z-Image Base (undistilled) model variant support
- Add ZImageVariantType enum with 'turbo' and 'zbase' variants
- Auto-detect variant on import via scheduler_config.json shift value (3.0=turbo, 6.0=zbase)
- Add database migration to populate variant field for existing Z-Image models
- Re-add LCM scheduler with variant-aware filtering (LCM hidden for zbase)
- Auto-reset scheduler to Euler when switching to zbase model if LCM selected
- Update frontend to show/hide LCM option based on model variant
- Add toast notification when scheduler is auto-reset
Z-Image Base models are undistilled and require more steps (28-50) with higher
guidance (3.0-5.0), while Z-Image Turbo is distilled for ~8 steps with CFG 1.0.
LCM scheduler only works with distilled (Turbo) models.
* Chore ruff format
* Chore fix windows path
* feat(z-image): filter LoRAs by variant compatibility and warn on mismatch
LoRA picker now hides Z-Image LoRAs with incompatible variants (e.g. ZBase
LoRAs when using Turbo model). LoRAs without a variant are always shown.
Backend loaders warn at runtime if a LoRA variant doesn't match the
transformer variant.
* Chore typegen
---------
Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>
The FLUX.2 Klein transformer operates in BN-normalized latent space,
but init_latents from VAE encode were not being normalized before
being passed to the InpaintExtension. This caused a scale mismatch
when merging intermediate_latents (normalized) with noised_init_latents
(unnormalized), resulting in visible artifacts at mask blur boundaries.
Now normalize:
- init_latents_packed before passing to InpaintExtension
- noise_packed for correct interpolation in normalized space
- x (starting latents) for img2img/inpainting workflows
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Co-authored-by: Jonathan <34005131+JPPhoto@users.noreply.github.com>
* feat(canvas): add raster layer blend modes and boolean operations submenu; support per-layer globalCompositeOperation in compositor; UI to toggle and select color blend modes (multiply, screen, darken, lighten, color-dodge, color-burn, hard-light, soft-light, difference, hue, saturation, color, luminosity).
* feat(canvas): boolean ops submenu and UI polish
* (chore): prettier lint
* add icons to boolean submenu items
* add delete button for color blend operations
* move composite operation type and imports
* chore: pnpm eslint
* update blend modes order
* update default blend mode to 'color'
* add i18n for blend modes
* actually use translations for blend modes now
* move composite options into types.ts
* cleanup and comments
* update names
* move constant mapping out of function
* feat(ui): Refactor Blend Mode Implementation
- Blend Modes are not right click menu options anymore. Instead they rest above the layer panel as they do in other art programs readily available for each layer.
- Blend Modes have been resorted to match the listings of other art programs so users can avail their muscle memory.
- Blend Mode now defaults to `Normal` for each layer as it should.
- The extra layer operations have now been moved down to the `Operations Bar` at the bottom of the layer stack. This is to increase familiarity again with other art programs and also to make space for us in the top action bar.
- The Operations Bars operations have been resorted in order of usage that makes sense.
* fix: use source-over instead of normal
* fix: pixel fix for slightly offset action bar labels.
* feat(canvas): boolean raster merge creates new layer and disables sources
* (fix) lint errors
* remove extra typecast
---------
Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>
Co-authored-by: blessedcoolant <54517381+blessedcoolant@users.noreply.github.com>
* Add script and UI to remove orphaned model files
- This commit adds command-line and Web GUI functionality for
identifying and optionally removing models in the models directory
that are not referenced in the database.
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Add backend service and API routes for orphaned models sync
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
Add expandable file list to orphaned models dialog
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* Fix cache invalidation after deleting orphaned models
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* (bugfix) improve status messages
* docs(backend): add info on the orphaned model detection/removal feature
* Update docs/features/orphaned_model_removal.md
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
Co-authored-by: dunkeroni <dunkeroni@gmail.com>
* fix(flux2): Fix image quality degradation at resolutions > 1024x1024
This commit addresses severe quality degradation and artifacts when
generating images larger than 1024x1024 with FLUX.2 Klein models.
Root causes fixed:
1. Dynamic max_image_seq_len in scheduler (flux2_denoise.py)
- Previously hardcoded to 4096 (1024x1024 only)
- Now dynamically calculated based on actual resolution
- Allows proper schedule shifting at all resolutions
2. Smoothed mu calculation discontinuity (sampling_utils.py)
- Eliminated 40-50% mu value drop at seq_len 4300 threshold
- Implemented smooth cosine interpolation (4096-4500 transition zone)
- Gradual blend between low-res and high-res formulas
Impact:
- FLUX.2 Klein 9B: Major quality improvement at high resolutions
- FLUX.2 Klein 4B: Improved quality at high resolutions
- Baseline 1024x1024: Unchanged (no regression)
- All generation modes: T2I and Kontext (reference images)
Fixes: Community-reported quality degradation issue
See: Discord discussions in #garbage-bin and #devchat
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
* fix(flux2): Fix high-resolution quality degradation for FLUX.2 Klein
Fixes grid/diamond artifacts and color loss at resolutions > 1024x1024.
Root causes identified and fixed:
- BN normalization was incorrectly applied to random noise input
(diffusers only normalizes image latents from VAE.encode)
- BN denormalization must be applied to output before VAE decode
- mu parameter was resolution-dependent causing over-shifted schedules
at high resolutions (now fixed to 2.02, matching ComfyUI)
Changes:
- Remove BN normalization on noise input (not needed for N(0,1) noise)
- Preserve BN denormalization on denoised output (required for VAE)
- Fix mu to constant 2.02 for all resolutions (matches ComfyUI)
Tested at 2048x2048 with FLUX.2 Klein 4B
* Chore Ruff
---------
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Co-authored-by: Jonathan <34005131+JPPhoto@users.noreply.github.com>
The ParamFluxDypePreset component was rendered twice in the FLUX
generation settings accordion, causing the DyPE dropdown to appear
both after the scheduler and after the guidance slider.
Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>
setting with hardcoded full denoising (start=0, end=1) in addOutpaint.
This caused denoising strength to be completely ignored whenever the
canvas bbox extended beyond the raster layer content, triggering outpaint
mode. The issue affected all model types (SDXL, SD1.5, FLUX, etc.).
Restore the original behavior by reading denoising_start/end from the
user's img2imgStrength setting via getDenoisingStartAndEnd().
Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>
When recalling an image that lacks `z_image_seed_variance_enabled` metadata
(e.g. older images), the toggle now defaults to off instead of retaining the
previous state.
* Switched to use v5.x gallery pagination design.
* Improved pagination UX and gallery grid calculation
* Minor bug fix
* Formatting...
* Fixed Jump to page input behavior and "Locate in gallery" logic.
* Changed Jump input field to select text on click for better UX.
Use useFlux1VAEModels() instead of useFluxVAEModels() in the FLUX VAE
selector, which was incorrectly returning both FLUX.1 and FLUX.2 VAEs.
Remove the now-unused useFluxVAEModels hook.
Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>
* fix(flux2): support Heun scheduler for FLUX.2 Klein models
FlowMatchHeunDiscreteScheduler does not support dynamic shifting parameters
(use_dynamic_shifting, base_shift, max_shift, etc.) or sigmas/mu in set_timesteps.
This caused FLUX.2 Klein to fail when using Heun scheduler.
- Create Heun scheduler with only num_train_timesteps and shift parameters
- Use num_inference_steps instead of sigmas for Heun's set_timesteps call
- Euler and LCM schedulers continue to use full dynamic shifting support
* fix(flux2): fix Heun scheduler detection using inspect.signature
The previous hasattr check for state_in_first_order failed because
the attribute doesn't exist before set_timesteps() is called. Now
using inspect.signature to check for sigmas parameter support,
matching the FLUX1 implementation.
---------
Co-authored-by: Jonathan <34005131+JPPhoto@users.noreply.github.com>
* Implemented ordering for expanded iterators
* Update test_graph_execution_state.py
Added a test for nested iterator execution ordering. (Failing at commit time!)
* Filter invalid nested-iterator parent mappings in _prepare()
When a graph has nested iterators, some "ready to run" node combinations do not actually belong together. Previously, the scheduler would still try to build nodes for those mismatched combinations, which could cause the same work to run more than once. This change skips any combination that is missing a valid iterator parent, so nested iterator expansions run once per intended item.
* Fixed Collect node ordering
* ruff
* Removed ordering guarantees from test_node_graph.py
* Fix iterator prep and type compatibility in graph execution
Include iterator nodes in nx_graph_flat so iterators are prepared/expanded correctly. Fix connection type checks to allow subclass-to-base via issubclass. Harden iterator/collector validation to fail cleanly instead of crashing on missing edges. Remove unused nx_graph_with_data(). Added tests to verify proper functionality.
* feat(model_manager): add missing models filter to Model Manager
Adds the ability to view and manage orphaned model database entries
where the underlying files have been deleted externally.
Changes:
- Add GET /v2/models/missing API endpoint to list models with missing files
- Add "Missing Files" filter option to Model Manager type filter dropdown
- Display "Missing Files" badge on models with missing files in the list
- Automatically exclude missing models from model selection dropdowns
to prevent users from selecting unavailable models for generation
* fix(ui): enable Select All checkbox for missing models filter
The Select All checkbox was disabled when the missing models filter was
active because the bulk actions component didn't use the missing models
query data. Now it correctly uses useGetMissingModelsQuery when the
filter is set to 'missing'.
* test(model_manager): add tests for missing model detection and bulk delete
Tests _scan_for_missing_models and the unregister/delete workflow for
models whose files have been removed externally.
* Chore Ruff check
When switching between FLUX.2 (model-less reference images) and other
models that require IP adapter/Redux models, the reference image configs
were not being converted, leaving stale config types that hid or showed
the wrong UI controls.
Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>
The scheduler dropdown is no longer shown for FLUX.2 Klein models.
The backend default (Euler) is used instead.
Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>
* fix(ui): improve DyPE field ordering and add 'On' preset option
- Add ui_order to DyPE fields (100, 101, 102) to group them at bottom of node
- Change DyPEPreset from Enum to Literal type for proper frontend dropdown support
- Add ui_choice_labels for human-readable dropdown options
- Add new 'On' preset to enable DyPE regardless of resolution
- Fix frontend input field sorting to respect ui_order (unordered first, then ordered)
- Bump flux_denoise node version to 4.4.0
* Chore Ruff check fix
* fix(flux): remove .value from dype_preset logging
DyPEPreset is now a Literal type (string) instead of an Enum,
so .value is no longer needed.
* fix(tests): update DyPE tests for Literal type change
Update test imports and assertions to use string constants
instead of Enum attributes since DyPEPreset is now a Literal type.
* feat(flux): add DyPE scale and exponent controls to Linear UI
- Add dype_scale (λs) and dype_exponent (λt) sliders to generation settings
- Add Zod schemas and parameter types for DyPE scale/exponent
- Pass custom values from Linear UI to flux_denoise node
- Fix bug where DyPE was enabled even when preset was "off"
- Add enhanced logging showing all DyPE parameters when enabled
* fix(flux): apply DyPE scale/exponent and add metadata recall
- Fix DyPE scale and exponent parameters not being applied in frequency
computation (compute_vision_yarn_freqs, compute_yarn_freqs now call
get_timestep_mscale)
- Add metadata handlers for dype_scale and dype_exponent to enable
recall from generated images
- Add i18n translations referencing existing parameter labels
* fix(flux): apply DyPE scale/exponent and add metadata recall
- Fix DyPE scale and exponent parameters not being applied in frequency
computation (compute_vision_yarn_freqs, compute_yarn_freqs now call
get_timestep_mscale)
- Add metadata handlers for dype_scale and dype_exponent to enable
recall from generated images
- Add i18n translations referencing existing parameter labels
* feat(ui): show DyPE scale/exponent only when preset is "on"
- Hide scale/exponent controls in UI when preset is not "on"
- Only parse/recall scale/exponent from metadata when preset is "on"
- Prevents confusion where custom values override preset behavior
* fix(dype): only allow custom scale/exponent with 'on' preset
Presets (auto, 4k) now use their predefined values and ignore
any custom_scale/custom_exponent parameters. Only the 'on' preset
allows manual override of these values.
This matches the frontend UI behavior where the scale/exponent
fields are only shown when 'On' is selected.
* refactor(dype): rename 'on' preset to 'manual'
Rename the 'on' DyPE preset to 'manual' to better reflect its purpose:
allowing users to manually configure scale and exponent values.
Updated in:
- Backend presets (DYPE_PRESET_ON -> DYPE_PRESET_MANUAL)
- Frontend UI labels and options
- Redux slice type definitions
- Zod schema validation
- Tests
* refactor(dype): rename 'on' preset to 'manual'
Rename the 'on' DyPE preset to 'manual' to better reflect its purpose:
allowing users to manually configure scale and exponent values.
Updated in:
- Backend presets (DYPE_PRESET_ON -> DYPE_PRESET_MANUAL)
- Frontend UI labels and options
- Redux slice type definitions
- Zod schema validation
- Tests
* fix(dype): update remaining 'on' references to 'manual'
- Update docstrings, comments, and error messages to use 'manual' preset name
- Simplify FLUX graph builder to always send dype_scale/dype_exponent
- Fix UI condition to show DyPE controls for 'manual' preset
---------
Co-authored-by: Jonathan <34005131+JPPhoto@users.noreply.github.com>
Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>
* release(docker): fix workflow edge case that prevented CUDA build from completing
* bugfix(release): fix yaml syntax error
* bugfix(CI/CD): fix similar problem in typegen check
* Add new model type integration guide
Comprehensive documentation covering all steps required to integrate
a new model type into InvokeAI, including:
- Backend: Model manager, configs, loaders, invocations, sampling
- Frontend: Graph building, state management, parameter recall
- Metadata, starter models, and optional features (ControlNet, LoRA, IP-Adapter)
Uses FLUX.1, FLUX.2 Klein, SD3, SDXL, and Z-Image as reference implementations.
* docs: improve new model integration guide
- Move document to docs/contributing/ directory
- Fix broken TOC links by replacing '&' with 'and' in headings
- Add code example for text encoder config (section 2.4)
- Add text encoder loader example (new section 3.3)
- Expand text encoder invocation to show full conditioning flow (section 4.2)
---------
Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>
* Update flux_model_loader.py
Added nodal points for inputs to the model loader since we should be able to use a model selection node and pass in for Flux models.
* typegen
* Fixed existing ruff error
---------
Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>
Remove extra array wrapper when saving ref_images metadata for FLUX.2 Klein
and FLUX.1 Kontext reference images. The double-nested array [[...]] was
preventing recall from parsing the metadata correctly.
* chore(release): add flux.2-klein to whats new items & bump version
* doc(release): update the WhatsNew text
* chore(frontend): run lint:prettier and frontend-typegen
* fix(model_manager): detect Flux VAE by latent space dimensions instead of filename
VAE detection previously relied solely on filename pattern matching, which failed
for Flux VAE files with generic names like "ae.safetensors". Now probes the model's
decoder.conv_in weight shape to determine the latent space dimensions:
- 16 channels -> Flux VAE
- 4 channels -> SD/SDXL VAE (with filename fallback for SD1/SD2/SDXL distinction)
* fix(model_manager): add latent space probing for Flux2 VAE detection
Extend Flux2 VAE detection to also check for 32-dimensional latent space
(decoder.conv_in with 32 input channels) in addition to BatchNorm layers.
This provides more robust detection for Flux2 VAE files regardless of filename.
* Chore Ruff format
---------
Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>
* docs: add DyPE implementation plan for FLUX high-resolution generation
Add detailed plan for porting ComfyUI-DyPE (Dynamic Position Extrapolation)
to InvokeAI, enabling 4K+ image generation with FLUX models without
training. Estimated effort: 5-7 developer days.
* docs: update DyPE plan with design decisions
- Integrate DyPE directly into FluxDenoise (no separate node)
- Add 4K preset and "auto" mode for automatic activation
- Confirm FLUX Schnell support (same base resolution as Dev)
* docs: add activation threshold for DyPE auto mode
FLUX can handle resolutions up to ~1.5x natively without artifacts.
Set activation_threshold=1536 so DyPE only kicks in above that.
* feat(flux): implement DyPE for high-resolution generation
Add Dynamic Position Extrapolation (DyPE) support to FLUX models,
enabling artifact-free generation at 4K+ resolutions.
New files:
- invokeai/backend/flux/dype/base.py: DyPEConfig and scaling calculations
- invokeai/backend/flux/dype/rope.py: DyPE-enhanced RoPE functions
- invokeai/backend/flux/dype/embed.py: DyPEEmbedND position embedder
- invokeai/backend/flux/dype/presets.py: Presets (off, auto, 4k)
- invokeai/backend/flux/extensions/dype_extension.py: Pipeline integration
Modified files:
- invokeai/backend/flux/denoise.py: Add dype_extension parameter
- invokeai/app/invocations/flux_denoise.py: Add UI parameters
UI parameters:
- dype_preset: off | auto | 4k
- dype_scale: Custom magnitude override (0-8)
- dype_exponent: Custom decay speed override (0-1000)
Auto mode activates DyPE for resolutions > 1536px.
Based on: https://github.com/wildminder/ComfyUI-DyPE
* feat(flux): add DyPE preset selector to Linear UI
Add Linear UI integration for FLUX DyPE (Dynamic Position Extrapolation):
- Add ParamFluxDypePreset component with Off/Auto/4K options
- Integrate preset selector in GenerationSettingsAccordion for FLUX models
- Add state management (paramsSlice, types) for fluxDypePreset
- Add dype_preset to FLUX denoise graph builder and metadata
- Add translations for DyPE preset label and popover
- Add zFluxDypePresetField schema definition
Fix DyPE frequency computation:
- Remove incorrect mscale multiplication on frequencies
- Use only NTK-aware theta scaling for position extrapolation
* feat(flux): add DyPE preset to metadata recall
- Add FluxDypePreset handler to ImageMetadataHandlers
- Parse dype_preset from metadata and dispatch setFluxDypePreset on recall
- Add translation key metadata.dypePreset
* chore: remove dype-implementation-plan.md
Remove internal planning document from the branch.
* chore(flux): bump flux_denoise version to 4.3.0
Version bump for dype_preset field addition.
* chore: ruff check fix
* chore: ruff format
* Fix truncated DyPE label in advanced options UI
Shorten the label from "DyPE (High-Res)" to "DyPE" to prevent text truncation in the sidebar. The high-resolution context is preserved in the informational popover tooltip.
* Add DyPE preset to recall parameters in image viewer
The dype_preset metadata was being saved but not displayed in the Recall Parameters tab. Add FluxDypePreset handler to ImageMetadataActions so users can see and recall this parameter.
---------
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: Jonathan <34005131+JPPhoto@users.noreply.github.com>
* WIP: feat(flux2): add FLUX 2 Kontext model support
- Add new invocation nodes for FLUX 2:
- flux2_denoise: Denoising invocation for FLUX 2
- flux2_klein_model_loader: Model loader for Klein architecture
- flux2_klein_text_encoder: Text encoder for Qwen3-based encoding
- flux2_vae_decode: VAE decoder for FLUX 2
- Add backend support:
- New flux2 module with denoise and sampling utilities
- Extended model manager configs for FLUX 2 models
- Updated model loaders for Klein architecture
- Update frontend:
- Extended graph builder for FLUX 2 support
- Added FLUX 2 model types and configurations
- Updated readiness checks and UI components
* fix(flux2): correct VAE decode with proper BN denormalization
FLUX.2 VAE uses Batch Normalization in the patchified latent space
(128 channels). The decode must:
1. Patchify latents from (B, 32, H, W) to (B, 128, H/2, W/2)
2. Apply BN denormalization using running_mean/running_var
3. Unpatchify back to (B, 32, H, W) for VAE decode
Also fixed image normalization from [-1, 1] to [0, 255].
This fixes washed-out colors in generated FLUX.2 Klein images.
* feat(flux2): add FLUX.2 Klein model support with ComfyUI checkpoint compatibility
- Add FLUX.2 transformer loader with BFL-to-diffusers weight conversion
- Fix AdaLayerNorm scale-shift swap for final_layer.adaLN_modulation weights
- Add VAE batch normalization handling for FLUX.2 latent normalization
- Add Qwen3 text encoder loader with ComfyUI FP8 quantization support
- Add frontend components for FLUX.2 Klein model selection
- Update configs and schema for FLUX.2 model types
* Chore Ruff
* Fix Flux1 vae probing
* Fix Windows Paths schema.ts
* Add 4B und 9B klein to Starter Models.
* feat(flux2): add non-commercial license indicator for FLUX.2 Klein 9B
- Add isFlux2Klein9BMainModelConfig and isNonCommercialMainModelConfig functions
- Update MainModelPicker and InitialStateMainModelPicker to show license icon
- Update license tooltip text to include FLUX.2 Klein 9B
* feat(flux2): add Klein/Qwen3 variant support and encoder filtering
Backend:
- Add klein_4b/klein_9b variants for FLUX.2 Klein models
- Add qwen3_4b/qwen3_8b variants for Qwen3 encoder models
- Validate encoder variant matches Klein model (4B↔4B, 9B↔8B)
- Auto-detect Qwen3 variant from hidden_size during probing
Frontend:
- Show variant field for all model types in ModelView
- Filter Qwen3 encoder dropdown to only show compatible variants
- Update variant type definitions (zFlux2VariantType, zQwen3VariantType)
- Remove unused exports (isFluxDevMainModelConfig, isFlux2Klein9BMainModelConfig)
* Chore Ruff
* feat(flux2): add Klein 9B Base (undistilled) variant support
Distinguish between FLUX.2 Klein 9B (distilled) and Klein 9B Base (undistilled)
models by checking guidance_embeds in diffusers config or guidance_in keys in
safetensors. Klein 9B Base requires more steps but offers higher quality.
* feat(flux2): improve diffusers compatibility and distilled model support
Backend changes:
- Update text encoder layers from [9,18,27] to (10,20,30) matching diffusers
- Use apply_chat_template with system message instead of manual formatting
- Change position IDs from ones to zeros to match diffusers implementation
- Add get_schedule_flux2() with empirical mu computation for proper schedule shifting
- Add txt_embed_scale parameter for Qwen3 embedding magnitude control
- Add shift_schedule toggle for base (28+ steps) vs distilled (4 steps) models
- Zero out guidance_embedder weights for Klein models without guidance_embeds
UI changes:
- Clear Klein VAE and Qwen3 encoder when switching away from flux2 base
- Clear Qwen3 encoder when switching between different Klein model variants
- Add toast notification informing user to select compatible encoder
* feat(flux2): fix distilled model scheduling with proper dynamic shifting
- Configure scheduler with FLUX.2 Klein parameters from scheduler_config.json
(use_dynamic_shifting=True, shift=3.0, time_shift_type="exponential")
- Pass mu parameter to scheduler.set_timesteps() for resolution-aware shifting
- Remove manual shift_schedule parameter (scheduler handles this automatically)
- Simplify get_schedule_flux2() to return linear sigmas only
- Remove txt_embed_scale parameter (no longer needed)
This matches the diffusers Flux2KleinPipeline behavior where the
FlowMatchEulerDiscreteScheduler applies dynamic timestep shifting
based on image resolution via the mu parameter.
Fixes 4-step distilled Klein 9B model quality issues.
* fix(ui): fix FLUX.1 graph building with posCondCollect node lookup
The posCondCollect node was created with getPrefixedId() which generates
a random suffix (e.g., 'pos_cond_collect:abc123'), but g.getNode() was
called with the plain string 'pos_cond_collect', causing a node lookup
failure.
Fix by declaring posCondCollect as a module-scoped variable and
referencing it directly instead of using g.getNode().
* Remove Flux2 Klein Base from Starter Models
* Remove Logging
* Add Default Values for Flux2 Klein and add variant as additional info to from_base
* Add migrations for the z-image qwen3 encoder without a variant value
* Add img2img, inpainting and outpainting support for FLUX.2 Klein
- Add flux2_vae_encode invocation for encoding images to FLUX.2 latents
- Integrate inpaint_extension into FLUX.2 denoise loop for proper mask handling
- Apply BN normalization to init_latents and noise for consistency in inpainting
- Use manual Euler stepping for img2img/inpaint to preserve exact timestep schedule
- Add flux2_img2img, flux2_inpaint, flux2_outpaint generation modes
- Expand starter models with FP8 variants, standalone transformers, and separate VAE/encoders
- Fix outpainting to always use full denoising (0-1) since strength doesn't apply
- Improve error messages in model loader with clear guidance for standalone models
* Add GGUF quantized model support and Diffusers VAE loader for FLUX.2 Klein
- Add Main_GGUF_Flux2_Config for GGUF-quantized FLUX.2 transformer models
- Add VAE_Diffusers_Flux2_Config for FLUX.2 VAE in diffusers format
- Add Flux2GGUFCheckpointModel loader with BFL-to-diffusers conversion
- Add Flux2VAEDiffusersLoader for AutoencoderKLFlux2
- Add FLUX.2 Klein 4B/9B hardware requirements to documentation
- Update starter model descriptions to clarify dependencies install together
- Update frontend schema for new model configs
* Fix FLUX.2 model detection and add FP8 weight dequantization support
- Improve FLUX.2 variant detection for GGUF/checkpoint models (BFL format keys)
- Fix guidance_embeds logic: distilled=False, undistilled=True
- Add FP8 weight dequantization for ComfyUI-style quantized models
- Prevent FLUX.2 models from being misidentified as FLUX.1
- Preserve user-editable fields (name, description, etc.) on model reidentify
- Improve Qwen3Encoder detection by variant in starter models
- Add defensive checks for tensor operations
* Chore ruff format
* Chore Typegen
* Fix FLUX.2 Klein 9B model loading by detecting hidden_size from weights
Previously num_attention_heads was hardcoded to 24, which is correct for
Klein 4B but causes size mismatches when loading Klein 9B checkpoints.
Now dynamically calculates num_attention_heads from the hidden_size
dimension of context_embedder weights:
- Klein 4B: hidden_size=3072 → num_attention_heads=24
- Klein 9B: hidden_size=4096 → num_attention_heads=32
Fixes both Checkpoint and GGUF loaders for FLUX.2 models.
* Only clear Qwen3 encoder when FLUX.2 Klein variant changes
Previously the encoder was cleared whenever switching between any Klein
models, even if they had the same variant. Now compares the variant of
the old and new model and only clears the encoder when switching between
different variants (e.g., klein_4b to klein_9b).
This allows users to switch between different Klein 9B models without
having to re-select the Qwen3 encoder each time.
* Add metadata recall support for FLUX.2 Klein parameters
The scheduler, VAE model, and Qwen3 encoder model were not being
recalled correctly for FLUX.2 Klein images. This adds dedicated
metadata handlers for the Klein-specific parameters.
* Fix FLUX.2 Klein denoising scaling and Z-Image VAE compatibility
- Apply exponential denoising scaling (exponent 0.2) to FLUX.2 Klein,
matching FLUX.1 behavior for more intuitive inpainting strength
- Add isFlux1VAEModelConfig type guard to filter FLUX 1.0 VAEs only
- Restrict Z-Image VAE selection to FLUX 1.0 VAEs, excluding FLUX.2
Klein 32-channel VAEs which are incompatible
* chore pnpm fix
* Add FLUX.2 Klein to starter bundles and documentation
- Add FLUX.2 Klein hardware requirements to quick start guide
- Create flux2_klein_bundle with GGUF Q4 model, VAE, and Qwen3 encoder
- Add "What's New" entry announcing FLUX.2 Klein support
* Add FLUX.2 Klein built-in reference image editing support
FLUX.2 Klein has native multi-reference image editing without requiring
a separate model (unlike FLUX.1 which needs a Kontext model).
Backend changes:
- Add Flux2RefImageExtension for encoding reference images with FLUX.2 VAE
- Apply BN normalization to reference image latents for correct scaling
- Use T-coordinate offset scale=10 like diffusers (T=10, 20, 30...)
- Concatenate reference latents with generated image during denoising
- Extract only generated portion in step callback for correct preview
Frontend changes:
- Add flux2_reference_image config type without model field
- Hide model selector for FLUX.2 reference images (built-in support)
- Add type guards to handle configs without model property
- Update validators to skip model validation for FLUX.2
- Add 'flux2' to SUPPORTS_REF_IMAGES_BASE_MODELS
* Chore windows path fix
* Add reference image resizing for FLUX.2 Klein
Resize large reference images to match BFL FLUX.2 sampling.py limits:
- Single reference: max 2024² pixels (~4.1M)
- Multiple references: max 1024² pixels (~1M)
Uses same scaling approach as BFL's cap_pixels() function.
* Add user survey section to README
Added a section for new and returning users to take a survey.
* docs: add user survey link to WhatsNew
* Fix formatting issues in WhatsNew.tsx
---------
Co-authored-by: Alexander Eichhorn <alex@eichhorn.dev>
* fix(model_manager): prevent Z-Image LoRAs from being misclassified as main models
Z-Image LoRAs containing keys like `diffusion_model.context_refiner.*` were being
incorrectly classified as main checkpoint models instead of LoRAs. This happened
because the `_has_z_image_keys()` function checked for Z-Image specific keys
(like `context_refiner`) without verifying if the file was actually a LoRA.
Since main models have higher priority than LoRAs in the classification sort order,
the incorrect main model classification would win.
The fix adds detection of LoRA-specific weight suffixes (`.lora_down.weight`,
`.lora_up.weight`, `.lora_A.weight`, `.lora_B.weight`, `.dora_scale`) and returns
False if any are found, ensuring LoRAs are correctly classified.
* refactor(mm): simplify _has_z_image_keys with early return
Return True directly when a Z-Image key is found instead of using an
intermediate variable.
* feat(z-image): add Seed Variance Enhancer node and Linear UI integration
Add a new conditioning node for Z-Image models that injects seed-based
noise into text embeddings to increase visual variation between seeds.
Backend:
- New invocation: z_image_seed_variance_enhancer.py
- Parameters: strength (0-2), randomize_percent (1-100%), seed
Frontend:
- State management in paramsSlice with selectors and reducers
- UI components in SeedVariance/ folder with toggle and sliders
- Integration in GenerationSettingsAccordion (Advanced Options)
- Graph builder integration in buildZImageGraph.ts
- Metadata recall handlers for remix functionality
- Translations and tooltip descriptions
Based on: github.com/Pfannkuchensack/invokeai-z-image-seed-variance-enhancer
* chore: ruff and typegen fix
* chore: ruff and typegen fix
* Revise seedVarianceStrength explanation
Updated description for seedVarianceStrength.
* Update description for seedVarianceStrength
* fix(z-image): correct noise range comment from [-1, 1] to [-1, 1)
torch.rand() generates [0, 1), so the scaled range excludes 1.
## Summary
This PR removes codeowners from the `/docs` directory, allowing any team
member with repo write permissions to review and approve PRs involving
documentation.
## Related Issues / Discussions
Documentation review is a shared responsibility.
## QA Instructions
None needed.
## Merge Plan
Simple merge.
## Checklist
- [X] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _❗Changes to a redux slice have a corresponding migration_
- [ ] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
* WIP transform smoothing controls
* Fix transform smoothing control typings
* High level resize algo for transformation
* ESLint fix
* format with prettier
---------
Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>
* Fix for brush/eraser size not updating on up/down arrow click
* Made further improvements on brush size selection behavior
---------
Co-authored-by: Alexander Eichhorn <alex@eichhorn.dev>
## Summary
This PR fixes misleading popup message "Canvas is empty" when attempting
to extract region with empty mask layer.
Replaced with correct message "Mask layer is empty". Also redirected few
other popups to use translation file.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _❗Changes to a redux slice have a corresponding migration_
- [ ] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
* feat(z-image): add `add_noise` option to Z-Image Denoise
Add the same `add_noise` option that exists in FLUX Denoise to Z-Image Denoise.
When set to false, no noise is added to the input latents during image-to-image,
allowing for more controlled transformations.
## Summary
Add a new "Denoise - Z-Image + Metadata" node
(`ZImageDenoiseMetaInvocation`) that extends the Z-Image denoise node
with metadata output for image recall functionality.
This follows the same pattern as existing `denoise_latents_meta`
(SD1.5/SDXL) and `flux_denoise_meta` (FLUX) nodes.
**Captured metadata:**
- `width` / `height`
- `steps`
- `guidance` (guidance_scale)
- `denoising_start` / `denoising_end`
- `scheduler`
- `model` (transformer)
- `seed`
- `loras` (if applied)
## Related Issues / Discussions
Enables metadata recall for Z-Image generated images, similar to
existing support for SD1.5, SDXL, and FLUX models.
## QA Instructions
1. Create a workflow using the new "Denoise - Z-Image + Metadata" node
2. Connect the metadata output to a "Save Image" node
3. Generate an image
4. Check that metadata is saved with the image (visible in image info
panel)
5. Verify all generation parameters are captured correctly
## Merge Plan
Requires `feature/zimage-scheduler-support` #8705 branch to be merged
first (base branch).
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _❗Changes to a redux slice have a corresponding migration_
- [ ] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
## Summary
Adds `model_cache_keep_alive_min` config option (minutes, default 5) to
automatically clear model cache after inactivity. Addresses memory
contention when running InvokeAI alongside other GPU applications like
Ollama.
**Implementation:**
- **Config**: New `model_cache_keep_alive_min` field in
`InvokeAIAppConfig` with 5-minute default
- **ModelCache**: Activity tracking on get/lock/unlock/put operations,
threading.Timer for scheduled clearing
- **Thread safety**: Double-check pattern handles race conditions,
daemon threads for clean shutdown
- **Integration**: ModelManagerService passes config to cache, calls
shutdown() on stop
- **Logging**: Smart timeout logging that only shows messages when
unlocked models are actually cleared
- **Tests**: Comprehensive unit tests with properly configured mock
logger
**Usage:**
```yaml
# invokeai.yaml
model_cache_keep_alive_min: 10 # Clear after 10 minutes idle
model_cache_keep_alive_min: 0 # Set to 0 for indefinite caching (old behavior)
```
**Key Behavior:**
- **Default timeout**: 5 minutes - models are automatically cleared
after 5 minutes of inactivity
- Clearing uses same logic as "Clear Model Cache" button (make_room with
1000GB)
- Only clears **unlocked** models (respects models actively in use
during generation)
- Timeout message only appears when models are actually cleared
- Debug logging available for timeout events when no action is taken
- Prevents misleading log entries during active generation
- Users can set to 0 to restore indefinite caching behavior
## Related Issues / Discussions
Addresses enhancement request for automatic model unloading from memory
after inactivity period.
## QA Instructions
1. **Test default behavior (5-minute timeout)**:
- Start InvokeAI without explicit config
- Run a generation
- Wait 6 minutes with no activity
- Check logs for "Clearing X unlocked model(s) from cache" message
- Verify cache is empty
2. **Test custom timeout**:
- Set `model_cache_keep_alive_min: 0.1` (6 seconds) in config
- Load a model (run generation)
- Wait 7+ seconds with no activity
- Check logs for "Clearing X unlocked model(s) from cache" message
- Verify cache is empty
3. **Test no timeout (old behavior)**:
- Set `model_cache_keep_alive_min: 0` in config
- Run generations and wait extended periods
- Verify models remain cached indefinitely
4. **Test during active use**:
- Run continuous generations with any timeout setting
- Verify no timeout messages appear during active use (models are
locked)
- After generation completes, wait for timeout and verify unlocked
models are cleared
## Merge Plan
N/A - Additive change with sensible defaults. The 5-minute default
enables automatic memory management while remaining practical for
typical workflows.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [ ] _❗Changes to a redux slice have a corresponding migration_
- [x] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
<!-- START COPILOT ORIGINAL PROMPT -->
<details>
<summary>Original prompt</summary>
>
> ----
>
> *This section details on the original issue you should resolve*
>
> <issue_title>[enhancement]: option to unload from memory
</issue_title>
> <issue_description>### Is there an existing issue for this?
>
> - [X] I have searched the existing issues
>
> ### Contact Details
>
> ### What should this feature add?
>
> a command line option to unload model from RAM after a defined period
of time
>
> ### Alternatives
>
> running as a container and using Sablier to shutdown the container
after some time, this has the downside of if traffic isn't see through
the web interface it will be shut even if jobs are running.
>
> ### Additional Content
>
> _No response_</issue_description>
>
> ## Comments on the Issue (you are @copilot in this section)
>
> <comments>
> <comment_new><author>@lstein</author><body>
> I am reopening this issue. I'm running ollama and invoke on the same
server and I find their memory requirements are frequently clashing. It
would be helpful to offer users the option to have the model cache
automatically cleared after a fixed amount of inactivity. I would
suggest the following:
>
> 1. Introduce a new config file option `model_cache_keep_alive` which
specifies, in minutes, how long to keep a model in cache between
generations. The default is 0, which means to keep the model in cache
indefinitely, as is currently the case.
> 2. If no model generations occur within the timeout period, the model
cache is cleared using the same backend code as the "Clear Model Cache"
button in the queue tab.
>
> I'm going to assign this to GitHub copilot, partly to test how well it
can manage the Invoke code base. </body></comment_new>
> </comments>
>
</details>
<!-- START COPILOT CODING AGENT SUFFIX -->
- Fixesinvoke-ai/InvokeAI#6856
<!-- START COPILOT CODING AGENT TIPS -->
---
✨ Let Copilot coding agent [set things up for
you](https://github.com/invoke-ai/InvokeAI/issues/new?title=✨+Set+up+Copilot+instructions&body=Configure%20instructions%20for%20this%20repository%20as%20documented%20in%20%5BBest%20practices%20for%20Copilot%20coding%20agent%20in%20your%20repository%5D%28https://gh.io/copilot-coding-agent-tips%29%2E%0A%0A%3COnboard%20this%20repo%3E&assignees=copilot)
— coding agent works faster and does higher quality work when set up for
your repo.
Instead of disabling mutually exclusive model selectors, automatically
clear conflicting models when a new selection is made. This applies to
VAE, Qwen3 Encoder, and Qwen3 Source selectors - selecting one now
clears the others. Also applies same logic during metadata recall.
Move Scheduler handler after MainModel in ImageMetadataHandlers so that
base-dependent recall logic (z-image scheduler) works correctly. The
Scheduler handler checks `base === 'z-image'` before dispatching the
z-image scheduler action, but this check failed when Scheduler ran
before MainModel was recalled.
* feat(flux): add scheduler selection for Flux models
Add support for alternative diffusers Flow Matching schedulers:
- Euler (default, 1st order)
- Heun (2nd order, better quality, 2x slower)
- LCM (optimized for few steps)
Backend:
- Add schedulers.py with scheduler type definitions and class mapping
- Modify denoise.py to accept optional scheduler parameter
- Add scheduler InputField to flux_denoise invocation (v4.2.0)
Frontend:
- Add fluxScheduler to Redux state and paramsSlice
- Create ParamFluxScheduler component for Linear UI
- Add scheduler to buildFLUXGraph for generation
* fix(flux): prevent progress percentage overflow with LCM scheduler
LCM scheduler may have more internal timesteps than user-facing steps,
causing user_step to exceed total_steps. This resulted in progress
percentage > 1.0, which caused a pydantic validation error.
Fix: Only call step_callback when user_step <= total_steps.
* Ruff format
* fix(flux): remove initial step-0 callback for consistent step count
Remove the initial step_callback at step=0 to match SD/SDXL behavior.
Previously Flux showed N+1 steps (step 0 + N denoising steps), while
SD/SDXL showed only N steps. Now all models display N steps consistently.
* feat(flux): add scheduler support with metadata recall
- Handle LCM scheduler by using num_inference_steps instead of custom sigmas
- Fix progress bar to show user-facing steps instead of internal scheduler steps
- Pass scheduler parameter to Flux denoise node in graph builder
- Add model-aware metadata recall for Flux scheduler
---------
Co-authored-by: Jonathan <34005131+JPPhoto@users.noreply.github.com>
Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>
* feat(flux): add scheduler selection for Flux models
Add support for alternative diffusers Flow Matching schedulers:
- Euler (default, 1st order)
- Heun (2nd order, better quality, 2x slower)
- LCM (optimized for few steps)
Backend:
- Add schedulers.py with scheduler type definitions and class mapping
- Modify denoise.py to accept optional scheduler parameter
- Add scheduler InputField to flux_denoise invocation (v4.2.0)
Frontend:
- Add fluxScheduler to Redux state and paramsSlice
- Create ParamFluxScheduler component for Linear UI
- Add scheduler to buildFLUXGraph for generation
* feat(z-image): add scheduler selection for Z-Image models
Add support for alternative diffusers Flow Matching schedulers for Z-Image:
- Euler (default) - 1st order, optimized for Z-Image-Turbo (8 steps)
- Heun (2nd order) - Better quality, 2x slower
- LCM - Optimized for few-step generation
Backend:
- Extend schedulers.py with Z-Image scheduler types and mapping
- Add scheduler InputField to z_image_denoise invocation (v1.3.0)
- Refactor denoising loop to support diffusers schedulers
Frontend:
- Add zImageScheduler to Redux state in paramsSlice
- Create ParamZImageScheduler component for Linear UI
- Add scheduler to buildZImageGraph for generation
* fix ruff check
* fix(schedulers): prevent progress percentage overflow with LCM scheduler
LCM scheduler may have more internal timesteps than user-facing steps,
causing user_step to exceed total_steps. This resulted in progress
percentage > 1.0, which caused a pydantic validation error.
Fix: Only call step_callback when user_step <= total_steps.
* Ruff format
* fix(schedulers): remove initial step-0 callback for consistent step count
Remove the initial step_callback at step=0 to match SD/SDXL behavior.
Previously Flux/Z-Image showed N+1 steps (step 0 + N denoising steps),
while SD/SDXL showed only N steps. Now all models display N steps
consistently in the server log.
* feat(z-image): add scheduler support with metadata recall
- Handle LCM scheduler by using num_inference_steps instead of custom sigmas
- Fix progress bar to show user-facing steps instead of internal scheduler steps
- Pass scheduler parameter to Z-Image denoise node in graph builder
- Add model-aware metadata recall for Flux and Z-Image schedulers
---------
Co-authored-by: Jonathan <34005131+JPPhoto@users.noreply.github.com>
Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>
Add ZImageDenoiseMetaInvocation that extends ZImageDenoiseInvocation
with metadata output for image recall. Captures generation parameters
including steps, guidance, scheduler, seed, model, and LoRAs.
- Handle LCM scheduler by using num_inference_steps instead of custom sigmas
- Fix progress bar to show user-facing steps instead of internal scheduler steps
- Pass scheduler parameter to Z-Image denoise node in graph builder
- Add model-aware metadata recall for Flux and Z-Image schedulers
When using GGUF-quantized models on MPS (Apple Silicon), the
dequantized tensors could end up on a different device than the
other operands in math operations, causing "Expected all tensors
to be on the same device" errors.
This fix ensures that after dequantization, tensors are moved to
the same device as the other tensors in the operation.
Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>
Add local_files_only fallback for Qwen3 tokenizer loading in both
Checkpoint and GGUF loaders. This ensures Z-Image models can generate
images offline after the initial tokenizer download.
The tokenizer is now loaded with local_files_only=True first, falling
back to network download only if files aren't cached yet.
Fixes#8716
Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>
Remove the initial step_callback at step=0 to match SD/SDXL behavior.
Previously Flux/Z-Image showed N+1 steps (step 0 + N denoising steps),
while SD/SDXL showed only N steps. Now all models display N steps
consistently in the server log.
* feat: Implement PBR Maps Generation Node
* feat(ui): Add PBR Maps Generation to UI
* chore: fix typegen checks
* chore: possible fix for nvidia 5000 series cards
* fix: Use safetensor models for PBR maps instead of pickles.
* fix: incorrect naming of upconv_block for PBR network
* fix: incorrect naming of displacement map variable
* chore: add relevant docs to the PBR generate function
* fix: clear cuda cache after loading state_dict for PBR maps
* fix: load torch_device only once as multiple models are loaded
* chore(ui): update the filter icon for PBR to CubeBold
More relevant
---------
Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>
* Fix an issue with multiple quick-queued generations after moving bbox
After moving the canvas bbox we still handed out the previous regional-guidance mask because only two parts of the system knew anything had changed. The adapter’s
cache key doesn’t include the bbox, so the next few graph builds reused the stale mask from before the move; if the user queued several runs back‑to‑back, every
background enqueue except the last skipped rerasterizing altogether because another raster job was still in flight. The fix makes the canvas manager invalidate each
region adapter’s cached mask whenever the bbox (or a related setting) changes, and—if a reraster is already running—queues up and waits instead of bailing. Now the
first run after a bbox edit forces a new mask, and rapid-fire enqueues just wait their turn, so every queued generation gets the correct regional prompt.
* (fix) Update invokeai/frontend/web/src/features/controlLayers/konva/CanvasStateApiModule.ts
Fixes race condition identified during copilot review.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* Update invokeai/frontend/web/src/features/controlLayers/konva/CanvasStateApiModule.ts
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* Apply suggestions from code review
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
---------
Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* fix(ui): make Z-Image model selects mutually exclusive
VAE and Qwen3 Encoder selects are disabled when Qwen3 Source is selected,
and vice versa. This prevents invalid model combinations.
* feat(ui): auto-select Z-Image component models on model change
When switching to a Z-Image model, automatically set valid defaults
if no configuration exists:
- Prefers Qwen3 Source (Diffusers model) if available
- Falls back to Qwen3 Encoder + FLUX VAE combination
This ensures the generate button is enabled immediately after selecting
a Z-Image model, without requiring manual configuration.
* fix(ui): save and restore Qwen3 Source model in metadata
Qwen3 Source (Diffusers Z-Image) model was not being saved to image
metadata or restored during Remix. This adds:
- Saving qwen3_source to metadata in buildZImageGraph
- ZImageQwen3SourceModel metadata handler for parsing and recall
- i18n translation for qwen3Source
Changes image self-attention from restricted (region-isolated) to unrestricted
(all image tokens can attend to each other), similar to the FLUX approach.
This fixes the issue where ZImage-Turbo with multiple regional guidance layers
would generate two separate/disconnected images instead of compositing them
into a single unified image.
The regional text-image attention remains restricted so that each region still
responds to its corresponding prompt.
Fixes#8715
LCM scheduler may have more internal timesteps than user-facing steps,
causing user_step to exceed total_steps. This resulted in progress
percentage > 1.0, which caused a pydantic validation error.
Fix: Only call step_callback when user_step <= total_steps.
Changed the default value of model_cache_keep_alive from 0 (indefinite)
to 5 minutes as requested. This means models will now be automatically
cleared from cache after 5 minutes of inactivity by default, unless
users explicitly configure a different value.
Users can still set it to 0 in their config to get the old behavior
of keeping models indefinitely.
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
## Summary
Fix Z-Image LoRA/DoRA model detection failing during installation.
Z-Image LoRAs use different key patterns than SD/SDXL LoRAs. The base
`LoRA_LyCORIS_Config_Base` class only checked for key suffixes like
`lora_A.weight` and `lora_B.weight`, but Z-Image LoRAs (especially those
in DoRA format) use:
- `lora_down.weight` / `lora_up.weight` (standard LoRA format)
- `dora_scale` (DoRA weight decomposition)
This PR overrides `_validate_looks_like_lora` in
`LoRA_LyCORIS_ZImage_Config` to recognize Z-Image specific patterns:
- Keys starting with `diffusion_model.layers.` (Z-Image S3-DiT
architecture)
- Keys ending with `lora_down.weight`, `lora_up.weight`,
`lora_A.weight`, `lora_B.weight`, or `dora_scale`
## Related Issues / Discussions
Fixes installation of Z-Image LoRAs trained with DoRA (Weight-Decomposed
Low-Rank Adaptation).
## QA Instructions
1. Download a Z-Image LoRA in DoRA format (e.g., from CivitAI with keys
like `diffusion_model.layers.X.attention.to_k.lora_down.weight`)
2. Try to install the LoRA via Model Manager
3. Verify the model is recognized as a Z-Image LoRA and installs
successfully
4. Verify the LoRA can be applied when generating with Z-Image
## Merge Plan
Standard merge, no special considerations.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _❗Changes to a redux slice have a corresponding migration_
- [ ] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
Two fixes for Z-Image LoRA support:
1. Override _validate_looks_like_lora in LoRA_LyCORIS_ZImage_Config to
recognize Z-Image specific LoRA formats that use different key patterns
than SD/SDXL LoRAs. Z-Image LoRAs use lora_down.weight/lora_up.weight
and dora_scale suffixes instead of lora_A.weight/lora_B.weight.
2. Fix _group_by_layer in z_image_lora_conversion_utils.py to correctly
group LoRA keys by layer name. The previous logic used rsplit with
maxsplit=2 which incorrectly grouped keys like:
- "to_k.alpha" -> layer "diffusion_model.layers.17.attention"
- "lora_down.weight" -> layer "diffusion_model.layers.17.attention.to_k"
Now uses suffix matching to ensure all keys for a layer are grouped
together (alpha, dora_scale, lora_down.weight, lora_up.weight).
Override _validate_looks_like_lora in LoRA_LyCORIS_ZImage_Config to
recognize Z-Image specific LoRA formats that use different key patterns
than SD/SDXL LoRAs.
Z-Image LoRAs (including DoRA format) use keys like:
- diffusion_model.layers.X.attention.to_k.lora_down.weight
- diffusion_model.layers.X.attention.to_k.dora_scale
The base LyCORIS config only checked for lora_A.weight/lora_B.weight
suffixes, missing the lora_down.weight/lora_up.weight and dora_scale
patterns used by Z-Image LoRAs.
Add support for alternative diffusers Flow Matching schedulers for Z-Image:
- Euler (default) - 1st order, optimized for Z-Image-Turbo (8 steps)
- Heun (2nd order) - Better quality, 2x slower
- LCM - Optimized for few-step generation
Backend:
- Extend schedulers.py with Z-Image scheduler types and mapping
- Add scheduler InputField to z_image_denoise invocation (v1.3.0)
- Refactor denoising loop to support diffusers schedulers
Frontend:
- Add zImageScheduler to Redux state in paramsSlice
- Create ParamZImageScheduler component for Linear UI
- Add scheduler to buildZImageGraph for generation
Add support for alternative diffusers Flow Matching schedulers:
- Euler (default, 1st order)
- Heun (2nd order, better quality, 2x slower)
- LCM (optimized for few steps)
Backend:
- Add schedulers.py with scheduler type definitions and class mapping
- Modify denoise.py to accept optional scheduler parameter
- Add scheduler InputField to flux_denoise invocation (v4.2.0)
Frontend:
- Add fluxScheduler to Redux state and paramsSlice
- Create ParamFluxScheduler component for Linear UI
- Add scheduler to buildFLUXGraph for generation
* feat: Add Regional Guidance support for Z-Image model
Implements regional prompting for Z-Image (S3-DiT Transformer) allowing
different prompts to affect different image regions using attention masks.
Backend changes:
- Add ZImageRegionalPromptingExtension for mask preparation
- Add ZImageTextConditioning and ZImageRegionalTextConditioning data classes
- Patch transformer forward to inject 4D regional attention masks
- Use additive float mask (0.0 attend, -inf block) in bfloat16 for compatibility
- Alternate regional/full attention layers for global coherence
Frontend changes:
- Update buildZImageGraph to support regional conditioning collectors
- Update addRegions to create z_image_text_encoder nodes for regions
- Update addZImageLoRAs to handle optional negCond when guidance_scale=0
- Add Z-Image validation (no IP adapters, no autoNegative)
* @Pfannkuchensack
Fix windows path again
* ruff check fix
* ruff formating
* fix(ui): Z-Image CFG guidance_scale check uses > 1 instead of > 0
Changed the guidance_scale check from > 0 to > 1 for Z-Image models.
Since Z-Image uses guidance_scale=1.0 as "no CFG" (matching FLUX convention),
negative conditioning should only be created when guidance_scale > 1.
---------
Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>
* (bugfix)(mm) work around Windows being unable to rmtree tmp directories after GGUF install
* (style) fix ruff error
* (fix) add workaround for Windows Permission Denied on GGUF file move() call
* (fix) perform torch copy() in GGUF reader to avoid deletion failures on Windows
* (style) fix ruff formatting issues
Add support for loading Flux LoRA models in the xlabs format, which uses
keys like `double_blocks.X.processor.{qkv|proj}_lora{1|2}.{down|up}.weight`.
The xlabs format maps:
- lora1 -> img_attn (image attention stream)
- lora2 -> txt_attn (text attention stream)
- qkv -> query/key/value projection
- proj -> output projection
Changes:
- Add FluxLoRAFormat.XLabs enum value
- Add flux_xlabs_lora_conversion_utils.py with detection and conversion
- Update formats.py to detect xlabs format
- Update lora.py loader to handle xlabs format
- Update model probe to accept recognized Flux LoRA formats
- Add unit tests for xlabs format detection and conversion
Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>
* Feature: Add Tag System for user made Workflows
* feat(ui): display tags on workflow library tiles
Show workflow tags at the bottom of each tile in the workflow browser,
making it easier to identify workflow categories at a glance.
---------
Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>
* feat(nodes): add Prompt Template node
Add a new node that applies Style Preset templates to prompts in workflows.
The node takes a style preset ID and positive/negative prompts as inputs,
then replaces {prompt} placeholders in the template with the provided prompts.
This makes Style Preset templates accessible in Workflow mode, enabling
users to apply consistent styling across their workflow-based generations.
* feat(nodes): add StylePresetField for database-driven preset selection
Adds a new StylePresetField type that enables dropdown selection of
style presets from the database in the workflow editor.
Changes:
- Add StylePresetField to backend (fields.py)
- Update Prompt Template node to use StylePresetField instead of string ID
- Add frontend field type definitions (zod schemas, type guards)
- Create StylePresetFieldInputComponent with Combobox
- Register field in InputFieldRenderer and nodesSlice
- Add translations for preset selection
* fix schema.ts on windows.
* chore(api): regenerate schema.ts after merge
---------
Co-authored-by: Claude <noreply@anthropic.com>
Configure mock logger to return a valid log level for getEffectiveLevel()
to prevent TypeError when comparing with logging.DEBUG constant.
The issue was that ModelCache._log_cache_state() checks
self._logger.getEffectiveLevel() > logging.DEBUG, and when the logger
is a MagicMock without configuration, getEffectiveLevel() returns another
MagicMock, causing a TypeError when compared with an int.
Fixes all 4 test failures in test_model_cache_timeout.py
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
Only log "Clearing model cache" message when there are actually unlocked
models to clear. This prevents the misleading message from appearing during
active generation when all models are locked.
Changes:
- Check for unlocked models before logging clear message
- Add count of unlocked models in log message
- Add debug log when all models are locked
- Improves user experience by avoiding confusing messages
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* fix(model-install): support multi-subfolder downloads for Z-Image Qwen3 encoder
The Z-Image Qwen3 text encoder requires both text_encoder and tokenizer
subfolders from the HuggingFace repo, but the previous implementation
only downloaded the text_encoder subfolder, causing model identification
to fail.
Changes:
- Add subfolders property to HFModelSource supporting '+' separated paths
- Extend filter_files() and download_urls() to handle multiple subfolders
- Update _multifile_download() to preserve subfolder structure
- Make Qwen3Encoder probe check both nested and direct config.json paths
- Update Qwen3EncoderLoader to handle both directory structures
- Change starter model source to text_encoder+tokenizer
* ruff format
* fix schema description
* fix schema description
---------
Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>
- Remove all trailing whitespace (W293 errors)
- Add debug logging when timeout fires but activity detected
- Add debug logging when timeout fires but cache is empty
- Only log "Clearing model cache" message when actually clearing
- Prevents misleading timeout messages during active generation
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
* feat(ui): add model path update for external models
Add ability to update file paths for externally managed models (models with
absolute paths). Invoke-controlled models (with relative paths in the models
directory) are excluded from this feature to prevent breaking internal
model management.
- Add ModelUpdatePathButton component with modal dialog
- Only show button for external models (absolute path check)
- Add translations for path update UI elements
* Added support for Windows UNC paths in ModelView.tsx:38-41. The isExternalModel function now detects:
Unix absolute paths: /home/user/models/...
Windows drive paths: C:\Models\... or D:/Models/...
Windows UNC paths: \\ServerName\ShareName\... or //ServerName/ShareName/...
* fix(ui): validate path format in Update Path modal to prevent invalid paths
When updating an external model's path, the new path is now validated to ensure
it follows an absolute path format (Unix, Windows drive, or UNC). This prevents
users from accidentally entering invalid paths that would cause the Update Path
button to disappear, leaving them unable to correct the mistake.
* fix(ui): extract isExternalModel to separate file to fix circular dependency
Moves the isExternalModel utility function to its own file to break the
circular dependency between ModelView.tsx and ModelUpdatePathButton.tsx.
---------
Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>
- Added clarifying comment that _record_activity is called with lock held
- Enhanced double-check in _on_timeout for thread safety
- Added lock protection to shutdown method
- Improved handling of edge cases where timer fires during activity
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
## Summary
Add Z-Image Turbo and related models to the starter models list for easy
installation via the Model Manager:
- **Z-Image Turbo** - Full precision Diffusers format (~13GB)
- **Z-Image Turbo (quantized)** - GGUF Q4_K format (~4GB)
- **Z-Image Qwen3 Text Encoder** - Full precision (~8GB)
- **Z-Image Qwen3 Text Encoder (quantized)** - GGUF Q6_K format (~3.3GB)
- **Z-Image ControlNet Union** - Unified ControlNet supporting Canny,
HED, Depth, Pose, MLSD, and Inpainting modes
The quantized Turbo model includes the quantized Qwen3 encoder as a
dependency for automatic installation.
## Related Issues / Discussions
Builds on the Z-Image Turbo support added in main.
## QA Instructions
1. Open Model Manager → Starter Models
2. Search for "Z-Image"
3. Verify all 5 models appear with correct descriptions
4. Install the quantized version and confirm the Qwen3 encoder
dependency is also installed
## Merge Plan
Standard merge, no special considerations.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _❗Changes to a redux slice have a corresponding migration_
- [ ] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
Add higher quality Q8_0 quantization option for Z-Image Turbo (~6.6GB)
to complement existing Q4_K variant, providing better quality for users
with more VRAM.
Add dedicated Z-Image ControlNet Tile model (~6.7GB) for upscaling and
detail enhancement workflows.
## Summary
Fix shape mismatch when loading GGUF-quantized Z-Image transformer
models.
GGUF Z-Image models store `x_pad_token` and `cap_pad_token` with shape
`[3840]`, but diffusers `ZImageTransformer2DModel` expects `[1, 3840]`
(with batch dimension). This caused a `RuntimeError` on Linux systems
when loading models like `z_image_turbo-Q4_K.gguf`.
The fix:
- Dequantizes GGMLTensors first (since they don't support `unsqueeze`)
- Reshapes the tensors to add the missing batch dimension
## Related Issues / Discussions
Reported by Linux user using:
-
https://huggingface.co/leejet/Z-Image-Turbo-GGUF/resolve/main/z_image_turbo-Q4_K.gguf
-
https://huggingface.co/worstplayer/Z-Image_Qwen_3_4b_text_encoder_GGUF/resolve/main/Qwen_3_4b-Q6_K.gguf
## QA Instructions
1. Install a GGUF-quantized Z-Image model (e.g.,
`z_image_turbo-Q4_K.gguf`)
2. Install a Qwen3 GGUF encoder
3. Run a Z-Image generation
4. Verify no `RuntimeError: size mismatch for x_pad_token` error occurs
## Merge Plan
None, straightforward fix.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _❗Changes to a redux slice have a corresponding migration_
- [ ] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
## Summary
Add support for Z-Image ControlNet V2.0 alongside the existing V1
support.
**Key changes:**
- Auto-detect `control_in_dim` from adapter weights (16 for V1, 33 for
V2.0)
- Auto-detect `n_refiner_layers` from state dict
- Add zero-padding for V2.0's additional control channels (diffusers
approach)
- Use `accelerate.init_empty_weights()` for more efficient model
creation
- Add `ControlNet_Checkpoint_ZImage_Config` to frontend schema
## Related Issues / Discussions
Part of Z-Image feature implementation.
## QA Instructions
1. Load a Z-Image ControlNet V1 model (control_in_dim=16) and verify it
works
2. Load a Z-Image ControlNet V2.0 model (control_in_dim=33) and verify
it works
3. Test with different control types: Canny, Depth, Pose
4. Recommended `control_context_scale`: 0.65-0.80
## Merge Plan
Can be merged after review. No special considerations needed.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _❗Changes to a redux slice have a corresponding migration_
- [ ] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
* chore: localize extraction errors
* chore: rename extract masked area menu item
* chore: rename inpaint mask extract component
* fix: use mask bounds for extraction region
* Prettier format applied to InpaintMaskMenuItemsExtractMaskedArea.tsx
* Fix base64 image import bug in extracted area in InpaintMaskMenuItemsExtractMaskedArea.tsx and removed unused locales entries in en.json
* Fix formatting issue in InpaintMaskMenuItemsExtractMaskedArea.tsx
* Minor comment fix
---------
Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>
GGUF Z-Image models store x_pad_token and cap_pad_token with shape [dim],
but diffusers ZImageTransformer2DModel expects [1, dim]. This caused a
RuntimeError when loading GGUF-quantized Z-Image models.
The fix dequantizes GGMLTensors first (since they don't support unsqueeze),
then reshapes to add the batch dimension.
* fix(ui): 🐛 `HotkeysModal` and `SettingsModal` initial focus
instead of using the `initialFocusRef` prop, the `Modal` component was focusing on the last available Button. This is a workaround that uses `tabIndex` instead which seems to be working.
Closes#8685
* style: 🚨 satisfy linter
---------
Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>
Add Z-Image Turbo and related models to the starter models list:
- Z-Image Turbo (full precision, ~13GB)
- Z-Image Turbo quantized (GGUF Q4_K, ~4GB)
- Z-Image Qwen3 Text Encoder (full precision, ~8GB)
- Z-Image Qwen3 Text Encoder quantized (GGUF Q6_K, ~3.3GB)
- Z-Image ControlNet Union (Canny, HED, Depth, Pose, MLSD, Inpainting)
The quantized Turbo model includes the quantized Qwen3 encoder as a
dependency for automatic installation.
Implement Z-Image ControlNet as an Extension pattern (similar to FLUX ControlNet)
instead of merging control weights into the base transformer. This provides:
- Lower memory usage (no weight duplication)
- Flexibility to enable/disable control per step
- Cleaner architecture with separate control adapter
Key implementation details:
- ZImageControlNetExtension: computes control hints per denoising step
- z_image_forward_with_control: custom forward pass with hint injection
- patchify_control_context: utility for control image patchification
- ZImageControlAdapter: standalone adapter with control_layers and noise_refiner
Architecture matches original VideoX-Fun implementation:
- Hints computed ONCE using INITIAL unified state (before main layers)
- Hints injected at every other main transformer layer (15 control blocks)
- Control signal added after each designated layer's forward pass
V2.0 ControlNet support (control_in_dim=33):
- Channels 0-15: control image latents
- Channels 16-31: reference image (zeros for pure control)
- Channel 32: inpaint mask (1.0 = don't inpaint, use control signal)
VRAM usage is high.
- Auto-detect control_in_dim from adapter weights (16 for V1, 33 for V2.0)
- Auto-detect n_refiner_layers from state dict
- Add zero-padding for V2.0's additional channels
- Use accelerate.init_empty_weights() for efficient model creation
- Add ControlNet_Checkpoint_ZImage_Config to frontend schema
feat: Add Z-Image ControlNet support with spatial conditioning
Add comprehensive ControlNet support for Z-Image models including:
Backend:
- New ControlNet_Checkpoint_ZImage_Config for Z-Image control adapter models
- Z-Image control key detection (_has_z_image_control_keys) to identify control layers
- ZImageControlAdapter loader for standalone control models
- ZImageControlTransformer2DModel combining base transformer with control layers
- Memory-efficient model loading by building combined state dict
Add comprehensive support for Z-Image-Turbo (S3-DiT) models including:
Backend:
- New BaseModelType.ZImage in taxonomy
- Z-Image model config classes (ZImageTransformerConfig,
Qwen3TextEncoderConfig)
- Model loader for Z-Image transformer and Qwen3 text encoder
- Z-Image conditioning data structures
- Step callback support for Z-Image with FLUX latent RGB factors
Invocations:
- z_image_model_loader: Load Z-Image transformer and Qwen3 encoder
- z_image_text_encoder: Encode prompts using Qwen3 with chat template
- z_image_denoise: Flow matching denoising with time-shifted sigmas
- z_image_image_to_latents: Encode images to 16-channel latents
- z_image_latents_to_image: Decode latents using FLUX VAE
Frontend:
- Z-Image graph builder for text-to-image generation
- Model picker and validation updates for z-image base type
- CFG scale now allows 0 (required for Z-Image-Turbo)
- Clip skip disabled for Z-Image (uses Qwen3, not CLIP)
- Optimal dimension settings for Z-Image (1024x1024)
Technical details:
- Uses Qwen3 text encoder (not CLIP/T5)
- 16 latent channels with FLUX-compatible VAE
- Flow matching scheduler with dynamic time shift
- 8 inference steps recommended for Turbo variant
- bfloat16 inference dtype
## Summary
<!--A description of the changes in this PR. Include the kind of change
(fix, feature, docs, etc), the "why" and the "how". Screenshots or
videos are useful for frontend changes.-->
## Related Issues / Discussions
<!--WHEN APPLICABLE: List any related issues or discussions on github or
discord. If this PR closes an issue, please use the "Closes #1234"
format, so that the issue will be automatically closed when the PR
merges.-->
## QA Instructions
- Install a Z-Image-Turbo model (e.g., from HuggingFace)
- Select the model in the Model Picker
- Generate a text-to-image with:
- CFG Scale: 0
- Steps: 8
- Resolution: 1024x1024
- Verify the generated image is coherent (not noise)
## Merge Plan
Standard merge, no special considerations needed.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _❗Changes to a redux slice have a corresponding migration_
- [ ] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
The previous mixed-precision optimization for FP32 mode only converted
some VAE decoder layers (post_quant_conv, conv_in, mid_block) to the
latents dtype while leaving others (up_blocks, conv_norm_out) in float32.
This caused "expected scalar type Half but found Float" errors after
recent diffusers updates.
Simplify FP32 mode to consistently use float32 for both VAE and latents,
removing the incomplete mixed-precision logic. This trades some VRAM
usage for stability and correctness.
Also removes now-unused attention processor imports.
The Z-Image denoise node outputs latents, not images, so these mixins
were unnecessary. Metadata and board handling is correctly done in the
L2I (latents-to-image) node. This aligns with how FLUX denoise works.
- Add CustomDiffusersRMSNorm for diffusers.models.normalization.RMSNorm
- Add CustomLayerNorm for torch.nn.LayerNorm
- Register both in AUTOCAST_MODULE_TYPE_MAPPING
Enables partial loading (enable_partial_loading: true) for Z-Image models
by wrapping their normalization layers with device autocast support
The FLUX Dev license warning in model pickers used isCheckpointMainModelConfig
incorrectly:
```
isCheckpointMainModelConfig(config) && config.variant === 'dev'
```
This caused a TypeScript error because CheckpointModelConfig type doesn't
include the 'variant' property (it's extracted as `{ type: 'main'; format:
'checkpoint' }` which doesn't narrow to include variant).
Changes:
- Add isFluxDevMainModelConfig type guard that properly checks
base='flux' AND variant='dev', returning MainModelConfig
- Update MainModelPicker and InitialStateMainModelPicker to use new guard
- Remove isCheckpointMainModelConfig as it had no other usages
The function was removed because:
1. It was only used for detecting FLUX Dev models (incorrect use case)
2. No other code needs a generic "is checkpoint format" check
3. The pattern in this codebase is specific type guards per model variant
(isFluxFillMainModelModelConfig, isRefinerMainModelModelConfig, etc.)
Add robust device capability detection for bfloat16, replacing hardcoded
dtype with runtime checks that fallback to float16/float32 on unsupported
hardware. This prevents runtime failures on GPUs and CPUs without bfloat16.
Key changes:
- Add TorchDevice.choose_bfloat16_safe_dtype() helper for safe dtype selection
- Fix LoRA device mismatch in layer_patcher.py (add device= to .to() call)
- Replace all assert statements with descriptive exceptions (TypeError/ValueError)
- Add hidden_states bounds check and apply_chat_template fallback in text encoder
- Add GGUF QKV tensor validation (divisible by 3 check)
- Fix CPU noise generation to use float32 for compatibility
- Remove verbose debug logging from LoRA conversion utils
Add support for saving and recalling Z-Image component models (VAE and
Qwen3 Encoder) in image metadata.
Backend:
- Add qwen3_encoder field to CoreMetadataInvocation (version 2.1.0)
Frontend:
- Add vae and qwen3_encoder to Z-Image graph metadata
- Add Qwen3EncoderModel metadata handler for recall
- Add ZImageVAEModel metadata handler (uses zImageVaeModelSelected
instead of vaeSelected to set Z-Image-specific VAE state)
- Add qwen3Encoder translation key
This enables "Recall Parameters" / "Remix Image" to restore the VAE
and Qwen3 Encoder settings used for Z-Image generations.
Add support for loading Z-Image transformer and Qwen3 encoder models
from single-file safetensors format (in addition to existing diffusers
directory format).
Changes:
- Add Main_Checkpoint_ZImage_Config and Main_GGUF_ZImage_Config for
single-file Z-Image transformer models
- Add Qwen3Encoder_Checkpoint_Config for single-file Qwen3 text encoder
- Add ZImageCheckpointModel and ZImageGGUFCheckpointModel loaders with
automatic key conversion from original to diffusers format
- Add Qwen3EncoderCheckpointLoader using Qwen3ForCausalLM with fast
loading via init_empty_weights and proper weight tying for lm_head
- Update z_image_denoise to accept Checkpoint format models
Add comprehensive support for GGUF quantized Z-Image models and improve component flexibility:
Backend:
- New Main_GGUF_ZImage_Config for GGUF quantized Z-Image transformers
- Z-Image key detection (_has_z_image_keys) to identify S3-DiT models
- GGUF quantization detection and sidecar LoRA patching for quantized models
- Qwen3Encoder_Qwen3Encoder_Config for standalone Qwen3 encoder models
Model Loader:
- Split Z-Image model
Move Flux layer structure check before metadata check to prevent misidentifying Z-Image LoRAs (which use `diffusion_model.layers.X`) as Flux AI Toolkit format. Flux models use `double_blocks` and `single_blocks` patterns which are now checked first regardless of metadata presence.
* feat: Add bulk delete functionality for models, LoRAs, and embeddings
Implements a comprehensive bulk deletion feature for the model manager that allows users to select and delete multiple models, LoRAs, and embeddings at once.
Key changes:
Frontend:
- Add multi-selection state management to modelManagerV2 slice
- Update ModelListItem to support Ctrl/Cmd+Click multi-selection with checkboxes
- Create ModelListHeader component showing selection count and bulk actions
- Create BulkDeleteModelsModal for confirming bulk deletions
- Integrate bulk delete UI into ModelList with proper error handling
- Add API mutation for bulk delete operations
Backend:
- Add POST /api/v2/models/i/bulk_delete endpoint
- Implement BulkDeleteModelsRequest and BulkDeleteModelsResponse schemas
- Handle partial failures with detailed error reporting
- Return lists of successfully deleted and failed models
This feature significantly improves user experience when managing large model libraries, especially when restructuring model storage locations.
Fixes issue where users had to delete models individually after moving model files to new storage locations.
* fix: prevent model list header from scrolling with content
* fix: improve error handling in bulk model deletion
- Added proper error serialization using serialize-error for better error logging
- Explicitly defined BulkDeleteModelsResponse type instead of relying on generated schema reference
* refactor: improve code organization in ModelList components
- Reordered imports to follow conventional grouping (external, internal, then third-party utilities)
- Added type assertion for error serialization to satisfy TypeScript
- Extracted inline event handler into named callback function for better readability
* refactor: consolidate Button component props to single line
* feat(ui): enhance model manager bulk selection with select-all and actions menu
- Added select-all checkbox in navigation header with indeterminate state support
- Replaced single delete button with actions dropdown menu for future extensibility
- Made checkboxes always visible instead of conditionally showing on selection
- Moved model filtering logic to ModelListNavigation for select-all functionality
- Improved UX by showing selection state for filtered models only
* fix the wrong path seperater from my windows system
---------
Co-authored-by: Claude <noreply@anthropic.com>
Add comprehensive support for Z-Image-Turbo (S3-DiT) models including:
Backend:
- New BaseModelType.ZImage in taxonomy
- Z-Image model config classes (ZImageTransformerConfig, Qwen3TextEncoderConfig)
- Model loader for Z-Image transformer and Qwen3 text encoder
- Z-Image conditioning data structures
- Step callback support for Z-Image with FLUX latent RGB factors
Invocations:
- z_image_model_loader: Load Z-Image transformer and Qwen3 encoder
- z_image_text_encoder: Encode prompts using Qwen3 with chat template
- z_image_denoise: Flow matching denoising with time-shifted sigmas
- z_image_image_to_latents: Encode images to 16-channel latents
- z_image_latents_to_image: Decode latents using FLUX VAE
Frontend:
- Z-Image graph builder for text-to-image generation
- Model picker and validation updates for z-image base type
- CFG scale now allows 0 (required for Z-Image-Turbo)
- Clip skip disabled for Z-Image (uses Qwen3, not CLIP)
- Optimal dimension settings for Z-Image (1024x1024)
Technical details:
- Uses Qwen3 text encoder (not CLIP/T5)
- 16 latent channels with FLUX-compatible VAE
- Flow matching scheduler with dynamic time shift
- 8 inference steps recommended for Turbo variant
- bfloat16 inference dtype
* feat: remove the ModelFooter in the ModelView and add the Delete Model Button from the Footer into the View
* forget to run pnpm fix
* chore(ui): reorder the model view buttons
* Initial plan
* Add customizable hotkeys infrastructure with UI
Co-authored-by: dunkeroni <3298737+dunkeroni@users.noreply.github.com>
* Fix ESLint issues in HotkeyEditor component
Co-authored-by: dunkeroni <3298737+dunkeroni@users.noreply.github.com>
* Fix knip unused export warning
Co-authored-by: dunkeroni <3298737+dunkeroni@users.noreply.github.com>
* Add tests for hotkeys slice
Co-authored-by: dunkeroni <3298737+dunkeroni@users.noreply.github.com>
* Fix tests to actually call reducer and add documentation
Co-authored-by: dunkeroni <3298737+dunkeroni@users.noreply.github.com>
* docs: add comprehensive hotkeys system documentation
- Created new HOTKEYS.md technical documentation for developers explaining architecture, data flow, and implementation details
- Added user-facing hotkeys.md guide with features overview and usage instructions
- Removed old CUSTOMIZABLE_HOTKEYS.md in favor of new split documentation
- Expanded documentation with detailed sections on:
- State management and persistence
- Component architecture and responsibilities
- Developer integration
* Behavior changed to hotkey press instead of input + checking for allready used hotkeys
---------
Co-authored-by: blessedcoolant <54517381+blessedcoolant@users.noreply.github.com>
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: dunkeroni <3298737+dunkeroni@users.noreply.github.com>
Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>
* feat(nodes/UI): add SDXL color compensation option
* adjust value
* Better warnings on wrong VAE base model
* Restrict XL compensation to XL models
Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>
* fix: BaseModelType missing import
* (chore): appease the ruff
---------
Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>
* Wrap GGUF loader for context managed close()
Wrap gguf.GGUFReader and then use a context manager to load memory-mapped GGUF files, so that they will automatically close properly when no longer needed. Should prevent the 'file in use in another process' errors on Windows.
* Additional check for cached state_dict
Additional check for cached state_dict as path is now optional - should solve model manager 'missing' this and the resultant memory errors.
* Appease ruff
* Further ruff appeasement
* ruff
* loaders.py fix for linux
No longer attempting to delete internal object.
* loaders.py - one more _mmap ref removed
---------
Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>
* Rework graph, add documentation
* Minor fixes to README.md
* Updated schema
* Fixed test to match behavior - all nodes executed, parents before children
* Update invokeai/app/services/shared/graph.py
Cleaned up code
Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>
* Change silent corrections to enforcing invariants
---------
Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>
## Summary
This fixes a bug in which private directory paths on the host could be
leaked to the user interface. The error occurs during the `scan_folders`
operation when a subdirectory is not accessible. The UI shows a
permission denied error message, followed by the path of the offending
directory. This patch limits the error message to the error type only
and does not give further details.
## Related Issues / Discussions
This bug was reported in a private DM on the Discord server.
## QA Instructions
Before applying this PR, go to ***Model Manager -> Add Model -> Scan
Folder*** and enter the path of a directory that has subdirectories that
the backend should not have access to, for example `/etc`. Press the
***Scan Folder*** button. You will see a Permission Denied error message
that gives away the path of the first inaccesislbe subdirectory.
After applying this PR, you will see just the Permission Denied error
without details.
## Merge Plan
Merge when approved.
## Checklist
- [X] _The PR has a short but descriptive title, suitable for a
changelog_
- [X] _Tests added / updated (if applicable)_
- [X] _❗Changes to a redux slice have a corresponding migration_
- [X] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
Add route and model record service method to reidentify a model. This
re-probes the model files and replaces the model's config with the new
one if it does not error.
We had an "infill methods" route that long ago told the frontend infill
method, upscale method (model), NSFW checker, and watermark feature
availability.
None of these were used except for the patchmatch check. Removed them,
made the check exclusively for patchmatch, updated related code in redux
app startup listeners and settings modal.
* feat(mm): add UnknownModelConfig
* refactor(ui): move model categorisation-ish logic to central location, simplify model manager models list
* refactor(ui)refactor(ui): more cleanup of model categories
* refactor(ui): remove unused excludeSubmodels
I can't remember what this was for and don't see any reference to it.
Maybe it's just remnants from a previous implementation?
* feat(nodes): add unknown as model base
* chore(ui): typegen
* feat(ui): add unknown model base support in ui
* feat(ui): allow changing model type in MM, fix up base and variant selects
* feat(mm): omit model description instead of making it "base type filename model"
* feat(app): add setting to allow unknown models
* feat(ui): allow changing model format in MM
* feat(app): add the installed model config to install complete events
* chore(ui): typegen
* feat(ui): toast warning when installed model is unidentified
* docs: update config docstrings
* chore(ui): typegen
* tests(mm): fix test for MM, leave the UnknownModelConfig class in the list of configs
* tidy(ui): prefer types from zod schemas for model attrs
* chore(ui): lint
* fix(ui): wrong translation string
* feat(mm): normalized model storage
Store models in a flat directory structure. Each model is in a dir named
its unique key (a UUID). Inside that dir is either the model file or the
model dir.
* feat(mm): add migration to flat model storage
* fix(mm): normalized multi-file/diffusers model installation no worky
now worky
* refactor: port MM probes to new api
- Add concept of match certainty to new probe
- Port CLIP Embed models to new API
- Fiddle with stuff
* feat(mm): port TIs to new API
* tidy(mm): remove unused probes
* feat(mm): port spandrel to new API
* fix(mm): parsing for spandrel
* fix(mm): loader for clip embed
* fix(mm): tis use existing weight_files method
* feat(mm): port vae to new API
* fix(mm): vae class inheritance and config_path
* tidy(mm): patcher types and import paths
* feat(mm): better errors when invalid model config found in db
* feat(mm): port t5 to new API
* feat(mm): make config_path optional
* refactor(mm): simplify model classification process
Previously, we had a multi-phase strategy to identify models from their
files on disk:
1. Run each model config classes' `matches()` method on the files. It
checks if the model could possibly be an identified as the candidate
model type. This was intended to be a quick check. Break on the first
match.
2. If we have a match, run the config class's `parse()` method. It
derive some additional model config attrs from the model files. This was
intended to encapsulate heavier operations that may require loading the
model into memory.
3. Derive the common model config attrs, like name, description,
calculate the hash, etc. Some of these are also heavier operations.
This strategy has some issues:
- It is not clear how the pieces fit together. There is some
back-and-forth between different methods and the config base class. It
is hard to trace the flow of logic until you fully wrap your head around
the system and therefore difficult to add a model architecture to the
probe.
- The assumption that we could do quick, lightweight checks before
heavier checks is incorrect. We often _must_ load the model state dict
in the `matches()` method. So there is no practical perf benefit to
splitting up the responsibility of `matches()` and `parse()`.
- Sometimes we need to do the same checks in `matches()` and `parse()`.
In these cases, splitting the logic is has a negative perf impact
because we are doing the same work twice.
- As we introduce the concept of an "unknown" model config (i.e. a model
that we cannot identify, but still record in the db; see #8582), we will
_always_ run _all_ the checks for every model. Therefore we need not try
to defer heavier checks or resource-intensive ops like hashing. We are
going to do them anyways.
- There are situations where a model may match multiple configs. One
known case are SD pipeline models with merged LoRAs. In the old probe
API, we relied on the implicit order of checks to know that if a model
matched for pipeline _and_ LoRA, we prefer the pipeline match. But, in
the new API, we do not have this implicit ordering of checks. To resolve
this in a resilient way, we need to get all matches up front, then use
tie-breaker logic to figure out which should win (or add "differential
diagnosis" logic to the matchers).
- Field overrides weren't handled well by this strategy. They were only
applied at the very end, if a model matched successfully. This means we
cannot tell the system "Hey, this model is type X with base Y. Trust me
bro.". We cannot override the match logic. As we move towards letting
users correct mis-identified models (see #8582), this is a requirement.
We can simplify the process significantly and better support "unknown"
models.
Firstly, model config classes now have a single `from_model_on_disk()`
method that attempts to construct an instance of the class from the
model files. This replaces the `matches()` and `parse()` methods.
If we fail to create the config instance, a special exception is raised
that indicates why we think the files cannot be identified as the given
model config class.
Next, the flow for model identification is a bit simpler:
- Derive all the common fields up-front (name, desc, hash, etc).
- Merge in overrides.
- Call `from_model_on_disk()` for every config class, passing in the
fields. Overrides are handled in this method.
- Record the results for each config class and choose the best one.
The identification logic is a bit more verbose, with the special
exceptions and handling of overrides, but it is very clear what is
happening.
The one downside I can think of for this strategy is we do need to check
every model type, instead of stopping at the first match. It's a bit
less efficient. In practice, however, this isn't a hot code path, and
the improved clarity is worth far more than perf optimizations that the
end user will likely never notice.
* refactor(mm): remove unused methods in config.py
* refactor(mm): add model config parsing utils
* fix(mm): abstractmethod bork
* tidy(mm): clarify that model id utils are private
* fix(mm): fall back to UnknownModelConfig correctly
* feat(mm): port CLIPVisionDiffusersConfig to new api
* feat(mm): port SigLIPDiffusersConfig to new api
* feat(mm): make match helpers more succint
* feat(mm): port flux redux to new api
* feat(mm): port ip adapter to new api
* tidy(mm): skip optimistic override handling for now
* refactor(mm): continue iterating on config
* feat(mm): port flux "control lora" and t2i adapter to new api
* tidy(ui): use Extract to get model config types
* fix(mm): t2i base determination
* feat(mm): port cnet to new api
* refactor(mm): add config validation utils, make it all consistent and clean
* feat(mm): wip port of main models to new api
* feat(mm): wip port of main models to new api
* feat(mm): wip port of main models to new api
* docs(mm): add todos
* tidy(mm): removed unused model merge class
* feat(mm): wip port main models to new api
* tidy(mm): clean up model heuristic utils
* tidy(mm): clean up ModelOnDisk caching
* tidy(mm): flux lora format util
* refactor(mm): make config classes narrow
Simpler logic to identify, less complexity to add new model, fewer
useless attrs that do not relate to the model arch, etc
* refactor(mm): diffusers loras
w
* feat(mm): consistent naming for all model config classes
* fix(mm): tag generation & scattered probe fixes
* tidy(mm): consistent class names
* refactor(mm): split configs into separate files
* docs(mm): add comments for identification utils
* chore(ui): typegen
* refactor(mm): remove legacy probe, new configs dir structure, update imports
* fix(mm): inverted condition
* docs(mm): update docsstrings in factory.py
* docs(mm): document flux variant attr
* feat(mm): add helper method for legacy configs
* feat(mm): satisfy type checker in flux denoise
* docs(mm): remove extraneous comment
* fix(mm): ensure unknown model configs get unknown attrs
* fix(mm): t5 identification
* fix(mm): sdxl ip adapter identification
* feat(mm): more flexible config matching utils
* fix(mm): clip vision identification
* feat(mm): add sanity checks before probing paths
* docs(mm): add reminder for self for field migrations
* feat(mm): clearer naming for main config class hierarchy
* feat(mm): fix clip vision starter model bases, add ref to actual models
* feat(mm): add model config schema migration logic
* fix(mm): duplicate import
* refactor(mm): split big migration into 3
Split the big migration that did all of these things into 3:
- Migration 22: Remove unique contraint on base/name/type in models
table
- Migration 23: Migrate configs to v6.8.0 schemas
- Migration 24: Normalize file storage
* fix(mm): pop base/type/format when creating unknown model config
* fix(db): migration 22 insert only real cols
* fix(db): migration 23 fall back to unknown model when config change fails
* feat(db): run migrations 23 and 24
* fix(mm): false negative on flux lora
* fix(mm): vae checkpoint probe checking for dir instead of file
* fix(mm): ModelOnDisk skips dirs when looking for weights
Previously a path w/ any of the known weights suffixes would be seen as
a weights file, even if it was a directory. We now check to ensure the
candidate path is actually a file before adding it to the list of
weights.
* feat(mm): add method to get main model defaults from a base
* feat(mm): do not log when multiple non-unknown model matches
* refactor(mm): continued iteration on model identifcation
* tests(mm): refactor model identification tests
Overhaul of model identification (probing) tests. Previously we didn't
test the correctness of probing except in a few narrow cases - now we
do.
See tests/model_identification/README.md for a detailed overview of the
new test setup. It includes instructions for adding a new test case. In
brief:
- Download the model you want to add as a test case
- Run a script against it to generate the test model files
- Fill in the expected model type/format/base/etc in the generated test
metadata JSON file
Included test cases:
- All starter models
- A handful of other models that I had installed
- Models present in the previous test cases as smoke tests, now also
tested for correctness
* fix(mm): omit type/format/base when creating unknown config instance
* feat(mm): use ValueError for model id sanity checks
* feat(mm): add flag for updating models to allow class changes
* tests(mm): fix remaining MM tests
* feat: allow users to edit models freely
* feat(ui): add warning for model settings edit
* tests(mm): flux state dict tests
* tidy: remove unused file
* fix(mm): lora state dict loading in model id
* feat(ui): use translation string for model edit warning
* docs(db): update version numbers in migration comments
* chore: bump version to v6.9.0a1
* docs: update model id readme
* tests(mm): attempt to fix windows model id tests
* fix(mm): issue with deleting single file models
* feat(mm): just delete the dir w/ rmtree when deleting model
* tests(mm): windows CI issue
* fix(ui): typegen schema sync
* fix(mm): fixes for migration 23
- Handle CLIP Embed and Main SD models missing variant field
- Handle errors when calling the discriminator function, previously only
handled ValidationError but it could be a ValueError or something else
- Better logging for config migration
* chore: bump version to v6.9.0a2
* chore: bump version to v6.9.0a3
Fixes a test failure introduced by
https://github.com/pydantic/pydantic/pull/11957
TL;DR: "after" model validators should be instance methods, not class
methods. Batch model updated to use an instance method, which fixes the
failing test.
- Move migration of model-specific ui_types into BaseInvocation. This
gives us access to the node and field names, so the warnings are more
useful to the end user.
- Ensure we serialize the fields' json_schema_extra with enum values.
This wasn't a problem until now, when it interferes with migrating
ui_type cleanly. It's a transparent change.
- Improve warnings when validating fields (which includes the ui_type
migration logic)
Do not use whole layer as trigger for histo recalc; use the canvas cache
of the layer - it more reliably indicates when the layer pixel data has
changed, and fixes an issue where we can miss the first histo calc due
to race conditiong with async layer bbox calculation.
Added button checks to bbox rect and transformer mousedown/touchstart handlers to only process left clicks. Also added stage dragging check in onBboxDragMove to clear bbox drag state when middle mouse panning is active.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
When middle mouse button is used for canvas panning, the pointerup event was still creating points in the segmentation module. Added button check to onBboxDragEnd handler to only process left clicks.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
Fixed an issue where bounding boxes could grow exponentially when created at small sizes. The problem occurred because Konva Transformer modifies scaleX/scaleY rather than width/height directly, and the scale values weren't consistently reset after being applied to dimensions.
Changes:
- Ensure scale values are always reset to 1 after applying to dimensions
- Add minimum size constraints to prevent zero/negative dimensions
- Fix scale handling in transformend, dragend, and initial bbox creation
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
Revised the Select Object feature to support two input modes:
- Visual mode: Combined points and bounding box input for paired SAM inputs
- Prompt mode: Text-based object selection (unchanged)
Key changes:
- Replaced three input types (points, prompt, bbox) with two (visual, prompt)
- Visual mode supports both point and bbox inputs simultaneously
- Click to add include points, Shift+click for exclude points
- Click and drag to draw bounding box
- Fixed bbox visibility issues when adding points
- Fixed coordinate system issues for proper bbox positioning
- Added proper event handling and interaction controls
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
There was a really confusing aspect of the SAM pipeline classes where
they accepted deeply nested lists of different dimensions (bbox, points,
and labels).
The lengths of the lists are related; each point must have a
corresponding label, and if bboxes are provided with points, they must
be same length.
I've refactored the backend API to take a single list of SAMInput
objects. This class has a bbox and/or a list of points, making it much
simpler to provide the right shape of inputs.
Internally, the pipeline classes take rejigger these input classes to
have the correct nesting.
The Nodes still have an awkward API where you can provide both bboxes
and points of different lengths, so I added a pydantic validator that
enforces correct lenghts.
Certain items in redux are ephemeral and omitted from persisted slices.
On rehydration, we need to inject these values back into the slice.
But there was an issue taht could prevent slice migrations from running
during rehydration.
The migrations look for the `_version` key in state and migrate the
slice accordingly.
The logic that merged in the ephemeral values accidentally _also_ merged
in the `_version` key if it didn't already exist. This happened _before_
migrations are run.
This causes problems for slices that didn't have a `_version` key and
then have one added via migration.
For example, the params slice didn't have a `_version` key until the
previous commit, which added `_version` and changed some other parts of
state in a migration.
On first load of the updated code, we have a catch-22 kinda situation:
- The persisted params slice is the old version. It needs to have both
`_version` and some other data added to it.
- We deserialize the state and then merge in ephemeral values. This
inadvertnetly also merged in the `_version` key.
- We run the slice migration. It sees there is a `_version` key and
thinks it doesn't need to run. The extra data isn't added to the slice.
The slice is parsed against its zod schema and fails because the new
data is missing.
- Because the parse failed, we treat the user's persisted data as
invalid and overwrite it with initial state, potentially causing data
loss.
The fix is to be more selective when merging in the ephemeral state
before migration - this is now done by checking which keys are on the
persist denylist and only adding those key.
This tells react that the component is a new instance each time we
change the image. Which, in turn, prevents a flash of the
previously-selected image during image switching and
progress-image-to-output-image-ing.
This has been an issue for a long time. I suspect it wasn't noticed
until now because it's finicky to trigger - you have to click and
release very quickly, without moving the mouse at all.
Must set cross origin whenever we load an image from a URL to prevent
race conditions where browser caches an image with no CORS, then canvas
attempts to load it with CORS, resulting in browser rejecting the
request before it is made
If incompatible LoRAs are added, prevent Invoking.
The logic to prevent adding incompatible LoRAs to graphs already
existed. This does not fix any generation bugs; just a visual
inconsistency where it looks like Invoke would use an incompatible LoRA.
Gemini 2.5 Flash makes no guarantees about output image sizes. Our
existing logic always rendered staged images on Canvas at the bbox dims
- not the image's physical dimensions. When Gemini returns an image that
doesn't match the bbox, it would get squished.
To rectify this, the canvas staging area renderer is updated to render
its images using their physical dimensions, as opposed to their
configured dimensions (i.e. bbox).
A flag on CanvasObjectImage enables this rendering behaviour.
Then, when saving the image as a layer from staging area, we use the
physical dimensions.
When the bbox and physical dimensions do not match, the bbox is not
touched, so it won't exactly encompass the staged image. No point in
resizing the bbox if the dimensions don't match - the next image could
be a different size, and the sizes might not be valid (it's an external
resource, after all).
- Disable LoRAs instead of deleting them when base model changes
- Update toast message to indicate that we may have _updated_ a model
(prev just sayed cleared or disabled)
- Do not change ref image models if the new base model doesn't support
them. For example, changing from SDXL to Imagen does not update the ref
image model or alert the user, because Imagen does not support ref
images. Switching from Imagen to FLUX does update the ref image model
and alert the user. Just a bit less noisy.
## Summary
Bump version
## Related Issues / Discussions
n/a
## QA Instructions
n/a
## Merge Plan
This is already released.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
Fixes errors like `AttributeError: module 'cv2.ximgproc' has no
attribute 'thinning'` which occur because there is a conflict between
our own `opencv-contrib-python` dependency and the `invisible-watermark`
library's `opencv-python`.
Determine the "base" step for floats. If no `multipleOf` is provided,
the "base" step is `undefined`, meaning the float can have any number of
decimal places.
The UI library does its own step constrains though and is rounding to 3
decimal places. Probably need to update the logic in the UI library to
have truly arbitrary precision for float fields.
I ran into a race condition where I set a HF token and it was valid, but
somehow this error toast still appeared. The conditional feel through to
an assertion that we never expected to get to, which crashed the UI.
Handled the unexpected case gracefully now.
- Move the estimation logic to utility functions
- Estimate memory _within_ the encode and decode methods, ensuring we
_always_ estimate working memory when running a VAE
Three changes needed to make scrollIntoView and "Locate in Gallery" work
reliably.
1. Use setTimeout to work around race condition with scrollIntoView in
gallery.
It was possible to call scrollIntoView before react-virtuoso was ready.
I think react-virtuoso was initialized but hadn't rendered/measured its
items yet, so when we scroll to e.g. index 742, the items have a zero
height, so it doesn't actually scroll down. Then the items render.
Setting a timeout here defers the scroll until after the next event loop
cycle, by which time we expect react-virutoso to be ready.
2. Ensure the scollIntoView effect in gallery triggers any time the
selection is touched by making its dependency the array of selected
images, not just the last selected image name.
The "locate in gallery" functionality works by selecting an image.
There's a reactive effect in the gallery that runs when the last
selected image changes and scrolls it into view.
But if you already have an image selected, selecting it again will not
change the image name bc it is a string primitive. The useEffect ignores
the selection.
So, if you clicked "locate in gallery" on an image that was already
selected, it wouldn't be scrolled into view - even if you had already
scrolled away from it.
To work around this, the effect now uses the whole selection array as
its dependency. Whenever the selection changes, we get a new array,
which triggers the effect.
3. Gallery slice had some checks to avoid creating a new array of
selected image names in state when the selected images didn't change.
For example, if image "abc" was selected, and we selected "abc" again,
instead of creating a new array with the same "abc" image, we bailed
early. IIRC this optimization addressed a rerender issue long ago.
This optimization needs to be removed in order for fix#2 above to work.
We now _want_ a new array whenever selection is set - even if it didn't
actually change.
This feature added a lot of unexpected complexity in graph building /
metadata recall and is unintuitive user experience. 99% of the time, the
style prompt should be exactly the main prompt.
You can still use style prompts in workflows, but in an effort to reduce
complexity in the linear UI, we are removing this rarely-used feature.
When installing a model, the previous, graceful logic would increment a
suffix on the destination path until found a free path for the model.
But because model file installation and record creation are not in a
transaction, we could end up moving the file successfully and fail to
create the record:
- User attempts to install an already-installed model
- Attempt to move the downloaded model from download tempdir to
destination path
- The path already exists
- Add `_1` or similar to the path until we find a path that is free
- Move the model
- Create the model record
- FK constraint violation bc we already have a model w/ that name, but
the model file has already been moved into the invokeai dir.
Closes#8416
Prevents a large spike in VRAM when preparing to denoise w/ multiple ref
images.
There doesn't appear to be any different in image quality / ref
adherence when concatenating in latent space vs image space, though
images _are_ different.
If the transformer fills up VRAM, then when we VAE encode kontext
latents, we'll need to first offload the transformer (partially, if
partial loading is enabled).
No need to do this - we can encode kontext latents before loading the
transformer to reduce model thrashing.
Tell the model manager that we need some extra working memory for VAE
encoding operations to prevent OOMs.
See previous commit for investigation and determination of the magic
numbers used.
This safety measure is especially relevant now that we have FLUX Kontext
and may be encoding rather large ref images. Without the working memory
estimation we can OOM as we prepare for denoising.
See #8405 for an example of this issue on a very low VRAM system. It's
possible we can have the same issue on any GPU, though - just a matter
of hitting the right combination of models loaded.
This commit includes a task delegated to Claude to investigate our VAE
working memory calculations and investigation results.
See VAE_INVESTIGATION.md for motivation and detail. Everything else is
its output.
Result data includes empirical measurements for all supported model
architectures at a variety of resolutions and fp16/fp32 precision.
Testing conducted on a 4090.
The summarized conclusion is that our working memory estimations for
decoding are spot-on, but decoding also needs some extra working memory.
Empirical measurements suggest ~45% the amount needed for encoding.
A followup commit will implement working memory estimations for VAE
encoding with the goal of preventing unexpected OOMs during encode.
Currently translated at 98.6% (2037 of 2065 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.6% (2037 of 2065 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.5% (2036 of 2065 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.6% (2014 of 2042 strings)
Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
- Do not reset dimensions when resetting generation settings (they are
model-dependent, and we don't change model-dependent settings w/ that
butotn)
- Do not reset bbox when resetting canvas layers
- Show reset canvas layers button only on canvas tab
- Show reset generation settings button only on canvas or generate tab
Disable these items while staging:
- New Canvas From Image context menu
- Edit image hook & launchpad button
- Generate from Text launchpad button (only while on canvas tab)
- Use a Layout Image launchpad button
When unsafe_disable_picklescan is enabled, instead of erroring on
detections or scan failures, a warning is logged.
A warning is also logged on app startup when this setting is enabled.
The setting is disabled by default and there is no change in behaviour
when disabled.
Implements intelligent spatial tiling that arranges multiple reference
images in a virtual canvas, choosing between horizontal and vertical
placement to maintain a square-like aspect ratio
This fixes an issue where gallery's auto-scroll-into-view for selected
images didn't work, and users instead saw a "Unable to find image..."
debug log message in JS console.
1. Fix the run script to properly read the GPU_DRIVER
2. Cloned and adjusted the ROCM dockerbuild for docker
3. Adjust the docker-compose.yml to use the cloned dockerbuild
It's not clear why we were copying downloaded models to the destination
dir instead of moving them. I cannot find a reason for it, and I am able
to install single-file and diffusers models just fine with the change.
This fixes an issue where model installation requires 2x the model's
size (bc we were copying the model over).
Previously, we used pathlib's `with_suffix()` method to change add a
suffix (e.g. ".safetensors") to a model when installing it.
The intention is to add a suffix to the model's name - but that method
actually replaces everything after the first period.
This can cause different models to be installed under the same name!
For example, the FLUX models all end up with the same name:
- "FLUX.1 schnell.safetensors" -> "FLUX.safetensors"
- "FLUX.1 dev.safetensors" -> "FLUX.safetensors"
The fix is easy - append the suffix using string formatting instead of
using pathlib.
This issue has existed for a long time, but was exacerbated in
075345bffd in which I updated the names of
our starter models, adding ".1" to the FLUX model names. Whoops!
## Summary
Move client state persistence from browser to server.
- Add new client state persistence service to handle reading and writing
client state to db & associated router. The API mirrors that of
LocalStorage/IndexedDB where the set/get methods both operate on _keys_.
For example, when we persist the canvas state, we send only the new
canvas state to the backend - not the whole app state.
- The data is very flexibly-typed as a pydantic `JsonValue`. The client
is expected to handle all data parsing/validation (it must do this
anyways, and does this today).
- Change persistence from debounced to throttled at 2 seconds. Maybe
less is OK? Trying to not hammer the server.
- Add new persistence storage driver in client and use it in
redux-remember. It does its best to avoid extraneous persist requests,
caching the last data it persisted and noop-ing if there are no changes.
- Storage driver tracks pending persist actions using ref counts (bc
each slice is persisted independently). If there user navigates away
from the page during a persist request, it will give them the "you may
lose something if you navigate away" alert.
- This "lose something" alert message is not customizable (browser
security reasons).
- The alert is triggered only when the user closes the tape while a
persist network request is mid-flight. It's possible that the user makes
a change and closes the page before we start persisting. In this case,
they will lose the last 2 seconds of data.
- I tried making triggering the alert when a persist was waiting to
start, and it felt off.
- Maybe the alert isn't even necessary. Again you'd lose 2s of data at
most, probably a non issue. IMO after trying it, a subtle indicator
somewhere on the page is probably less confusing/intrusive.
- Fix an issue where the `redux-remember` enhancer was added _last_ in
the enhancer chain, which prevented us detecting when a persist has
succeeded. This required a small change to the `unserialze` utility
(used during rehydration) to ensure slices enhanced with `redux-undo`
are set up correctly as they are rehydrated.
- Restructure the redux store code to avoid circular dependencies. I
couldn't figure out how to do this without just smooshing it all into
the main `store.ts` file. Oh well.
Implications:
- Because client state is now on the server, different browsers will
have the same studio state. For example, if I start working on something
in Firefox, if I switch to Chrome, I have the same client state.
- Incognito windows won't do anything bc client state is server-side.
- It takes a bit longer for persistence to happen thanks to the
debounce, but there's now an indicator that tells you your stuff isn't
saved yet.
- Resetting the browser won't fix an issue with your studio state. You
must use `Reset Web UI` to fix it (or otherwise hit the appropriate
endpoint). It may be possible to end up in a Catch-22 where you can't
click the button and get stuck w/ a borked studio - I think to think
through this a bit more, might not be an issue.
- It probably takes a bit longer to start up, since we need to retrieve
client state over network instead of directly with browser APIs.
Other notes:
- We could explore adding an "incognito" mode, enabled via
`invokeai.yaml` setting or maybe in the UI. This would temporarily
disable persistence. Actually, I don't think this really makes sense, bc
all the images would be saved to disk.
- The studio state is stored in a single row in the DB. Currently, a
static row ID is used to force the studio state to be a singleton. It is
_possible_ to support multiple saved states. Might be a solve for app
workspaces.
## Related Issues / Discussions
n/a
## QA Instructions
Try it out. It's pretty straightforward. Error states are the main
things to test - for example, network blips. The new server-side
persistence driver is the only real functional change - everything else
is just kinda shuffling things around to support it.
## Merge Plan
n/a
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
It is accessible in two places:
- The queue actions hamburger menu.
- On the queue tab.
If the clear queue app feature is disabled, it is not shown in either of
those places.
Currently translated at 98.7% (1978 of 2003 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.7% (1978 of 2003 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.6% (1968 of 1994 strings)
Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
Currently translated at 99.8% (2007 of 2011 strings)
translationBot(ui): update translation (Japanese)
Currently translated at 99.8% (2007 of 2011 strings)
translationBot(ui): update translation (Japanese)
Currently translated at 99.8% (2007 of 2011 strings)
translationBot(ui): update translation (Japanese)
Currently translated at 99.8% (2007 of 2011 strings)
translationBot(ui): update translation (Japanese)
Currently translated at 99.8% (2007 of 2011 strings)
translationBot(ui): update translation (Japanese)
Currently translated at 92.0% (1851 of 2011 strings)
translationBot(ui): update translation (Japanese)
Currently translated at 92.0% (1851 of 2011 strings)
translationBot(ui): update translation (Japanese)
Currently translated at 92.0% (1851 of 2011 strings)
translationBot(ui): update translation (Japanese)
Currently translated at 87.4% (1744 of 1995 strings)
translationBot(ui): update translation (Japanese)
Currently translated at 87.4% (1744 of 1995 strings)
translationBot(ui): update translation (Japanese)
Currently translated at 81.0% (1616 of 1995 strings)
translationBot(ui): update translation (Japanese)
Currently translated at 81.0% (1616 of 1995 strings)
translationBot(ui): update translation (Japanese)
Currently translated at 81.0% (1616 of 1995 strings)
translationBot(ui): update translation (Japanese)
Currently translated at 81.0% (1616 of 1995 strings)
translationBot(ui): update translation (Japanese)
Currently translated at 81.0% (1616 of 1995 strings)
translationBot(ui): update translation (Japanese)
Currently translated at 81.0% (1616 of 1995 strings)
translationBot(ui): update translation (Japanese)
Currently translated at 81.0% (1616 of 1995 strings)
translationBot(ui): update translation (Japanese)
Currently translated at 81.0% (1616 of 1995 strings)
translationBot(ui): update translation (Japanese)
Currently translated at 81.0% (1616 of 1995 strings)
translationBot(ui): update translation (Japanese)
Currently translated at 75.6% (1510 of 1995 strings)
translationBot(ui): update translation (Japanese)
Currently translated at 75.6% (1510 of 1995 strings)
translationBot(ui): update translation (Japanese)
Currently translated at 75.6% (1510 of 1995 strings)
translationBot(ui): update translation (Japanese)
Currently translated at 75.6% (1510 of 1995 strings)
translationBot(ui): update translation (Japanese)
Currently translated at 75.6% (1510 of 1995 strings)
translationBot(ui): update translation (Japanese)
Currently translated at 75.6% (1510 of 1995 strings)
translationBot(ui): update translation (Japanese)
Currently translated at 75.6% (1510 of 1995 strings)
translationBot(ui): update translation (Japanese)
Currently translated at 75.6% (1510 of 1995 strings)
Co-authored-by: RyoKoba <kobayashi_ryo@cyberagent.co.jp>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/ja/
Translation: InvokeAI/Web UI
Currently translated at 97.9% (1953 of 1994 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.7% (1986 of 2011 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.7% (1970 of 1995 strings)
translationBot(ui): update translation (Italian)
Currently translated at 97.8% (1910 of 1952 strings)
Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
Currently translated at 100.0% (2012 of 2012 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 100.0% (2012 of 2012 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 99.7% (2006 of 2012 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 99.7% (2006 of 2012 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 99.5% (2002 of 2012 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 99.5% (2002 of 2012 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 97.8% (1968 of 2012 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 97.8% (1968 of 2012 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 97.8% (1968 of 2012 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 97.8% (1968 of 2012 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 96.4% (1940 of 2012 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 96.4% (1940 of 2012 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 100.0% (1921 of 1921 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 100.0% (1917 of 1917 strings)
Co-authored-by: Linos <linos.coding@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/vi/
Translation: InvokeAI/Web UI
Fix nodes ui: Make nodes dot background to be the same as the snap to grid size and position
Update to Flow.tsx
Changes the size and offset of the dots background to be the same size as the snap to grid, and also fix the background dot pattern alignment.
Currently, the snapGrid is 25x25, and the default background dot gap is 20x20, these do not align. This is fixed by making the gap property of the background the same as the snapGrid.
Additionally, there is a bug in the rectFlow background code that incorrectly sets the offset to be the centre of the dot pattern with the default offset of 0. To work around this issue, setting the background offset property to the snapGrid size will realign the dot pattern correctly.
I have logged a bug for the rectFlow background issue in its repo.
https://github.com/xyflow/xyflow/issues/5405
Update workflowSettingsSlice.ts
Change the default settings for auto layout nodeSpacing and layerSpacing to 30 instead of 32. This will make the x position of auto layed nodes land on the snap to grid positions.
Because the node width (320) + 30 = 350 which is divisible by the snap to grid size of 25.
We intermittently get an error like this:
```
TypeError: Cannot read properties of undefined (reading 'length')
```
This error is caused by a `redux-undo`-enhanced slice being rehydrated
without the extra stuff it adds to the slice to make it undoable (e.g.
an array of `past` states, the `present` state, array of `future`
states, and some other metadata).
`redux-undo` may need to check the length of the past/future arrays as
part of its internal functionality. These keys don't exist so we get the
error. I'm not sure _why_ they don't exist - my understanding of
`redux-undo` is that it should be checking and wrapping the state w/ the
history stuff automatically. Seems to be related to `redux-remember` -
may be a race condition.
The solution is to ensure we wrap rehydrated state for undoable slices
as we rehydrate them. I discovered the solution while troubleshooting
#8314 when the changes therein somehow triggered the issue to start
occuring every time instead of rarely.
* Add auto layout controls using elkjs to node editor
Introduces auto layout functionality for the node editor using elkjs, including a new UI popover for layout options (placement strategy, layering, spacing, direction). Adds related state and actions to workflowSettingsSlice, updates translations, and ensures elkjs is included in optimized dependencies.
* feat(nodes): Improve workflow auto-layout controls and accuracy
- The auto-layout settings panel is updated to use `Select` dropdowns and `NumberInput`
- The layout algorithm now uses the actual rendered dimensions of nodes from the DOM, falling back to estimates only when necessary. This results in a much more accurate and predictable layout.
- The ELKjs library integration is refactored to fix some warnings
* Update useAutoLayout.ts
prettier
* feat(nodes): Improve workflow auto-layout controls and accuracy
- The auto-layout settings panel is updated to use `Select` dropdowns and `NumberInput`
- The layout algorithm now uses the actual rendered dimensions of nodes from the DOM, falling back to estimates only when necessary. This results in a much more accurate and predictable layout.
- The ELKjs library integration is refactored to fix some warnings
* Update useAutoLayout.ts
prettier
* build(ui): import elkjs directly
* updated to use dagrejs for autolayout
updated to use dagrejs - it has less layout options but is already included
but this is still WIP as some nodes don't report the height correctly. I am still investigating this...
* Update useAutoLayout.ts
update to fix layout issues
* minor updates
- pretty useAutoLayout.ts
- add missing type import in ViewportControls.tsx
- update pnpm-lock.yaml with elkjs removed
* Update ViewportControls.tsx
pnpm fix
* Fix Frontend check + single node selection fix
Fix Frontend check - remove unused export from workflowSettingsSlice.ts
Update so that if you have a single node selected, it will auto layout all nodes, as this is a common thing to have a single node selected and means that you don't have to unselect it.
* feat(ui): misc improvements for autolayout
- Split popover into own component
- Add util functions to get node w/h
- Use magic wand icon for button
- Fix sizing of input components
- Use CompositeNumberInput instead of base chakra number input
- Add zod schemas for string values and use them in the component to
ensure state integrity
* chore(ui): lint
---------
Co-authored-by: psychedelicious <4822129+psychedelicious@users.noreply.github.com>
- Name it `pickerCompactViewStates` bc its not exclusive to model
picker, it is used for all pickers
- Rename redux action to model an event
- Move selector to right file
- Use selector to derive state for individual picker
There was a subtle issue where the progress image wasn't ever cleared,
preventing the context menu from working on staging area preview images.
The staging area preview images were displaying the last progress image
_on top of_ the result image. Because the image elements were so small,
you wouldn't notice that you were looking at a low-res progress image.
Right clicking a progress image gets you no menu.
If you refresh the page or switch tabs, this would fix itself, because
those actions clear out the progress images. The result image would then
be the topmost element, and the context menu works.
Fixing this without introducing a flash of empty space as the progress
image was hidden required a bit of refactoring. We have to wait for the
result image element to load before clearing out the progress.
Result - progress images appear to "resolve" to result images in the
staging area without any blips or jank, and the context menu works after
that happens.
Was running into difficultlies reasoning about the logic and couldn't
write tests because it was all in react.
Moved logic outside react, updated context, make it testable.
Simplify the canvas auto-switch logic to not rely on the preview images
loading. This fixes an issue where offscreen preview images didn't get
auto-switched to. Images are now loaded directly.
Fix an issue in certain browsers/builds causing a runtime error.
A zod enum has a .options property, which is an array of all the options
for the enum. This is handy for when you need to derive something from a
zod schema.
In this case, we represented the possible focus regions in the zod enum,
then derived a mapping of region names to set of target HTML elements.
Why isn't important, but suffice to say, we were using the .options
property for this.
But actually, we were using .options.values(), then calling .reduce() on
that. An array's .values() method returns an _array iterator_. Array
iterators do not have .reduce() methods!
Except, apparently in some environments they do - it depends on the JS
engine and whether or not polyfills for iterator helpers were included
in the build.
Turns out my dev environment - and most user browsers - do provide
.reduce(), so we didn't catch this error. It took a large deployment and
error monitoring to catch it.
I've refactored the code to totally avoid deriving data from zod in this
way.
- Add a context manager to the SqliteDatabase class which abstracts away
creating a transaction, committing it on success and rolling back on
error.
- Use it everywhere. The context manager should be exited before
returning results. No business logic changes should be present.
- Apparently locales must use hyphens instead of underscores. This must
have been a fairly recent change that we didn't catch. It caused i18n to
throw for Brasilian Portuguese and both Simplified and Traditional
Mandarin. Change the locales to use the right strings.
- Move the theme + locale provider inside of the error boundary. This
allows errors with locals to be caught by the error boundary instead of
hard-crashing the app. The error screen is unstyled if this happens but
at least it has the reset button.
- Add a migration for the system slice to fix existing users' language
selections. For example, if the user had an incorrect language setting
of `zh_CN`, it will be changed to the correct `zh-CN`.
The range-based fetching logic had a subtle bug - it didn't keep track
of what the _current_ visible range is - only the ranges that the user
last scrolled to.
When an image was added to the gallery, the logic saw that the images
had changed, but thought it had already loaded everything it needed to,
so it didn't load the new image.
The updated logic tracks the current visible range separately from the
accumulated scroll ranges to address this issue.
When the user scrolls in the gallery, we are alerted of the new range of
visible images. Then we fetch those specific images.
Previously, each change of range triggered a throttled function to fetch
that range. The throttle timeout was 100ms.
Now, each change of range appends that range to a list of ranges and
triggers the throttled fetch. The timeout is increased to 500ms, but to
compensate, each fetch handles all ranges that had been accumulated
since the last fetch.
The result is far fewer network requests, but each of them gets more
images.
- Smaller staged image previews.
- Move autoswitch buttons to staging area toolbar, remove from settings
popover and the little three-dots menu. Use persisted autoswitch
setting, which is renamed from `defaultAutoSwitch` to
`stagingAreaAutoSwitch`.
- Fix issue with misaligned border radii in staging area preview images.
Required small changes to DndImage and its usage elsewhere.
- Fix issue where staging area toolbar could show up without any
previews in the list.
- Migrate canvas settings slice to use zod schema and inferred types for
its state.
* dont show option to add new layer from if on generate tab
* only disable width/height recall is staging AND canvas tab
---------
Co-authored-by: Mary Hipp <maryhipp@Marys-Air.lan>
Reverted incomplete change to how queue items are listed. In the future
I think we should redo it to work like the gallery. For now, it is back
the way it was in v5.
When percentage is zero, the progress bar looks the same as it does when
no generation is in progress. Render it as indeterminate (pulsing) when
percentage is zero to indicate that somethign is happenign.
* initializing prompt expansion and putting response in prompt box working for all methods
* properly disable UI and show loading state on prompt box when there is a pending prompt expansion item
* misc wrapup: disable apploying prompt templates, dont block textarea resize handle
* update progress to differentiate between prompt expansion and non
* cleanup
* lint
* more cleanup
* add image to background of loading state
* add allowPromptExpansion for front-end gating
* updated readiness text for needing to accept or discard
* fix tsc
* lint
* lint
* refactor(ui): prompt expansion logic
* tidy(ui): remove unnecessary changes
* revert(ui): unused arg on useImageUploadButton
* feat(ui): simplify prompt expansion state
* set pending for dragndrop and context menu
* add readiness logic for generate tab
* missing translation
* update error handling for prompt expansion
---------
Co-authored-by: Mary Hipp <maryhipp@Marys-Air.lan>
Co-authored-by: Mary Hipp <maryhipp@Marys-MacBook-Air.local>
Co-authored-by: psychedelicious <4822129+psychedelicious@users.noreply.github.com>
Ensure disabled tabs are never mounted:
- Add didLoad flag to configSlice, default false
- Always merge in config - even it is is empty
- On first merge, set didLoad to true
- Until didLoad is true, mark _all_ tabs as disabled
This gets around an issue where tabs are all enabled for a brief moment
before the config is loaded.
A bit hacky but it works.
Co-authored-by: kent <kent@invoke.ai>
Revert unnecessary validation changes in multi-diffusion
Fix in python instead of graphbuilder
tidy(ui): remove extraneous comment
The previous logic had a subtle python bug related the scope and nested
generators.
Python generators are lazily evaluated - the expressions are stored and
only evaluated when needed (e.g. calling next() or list() on them)
The old logic used a variable `s`, which was continually overwritten as
the generator expressions were created. As a result, the final mappings
all use the _final_ value for `s`.
Following the consequences of this down the line, we find that collect
nodes can end up with multiple edges from exactly one of their ancestor
nodes, instead of one edge from each ancestor. Notably, it's only the
source _node_id_ that is affected - the source _fields_ have the correct
values.
So the invalid edges will point to a real node and a real field, but the
field exists on a different node.
---
This can result in a number of cryptic problems - include an error about
incompatible field types:
```
InvalidEdgeError: Field types are incompatible
(31758fd5-14a8-4de7-a840-b73ec1a1b94f.value ->
3459c793-41a2-4d82-9204-7df2d6d099ba.item)
```
Here are the conditions that lead to this error:
- The collect node has at least two incoming connections.
- The two incoming connections come from nodes of different types.
- The nodes both output a value of the same type, but the name of the
output field differs between them.
---
This commit uses non-generator logic to build up the mappings, avoiding
the issue entirely. As a bonus, it is much easier to read.
Previously we used python's own type introspection utilties to determine
input and output field types. We can use pydantic to get the field types
in a clearer, more direct way.
This improvement also exposed an awkward behaviour in this utility,
where it would return None when a field doesn't exist. I've added a
comment in the code describing the issue, but changing it would require
some significant changes and I don't want to risk breaking anything.
* Add Rule of 4 composition guide to canvas settings and rendering
Co-authored-by: kent <kent@invoke.ai>
* Rename Rule of 4 Guide to Rule of Thirds in canvas composition guide
Co-authored-by: kent <kent@invoke.ai>
* Updates to comp guide and naming
* Fix reference
* Update translation keys and organize settings.
* revert to previous canvas manager for conflict
* Re-add composition guide.
* Fix lint
* prettier
* feat(ui): improve markup in canvas settings popover
* feat(ui): use brand colors for canvas rule of thirds guide
---------
Co-authored-by: Cursor Agent <cursoragent@cursor.com>
Co-authored-by: psychedelicious <4822129+psychedelicious@users.noreply.github.com>
Enhance LoRA picker to default filter by current base model architecture
## Summary
Fixes new LoRA picker to auto select the architecture filter for the
current model group
## Related Issues / Discussions
N/A
## QA Instructions
Open LoRA menu with any model group selected. The right models should be
filtered.
## Merge Plan
Merge when ready.
## Checklist
- [X] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
When we delete images, boards, or do any other board mutation, we need
to invalidate numerous query caches and related internal frontend state.
This gets complicated very quickly.
We can drastically reduce the complexity by having the backend return
some more information when we make these mutations.
For example, when deleting a list of images by name, we can return a
list of deleted image name and affected boards. The frontend can use
this information to determine which queries to invalidate with far less
tedium.
This will also enable the more efficient storage of images (e.g. in the
gallery selection). Previously, we had to store the entire image DTO
object, else we wouldn't be able to figure out which queries to
invalidate. But now that the backend tells us exactly what images/boards
have changed, we can just store image names in frontend state. This
amounts to a substantial improvement in DX and reduction in frontend
complexity.
When the invocation cache is used, we might skip all progress images. This can prevent auto-switch-on-first-progress from working, as we don't get any of those events.
It's much easier to only support auto-switch on complete.
This appears to be a bug in Chakra UI v2 - use of a fallback component makes the ref passed to an image end up undefined. Had to remove the skeleton loader fallback component.
* add support for flux-kontext models in nodes
* flux kontext in canvas
* add aspect ratio support
* lint
* restore aspect ratio logic
* more linting
* typegen
* fix typegen
---------
Co-authored-by: Mary Hipp <maryhipp@Marys-Air.lan>
## Summary
Support for
[OMI](https://github.com/Open-Model-Initiative/OMI-Model-Standards/tree/main)
LoRAs that use Flux and SDXL as the base model. Automated tests for
config classification. Manually tested (visual inspection) for LoRA
loading and execution.
## Related Issues / Discussions
<!--WHEN APPLICABLE: List any related issues or discussions on github or
discord. If this PR closes an issue, please use the "Closes #1234"
format, so that the issue will be automatically closed when the PR
merges.-->
## QA Instructions
<!--WHEN APPLICABLE: Describe how you have tested the changes in this
PR. Provide enough detail that a reviewer can reproduce your tests.-->
## Merge Plan
<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->
## Checklist
- [ ] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
In #7724 we made a number of perf optimisations related to enqueuing. One of these optimisations included moving the enqueue logic - including expensive prep work and db writes - to a separate thread.
At the same time manual DB locking was abandoned in favor of WAL mode.
Finally, we set `check_same_thread=False` to allow multiple threads to access the connection at a given time.
I think this may be the cause of #7950:
- We start an enqueue in a thread (running in bg)
- We dequeue
- Dequeue pulls a partially-written queue item from DB and we get the errors in the linked issue
To be honest, I don't understand enough about SQLite to confidently say that this kind of race condition is actually possible. But:
- The error started popping up around the time we made this change.
- I have reviewed the logic from enqueue to dequeue very carefully _many_ times over the past month or so, and I am confident that the error is only possible if we are getting unexpectedly `NULL` values from the DB.
- The DB schema includes `NOT NULL` constraints for the column that is apparently returning `NULL`.
- Therefore, without some kind of race condition or schema issue, the error should not be possible.
- The `enqueue_batch` call is the only place I can find where we have the possibility of a race condition due to async logic. Everywhere else, all DB interaction for the queue is synchronous, as far as I can tell.
This change retains the perf benefits by running the heavy enqueue prep logic in a separate thread, but moves back to the main thread for the DB write. It also uses an explicit transaction for the write.
Will just have to wait and see if this fixes the issue.
This reduces peak memory usage at a negligible cost. Queue items typically take on the order of seconds, making the time cost of a GC essentially free.
Not a great idea on a hotter code path though.
We've long suspected there is a memory leak in Invoke, but that may not be true. What looks like a memory leak may in fact be the expected behaviour for our allocation patterns.
We observe ~20 to ~30 MB increase in memory usage per session executed. I did some prolonged tests, where I measured the process's RSS in bytes while doing 200 SDXL generations. I found that it eventually leveled off at around 100 generations, at which point memory usage had climbed by ~900MB from its starting point.
I used tracemalloc to diff the allocations of single session executions and found that we are allocating ~20MB or so per session in `ModelPatcher.apply_ti()`.
In `ModelPatcher.apply_ti()` we add tokens to the tokenizer when handling TIs. The added tokens should be scoped to only the current invocation, but there is no simple way to remove the tokens afterwards.
As a workaround for this, we clone the tokenizer, add the TI tokens to the clone, and use the clone to when running compel. Afterwards, this cloned tokenizer is discarded.
The tokenizer uses ~20MB of memory, and it has referrers/referents to other compel stuff. This is what is causing the observed increases in memory per session!
We'd expect these objects to be GC'd but python doesn't do it immediately. After creating the cond tensors, we quickly move on to denoising. So there isn't any time for the GC to happen to free up its existing memory arenas/blocks to reuse them. Instead, python needs to request more memory from the OS.
We can improve the situation by immediately calling `del` on the tokenizer clone and related objects. In fact, we already had some code in the compel nodes to `del` some of these objects, but not all.
Adding the `del`s vastly improves things. We hit peak RSS in half the sessions (~50 or less) and it's now ~100MB more than starting value. There is still a gradual increase in memory usage until we level off.
* build: prevent `opencv-python` from being installed
Fixes this error: `AttributeError: module 'cv2.ximgproc' has no attribute 'thinning'`
`opencv-contrib-python` supersedes `opencv-python`, providing the same API + additional features. The two packages should not be installed at the same time to avoid conflicts and/or errors.
The `invisible-watermark` package requires `opencv-python`, but we require the contrib variant.
This change updates `pyproject.toml` to prevent `opencv-python` from ever being installed using a `uv` features called dependency overrides.
* feat(ui): data viewer supports disabling wrap
* feat(api): list _all_ pkgs in app deps endpoint
* chore(ui): typegen
* feat(ui): update about modal to display new full deps list
* chore: uv lock
When a layer is initialized, we do not yet know its bbox, so we cannot fit the stage view to the layer. We have to wait for the bbox calculation to finish. Previously, we had no way to wait unti lthat bbox calculation was complete to take an action.
For example, this means we could not fit the layers to the stage immediately after creating a new layer, bc we don't know the dimensions of the layer yet.
This callback lets us do that. When creating a new canvas from an image, we now...
- Register a bbox update callback to fit the layers to stage
- Layer is created
- Canvas initializes the layer's entity adapter module (layer's width and height are set to zero at this point)
- Canvas calculates the bbox
- Bbox is updated (width and height are now correct)
- Callback is ran, fitting layer to stage
Also change import order to ensure CLI args are handled correctly. Had to do this bc importing `InvocationRegistry` before parsing args resulted in the `--root` CLI arg being ignored.
Add `heuristic_resize_fast`, which does the same thing as `heuristic_resize`, except it's about 20x faster.
This is achieved by using opencv for the binary edge handling isntead of python, and checking only 100k pixels to determine what kind of image we are working with.
Besides being much faster, it results in cleaner lines for resized binary canny edge maps, and has results in fewer misidentified segmentation maps.
Tested against normal images, binary canny edge maps, grayscale HED edge maps, segmentation maps, and normal images.
Tested resizing up and down for each.
Besides the new utility function, I needed to swap the `opencv-python` dep for `opencv-contrib-python`, which includes `cv2.ximgproc.thinning`. This function accounts for a good chunk of the perf improvement.
Upstream bug in `transformers` breaks use of `AutoModelForMaskGeneration` class to load SAM models
Simple fix - directly load the model with `SamModel` class instead.
See upstream issue https://github.com/huggingface/transformers/issues/38228
## Summary
- Fallback to new classification API if legacy probe fails
- Method to read model metadata
- Created `StrippedModelOnDisk` class for testing
- Test to verify only a single config `matches` with a model
## Related Issues / Discussions
<!--WHEN APPLICABLE: List any related issues or discussions on github or
discord. If this PR closes an issue, please use the "Closes #1234"
format, so that the issue will be automatically closed when the PR
merges.-->
## QA Instructions
<!--WHEN APPLICABLE: Describe how you have tested the changes in this
PR. Provide enough detail that a reviewer can reproduce your tests.-->
## Merge Plan
<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->
## Checklist
- [ ] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
For example:
```py
my_field: Literal["foo", "bar"] | None = InputField(default=None)
```
Previously, this would cause a field parsing error and prevent the app from loading.
Two fixes:
- This type annotation and resultant schema are now parsed correctly
- Error handling added to template building logic to prevent the hang at startup when an error does occur
Major cleanup of RelatedModels.tsx for improved readability, structure, and maintainability.
Dried out repetitive logic
Consolidated model type sorting into reusable helpers
Added disallowed model type relationships to prevent broken connections (e.g. VAE ↔ LoRA)
- Aware this introduces a new constraint—open to feedback (see PR comment)
Some naming and types may still need refinement; happy to revisit
Adds full support for managing model-to-model relationships in the UI and backend.
Introduces RelatedModels subpanel for linking and unlinking models in model management.
- Adds REST API routes for adding, removing, and retrieving model relationships.
- New database migration: creates model_relationships table for bidirectional links.
- New service layer (model_relationships) for relationship management.
- Updated frontend: Related models float to top of LoRA/Main grouped model comboboxes for quick access.
- Added 'Show Only Related' toggle badge to MainModelPicker filter bar
**Amended commit to remove changes to ParamMainModelSelect.tsx and MainModelPicker.tsx to avoid conflict with upstream deletion/ rewrite**
## Summary
- Modify stats reset to be on a per session basis, rather than a "full
reset", to allow for parallel session execution
- Add "aider" to gitignore
## Related Issues / Discussions
<!--WHEN APPLICABLE: List any related issues or discussions on github or
discord. If this PR closes an issue, please use the "Closes #1234"
format, so that the issue will be automatically closed when the PR
merges.-->
## QA Instructions
<!--WHEN APPLICABLE: Describe how you have tested the changes in this
PR. Provide enough detail that a reviewer can reproduce your tests.-->
## Merge Plan
<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->
## Checklist
- [ ] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
Currently translated at 67.1% (1279 of 1904 strings)
translationBot(ui): update translation (Japanese)
Currently translated at 64.9% (1231 of 1895 strings)
translationBot(ui): update translation (Japanese)
Currently translated at 60.2% (1141 of 1895 strings)
translationBot(ui): update translation (Japanese)
Currently translated at 56.7% (1075 of 1895 strings)
Co-authored-by: RyoKoba <kobayashi_ryo@cyberagent.co.jp>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/ja/
Translation: InvokeAI/Web UI
Currently translated at 100.0% (1896 of 1896 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 100.0% (1895 of 1895 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 100.0% (1886 of 1886 strings)
Co-authored-by: Linos <linos.coding@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/vi/
Translation: InvokeAI/Web UI
Currently translated at 98.8% (1883 of 1904 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.8% (1882 of 1903 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.8% (1881 of 1902 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.8% (1878 of 1899 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.8% (1874 of 1895 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.8% (1873 of 1895 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.8% (1864 of 1886 strings)
Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
When we do our field type overrides to allow invocations to be instantiated without all required fields, we were not modifying the annotation of the field but did set the default value of the field to `None`.
This results in an error when doing a ser/de round trip. Here's what we end up doing:
```py
from pydantic import BaseModel, Field
class MyModel(BaseModel):
foo: str = Field(default=None)
```
And here is a simple round-trip, which should not error but which does:
```py
MyModel(**MyModel().model_dump())
# ValidationError: 1 validation error for MyModel
# foo
# Input should be a valid string [type=string_type, input_value=None, input_type=NoneType]
# For further information visit https://errors.pydantic.dev/2.11/v/string_type
```
To fix this, we now check every incoming field and update its annotation to match its default value. In other words, when we override the default field value to `None`, we make its type annotation `<original type> | None`.
This prevents the error during deserialization.
This slightly alters the schema for all invocations and outputs - the values of all fields without default values are now typed as `<original type> | None`, reflecting the overrides.
This means the autogenerated types for fields have also changed for fields without defaults:
```ts
// Old
image?: components["schemas"]["ImageField"];
// New
image?: components["schemas"]["ImageField"] | null;
```
This does not break anything on the frontend.
* support for custom error toast components, starting with usage limit
* add support for all usage limits
---------
Co-authored-by: Mary Hipp <maryhipp@Marys-MacBook-Air.local>
* display credit column in queue list if shouldShowCredits is true
* change apiModels feature to chatGPT4oModels feature
* empty
---------
Co-authored-by: Mary Hipp <maryhipp@Marys-MacBook-Air.local>
When I followed the Contribute Node documentation, I encountered an import error.
This commit fixes the error, which will help reduce debugging time for all future contributors.
* add GPTimage1 as allowed base model
* fix for non-disabled inpaint layers
* lots of boilerplate for adding gpt-image base model and disabling things along with imagen
* handle gpt-image dimensions
* build graph for gpt-image
* lint
* feat(ui): make chatgpt model naming consistent
* feat(ui): graph builder naming
* feat(ui): disable img2img for imagen3
* feat(ui): more naming
* feat(ui): support presigned url prefetch
* feat(ui): disable neg prompt for chatgpt
* docs(ui): update docstring
* feat(ui): fix graph building issues for chatgpt
* fix(ui): node ids for chatgpt/imagen
* chore(ui): typegen
---------
Co-authored-by: Mary Hipp <maryhipp@Marys-MacBook-Air.local>
Co-authored-by: psychedelicious <4822129+psychedelicious@users.noreply.github.com>
If provided, `<NavigateToModelManagerButton />` will render, even if `disabledTabs` includes "models". If provided, `<NavigateToModelManagerButton />` will run the callback instead of switching tabs within the studio.
The button's tooltip is now just "Manage Models" and its icon is the same as the model manager tab's icon ([CUBE!](https://www.youtube.com/watch?v=4aGDCE6Nrz0)).
There is a subtle change in behaviour with the new model probe API.
Previously, checks for model types was done in a specific order. For example, we did all main model checks before LoRA checks.
With the new API, the order of checks has changed. Check ordering is as follows:
- New API checks are run first, then legacy API checks.
- New API checks categorized by their speed. When we run new API checks, we sort them from fastest to slowest, and run them in that order. This is a performance optimization.
Currently, LoRA and LLaVA models are the only model types with the new API. Checks for them are thus run first.
LoRA checks involve checking the state dict for presence of keys with specific prefixes. We expect these keys to only exist in LoRAs.
It turns out that main models may have some of these keys.
For example, this model has keys that match the LoRA prefix `lora_te_`: https://civitai.com/models/134442/helloyoung25d
Under the old probe, we'd do the main model checks first and correctly identify this as a main model. But with the new setup, we do the LoRA check first, and those pass. So we import this model as a LoRA.
Thankfully, the old probe still exists. For now, the new probe is fully disabled. It was only called in one spot.
I've also added the example affected model as a test case for the model probe. Right now, this causes the test to fail, and I've marked the test as xfail. CI will pass.
Once we enable the new API again, the xfail will pass, and CI will fail, and we'll be reminded to update the test.
In the previous commit, the LLaVA model was updated to support partial loading.
In this commit, the SigLIP model is updated in the same way.
This model is used for FLUX Redux. It's <4GB and only ever run in isolation, so it won't benefit from partial loading for the vast majority of users. Regardless, I think it is best if we make _all_ models work with partial loading.
PS: I also fixed the initial load dtype issue, described in the prev commit. It's probably a non-issue for this model, but we may as well fix it.
The model manager has two types of model cache entries:
- `CachedModelOnlyFullLoad`: The model may only ever be loaded and unloaded as a single object.
- `CachedModelWithPartialLoad`: The model may be partially loaded and unloaded.
Partial loaded is enabled by overwriting certain torch layer classes, adding the ability to autocast the layer to a device on-the-fly. See `CustomLinear` for an example.
So, to take advantage of partial loading and be cached as a `CachedModelWithPartialLoad`, the model must inherit from `torch.nn.Module`.
The LLaVA classes provided by `transformers` do inherit from `torch.nn.Module`, but we wrap those classes in a separate class called `LlavaOnevisionModel`. The wrapper encapsulate both the LLaVA model and its "processor" - a lightweight class that prepares model inputs like text and images.
While it is more elegant to encapsulate both model and processor classes in a single entity, this prevents the model cache from enabling partial loading for the chunky vLLM model.
Fixing this involved a few changes.
- Update the `LlavaOnevisionModelLoader` class to operate on the vLLM model directly, instead the `LlavaOnevisionModel` wrapper class.
- Instantiate the processor directly in the node. The processor is lightweight and does its business on the CPU. We don't need to worry about caching in the model manager.
- Remove caching support code from the `LlavaOnevisionModel` wrapper class. It's not needed, because we do not cache this class. The class now only handles running the models provided to it.
- Rename `LlavaOnevisionModel` to `LlavaOnevisionPipeline` to better represent its purpose.
These changes have a bonus effect of fixing an OOM crash when initially loading the models. This was most apparent when loading LLaVA 7B, which is pretty chunky.
The initial load is onto CPU RAM. In the old version of the loaders, we ignored the loader's target dtype for the initial load. Instead, we loaded the model at `transformers`'s "default" dtype of fp32.
LLaVA 7B is fp16 and weighs ~17GB. Loading as fp32 means we need double that amount (~34GB) of CPU RAM. Many users only have 32GB RAM, so this causes a _CPU_ OOM - which is a hard crash of the whole process.
With the updated loaders, the initial load logic now uses the target dtype for the initial load. LLaVA now needs the expected ~17GB RAM for its initial load.
PS: If we didn't make the accompanying partial loading changes, we still could have solved this OOM. We'd just need to pass the initial load dtype to the wrapper class and have it load on that dtype. But we may as well fix both issues.
PPS: There are other models whose model classes are wrappers around a torch module class, and thus cannot be partially loaded. However, these models are typically fairly small and/or are run only on their own, so they don't benefit as much from partial loading. It's the really big models (like LLaVA 7B) that benefit most from the partial loading.
Currently translated at 56.6% (1069 of 1887 strings)
translationBot(ui): update translation (Japanese)
Currently translated at 50.8% (960 of 1887 strings)
translationBot(ui): update translation (Japanese)
Currently translated at 48.4% (912 of 1882 strings)
Co-authored-by: RyoKoba <kobayashi_ryo@cyberagent.co.jp>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/ja/
Translation: InvokeAI/Web UI
I am at loss as the to cause of this bug. The styles that I needed to change to fix it haven't been changed in a couple months. But these do seem to fix it.
Closes#7910
This query can have potentially large responses. Keeping them around for 24 hours essentially a hardcoded memory leak. Use the default for RTKQ of 60 seconds.
When users generate on the canvas or upscaling tabs, we parse prompts through dynamic prompts before invoking. Whenever the prompt or other settings change, we run dynamic prompts.
Previously, we used a redux listener to react to changes to dynamic prompts' dependent state, keeping the processed dynamic prompts synced. For example, when the user changed the prompt field, we re-processed the dynamic prompts.
This requires that all redux actions that change the dependent state be added to the listener matcher. It's easy to forget actions, though, which can result in the dynamic prompts state being stale.
For example, when resetting canvas state, we dispatch an action that resets the whole params slice, but this wasn't in the matcher. As a result, when resetting canvas, the dynamic prompts aren't updated. If the user then clicks Invoke (with an empty prompt), the last dynamic prompts state will be used.
For example:
- Generate w/ prompt "frog", get frog
- Click new canvas session
- Generate without any prompt, still get frog
To resolve this, the logic that keeps the dynamic prompts synced is moved from the listener to a hook. The way the logic is triggered is improved - it's now triggered in a useEffect, which is run when the dependent state changes. This way, it doesn't matter _how_ the dependent state changes - the changes will always be "seen", and the dynamic prompts will update.
Add `useCanvasIsBusySafe()` hook. This is like `useCanvasIsBusy()`, but when the canvas is not initialized, it gracefully falls back to false instead of raising.
Because app tabs are lazy-loaded, the canvas is not initialized until the user visits that tab. If the page loads up on the workflows tab, the canvas will be uninitialized until the user clicks on it.
This graceful fallback behaviour allows actions like sending an image to canvas to work even when the canvas is not yet initialized. These actions are exposed in the image context menu, and previously were hidden when the canvas was not initialized. We can now show these actions and use them even when the canvas is uninitialized.
- Add `useCanvasIsBusySafe()` hook
- Use the new hook in the image context menu for send to canvas actions
- Do not use `<CanvasManagerProviderGate />` in the image context menu (this was hiding the actions when canvas was uninitialized)
When calling `ctx.drawImage()`, if the image to be drawn has a width of height of 0, the call will raise.
In this change, I have carefully reviewed the call hierarchy for all of our own code that calls this method and ensured that each call has error handling.
Well, with one exception - I'm not sure how to handle errors in `invokeai/frontend/web/src/common/hooks/useClientSideUpload.ts`. But this should never be an issue in that hook - it's a Canvas problem.
Currently translated at 100.0% (1873 of 1873 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 100.0% (1871 of 1871 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 99.2% (1857 of 1871 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 100.0% (1840 of 1840 strings)
Co-authored-by: Linos <linos.coding@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/vi/
Translation: InvokeAI/Web UI
Whether a workflow is published or not shouldn't be something stored on the client. It's properly server-side state.
This change removes the `is_published` flag from redux and updates all references to the flag to use the getWorkflow query.
It also updates the socket event listener that handles session complete events. When a validation run completes, we invalidate the tags for the getWorkflow query. We need to do a bit of juggling to avoid a race condition (documented in the code). Works well though.
Previously, we maintained an `isTouched` flag in redux state to indicate if a workflow had unsaved changes. We manually updated this whenever we changed something on the workflow.
This was tedious and error-prone. It also didn't handle undo/redo, so if you made a change to a node and undid it, we'd still think the workflow had unsaved changes.
Moving forward, we use a simpler and more robust strategy by hashing the server's version of the workflow and comparing it to the client's version of the workflow.
The hashing uses `stable-hash`, which is both fast and, well, stable. Most importantly, the ordering of keys in hashed objects does not change the resultant hash.
- Remove `isTouched` state entirely.
- Extract the logic that builds the "preview" workflow object from redux state into its own hook. This "preview" workflow is what we send to the server when saving a workflow. This "preview" workflow is effectively the client version of the workflow.
- Add `useDoesWorkflowHaveUnsavedChanges()` hook, which compares the hash of the client workflow and server workflow (if it exists).
- Add `useIsWorkflowUntouched()` hook, which compares the hash of the client workflow and the initial workflow that you get when you click new workflow.
- Remove `reactflow` workaround in the nodes slice undo/redo filter. When we set the nodes state while loading a workflow, `reactflow` emits a nodes size/placement change event. This triggered up our `isTouched` flag logic and marked the workflow as unsaved right from the get-go. With the new strategy to track touched status, this workaround can be removed.
- Update all logic that tracked the old `isTouched` flag to use the new hooks.
Previously, the workflow form's root element id was random. Every time we reset the workflow editor, the root id changed. This makes it difficult to check if the workflow editor is untouched (in its default state).
Now that root element's id is simply "root". I can't imagine any way that this would break anything.
This allows it to pull in sentencepiece on its own. In 0.10.0, it didn't have this package listed as a dependency, but in recent releases it does. So we are able to remove sentencepiece as an explicit dep.
The fixes in this module monkeypatched `torch` to resolve some issues with FP16 on macOS. These issues have long since been resolved.
Included in the now-removed fixes is `CustomSlicedAttentionProcessor`, which is intended to reduce memory requirements for MPS. This overrides `diffusers`' own `SlicedAttentionProcessor`.
Unfortunately, `attention_type: sliced` produces hot garbage with the fixes and black images without the fixes. So this class appears to now be a moot point.
Regardless, SDPA is supported on MPS and very efficient, so sliced attention is largely obsolete.
In https://github.com/pydantic/pydantic/pull/10029, pydantic made an improvement to its generated JSON schemas (OpenAPI schemas). The previous and new generated schemas both meet the schema spec.
When we parse the OpenAPI schema to generate node templates, we use some typeguard to narrow schema components from generic OpenAPI schema objects to a node field schema objects. The narrower node field schema objects contain extra data.
For example, they contain a `field_kind` attribute that indicates it the field is an input field or output field. These extra attributes are not part of the OpenAPI spec (but the spec allows does allow for this extra data).
This typeguard relied on a pydantic implementation detail. This was changed in the linked pydantic PR, which released with v2.9.0. With the change, our typeguard rejects input field schema objects, causing parsing to fail with errors/warnings like `Unhandled input property` in the JS console.
In the UI, this causes many fields - mostly model fields - to not show up in the workflow editor.
The fix for this is very simple - instead of relying on an implementation detail for the typeguard, we can check if the incoming schema object has any of our invoke-specific extra attributes. Specifically, we now look for the presence of the `field_kind` attribute on the incoming schema object. If it is present, we know we are dealing with an invocation input field and can parse it appropriately.
In `ObjectSerializerDisk`, we use `torch.load` to load serialized objects from disk. With torch 2.6.0, torch defaults to `weights_only=True`. As a result, torch will raise when attempting to deserialize anything with an unrecognized class.
For example, our `ConditioningFieldData` class is untrusted. When we load conditioning from disk, we will get a runtime error.
Torch provides a method to add trusted classes to an allowlist. This change adds an arg to `ObjectSerializerDisk` to add a list of safe globals to the allowlist and uses it for both `ObjectSerializerDisk` instances.
Note: My first attempt inferred the class from the generic type arg that `ObjectSerializerDisk` accepts, and added that to the allowlist. Unfortunately, this doesn't work.
For example, `ConditioningFieldData` has a `conditionings` attribute that may be one some other untrusted classes representing model-specific conditioning data. So, even if we allowlist `ConditioningFieldData`, loading will fail when torch deserializes the `conditionings` attribute.
This is a squash of a lot of scattered commits that became very difficult to clean up and make individually. Sorry.
Besides the new UI, there are a number of notable changes:
- Publishing logic is disabled in OSS by default. To enable it, provided a `disabledFeatures` prop _without_ "publishWorkflow".
- Enqueuing a workflow is no longer handled in a redux listener. It was hard to track the state of the enqueue logic in the listener. It is now in a hook. I did not migrate the canvas and upscaling tabs - their enqueue logic is still in the listener.
- When queueing a validation run, the new `useEnqueueWorkflows()` hook will update the payload with the required data for the run.
- Some logic is added to the socket event listeners to handle workflow publish runs completing.
- The workflow library side nav has a new "published" view. It is hidden when the "publishWorkflow" feature is disabled.
- I've added `Safe` and `OrThrow` versions of some workflows hooks. These hooks typically retrieve some data from redux. For example, a node. The `Safe` hooks return the node or null if it cannot be found, while the `OrThrow` hooks return the node or raise if it cannot be found. The `OrThrow` hooks should be used within one of the gate components. These components use the `Safe` hooks and render a fallback if e.g. the node isn't found. This change is required for some of the publish flow UI.
- Add support for locking the workflow editor. When locked, you can pan and zoom but that's it. Currently, it is only locked during publish flow and if a published workflow is opened.
This message is logged _every_ time we retrieve a list of models if there is an invalid model. Previously it logged the _whole_ row which can be a lot of data. Truncate the row to 64 characters to reduce log pollution.
Currently translated at 98.8% (1818 of 1840 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.6% (1816 of 1840 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.7% (1816 of 1839 strings)
Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
Previously, reactflow appears to have handled an edge case when using its `applyChanges` utility. If a change was provided without an item, it would skip that change. For example, an "add edge" change that somehow passed `null` as the edge, instead of a valid edge.
In our workflow loading and validation logic, invalid edges were removed from the array using `delete edges[i]`. This left "holes" in the array of edges. We then asked `reactflow` to add these edges to state. When it encountered one of the "holes", it skipped over it.
In a recent release (unsure which, somewhere between the latest v11 and ~v12.4) this seems to have changed. It no longer skips over the "holes" and instead trusts the data. This can cause a couple issues:
- Error when loading the workflow if `reactflow` attempt to do anything with the nonexistent edge.
- If somehow the workflow makes it into state with "holes" in the array of edges, all sorts of other stuff breaks when our code does anything with the nonexistent edge.
Two-part fix:
- Update the invalid edge handling to not use `delete edges[i]`. Instead, as we check each edge, we add invalid ones to a set. Then, after all the checks are finished, filter out the invalid edges. The resultant edges array has no holes.
- Simplify the logic around setting nodes and edges in redux. Previously we were using `reactflow`'s `applyChanges` utils, but this does literally nothing except take extra CPU cycles. We can simply set the loaded nodes and edges directly in redux. Perhaps we were using `applyChanges` because it addressed the "holes" issue? Not sure. But we don't need it now.
Closes#7868
## Summary
`timm` below 1.0.0 prevents llava models from working (broken in
transformers). but `controlnet-aux` pins `timm` to an earlier version
because otherwise it was breaking the ZoeDepth controlnet.
we don't use ZoeDepth (replaced by depthAnything), and downgrading
controlnet-aux seems to be acceptable.
more context here:
https://github.com/huggingface/controlnet_aux/issues/106https://github.com/huggingface/controlnet_aux/pull/101
Note that this results in some warnings on startup, stemming from
controlnet-aux:

we can probably silence the warnings as a separate enhancement
## Related Issues / Discussions
<!--WHEN APPLICABLE: List any related issues or discussions on github or
discord. If this PR closes an issue, please use the "Closes #1234"
format, so that the issue will be automatically closed when the PR
merges.-->
## QA Instructions
<!--WHEN APPLICABLE: Describe how you have tested the changes in this
PR. Provide enough detail that a reviewer can reproduce your tests.-->
## Merge Plan
<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
## Summary
- Port LoRA to new classification API
- Add 2 additional tests cases (ControlLora and Flux Diffusers LoRA)
- Moved `ModelOnDisk` to its own module
## Related Issues / Discussions
<!--WHEN APPLICABLE: List any related issues or discussions on github or
discord. If this PR closes an issue, please use the "Closes #1234"
format, so that the issue will be automatically closed when the PR
merges.-->
## QA Instructions
<!--WHEN APPLICABLE: Describe how you have tested the changes in this
PR. Provide enough detail that a reviewer can reproduce your tests.-->
## Merge Plan
<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->
## Checklist
- [ ] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
Before FLUX Fill was merged, we didn't do any checks for the model variant. We always returned "normal".
To determine if a model is a FLUX Fill model, we need to check the state dict for a specific key. Initially, this logic was too strict and rejected quantized FLUX models. This issue was resolved, but it turns out there is another failure mode - some fine-tunes use a different key.
This change further reduces the strictness, handling the alternate key and also falling back to "normal" if we don't see either key. This effectively restores the previous probing behaviour for all FLUX models.
Closes#7856Closes#7859
The polynomial fit isn't perfect and we end up with alpha values of 1 instead of 0 when applying the mask. This in turn causes issues on canvas where outputs aren't 100% transparent and individual layer bbox calculations are incorrect.
Lots of squashed experimentation heh:
ci: manually specify python version in tests
ci: whoops typo in ruff cmds
ci: specify python versions for uv python install
ci: install python verbosely
ci: try forcing python preference?
ci: try forcing python preference a different way?
ci: try in a venv?
ci: it works, but try without venv
ci: oh maybe we need --preview?
ci: poking it with a stick
ci: it works, add summary to pytest output
ci: fix pytest output
experiment: simulate test failure
Revert "experiment: simulate test failure"
This reverts commit b99ca512f6e61a2a04a1c0636d44018c11019954.
ci: just use default pytest output
cI: attempt again to use uv to install python
cI: attempt again again to use uv to install python
Revert "cI: attempt again again to use uv to install python"
This reverts commit 3cba861c90738081caeeb3eca97b60656ab63929.
Revert "cI: attempt again to use uv to install python"
This reverts commit b30f2277041dc999ed514f6c594c6d6a78f5c810.
## Summary
- Extend `ModelOnDisk` with caching, type hints, default args
- Fail early if there is an error classifying a config
## Related Issues / Discussions
<!--WHEN APPLICABLE: List any related issues or discussions on github or
discord. If this PR closes an issue, please use the "Closes #1234"
format, so that the issue will be automatically closed when the PR
merges.-->
## QA Instructions
<!--WHEN APPLICABLE: Describe how you have tested the changes in this
PR. Provide enough detail that a reviewer can reproduce your tests.-->
## Merge Plan
<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->
## Checklist
- [ ] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
## Summary
This PR moves type definitions out of `config.py` into a new
`taxonomy.py` module.
The goal is to reduce clutter in `config.py`, and to resolve circular
import issues by isolating these types in a dedicated module with
(almost) no internal dependencies.
Because so many places import these definitions, these changes touch 73
files.
Additional changes:
- Removed star imports using "removestar" tool
- Added the commit to `.git-blame-ignore-revs` to avoid noise in git
blame history
## Related Issues / Discussions
<!--WHEN APPLICABLE: List any related issues or discussions on github or
discord. If this PR closes an issue, please use the "Closes #1234"
format, so that the issue will be automatically closed when the PR
merges.-->
## QA Instructions
<!--WHEN APPLICABLE: Describe how you have tested the changes in this
PR. Provide enough detail that a reviewer can reproduce your tests.-->
## Merge Plan
<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->
## Checklist
- [ ] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
The top-level `invokeai` package may have an obscured origin due to the way editible installs work, but it's much more likely that this module is from a specific file.
## Summary
This test imports all modules in the invokeai package and fails if there
are any exceptions.
Existing issues are excluded to avoid blocking main.
## Related Issues / Discussions
<!--WHEN APPLICABLE: List any related issues or discussions on github or
discord. If this PR closes an issue, please use the "Closes #1234"
format, so that the issue will be automatically closed when the PR
merges.-->
## QA Instructions
<!--WHEN APPLICABLE: Describe how you have tested the changes in this
PR. Provide enough detail that a reviewer can reproduce your tests.-->
## Merge Plan
<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->
## Checklist
- [ ] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
## Summary
- Port LLaVA model config to new classification API
- Add 2 test cases (stripped LLaVA models variants to git-lfs)
## Related Issues / Discussions
<!--WHEN APPLICABLE: List any related issues or discussions on github or
discord. If this PR closes an issue, please use the "Closes #1234"
format, so that the issue will be automatically closed when the PR
merges.-->
## QA Instructions
<!--WHEN APPLICABLE: Describe how you have tested the changes in this
PR. Provide enough detail that a reviewer can reproduce your tests.-->
## Merge Plan
<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->
## Checklist
- [ ] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
In #7780 we added FLUX Fill support, and needed the probe to be able to distinguish between "normal" FLUX models and FLUX Fill models.
Logic was added to the probe to check a particular state dict key (input channels), which should be 384 for FLUX Fill and 64 for other FLUX models.
The new logic was stricter and instead of falling back on the "normal" variant, it raised when an unexpected value for input channels was detected.
This caused failures to probe for BNB-NF4 quantized FLUX Dev/Schnell, which apparently only have 1 input channel.
After checking a variety of FLUX models, I loosened the strictness of the variant probing logic to only special-case the new FLUX Fill model, and otherwise fall back to returning the "normal" variant. This better matches the old behaviour and fixes the import errors.
Closes#7822
Currently translated at 100.0% (1827 of 1827 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 100.0% (1826 of 1826 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 100.0% (1825 of 1825 strings)
Co-authored-by: Linos <linos.coding@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/vi/
Translation: InvokeAI/Web UI
Previously we used erode/dilate and a Gaussian blur to expand and fade the edges of Canvas masks. The implementation a number of problems:
- Erode/dilate kernel sizes were not calculated correctly, and extra iterations were run to compensate. The result is the blur size, which should have been pixels, was very inaccurate and unreliable.
- What we want is to add a "soft bleed" - like a drop shadow with no offset - starting from the edge of the mask, extending out by however many pixels. But Gaussian blur does not do this. The blurred area starts _inside_ the mask and extends outside it. So it kinda blurs inwards and outwards. We compensated for this by expanding the mask.
- Using a Gaussian blur can cause banding artifacts. Gaussian blur doesn't have a "size" or "radius" parameter in the sense that you think it should. It's a convolution matrix and there are _no non-zero values in the result_. This means that, far away from the mask, once compositing completes, we have some values that are very close to zero but not quite zero. These values are quantized by HTML Canvas, resulting in banding artifacts where you'd expect the blur to have faded to 0% alpha. At least, that is my understanding of why the banding artifacts occur.
The new node uses a better strategy to expand the mask and add the fade out effect:
- Calculate the distance from each white pixel to the nearest black pixel.
- Normalize this distance by dividing by the fade size in px, then clip the values to 0 - 1. The result represents the distance of each white pixel to its nearest black pixel as a percentage of the fade size. At this point, it is a linear distribution.
- Create a polynomial to describe the fade's intensity so that we can have a smooth transition from the masked region (black) to unmasked (white). There are some magic numbers here, deterined experimentally.
- Evaluate the polynomial over the normalized distances, so we now have a matrix representing the fade intensity for every pixel
- Convert this matrix back to uint8 and apply it to the mask
This works soooo much better than the previous method. Not only does it fix the banding issues, but when we enable "output only generated regions", we get a much smaller image. Will add images to the PR to clarify.
## Summary
- Integrate Git LFS to our automated Python tests in CI
- Add stripped model files with git-lfs
- `README.md` instructions to install and configure git-lfs
- Unrelated change (skip hashing to make unit test run faster)
## Related Issues / Discussions
<!--WHEN APPLICABLE: List any related issues or discussions on github or
discord. If this PR closes an issue, please use the "Closes #1234"
format, so that the issue will be automatically closed when the PR
merges.-->
## QA Instructions
<!--WHEN APPLICABLE: Describe how you have tested the changes in this
PR. Provide enough detail that a reviewer can reproduce your tests.-->
## Merge Plan
<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->
## Checklist
- [ ] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
## Summary
**Problem**
We want to have automated tests for model classification/probing, but
model files are too large to include in the source.
**Proposed Solution**
Classification/probing only requires metadata (key names, tensor
shapes), not weights.
This PR introduces "stripped" models - lightweight versions that retains
only essential metadata.
- Added script to strip models
- Added stripped models to automated tests
**Model size before and after "stripping":**
```
LLaVA Onevision Qwen2 0.5b-ov-hf before: 1.8 GB, after: 11.6 MB
text_encoder before: 246.1 MB, after: 35.6 kB
llava-onevision-qwen2-7b-si-hf before: 16.1 GB, after: 11.7 MB
RealESRGAN_x2plus.pth before: 67.1 MB, after: 143.0 kB
IP Adapter SD1 before: 2.5 GB, after: 94.9 kB
Hard Edge Detection (canny) before: 722.6 MB, after: 63.6 kB
Lineart before: 722.6 MB, after: 63.6 kB
Segmentation Map before: 722.6 MB, after: 63.6 kB
EasyNegative before: 24.7 kB, after: 151 Bytes
Face Reference (IP Adapter Plus Face) before: 98.2 MB, after: 13.7 kB
Standard Reference (IP Adapter) before: 44.6 MB, after: 6.0 kB
shinkai_makoto_offset before: 151.1 MB, after: 160.0 kB
thickline_fp16 before: 151.1 MB, after: 160.0 kB
Alien Style before: 228.5 MB, after: 582.6 kB
Noodles Style before: 228.5 MB, after: 582.6 kB
Juggernaut XL v9 before: 6.9 GB, after: 3.7 MB
dreamshaper-8 before: 168.9 MB, after: 1.6 MB
```
## Related Issues / Discussions
<!--WHEN APPLICABLE: List any related issues or discussions on github or
discord. If this PR closes an issue, please use the "Closes #1234"
format, so that the issue will be automatically closed when the PR
merges.-->
## QA Instructions
<!--WHEN APPLICABLE: Describe how you have tested the changes in this
PR. Provide enough detail that a reviewer can reproduce your tests.-->
## Merge Plan
<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->
## Checklist
- [ ] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
## Summary
The _goal_ of this PR is to make it easier to add an new config type.
This _scope_ of this PR is to integrate the API and does not include
adding new configs (outside tests) or porting existing ones.
One of the glaring issues of the existing *legacy probe* is that the
logic for each type is spread across multiple classes and intertwined
with the other configs. This means that adding a new config type (or
modifying an existing one) is complex and error prone.
This PR attempts to remedy this by providing a new API for adding
configs that:
- Is backwards compatible with the existing probe.
- Encapsulates fields and logic in a single class, keeping things
self-contained and easy to modify safely.
Below is a minimal toy example illustrating the proposed new structure:
```python
class MinimalConfigExample(ModelConfigBase):
type: ModelType = ModelType.Main
format: ModelFormat = ModelFormat.Checkpoint
fun_quote: str
@classmethod
def matches(cls, mod: ModelOnDisk) -> bool:
return mod.path.suffix == ".json"
@classmethod
def parse(cls, mod: ModelOnDisk) -> dict[str, Any]:
with open(mod.path, "r") as f:
contents = json.load(f)
return {
"fun_quote": contents["quote"],
"base": BaseModelType.Any,
}
```
To create a new config type, one needs to inherit from `ModelConfigBase`
and implement its interface.
The code falls back to the legacy model probe for existing models using
the old API.
This allows us to incrementally port the configs one by one.
## Related Issues / Discussions
<!--WHEN APPLICABLE: List any related issues or discussions on github or
discord. If this PR closes an issue, please use the "Closes #1234"
format, so that the issue will be automatically closed when the PR
merges.-->
## QA Instructions
<!--WHEN APPLICABLE: Describe how you have tested the changes in this
PR. Provide enough detail that a reviewer can reproduce your tests.-->
## Merge Plan
<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
In #7688 we optimized queuing preparation logic. This inadvertently broke retrying queue items.
Previously, a `NamedTuple` was used to store the values to insert in the DB when enqueuing. This handy class provides an API similar to a dataclass, where you can instantiate it with kwargs in any order. The resultant tuple re-orders the kwargs to match the order in the class definition.
For example, consider this `NamedTuple`:
```py
class SessionQueueValueToInsert(NamedTuple):
foo: str
bar: str
```
When instantiating it, no matter the order of the kwargs, if you make a normal tuple out of it, the tuple values are in the same order as in the class definition:
```
t1 = SessionQueueValueToInsert(foo="foo", bar="bar")
print(tuple(t1)) # -> ('foo', 'bar')
t2 = SessionQueueValueToInsert(bar="bar", foo="foo")
print(tuple(t2)) # -> ('foo', 'bar')
```
So, in the old code, when we used the `NamedTuple`, it implicitly normalized the order of the values we insert into the DB.
In the retry logic, the values of the tuple were not ordered correctly, but the use of `NamedTuple` had secretly fixed the order for us.
In the linked PR, `NamedTuple` was dropped for a normal tuple, after profiling showed `NamedTuple` to be meaningfully slower than a normal tuple.
The implicit order normalization behaviour wasn't understood, and the order wasn't fixed when changin the retry logic to use a normal tuple instead of `NamedTuple`. This results in a bug where we incorrectly create queue items in the DB. For example, we stored the `destination` in the `field_values` column.
When such an incorrectly-created queue item is dequeued, it fails pydantic validation and causes what appears to be an endless loop of errors.
The only user-facing solution is to add this line to `invokeai.yaml` and restart the app:
```yaml
clear_queue_on_startup: true
```
On next startup, the queue is forcibly cleared before the error loop is triggered. Then the user should remove this line so their queue is persisted across app launches per usual.
The solution is simple - fix the ordering of the tuple. I also added a type annotation and comment to the tuple type alias definition.
Note: The endless error loop, as a general problem, will take some thinking to fix. The queue service methods to cancel and fail a queue item still retrieve it and parse it. And the list queue items methods parse the queue items. Bit of a catch 22, maybe the solution is to simply delete totally borked queue items and log an error.
Currently translated at 98.7% (1800 of 1822 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.7% (1798 of 1820 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.7% (1796 of 1818 strings)
Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
There is now a single entrypoint for loading a workflow - `useLoadWorkflowWithDialog`.
The hook:
Handles loading workflows from various sources. If there are unsaved changes, the user will be prompted to confirm before loading the workflow.
It returns a function that:
Loads a workflow from various sources. If there are unsaved changes, the user will be prompted to confirm before loading the workflow. The workflow will be loaded immediately if there are no unsaved changes. On success, error or completion, the corresponding callback will be called.
WHEW
- Replace `get_counts` method with `get_tag_counts_with_filter` which gets the counts for a list of tags, filtering by a list of selected tags
- Update `get_many` logic to apply tag filtering with AND logic, to match the new `get_tag_counts_with_filter` method
- Update workflow library router
User facing:
When a FLUX main model is selected, users may now add Regional Reference Image layers.
When switching between FLUX Redux and FLUX IP Adapter, the settings will change to match the model type. (IP Adapter has weight, begin/end step, but Redux does not.) The image will be retained when switching between the two.
Otherwise it works the same way as IP Adapter - both in Global and Regional Reference Image layers.
---
Internal state handling:
Slightly awkward, but it was easiest to make FLUX Redux a second type of IP Adapter in redux state.
Global and regional reference images still have a single `ipAdapter` field, but it can have a type of `ip_adapter` or `flux_redux`.
Ideally, this field is called `config` or `settings` or something, but we are past that point. We _could_ do a migration to rename it, but I don't think it's worth the effort.
---
Other changes:
- Updated canvas layer validators to handle FLUX Redux.
- Updated model list loading logic to un-set FLUX Redux models in Canvas if they are not in the list (e.g. if the user deletes the model in the main app).
- Updated graph builders - new `addFLUXRedux` util & updated `addRegions` util.
- Updated the `buildModelsHook` util to return a hook that accepts a filter callback. This handles a discrepancy: FLUX IP Adapter does not support regional guidance, but FLUX Redux does. The Regional Guidance settings provide the filter to filter out FLUX IP Adapter models from the combined list of IP Adapter ahd Redux models.
This follows the same pattern for IP Adapter w/ its CLIP Vision model. The SigLIP model is unlikely to ever change and we don't want to force the user to select it anywhere. Hardcoding it is safe and makes the UX much nicer.
The alternative is a model dropdown that will likely only ever have one valid choice in it.
- We don't need to copy the init file. Just crawl the custom nodes dir for modules and import them all. Dunno why I didn't do this initially.
- Pass the logger in as an arg. There was a race condition where if we got the logger directly in the load_custom_nodes function, the config would not have been loaded fully yet and we'd end up with the wrong custom nodes path!
- Remove permissions-setting logic, I do not believe it is relevant for custom nodes
- Minor cleanup of the utility
There's a pydantic thing that causes the graphs to fail validation erroneously. Details in the comments - not a high priority to fix but we should figure it out someday.
This method simply sets the `opened_at` attribute to the current time.
Previously `opened_at` was set when calling `get`, but that is not correct. We `get` workflows often, even when not opening them. So this needs to be a separate thing
Get the counts of workflows for the given tags and/or categories. Made a separate method bc get_many will deserialize all matching workflows, which is unnecessary for this use case.
This big chungus reworks and simplifies much of the logic around loading and saving workflows. It also makes some minor changes to how store the current workflow and determine if it is a draft, user workflow or default workflow.
---
The lower-level hooks to save a workflow have been revised:
- `useSaveLibraryWorkflow`: Saves a user or project workflow that has had changes made to it.
- `useCreateNewWorkflow`: Saves a workflow as a new entity.
A new higher-level hook `useSaveOrSaveAsWorkflow` is intended to be used by components. It returns a single function that:
- Constructs the workflow payload to be sent to the server
- Checks if the workflow is an existing user workflow. If so, it immediately saves (updates) that workflow.
- If it's not an existing user workflow, it opens the save as dialog so the user can choose a name for it and create a new workflow. This occurs for both draft workflows and loaded default workflows.
---
The logic to build the current redux state into a workflow - either to be saved as JSON, to update an existing user workflow, or save as - was a bit convoluted.
Changes to redux state triggered a debounced function to build the workflow, setting it in a global nanostores atom. Then, all of the functions that consumed the "built workflow" referenced this atom.
Now, this logic is strictly imperative. When a consumer wants to save a workflow, we build it on the spot. This removes a layer of indirection.
The logic is in the `useBuildWorkflowFast` hook.
---
The logic for loading a workflow is also revised. Previously, it happened in an RTK listener. You'd need to dispatch an action to load a workflow, and wouldn't know if it succeeded or not (though the listener would make a toast if the load failed).
This is now done in a callback, outside redux middleware. The callback is returned from the `useLoadWorkflow` hook.
---
Previously, we stripped the id from default workflows when loading them. Then, when saving the workflow, we built a workflow object from redux state and hit the API with it.
This has two issues:
- It relies on redux state never having an ID set when a default workflow is loaded. If we somehow ended up with a default workflow's ID in redux, when we go to save the workflow, we'd get and error or it wouldn't work, because you cannot save a default workflow. You can only save-as it.
- We do not know the default workflow from which the current workflow was loaded. And be cause we don't know the default workflow, we cannot show a thumbnail image.
The responsibilities have been shifted around a bit.
Now, when we load a workflow, we load it as-is. The default workflow IDs are saved in redux state. We can render the thumbnail, and if the user goes to save the workflow, we detect that it is a default workflow and save-as it.
---
In `App.tsx`, the long list of modals are moved into their own "isolator" component to ensure any re-renders there do not affect the rest of the app.
---
The save-workflow-as modal is restructured to be a bit simpler. Still works the same. On commercial, "save to project" will be enabled by default.
---
The workflow JSON tab uses a debounced version of "buildWorkflow" to build the workflow as JSON.
---
`buildWorkflowFast` is updated to deep-copy its _whole_ output, preventing issues where field types could accidentally get mutated. I don't think this has ever happened but we may as well be safe.
---
Fixed an issue where the edit button in the workflow list didn't open the workflow in edit mode.
It's only by misunderstanding the pydantic API that this field was is typed as optional. Workflows must _always_ have a category, and indeed they do.
Fixing this allows the generated types in the frontend to be easier to work with..
There was a bit of wonk with default workflows. On every app startup, we wiped them all out and recreated them with new IDs. This is a quick-and-dirty way to ensure default workflows are always in sync.
Unfortunately, it also means default workflows are newly-created entities on every app load. Any thumbnails associated to them will be lost (bc they have new IDs), and `updated_at` doesn't work.
This changes makes default workflows stable entities.
The workflows we bundle in the python package in JSON format are still the source of truth for default workflows, but the startup logic that syncs them to the user DB is a bit smarter.
- All bundled workflows have an ID. It is prefixed with "default_" for clarity.
- Any default workflows in the user's DB that are not in the bundled default workflows are deleted from the DB.
- Any bundled default workflows that are not in the user's DB are added to the DB.
- If a default workflow in the user's DB does not match the content of its corresponding bundled workflow, it is updated in the DB.
The end result is that default workflows are still kept in sync for the user, but they don't change their identity.
We may now add thumbnails to default workflows, and sorting by `updated_at` is now meaningful.
## Summary
Upgrade ruff version to 0.9.9 and format existing code.
## Related Issues / Discussions
<!--WHEN APPLICABLE: List any related issues or discussions on github or
discord. If this PR closes an issue, please use the "Closes #1234"
format, so that the issue will be automatically closed when the PR
merges.-->
## QA Instructions
<!--WHEN APPLICABLE: Describe how you have tested the changes in this
PR. Provide enough detail that a reviewer can reproduce your tests.-->
## Merge Plan
<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->
## Checklist
- [ ] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
This allows our tests to run in an isolated environment. For tests taht implicitly depend on import behaviour, this can prevent side-effects.
The function should only be used for tests.
by adding a layer with all the pytorch dependencies that don't change
most of the time.
## Summary
Every time the [`main` docker
images](https://github.com/invoke-ai/InvokeAI/pkgs/container/invokeai)
rebuild and I pull `main-cuda`, it gets another 3+ GB, which seems like
about a zillion times too much since most things don't change from one
commit on `main` to the next.
This is an attempt to follow the guidance in [Using uv in Docker:
Intermediate
Layers](https://docs.astral.sh/uv/guides/integration/docker/#intermediate-layers)
so there's one layer that installs all the dependencies—including
PyTorch with its bundled nvidia libraries—_before_ the project's own
frequently-changing files are copied in to the image.
## Related Issues / Discussions
- [Improved docker layer cache with
uv](https://discord.com/channels/1020123559063990373/1329975172022927370)
- [astral: Can `uv pip install` torch, but not `uv sync`
it](https://discord.com/channels/1039017663004942429/1329986610770612347)
## QA Instructions
Hopefully the CI system building the docker images is sufficient.
But there is one change to `pyproject.toml` related to xformers, so it'd
be worth checking that `python -m xformers.info` still says it has
triton on the platforms that expect it.
## Merge Plan
I don't expect this to be a disruptive merge.
(An earlier revision of this PR moved the venv, but I've reverted that
change at ebr's recommendation.)
## Checklist
- [ ] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
SQLite cursors are meant to be lightweight and not reused. For whatever reason, we reuse one per service for the entire app lifecycle.
This can cause issues where a cursor is used twice at the same time in different transactions.
This experiment makes the session queue use a fresh cursor for each method, hopefully fixing the issue.
This allows tags to be invalidated while mutations are executing, resolving an issue in this situation:
- A long-running mutation starts.
- A tag is invalidated; for example, user edits a board name, and the boards list query tag is invalidated.
- The boards list query isn't fired, and the board name isn't updated.
- The long-running mutation finishes.
- Finally, the boards list query fires and the board name is updated.
This is the "delayed" behaviour. The "immediately" behaviour has the fires requests from tag invalidation immediately, without waiting for all mutations to finish.
It may cause extra network requests and stale data if we are mutating a lot of things very quickly. I don't think it will be an issue in practice and the improved responsiveness will be a net benefit.
Rely on WAL mode and the busy timeout.
Also changed:
- Remove extraneous rollbacks when we were only doing a `SELECT`
- Remove try/catch blocks that were made extraneous when removing the extraneous rollbacks
This allows for read and write concurrency without using a global mutex. Operations may still fail they take longer than the busy timeout (5s).
If we get a database lock error after waiting 5s for an operation, we have a problem. So, I think it's actually better to use a busy timeout instead of a global mutex.
Alternatively, we could add a timeout to the global mutex.
Fixes an issue where fields like control weight on ControlNet nodes and image on IP Adapter nodes didn't render.
These are "single or collection" fields. They accept a single input object, or collection. They are supposed to render the UI input for a single object.
In a7a71ca935 a performance optimisation for a hot code-path inadvertently broke this.
The determination of which UI component to render for a given field was done using a type guard function for the field's template. Previously, this used a zod schema to parse the template. This is very slow, especially when the template was not the expected type.
The optimization changed the type guards to check the field name (aka its type, integer, image, etc) and cardinality directly, without any zod parsing.
It's much faster, but subtly changed the behaviour because it was a bit stricter. For some fields, it rejected "single or collection" cardinalities when it should have accepted them.
When these fields - like the aforementioned Control Weight and Image - were being rendered, none of the type guards passed and they rendered nothing.
The fix here updates the type guard functions to support multiple cardinalities. So now, when we go to render a "single or collection" field, we will render the "single" input component as it should be.
## Summary
This PR adds a `pytorch_cuda_alloc_conf` config flag to control the
torch memory allocator behavior.
- `pytorch_cuda_alloc_conf` defaults to `None`, preserving the current
behavior.
- The configuration options are explained here:
https://pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf.
Tuning this configuration can reduce peak reserved VRAM and improve
performance.
- Setting `pytorch_cuda_alloc_conf: "backend:cudaMallocAsync"` in
`invokeai.yaml` is expected to work well on many systems. This is a good
first step for those looking to tune this config. (We may make this the
default in the future.)
- The optimal configuration seems to be dependent on a number of factors
such as device version, VRAM, CUDA kernel version, etc. For now, users
will have to experiment with this config to see if it hurts or helps on
their systems. In most cases, I expect it to help.
### Memory Tests
```
VAE decode memory usage comparison:
- SDXL, fp16, 1024x1024:
- `cudaMallocAsync`: allocated=2593 MB, reserved=3200 MB
- `native`: allocated=2595 MB, reserved=4418 MB
- SDXL, fp32, 1024x1024:
- `cudaMallocAsync`: allocated=3982 MB, reserved=5536 MB
- `native`: allocated=3982 MB, reserved=7276 MB
- SDXL, fp32, 1536x1536:
- `cudaMallocAsync`: allocated=8643 MB, reserved=12032 MB
- `native`: allocated=8643 MB, reserved=15900 MB
```
## Related Issues / Discussions
N/A
## QA Instructions
- [x] Performance tests with `pytorch_cuda_alloc_conf` unset.
- [x] Performance tests with `pytorch_cuda_alloc_conf:
"backend:cudaMallocAsync"`.
## Merge Plan
- [x] Merge #7668 first and change target branch to `main`
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
## Summary
Prior to this PR, most of the app setup was being done in `api_app.py`
at import time. This PR cleans this up, by:
- Splitting app setup into more modular functions
- Narrower responsibility for the `api_app.py` file - it just
initializes the `FastAPI` app
The main motivation for this changes is to make it easier to support an
upcoming torch configuration feature that requires more careful ordering
of app initialization steps.
## Related Issues / Discussions
N/A
## QA Instructions
- [x] Launch the app via invokeai-web.py and smoke test it.
- [ ] Launch the app via the installer and smoke test it.
- [x] Test that generate_openapi_schema.py produces the same result
before and after the change.
- [x] No regression in unit tests that directly interact with the app.
(test_images.py)
## Merge Plan
- [x] Check to see if there are any commercial implications to modifying
the app entrypoint.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
On the Canvas tab, when we made the network request to enqueue a batch, we were immediately resetting the request. This effectively disabled RTKQ's tracking of the request - including the loading state.
As a result, when you click the Invoke button on the Canvas tab, it didn't show a spinner, and it was not clear that anything was happening.
The solution is simple - just await the enqueue request before resetting the tracking, same as we already did on the workflows and upscaling tabs.
I also added some extra logging messages for enqueuing, so we get the same JS console logs for each tab on success or failure.
Currently translated at 40.3% (727 of 1801 strings)
translationBot(ui): update translation (Japanese)
Currently translated at 37.7% (680 of 1801 strings)
Co-authored-by: Hiroto N <hironow365@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/ja/
Translation: InvokeAI/Web UI
Previously, custom node loading occurred _during module imports_. A consequence of this is that when a custom node import fails (e.g. its type clobbers an existing node), the app fails to start up.
In fact, any time we import basically anything from the app, we trigger custom node imports! Not good.
This logic is now in its own function, called as the API app starts up.
If a custom node load fails for any reason, it no longer prevents the app from starting up.
One other bonus we get from this is that we can now ensure custom nodes are loaded _after_ core nodes.
Any clobbering that may occur while loading custom nodes is now guaranteed to be a custom node clobbering a core node's type - and not the other way round.
When deleting a board w/ images, the image usage checking logic was not checking image collection fields. This could result in a nonexistent image lingering in a node.
We already handle single image fields correctly, it's only the image collection fields taht were affected.
Found another place where we deepcopy a dict, but it is safe to mutate.
Restructured the prep logic a bit to support this. Updated tests to use the new structure.
- Avoid pydantic models when dict manipulation works
- Avoid extraneous deep copies when we can safely mutate
- Avoid NamedTuple construct and its overhead
- Fix tests to use altered function signatures
- Remove extraneous populate_graph function
The method and route now supports:
- "none" as a board ID, sentinel value for uncategorized
- Optionally specify image categories
- Optionally specify is_intermediate
This fixes the broken readiness checks introduced in the previous commit.
To support async batch generators, all of the validation of the generators needs to be async. This is problematic because a lot of the validation logic was in redux selectors, which are necessarily synchronous.
To resolve this, the readiness checks and related logic are restructured to be run async in response to redux state changes via `useEffect` (another option is to directly subscribe to redux store). These async functions then set some react state. The checks are debounced to prevent thrashing the UI.
See #7580 for more context about this issue.
Other changes:
- Fix a minor issue where empty collections were also checked against their min and max sizes, and errors were shown for all the checks. If a collection is empty, we don't need to do the min/max checks. If a collection is empty, we skip the other min/max checks and do not report those errors to the user.
- When a field is connected, do not attempt to check its value. This fixes an issue where collection fields with a connection could erroneously appear to be invalid.
- Improved error messages for batch nodes.
Board fields in the workflow editor now default to using the auto-add board by default.
**This is a change in behaviour - previously, we defaulted to no board (i.e. Uncategorized).**
There is some translation needed between the UI field values for a board and what the graph expects.
A "BoardField" is an object in the shape of `{board_id: string}`.
Valid board field values in the graph:
- undefined
- a BoardField
Value UI values and their mapping to the graph values:
- 'none' -> undefined
- 'auto' -> BoardField for the auto-add board, or if the auto-add board is Uncategorized, undefined
- undefined -> undefined (this is a fallback case with the new logic)
- a BoardField -> the same BoardField
Currently translated at 98.9% (1737 of 1755 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.9% (1735 of 1753 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.9% (1731 of 1749 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.9% (1731 of 1749 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.6% (1726 of 1749 strings)
Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
All but the core `vitest` package were updated recently. Tests still ran but the test UI dashboard didn't. After updating, all tests still run, seems fine.
Also tested building in app and package mode.
- Support transparency w/ color picker. To do this, we need to hide the bg layer before sampling. In testing, this has a negligible performance impact.
- Add an RGBA value readout next to the color picker ring.
Unfortunately I couldn't reliably reproduce the issue, so I'm not 100% sure this fixes it. But I think there is a race condition that results in `updateCompositingRectSize` erroneously seeing the layer has no objects and skipping the update.
To address this, the compositing rect fill/size/pos are all now force-updated when the fill/objects are changed. Theoretically it should be impossible for the issue to occur now.
- Fix an issue where the cursor disappeared when selecting a non-renderable entity. For example, when selecting a reference image layer and certain tools, the cursor would disappear.
- Ensure color picker works no matter what layer types are selected.
The logic for showing/hiding the cursor needed to be rearranged a bit for this fix.
Retrying a queue item means cloning it, resetting all execution-related state. Retried queue items reference the item they were retried from by id. This relationship is not enforced by any DB constraints.
- Add `retried_from_item_id` to `session_queue` table in DB in a migration.
- Add `retry_items_by_id` method to session queue service. Accepts a list of queue item IDs and clones them (minus execution state). Returns a list of retried items. Items that are not in a canceled or failed state are skipped.
- Add `retry_items_by_id` HTTP endpoint that maps 1-to-1 to the queue service method.
- Add `queue_items_retried` event, which includes the list of retried items.
- Optimize component and hook structure for input fields to reduce rerenders of component tree
- Remove memoization on some selectors where it serves no purpose (bc the object will have a stable identity until it changes, at which point we need to re-render anyways)
- Shift the connection error selector logic around to rely more on the stable identity of pending connection objects
including just invokeai/version seems sufficient to appease uv sync here. including everything else would invalidate the cache we're trying to establish.
- Simplify and de-insane-ify component structure, hooks, selectors, etc.
- Some perf improvements by using data attributes for styling instead of dynamic CSS-in-JS.
- Add field notes and start of linear view config, got blocked when I ran into deeper layout issues that made it very difficult to handle field configs. So those are WIP in this commit.
Currently translated at 98.9% (1735 of 1753 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.9% (1731 of 1749 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.9% (1731 of 1749 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.6% (1726 of 1749 strings)
Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
Currently translated at 99.2% (1695 of 1708 strings)
translationBot(ui): update translation (Italian)
Currently translated at 99.2% (1692 of 1705 strings)
translationBot(ui): update translation (Italian)
Currently translated at 99.2% (1691 of 1704 strings)
Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
## Summary
This PR adds support for the FLUX LoRA model format produced by
OneTrainer.
Specifically, this PR adds:
- Support for DoRA patches
- Support for patch models that modify the FLUX T5 encoder
- Probing / loading support for OneTrainer models
## Known limitations
- DoRA patches cannot currently be applied to base weights that are
quantized with `bitsandbytes`. The DoRA algorithm requires accessing the
original model weight in order to compute the patch diff, and the
bitsandbytes quantization layers make this difficult. DoRA patches can
be applied to non-quantized and GGUF-quantized layers without issue.
- This PR results in a slight speed regression for a very particular
inference combination: quantized base model + LoRA with diffusers keys
(i.e. uses the `MergedLayerPatch`). Now that more LoRA formats are using
the `MergedLayerPatch`, it was becoming too much work to maintain this
optimization. Regression from ~1.7 it/s to ~1.4 it/s.
## Future Notes
- We may want to consider dropping support for bitsandbytes
quantization. It is very difficult to maintain compatibility for across
features like partial-loading and LoRA patching.
- At a future time, we should refactor the LoRA parsing logic to be more
generalized rather than handling each format independently.
- There are some redundant device casts and dequantizations in
`autocast_linear_forward_sidecar_patches(...)` (and its sub-calls).
Optimizing this is left for future work.
## Related Issues / Discussions
- This PR should address a handful of the LoRAs reported in
https://github.com/invoke-ai/InvokeAI/issues/7131 (specifically, most of
the `envy*` LoRAs).
- This PR should address the example in
https://github.com/invoke-ai/InvokeAI/issues/6912 (though the intended
effect of that LoRA is not totally clear, so its hard to verify with
full confidence).
## QA Instructions
OneTrainer test models:
-
https://civitai.com/models/844821/envy-flux-dark-watercolor-01?modelVersionId=945159
(DoRA, transformer only)
-
https://civitai.com/models/836757/envy-flux-digital-brush-01?modelVersionId=936167
(hada, transformer only)
- ball_flux from https://github.com/invoke-ai/InvokeAI/issues/6912
(DoRA, transformer/clip/t5)
The following tests were repeated with each of the OneTrainer test
models:
- [x] Test with non-quantized base model
- [x] Test with GGUF-quantized base model
- [x] Test with BnB-quantized base model
- [x] Test with non-quantized base model that is partially-loaded onto
the GPU
Other regression test:
- [x] Test some SD1 LoRAs
- [x] Test some SDXL LoRAs
- [x] Test a variety of existing FLUX LoRA formats
- [x] Test a FLUX Control LoRA on all base model quantization formats.
## Merge Plan
No special instructions.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
## Summary
This PR fixes an issue with mask dimension consistency. Prior to this
change, the following workflow would fail with `tuple out of range`
error:
<img width="1072" alt="image"
src="https://github.com/user-attachments/assets/d0a9e658-1d64-4db4-adee-973bbdaca745"
/>
### Before this PR
Dimension compatibility for invocations that take a mask input:
- `ApplyMaskTensorToImageInvocation`: 2 or 3
- `MaskTensorToImageInvocation`: 2 or 3
- `InvertTensorMaskInvocation`: 3
Mask dimension for invocations that produce a MaskOutput:
- `RectangleMaskInvocation`: 3
- `AlphaMaskToTensorInvocation`: 3
- `InvertTensorMaskInvocation`: 3
- `ImageMaskToTensorInvocation`: 3
- `SegmentAnythingInvocation`: 2
### After this PR (changes in bold)
Dimension compatibility for invocations that take a mask input:
- `ApplyMaskTensorToImageInvocation`: 2 or 3
- `MaskTensorToImageInvocation`: 2 or 3
- `InvertTensorMaskInvocation`: **2 or 3** <----------------
Mask dimension for invocations that produce a MaskOutput:
- `RectangleMaskInvocation`: 3
- `AlphaMaskToTensorInvocation`: 3
- `InvertTensorMaskInvocation`: 3
- `ImageMaskToTensorInvocation`: 3
- `SegmentAnythingInvocation`: **3** <-------------------
## QA Instructions
I tested the workflow in the PR description and this workflow:
<img width="872" alt="image"
src="https://github.com/user-attachments/assets/20496860-ce81-47c0-a46a-a611b73faa22"
/>
## Merge Plan
No special instructions.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
Currently translated at 100.0% (1697 of 1697 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 99.2% (1684 of 1697 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 99.7% (1676 of 1681 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 99.3% (1670 of 1681 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 99.5% (1658 of 1666 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 100.0% (1652 of 1652 strings)
Co-authored-by: Linos <linos.coding@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/vi/
Translation: InvokeAI/Web UI
Dynamic prompts string generators can cause an infinite feedback loop when added to the linear view.
The root cause is how these generators handle "resolving" their collections. They hit the dynamic prompts HTTP API within the view component to get the prompts, then set the batch node's internal state with those values.
When the same generator is rendered in both the node editor view and linear view and the timing is just right, that state update causes an infinite feedback loop between the two components as they respond to the state updates from the other component.
The other generators never store the generated values in the batch node's internal state. The values are "resolved" just-in-time as they are needed.
To fix this, the batch value "resolver" utilities could be made async and hit the API. But there's a problem - the resolver utilities are used within the "are we ready to invoke? are there any problems with the current settings?" redux selectors, which are strictly synchronous. To fix that, we can refactor that "are we ready to invoke?" logic to not use redux selectors, so the whole thing could be async.
It's not a big change but I'm not going to spend time on it at the moment.
So, until I address this, the dynamic prompts generators are disabled.
- Add JS Mersenne Twister implementation dependency to use as seeded PRNG. This is not a cryptographically secure algorithm.
- Add nullish seed field to float and integer random generators.
- Add UI to control the seed.
- When seed is not set, behaviour is unchanged - the values are randomized when you Invoke. When seed is set, the random distribution is deterministic depending on the seed. In this case, we can display the values to the user.
Unfortunately we cannot do strict floats or ints.
The batch data models don't specify the value types, it instead relies on pydantic parsing. JSON doesn't differentiate between float and int, so a float `1.0` gets parsed as `1` in python.
As a result, we _must_ accept mixed floats and ints for BatchDatum.items.
Tests and validation updated to handle this.
Maybe we should update the BatchDatum model to have a `type` field? Then we could parse as float or int, depending on the inputs...
## Summary
This PR revises the logic for calculating the model cache RAM limit. See
the code for thorough documentation of the change.
The updated logic is more conservative in the amount of RAM that it will
use. This will likely be a better default for more users. Of course,
users can still choose to set a more aggressive limit by overriding the
logic with `max_cache_ram_gb`.
## Related Issues / Discussions
- Should help with https://github.com/invoke-ai/InvokeAI/issues/7563
## QA Instructions
Exercise all heuristics:
- [x] Heuristic 1
- [x] Heuristic 2
- [x] Heuristic 3
- [x] Heuristic 4
## Merge Plan
- [x] Merge https://github.com/invoke-ai/InvokeAI/pull/7565 first and
update the target branch
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
## Summary
This PR adds a `keep_ram_copy_of_weights` config option the default (and
legacy) behavior is `true`. The tradeoffs for this setting are as
follows:
- `keep_ram_copy_of_weights: true`: Faster model switching and LoRA
patching.
- `keep_ram_copy_of_weights: false`: Lower average RAM load (may not
help significantly with peak RAM).
## Related Issues / Discussions
- Helps with https://github.com/invoke-ai/InvokeAI/issues/7563
- The Low-VRAM docs are updated to include this feature in
https://github.com/invoke-ai/InvokeAI/pull/7566
## QA Instructions
- Test with `enable_partial_load: false` and `keep_ram_copy_of_weights:
false`.
- [x] RAM usage when model is loaded is reduced.
- [x] Model loading / unloading works as expected.
- [x] LoRA patching still works.
- Test with `enable_partial_load: false` and `keep_ram_copy_of_weights:
true`.
- [x] Behavior should be unchanged.
- Test with `enable_partial_load: true` and `keep_ram_copy_of_weights:
false`.
- [x] RAM usage when model is loaded is reduced.
- [x] Model loading / unloading works as expected.
- [x] LoRA patching still works.
- Test with `enable_partial_load: true` and `keep_ram_copy_of_weights:
true`.
- [x] Behavior should be unchanged.
- [x] Smoke test CPU-only and MPS with default configs.
## Merge Plan
- [x] Merge https://github.com/invoke-ai/InvokeAI/pull/7564 first and
change target branch.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
## Summary
Prior to this change, there were several cases where we initialized the
weights of a FLUX model before loading its state dict (and, to make
things worse, in some cases the weights were in float32). This PR fixes
a handful of these cases. (I think I found all instances for the FLUX
family of models.)
## Related Issues / Discussions
- Helps with https://github.com/invoke-ai/InvokeAI/issues/7563
## QA Instructions
I tested that that model loading still works and that there is no
virtual memory reservation on model initialization for the following
models:
- [x] FLUX VAE
- [x] Full T5 Encoder
- [x] Full FLUX checkpoint
- [x] GGUF FLUX checkpoint
## Merge Plan
No special instructions.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
Previously, when previewing a filter on a layer with some transparency or a filter that changes the alpha, the preview was rendered on top of the layer. The preview blended with the layer, which isn't right.
In this change, the layer is hidden during the preview, and when the filter finishes (having been applied or canceled - the two possible paths), the layer is shown.
Technically, we are hiding and showing the layer's object renderer's konva group, which contains the layer's "real" data.
Another small change was made to prevent a flash of empty layer, by waiting to destroy a previous filter preview image until the new preview image is ready to display.
Due to the limited floating point precision, and konva's `scale` properties, it is possible for the relative rect of an object to have non-integer coordinates and dimensions.
When we go to rasterize and otherwise export images, the HTML canvas API truncates these numbers.
So, we can end up with situations where the relative width and height of a layer are very close to the "real" value, but slightly off.
For example, width and height might be 512px, but the relative rect is calculated to be something like 512.000000003 or 511.9999999997.
In the first case, the truncation results in 512x512 for the dimensions - which is correct. But in the second case, it results in 511x511!
One place where this causes issues is the image action `New Canvas from image -> As Raster Layer (resize)`. For certain input image sizes, this results in an incorrectly resized image. For example, a 1496x1946 input image is resized to 511x511 pixels when the bbox is 512x512.
To fix this, we can round both coords and dimensions of rects when rasterizing.
I've thought through the implications and done some testing. I believe this change will not cause any regressions and only fix edge cases. But, it's possible that something was inadvertently relying on the old behavior.
There's a bug where preset image tooltips get stuck open in the list.
After much fiddling, debugging, and review of upstream dependencies, I have determined that this is bug in Chakra-UI v2.
Specifically, it appears to be a race condition related to the Tooltip component's internal use of the `useDisclosure` hook to manage tooltip open state, and the react render cycle.
Unfortunately, Chakra v2 is no longer being updated, and it's a pain in the butt to vendor and fix that component given its dependencies. Not 100% sure I could easily fix it, anyways.
Fortunately, there is a workaround - reduce the tooltip openDelay to 0ms. I prefer the current 500ms delay but I think it's preferable to have too-quick tooltips than too-sticky tooltips...
## Summary
Changes:
- Deprecate `ram` and `vram` configs. If these are set in invokeai.yaml,
they will be ignored.
- Create new `max_cache_ram_gb` and `max_cache_vram_gb` configs with the
same definitions as the old configs.
The main motivation of this change is to make the migration path
smoother for users who had previously added `ram` /`vram` to their
config files. Now, these users will be automatically migrated into the
new dynamic limit behavior (which is better in most cases). These users
will have to manually re-add `max_cache_ram_gb` and `max_cache_vram_gb`
to their configs if they wish to go back to specifying manual limits.
## Related Issues / Discussions
See the release notes for RC v5.6.0rc1 for the old migration behavior
that we are trying to improve:
https://github.com/invoke-ai/InvokeAI/releases/tag/v5.6.0rc1
## QA Instructions
- [x] Test that if `ram` or `vram` are present in a user's
`invokeai.yaml`, these values are ignored.
- [x] Test that `max_cache_ram_gb` and `max_cache_vram_gb` are applied,
if set.
## Merge Plan
- Don't forget to update the RC release notes accordingly.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
## Summary
This PR contains a bugfix for an edge case with model unloading (from
VRAM to RAM). Thanks to @JPPhoto for finding it.
The bug was triggered under the following conditions:
- A GGML-quantized model is loaded in VRAM
- We run a Spandrel image-to-image invocation (which is wrapped in a
`torch.inference_mode()` context manager.
- The model cache attempts to unload the GGML-quantized model from VRAM
to RAM.
- Doing this inside of the `torch.inference_mode()` cm results in the
following error:
```
[2025-01-07 15:48:17,744]::[InvokeAI]::ERROR --> Error while invoking session 98a07259-0c03-4111-a8d8-107041cb86f9, invocation d8daa90b-7e4c-4fc4-807c-50ba9be1a4ed (spandrel_image_to_image): Cannot set version_counter for inference tensor
[2025-01-07 15:48:17,744]::[InvokeAI]::ERROR --> Traceback (most recent call last):
File "/home/ryan/src/InvokeAI/invokeai/app/services/session_processor/session_processor_default.py", line 129, in run_node
output = invocation.invoke_internal(context=context, services=self._services)
File "/home/ryan/src/InvokeAI/invokeai/app/invocations/baseinvocation.py", line 300, in invoke_internal
output = self.invoke(context)
File "/home/ryan/.pyenv/versions/3.10.14/envs/InvokeAI_3.10.14/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
File "/home/ryan/src/InvokeAI/invokeai/app/invocations/spandrel_image_to_image.py", line 167, in invoke
with context.models.load(self.image_to_image_model) as spandrel_model:
File "/home/ryan/src/InvokeAI/invokeai/backend/model_manager/load/load_base.py", line 60, in __enter__
self._cache.lock(self._cache_record, None)
File "/home/ryan/src/InvokeAI/invokeai/backend/model_manager/load/model_cache/model_cache.py", line 224, in lock
self._load_locked_model(cache_entry, working_mem_bytes)
File "/home/ryan/src/InvokeAI/invokeai/backend/model_manager/load/model_cache/model_cache.py", line 272, in _load_locked_model
vram_bytes_freed = self._offload_unlocked_models(model_vram_needed, working_mem_bytes)
File "/home/ryan/src/InvokeAI/invokeai/backend/model_manager/load/model_cache/model_cache.py", line 458, in _offload_unlocked_models
cache_entry_bytes_freed = self._move_model_to_ram(cache_entry, vram_bytes_to_free)
File "/home/ryan/src/InvokeAI/invokeai/backend/model_manager/load/model_cache/model_cache.py", line 330, in _move_model_to_ram
return cache_entry.cached_model.partial_unload_from_vram(
File "/home/ryan/.pyenv/versions/3.10.14/envs/InvokeAI_3.10.14/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
File "/home/ryan/src/InvokeAI/invokeai/backend/model_manager/load/model_cache/cached_model/cached_model_with_partial_load.py", line 182, in partial_unload_from_vram
cur_state_dict = self._model.state_dict()
File "/home/ryan/.pyenv/versions/3.10.14/envs/InvokeAI_3.10.14/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1939, in state_dict
module.state_dict(destination=destination, prefix=prefix + name + '.', keep_vars=keep_vars)
File "/home/ryan/.pyenv/versions/3.10.14/envs/InvokeAI_3.10.14/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1936, in state_dict
self._save_to_state_dict(destination, prefix, keep_vars)
File "/home/ryan/.pyenv/versions/3.10.14/envs/InvokeAI_3.10.14/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1843, in _save_to_state_dict
destination[prefix + name] = param if keep_vars else param.detach()
RuntimeError: Cannot set version_counter for inference tensor
```
### Explanation
From the `torch.inference_mode()` docs:
> Code run under this mode gets better performance by disabling view
tracking and version counter bumps.
Disabling version counter bumps results in the aforementioned error when
saving `GGMLTensor`s to a state_dict.
This incompatibility between `GGMLTensors` and `torch.inference_mode()`
is likely caused by the custom tensor type implementation. There may
very well be a way to get these to cooperate, but for now it is much
simpler to remove the `torch.inference_mode()` contexts.
Note that there are several other uses of `torch.inference_mode()` in
the Invoke codebase, but they are all tight wrappers around the
inference forward pass and do not contain the model load/unload process.
## Related Issues / Discussions
Original discussion:
https://discord.com/channels/1020123559063990373/1149506274971631688/1326180753159094303
## QA Instructions
Find a sequence of operations that triggers the condition. For me, this
was:
- Reserve VRAM in a separate process so that there was ~12GB left.
- Fresh start of Invoke
- Run FLUX inference with a GGML 8K model
- Run Spandrel upscaling
Tests:
- [x] Confirmed that I can reproduce the error and that it is no longer
hit after the change
- [x] Confirm that there is no speed regression from switching from
`torch.inference_mode()` to `torch.no_grad()`.
- Before: `50.354s`, After: `51.536s`
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
Currently translated at 16.5% (273 of 1645 strings)
translationBot(ui): update translation (Polish)
Currently translated at 15.4% (254 of 1645 strings)
translationBot(ui): update translation (Polish)
Currently translated at 10.8% (178 of 1645 strings)
Co-authored-by: Nik Nikovsky <zejdzztegomaila@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/pl/
Translation: InvokeAI/Web UI
Currently translated at 100.0% (1649 of 1649 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 100.0% (1645 of 1645 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 100.0% (1645 of 1645 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 100.0% (1645 of 1645 strings)
Co-authored-by: Linos <linos.coding@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/vi/
Translation: InvokeAI/Web UI
## Summary
This PR enables RAM/VRAM cache size limits to be determined dynamically
based on availability.
**Config Changes**
This PR modifies the app configs in the following ways:
- A new `device_working_mem_gb` config was added. This is the amount of
non-model working memory to keep available on the execution device (i.e.
GPU) when using dynamic cache limits. It default to 3GB.
- The `ram` and `vram` configs now default to `None`. If these configs
are set, they will take precedence over the dynamic limits. **Note: Some
users may have previously overriden the `ram` and `vram` values in their
`invokeai.yaml`. They will need to remove these configs to enable the
new dynamic limit feature.**
**Working Memory**
In addition to the new `device_working_mem_gb` config described above,
memory-intensive operations can estimate the amount of working memory
that they will need and request it from the model cache. This is
currently applied to the VAE decoding step for all models. In the
future, we may apply this to other operations as we work out which ops
tend to exceed the default working memory reservation.
**Mitigations for https://github.com/invoke-ai/InvokeAI/issues/7513**
This PR includes some mitigations for the issue described in
https://github.com/invoke-ai/InvokeAI/issues/7513. Without these
mitigations, it would occur with higher frequency when dynamic RAM
limits are used and the RAM is close to maxed-out.
## Limitations / Future Work
- Only _models_ can be offloaded to RAM to conserve VRAM. I.e. if VAE
decoding requires more working VRAM than available, the best we can do
is keep the full model on the CPU, but we will still hit an OOM error.
In the future, we could detect this ahead of time and switch to running
inference on the CPU for those ops.
- There is often a non-negligible amount of VRAM 'reserved' by the torch
CUDA allocator, but not used by any allocated tensors. We may be able to
tune the torch CUDA allocator to work better for our use case.
Reference:
https://pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf
- There may be some ops that require high working memory that haven't
been updated to request extra memory yet. We will update these as we
uncover them.
- If a model is 'locked' in VRAM, it won't be partially unloaded if a
later model load requests extra working memory. This should be uncommon,
but I can think of cases where it would matter.
## Related Issues / Discussions
- #7492
- #7494
- #7500
- #7505
## QA Instructions
Run a variety of models near the cache limits to ensure that model
switching works properly for the following configurations:
- [x] CUDA, `enable_partial_loading=true`, all other configs default
(i.e. dynamic memory limits)
- [x] CUDA, `enable_partial_loading=true`, CPU and CUDA memory reserved
in another process so there is limited RAM/VRAM remaining, all other
configs default (i.e. dynamic memory limits)
- [x] CUDA, `enable_partial_loading=false`, all other configs default
(i.e. dynamic memory limits)
- [x] CUDA, ram/vram limits set (these should take precedence over the
dynamic limits)
- [x] MPS, all other default (i.e. dynamic memory limits)
- [x] CPU, all other default (i.e. dynamic memory limits)
## Merge Plan
- [x] Merge #7505 first and change target branch to main
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
## Summary
This PR adds support for partial loading of models onto the GPU. This
enables models to run with much lower peak VRAM requirements (e.g. full
FLUX dev with 8GB of VRAM).
The partial loading feature is enabled behind a new config flag:
`enable_partial_loading=True`. This flag defaults to `False`.
**Note about performance:**
The `ram` and `vram` config limits are still applied when
`enable_partial_loading=True` is set. This can result in significant
slowdowns compared to the 'old' behaviour. Consider the case where the
VRAM limit is set to `vram=0.75` (GB) and we are trying to run an 8GB
model. When `enable_partial_loading=False`, we attempt to load the
entire model into VRAM, and if it fits (no OOM error) then it will run
at full speed. When `enable_partial_loading=True`, since we have the
option to partially load the model we will only load 0.75 GB into VRAM
and leave the remaining 7.25 GB in RAM. This will cause inference to be
much slower than before. To workaround this, it is important that your
`ram` and `vram` configs are carefully tuned. In a future PR, we will
add the ability to dynamically set the RAM/VRAM limits based on the
available memory / VRAM.
## Related Issues / Discussions
- #7492
- #7494
- #7500
## QA Instructions
Tests with `enable_partial_loading=True`, `vram=2`, on CUDA device:
For all tests, we expect model memory to stay below 2 GB. Peak working
memory will be higher.
- [x] SD1 inference
- [x] SDXL inference
- [x] FLUX non-quantized inference
- [x] FLUX GGML-quantized inference
- [x] FLUX BnB quantized inference
- [x] Variety of ControlNet / IP-Adapter / LoRA smoke tests
Tests with `enable_partial_loading=True`, and hack to force all models
to load 10%, on CUDA device:
- [x] SD1 inference
- [x] SDXL inference
- [x] FLUX non-quantized inference
- [x] FLUX GGML-quantized inference
- [x] FLUX BnB quantized inference
- [x] Variety of ControlNet / IP-Adapter / LoRA smoke tests
Tests with `enable_partial_loading=False`, `vram=30`:
We expect no change in behaviour when `enable_partial_loading=False`.
- [x] SD1 inference
- [x] SDXL inference
- [x] FLUX non-quantized inference
- [x] FLUX GGML-quantized inference
- [x] FLUX BnB quantized inference
- [x] Variety of ControlNet / IP-Adapter / LoRA smoke tests
Other platforms:
- [x] No change in behavior on MPS, even if
`enable_partial_loading=True`.
- [x] No change in behavior on CPU-only systems, even if
`enable_partial_loading=True`.
## Merge Plan
- [x] Merge #7500 first, and change the target branch to main
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
## Summary
This is an unplanned fix between PR3 and PR4 in the sequence of partial
loading (i.e. low-VRAM) PRs. This PR restores the 'Current Workaround'
documented in https://github.com/invoke-ai/InvokeAI/issues/7513. In
other words, to work around a flaw in the model cache API, this fix
allows models to be loaded into VRAM _even if_ they have been dropped
from the RAM cache.
This PR also adds an info log each time that this workaround is hit. In
a future PR (#7509), we will eliminate the places in the application
code that are capable of triggering this condition.
## Related Issues / Discussions
- #7492
- #7494
- #7500
- https://github.com/invoke-ai/InvokeAI/issues/7513
## QA Instructions
- Set RAM cache limit to a small value. E.g. `ram: 4`
- Run FLUX text-to-image with the full T5 encoder, which exceeds 4GB.
This will trigger the error condition.
- Before the fix, this test configuration would cause a `KeyError`.
After the fix, we should see an info-level log explaining that the
condition was hit, but that generation should continue successfully.
## Merge Plan
No special instructions.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
Previously, we didn't differentiate between model install errors for different types of model install sources, resulting in a buggy UX:
- If a HF model install failed, but it was a HF URL install and not a repo id install, the link to the HF model page was incorrect.
- If a non-HF URL install (e.g. civitai) failed, we treated it as a HF URL install. In this case, if the user's HF token was invalid or unset, we directed the user to set it. If the HF token was valid, we displayed an empty red toast. If it's not a HF URL install, then of course neither of these are correct.
Also, the logic for handling the toasts was a bit complicated.
This change does a few things:
- Consolidate the model install error toasts into one place - the socket.io event handler for the model install error event. There is no more global state for the toasts and there are no hooks managing them.
- Handling the different cases for errors, including all combinations of HF/non-HF and unauthorized/forbidden/unknown.
This is required to fix an issue with the MM UI's error handling.
Previously, we only included the model source as a string. That could be an arbitrary URL, file path or HF repo id, but the frontend has no parsing logic to differentiate between these different model sources.
Without access to the type of model source, it is difficult to determine how the user should proceed. For example, if it's HF URL with an HTTP unauthorized error, we should direct the user to log in to HF. But if it's a civitai URL with the same error, we should not direct the user to HF.
There are a variety of related edge cases.
With this change, the full `ModelSource` object is included in each model install event, including error events.
I had to fix some circular import issues, hence the import changes to files other than `events_common.py`.
## Summary
This PR is the third in a sequence of PRs working towards support for
partial loading of models onto the compute device (for low-VRAM
operation). This PR updates the LoRA patching code so that the following
features can cooperate fully:
- Partial loading of weights onto the GPU
- Quantized layers / weights
- Model patches (e.g. LoRA)
Note that this PR does not yet enable partial loading. It adds support
in the model patching code so that partial loading can be enabled in a
future PR.
## Technical Design Decisions
The layer patching logic has been integrated into the custom layers (via
`CustomModuleMixin`) rather than keeping it in a separate set of wrapper
layers, as before. This has the following advantages:
- It makes it easier to calculate the modified weights on the fly and
then reuse the normal forward() logic.
- In the future, it makes it possible to pass original parameters that
have been cast to the device down to the LoRA calculation without having
to re-cast (but the current implementation hasn't fully taken advantage
of this yet).
## Know Limitations
1. I haven't fully solved device management for patch types that require
the original layer value to calculate the patch. These aren't very
common, and are not compatible with some quantized layers, so leaving
this for future if there's demand.
2. There is a small speed regression for models that have CPU
bottlenecks. This seems to be caused by slightly slower method
resolution on the custom layers sub-classes. The regression does not
show up on larger models, like FLUX, that are almost entirely
GPU-limited. I think this small regression is tolerable, but if we
decide that it's not, then the slowdown can easily be reclaimed by
optimizing other CPU operations (e.g. if we only sent every 2nd progress
image, we'd see a much more significant speedup).
## Related Issues / Discussions
- https://github.com/invoke-ai/InvokeAI/pull/7492
- https://github.com/invoke-ai/InvokeAI/pull/7494
## QA Instructions
Speed tests:
- Vanilla SD1 speed regression
- Before: 3.156s (8.78 it/s)
- After: 3.54s (8.35 it/s)
- Vanilla SDXL speed regression
- Before: 6.23s (4.46 it/s)
- After: 6.45s (4.31 it/s)
- Vanilla FLUX speed regression
- Before: 12.02s (2.27 it/s)
- After: 11.91s (2.29 it/s)
LoRA tests with default configuration:
- [x] SD1: A handful of LoRA variants
- [x] SDXL: A handful of LoRA variants
- [x] flux non-quantized: multiple lora variants
- [x] flux bnb-quantized: multiple lora variants
- [x] flux ggml-quantized: muliple lora variants
- [x] flux non-quantized: FLUX control LoRA
- [x] flux bnb-quantized: FLUX control LoRA
- [x] flux ggml-quantized: FLUX control LoRA
LoRA tests with sidecar patching forced:
- [x] SD1: A handful of LoRA variants
- [x] SDXL: A handful of LoRA variants
- [x] flux non-quantized: multiple lora variants
- [x] flux bnb-quantized: multiple lora variants
- [x] flux ggml-quantized: muliple lora variants
- [x] flux non-quantized: FLUX control LoRA
- [x] flux bnb-quantized: FLUX control LoRA
- [x] flux ggml-quantized: FLUX control LoRA
Other:
- [x] Smoke testing of IP-Adapter, ControlNet
All tests repeated on:
- [x] cuda
- [x] cpu (only test SD1, because larger models are prohibitively slow)
- [x] mps (skipped FLUX tests, because my Mac doesn't have enough memory
to run them in a reasonable amount of time)
## Merge Plan
No special instructions.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
## Summary
This PR adds utilities to support partial loading of models from CPU to
GPU. The new utilities are not yet being used by the ModelCache, so
there should be no functional behavior changes in this PR.
Detailed changes:
- Add autocast modules that are designed to wrap common
`torch.nn.Module`s and enable them to run with automatic device casting.
E.g. a linear layer on the CPU can be executed with an input tensor on
the GPU by streaming the weights to the GPU at runtime.
- Add unit tests for the aforementioned autocast modules to verify that
they work for all supported quantization formats (GGUF, BnB NF4, BnB
LLM.int8()).
- Add `CachedModelWithPartialLoad` and `CachedModelOnlyFullLoad` classes
to manage partial loading at the model level.
## Alternative Implementations
Several options were explored for supporting inference on
partially-loaded models. The pros/cons of the explored options are
summarized here for reference. In the end, wrapper modules were selected
as the best overall solution for our use case.
Option 1: Re-implement the .forward() methods of modules to add support
for device conversions
- This is the option implemented in this PR.
- This approach is the most manual of the three, but as a result offers
the broadest compatibility with unusual model types. It is manual in
that we have to explicitly add support for all module types that we wish
to support. Fortunately, the list of foundational module types is
relatively small (e.g. the current set of implemented layers covers all
but 0.04 MB of the full FLUX model.).
Option 2: Implement a custom Tensor type that casts tensors to a
`target_device` each time the tensor is used
- This approach has the nice property that it is injected at the tensor
level, and the model does not need to be modified in any way.
- One challenge with this approach is handling interactions with other
custom tensor types (e.g. GGMLTensor). This problem is solvable, but
definitely introduces a layer of complexity. (There are likely to also
be some similar issues with interactions with the BnB quantization, but
I didn't get as far as testing BnB.)
Option 3: Override the `__torch_function__` dispatch calls globally and
cast all params to the execution device.
- This approach is nice and simple: just apply a global context manager
and all operations will happen on the compute device regardless of the
device of the participating tensors.
- Challenges:
- Overriding the `__torch_function__` dispatch calls introduces some
overhead even if the tensors are already on the correct device.
- It is difficult to manage the autocasting context manager. E.g. it is
tempting to apply it to the model's `.forward(...)` method, but we use
some models with non-standard entrypoints. And we don't want to end up
with nested autocasting context managers.
- BnB applies quantization side effects when a param is moved to the GPU
- this interacts in unexpected ways with a global context manager.
## QA Instructions
Most of the changes in this PR should not impact active code, and thus
should not cause any changes to behavior. The main risks come from
bumping the bitsandbytes dependency and some minor modifications to the
bitsandbytes quantization code.
- [x] Regression test bitsandbytes NF4 quantization
- [x] Regression test bitsandbytes LLM.int8() quantization
- [x] Regression test on MacOS (to ensure that there are no lingering
bitsandbytes import errors)
I also tested the new utilities for inference on full models in another
branch to validate that there were not major issues. This functionality
will be tested more thoroughly in a future PR.
## Merge Plan
- [x] #7492 should be merged first so that the target branch can be
updated to main.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
## Summary
This PR tidies up the model cache code in preparation for further
refactoring to support partial loading of models onto the GPU. **These
code changes should not change the functional behavior in any way.**
Changes:
- Remove the `ModelCacheBase` class. `ModelCache` is the only
implementation, so there is no benefit to the separate abstract class.
- Split `CacheRecord` and `CacheStats` out into their own files.
- Remove the `ModelLocker` class. This extra layer of indirection was
not providing any benefit. Locking is now done directly with the
`ModelCache`.
- Tidy up relative imports that were contributing to circular import
issues.
- Pull the 'submodel' concern out of the `ModelCache`. The `ModelCache`
should not need to be aware of the model manager submodel system.
- Delete unused properties from the `ModelCache` (e.g.
`.lazy_offloading`, `.storage_device`, etc.)
## QA Instructions
I ran smoke tests with a variety of SD1, SDXL and FLUX models. No change
to behavior is expected.
## Merge Plan
<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
Uvicorn's logging is rather verbose. This change adds a `log_level_network` config setting to independently control uvicorn's log outputs. The setting defaults to warning.
The change hides the helpful startup message that says the host and port we are running on.
For example: `Uvicorn running on http://0.0.0.0:9090 (Press CTRL+C to quit`
The ASGI lifespan handler is updated to log an equivalent message on startup, regardless of log level settings.
Besides being helpful, the launcher relies on a message like this to launch the app. So, previously, if the user set their log level to anything above info (e.g. warning or error), the launcher would fail to open the app. This change prevents that edge case.
Currently translated at 100.0% (1644 of 1644 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 100.0% (1643 of 1643 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 100.0% (1643 of 1643 strings)
Co-authored-by: Linos <linos.coding@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/vi/
Translation: InvokeAI/Web UI
## Summary
This PR refactors the LoRA handling code to enable the use of FLUX
control LoRAs on top of quantized transformers.
Changes:
- Renamed a bunch of the model patching utilities to reflect that they
are not LoRA-specific
- Improved the unit test coverage.
- Refactored the handling of 'sidecar' patch layers to make them work
with more layer patch types. (This was necessary to get FLUX control
LoRAs working on top of quantized models.)
- Removed `ONNXModelPatcher`. It is out-of-date and hasn't been used in
a while.
## QA Instructions
I completed the following tests.
**These should be repeated after changing the target branch to main.**
**Due to the large surface area of this PR, reviewers should do
regression tests on a range of LoRA formats. There is a risk of
regression on a specific format that was missed during the
refactoring.**
- [x] FLUX Control LoRA + full FLUX transformer
- [x] FLUX Control LoRA + BnB NF4 quantized transformer
- [x] FLUX Control LoRA + GGUF quantized transformer
- [x] FLUX Control LoRA + non-control LoRA + full FLUX transformer
- [x] FLUX Contro LoRA + non-control LoRA + BnB quantized transformer
- [x] FLUX Control LoRA + non-control LoRA + GGUF quantized transformer
- Test the following cases for regression:
- [x] Misc SD1/SDXL LoRA variants (LoRA, LoKr, IA3)
- [x] FLUX, non-quantized, variety of LoRA formats
- [x] FLUX, quantized, variety of LoRA formats
## Merge Plan
**_Don't merge this PR yet._**
Merge plan:
1. First merge brandon/flux-tools-loras into main
2. Change the target branch of this PR to main
3. Review / test / merge this PR
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
- Ensure the currently-rasterizing adapter is reset to `null` on success or failure of a rasterization operation. In case of failure, this prevents the UI from getting stuck with a disabled Invoke button and tooltip message "Canvas is busy (rasterizing)".
- Log the error if there is one.
## Summary
https://github.com/invoke-ai/InvokeAI/issues/7422
As reported in the above ticket, a recent FLUX performance improvement
caused a regression on MacOS. This PR reverts the offending part of the
change.
## Related Issues / Discussions
- Closes#7422
- Original perf improvement:
https://github.com/invoke-ai/InvokeAI/pull/7399
## QA Instructions
I don't have a Mac capable of running this test, so trusting the report
in #7422 that this fixes the problem.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
The `is` operator compares references, not values. Thanks to a wonderfully unintuitive quirk of python, `is` works on integers from `-5` to `256`, inclusive.
Whenever integers in this range are used for a value, internally python returns a reference to a stable object in memory. When integers outside this range are used as a value, python creates a new object in memory for that integer.
See `PyLong_FromLong` documentation here: https://docs.python.org/3/c-api/long.html
Tying this back to our session processor, we were using `is` to compare the queue item ids for equality. Our queue item ids start at 0, and each queue item created increments this by one. So this comparison works only for the first 256 queue items on the machine.
Starting with the 257th queue item, the comparison starts returning `False`, and cancelation gets weird.
Easy fix - use `!=` instead of `is not`.
The "adding to" text indicates if images are going to the gallery or staging area. This info is relevant only to the canvas tab, but was displayed on Upscaling and Workflows tabs. Removed it from those tabs.
A redux selector is used to get the "default" IP Adapter. The selector uses the model list query result to select an IP Adapter model to be preset by default.
The selector is memoized, so if we mutate the returned default IP Adapter state, it mutates the result of the selector for all consumers.
For example, the `image` property of the default IP Adapter selector result is `null`. When we set the `image` property of the selector result while creating an IP Adapter, this does not trigger the selector to recompute its result. We end up setting the image for the selector result directly, and all other consumers now have that same image set.
Solution - we need to clone the selector result everywhere it is used. This was missed in a few spots, causing the issue.
It was easy to misunderstand the empty state for a regional guidance reference image. There was no label, so it seemed like it was the whole region that was empty.
This small change adds the "Reference Image" heading to the empty state, so it's clear that the empty state messaging refers to this reference image, not the whole regional guidance layer.
## Summary
This PR adds support for regional prompting with FLUX.
### Example 1
Global prompt: `An architecture rendering of the reception area of a
corporate office with modern decor.`
<img width="1386" alt="image"
src="https://github.com/user-attachments/assets/c8169bdb-49a9-44bc-bd9e-58d98e09094b">

## QA Instructions
- [x] Test that there is no slowdown in the base case with a single
global prompt.
- [x] Test image fully covered by regional masks.
- [x] Test image covered by region masks with small gaps.
- [x] Test region masks with large unmasked ‘background’ regions
- [x] Test region masks with significant overlap
- [x] Test multiple global prompts.
- [x] Test no global prompt.
- [x] Test regional negative prompts (It runs... but results are not
great. Needs more tuning to be useful.)
- Test compatibility with:
- [x] ControlNet
- [x] LoRA
- [x] IP-Adapter
## Remaining TODO
- [x] Disable the following UI features for FLUX prompt regions:
negative prompts, reference images, auto-negative.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
These helpers consolidate layer validation checks. For example, checking that the layer has content drawn, is compatible with the selected main model, has valid reference images, etc.
There's a technical challenge with outputting these values directly. `ImageField` does not store them, so the batch's `ImageField` collection does not have width and height for each image.
In order to set up the batch and pass along width and height for each image, we'd need to make a network request for each image when the user clicks Invoke. It would often be cached, but this will eventually create a scaling issue and poor user experience.
As a very simple workaround, users can output the batch image output into an `Image Primitive` node to access the width and height.
This change is implemented by adding some simple special handling when parsing the output fields for the `image_batch` node.
I'll keep this situation in mind when extending the batching system to other field types.
- Split up logic to determine reason why the user cannot invoke for each tab.
- Fix issue where the workflows tab would show reasons related to canvas/upscale tab. The tooltip now only shows information relevant to the current tab.
- Add calculation for batch size to the queue count prediction.
- Use a constant for the enqueue mutation's fixed cache key, instead of a string. Just some typo protection.
Currently translated at 42.3% (672 of 1588 strings)
translationBot(ui): update translation (Spanish)
Currently translated at 28.0% (445 of 1588 strings)
Co-authored-by: gallegonovato <fran-carro@hotmail.es>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/es/
Translation: InvokeAI/Web UI
- Add special handling for `ImageBatchInvocation`
- Add input component for image collections, supporting multi-image upload and dnd
- Minor rework of some hooks for accessing node data
The canvas react components pass canvas entity identifiers around, then redux selectors are used to access that entity. This is good for perf - entity states may rapidly change. Passing only the identifiers allows components and other logic to have more granular state updates.
Unfortunately, this design opens the possibility for for an entity identifier to point to an entity that does not exist.
To get around this, I had created a redux selector `selectEntityOrThrow` for canvas entities. As the name implies, it throws if the entity is not found.
While it prevents components/hooks from needing to deal with missing entities, it results in mysterious errors if an entity is missing. Without sourcemaps, it's very difficult to determine what component or hook couldn't find the entity.
Refactoring the app to not depend on this behaviour is tricky. We could pass the entity state around directly as a prop or via context, but as mentioned, this could cause performance issues with rapidly changing entities.
As a workaround, I've made two changes:
- `<CanvasEntityStateGate/>` is a component that takes an entity identifier, returning its children if the entity state exists, or null if not. This component is wraps every usage of `selectEntityOrThrow`. Theoretically, this should prevent the entity not found errors.
- Add a `caller: string` arg to `selectEntityOrThrow`. This string is now added to the error message when the assertion fails, so we can more easily track the source of the errors.
In the future we can work out a way to not use this throwing selector and retain perf. The app has changed quite a bit since that selector was created - so we may not have to worry about perf at all.
When we added more progress events during generation, we indirectly broke the logic that controls when the progress bar throbs.
Co-authored-by: Mary Hipp Rogers <maryhipp@gmail.com>
Currently translated at 33.6% (533 of 1583 strings)
translationBot(ui): update translation (Japanese)
Currently translated at 30.3% (481 of 1583 strings)
Co-authored-by: Gohsuke Shimada <ghoskay@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/ja/
Translation: InvokeAI/Web UI
Currently translated at 17.6% (278 of 1575 strings)
translationBot(ui): update translation (Spanish)
Currently translated at 17.3% (274 of 1575 strings)
Co-authored-by: gallegonovato <fran-carro@hotmail.es>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/es/
Translation: InvokeAI/Web UI
Currently translated at 100.0% (1581 of 1581 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 100.0% (1576 of 1576 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 100.0% (1575 of 1575 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 85.0% (1340 of 1575 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 78.7% (1240 of 1575 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 73.1% (1152 of 1575 strings)
translationBot(ui): update translation (English)
Currently translated at 99.9% (1574 of 1575 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 57.9% (913 of 1575 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 37.0% (584 of 1575 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 3.2% (51 of 1575 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 3.2% (51 of 1575 strings)
Co-authored-by: Linos <tt250208@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/en/
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/vi/
Translation: InvokeAI/Web UI
Currently translated at 79.9% (1266 of 1583 strings)
translationBot(ui): update translation (Chinese (Simplified Han script))
Currently translated at 74.4% (1171 of 1573 strings)
Co-authored-by: aidawanglion <youjayjeel@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/zh_Hans/
Translation: InvokeAI/Web UI
Currently translated at 99.6% (1569 of 1575 strings)
translationBot(ui): update translation (Italian)
Currently translated at 99.4% (1567 of 1575 strings)
translationBot(ui): update translation (Italian)
Currently translated at 99.4% (1565 of 1573 strings)
Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
Turns out a gallery image's `imageDTO` object can actually be a different object by reference. I thought this was not possible thanks to how we have a quasi-normalized cache.
Need to check against image name instead of reference equality when deciding whether or not to use the single image or the gallery selection for the dnd payload.
Rework uploadImage and uploadImages helpers and the RTK listener, ensuring gallery view isn't changed unexpectedly and preventing extraneous toasts.
Fix staging area save to gallery button to essentially make a copy of the image, instead of changing its intermediate status.
- New name: "Output only Generated Regions"
- New default: true (this was the intention, but at some point the behaviour of the setting was inverted without the default being changed)
The styling in gallery for selected vs hovered was very similar, leading users to think that the hovered image was also selected.
Reducing the borders for hovered images to a single pixel makes it easier to distinguish between selected and hovered.
- Tweak layout/styling of alerts for consistent spacing
- Add percentage to message if it has percentage
- Only show events if the destination is canvas (so workflows events are hidden for example)
- Pass in the `UtilInterface` to the `ModelsInterface` so we can call the simple `signal_progress` method instead of the complicated `emit_invocation_progress` method.
- Only emit load events when starting to load - not after.
- Add more detail to the messages, like submodel type
## Summary
Add support for SD3 image-to-image and inpainting. Similar to FLUX, the
implementation supports fractional denoise_start/denoise_end for more
fine-grained denoise strength control, and a gradient mask adjustment
schedule for smoother inpainting seams.
## Example
Workflow
<img width="1016" alt="image"
src="https://github.com/user-attachments/assets/ee598d77-be80-4ca7-9355-c3cbefa2ef43">
Result

## QA Instructions
- [x] Regression test of text-to-image
- [x] Test image-to-image without mask
- [x] Test that adjusting denoising_start allows fine-grained control of
amount of change in image-to-image
- [x] Test inpainting with mask
- [x] Smoke test SD1, SDXL, FLUX image-to-image to make sure there was
no regression with the frontend changes.
## Merge Plan
<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
## Summary
The Flux VAE, like many VAEs, is broken if run using float16 inputs
returning black images due to NaNs
This will fix the issue by forcing the VAE to run in bfloat16 or float32
were compatible
## Related Issues / Discussions
Fix for issue https://github.com/invoke-ai/InvokeAI/issues/7208
## QA Instructions
Tested on MacOS, VAE works with float16 in the invoke.yaml and left to
default.
I also briefly forced it down the float32 route to check that to.
Needs testing on CUDA / ROCm
## Merge Plan
It should be a straight forward merge,
When an unsupported model architecture is selected, show that warning only, without the extra warnings (i.e. no "missing tile controlnet" warning)
Update Invoke tooltip warnings accordingly
Closes#7239Closes#7177
- Add `withToast` flag to `uploadImage` util
- Skip the toast if this is not set
- Use the flag to disable toasts when canvas does internal image-uploading stuff that should be invisible to user
We don't need a "dnd" image system. We need a "image action" system. We need to execute specific flows with images from various "origins":
- internal dnd e.g. from gallery
- external dnd e.g. user drags an image file into the browser
- direct file upload e.g. user clicks an upload button
- some other internal app button e.g. a context menu
The actions are now generalized to better support these various use-cases.
## Summary
Nodes to support SD3.5 txt2img generations
* adds SD3.5 to starter models
* adds default workflow for SD3.5 txt2img
## Related Issues / Discussions
<!--WHEN APPLICABLE: List any related issues or discussions on github or
discord. If this PR closes an issue, please use the "Closes #1234"
format, so that the issue will be automatically closed when the PR
merges.-->
## QA Instructions
<!--WHEN APPLICABLE: Describe how you have tested the changes in this
PR. Provide enough detail that a reviewer can reproduce your tests.-->
## Merge Plan
<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->
## Checklist
- [ ] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
In a8de6406c5 a change was made to many menus in an effort to improve performance. The menus were made to be lazy, so that they are mounted only while open.
This causes unexpected behaviour when there is some logic in the menu that may need to execute after the user selects a menu item.
In this case, when you click to load a workflow from file, the file picker opens but then the menuitem unmounts, taking the input element and all uploading logic with it. When you select a file, nothing happens because we've nuked the handlers by unmounting everything.
Easy fix - un-lazy-fy the menu.
Closes#7240
The validation on this node causes graph validation to valid. It must be validated _after_ instantiation.
Also, it was a bit too strict. The only case we explicitly do not handle is when both bboxes and points are provided. It's acceptable if neither are provided.
Closes#7248
When filtering, we use a listener to trigger processing the image whenever a filter setting changes. For example, if the user changes from canny to depth, and auto-process is enabled, we re-process the layer with new filter settings.
The filterer has a method to reset its ephemeral state. This includes the filter settings, so resetting the ephemeral state is expected to trigger processing of the filter.
When we exit filtering, we reset the ephemeral state before resetting everything else, like the listeners.
This can cause problem when we exit filtering. The sequence:
- Start filtering a layer.
- Auto-process the filter in response to starting the filter process.
- Change the filter settings.
- Auto-process the filter in response to the changed settings.
- Apply the filter.
- Exit filtering, first by resetting the ephemeral state.
- Auto-process the filter in response to the reset settings.*
- Finish exiting, including unsubscribing from listeners.
*Whoops! That last auto-process has now borked the layer's rendering by processing a filter when we shouldn't be processing a filter.
We need to first unsubscribe from listeners, so we don't react to that change to the filter settings and erroneously process the layer.
Also, add a check to the `processImmediate` method to prevent processing if that method is accidentally called without first starting the filterer.
The same issue could affect the segmenyanything module - same fixes are implemented there.
The root issue is the compositing cache. When we save the canvas to gallery, we need to first composite raster layers together and then upload the image.
The compositor makes extensive use of caching to reduce the number of images created and improve performance. There are two "layers" of caching:
1. Caching the composite canvas element, which is used both for uploading the canvas and for generation mode analysis.
2. Caching the uploaded composite canvas element as an image.
The combination of these caches allows for the various processes that require composite canvases to do minimal work.
But this causes a problem in this situation, because the user expects a new image to be uploaded when they click save to gallery.
For example, suppose we have already composited and uploaded the raster layer state for use in a generation. Then, we ask the compositor to save the canvas to gallery.
The compositor sees that we are requesting an image for the current canvas state, and instead of recompositing and uploading the image again, it just returns the cached image.
In this case, no image is uploaded and it the button does nothing.
We need to be able to opt out of the caching at some level, for certain actions. A `forceUpload` arg is added to the compositor's high-level `getCompositeImageDTO` method to do this.
When true, we ignore the uppermost caching layer (the uploaded image layer), but still use the lower caching layer (the canvas element layer). So we don't recompute the canvas element, but we do upload it as a new image to the server.
Previously, we cleared the canvas progress image when the canvas had no active generations. This allowed for a brief flash of canvas state between the last progress image for a given generation, and when the output image for that generation rendered. Here's the sequence:
- Progress images are received and rendered
- Generation completes - no active canvas generations
- Clear the progress image -> canvas layers visible unexpectedly, creating an awkward jarring change
- Generation output image is rendered -> output image overlaid on canvas layers
In 83538c4b2b I attempted to fix this by only clearing the progress image while we were not staging.
This isn't quite right, though. We are often staging with no active generations - for example, you have a few images completed and are waiting to choose one.
In this situation, if you cancel a pending generation, the logic to clear the progress image doesn't fire because it sees staging is in progress.
What we really need is:
- Staging area module clears the progress image once it has rendered an output image.
- Progress image module clears the progress image when a generation is canceled or failed, in which case there will be no output image.
To do this, we can add an event listener to the progress image module to listen for queue item status changes, and when we get a cancelation or failure, clear the progress image.
pip's dependency resolution doesn't take into account transitive
dependencies when choosing package versions for download.
Even though `torch=~2.4.1` is required by `diffusers`, pip will
download 2.5.0 and higher, but only install 2.4.1.
Pinning torch to <2.5.0 prevents this behaviour.
## Summary
This change mimics the unet padding strategy to align T2I featuremaps
with the latents during denoising. It also slightly adjusts the crop and
scale logic so that the control will match the input image without
shifting when it needs to pad.
## Related Issues / Discussions
<!--WHEN APPLICABLE: List any related issues or discussions on github or
discord. If this PR closes an issue, please use the "Closes #1234"
format, so that the issue will be automatically closed when the PR
merges.-->
## QA Instructions
Image generated at 1032x1024

Image generated at 1080x1040 to prove feature alignment.

Edge artifacts on the bottom and right are a result of SDXL's unet
padding, and t2i influence will be cut off in those regions.
## Merge Plan
Contingent on #7205
Currently the Canvas UI prevents users from generating non-64
resolutions while t2i adapter layers are active. Will leave this as a
draft until fixing that.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
Previously we maintained an `isInteractable` flag, which was derived from these layer flags:
- Locked/unlocked
- Enabled/disabled
- Layer's type visible/hidden
When a layer was not interactable, we blocked all layer actions.
After comparing to the behaviour in Affinity and considering user feedback, I've loosened these restrictions while maintaining safety. First, some definitions.
There two kinds of layer actions - mutating actions and non-mutating actions.
- Mutating actions are drawing on the layer, cropping it, filtering it, converting it, etc. Anything that changes the layer.
- Non-mutating actions are copying the layer, saving the layer to gallery, etc. Anything that _uses_ the layer.
Then, there are two broad canvas states - busy and not busy. "Busy" means the canvas is actively filtering, staging, compositing layers together, etc - something that is "single-threaded" by nature.
And here are the revised restrictions:
- When canvas is busy, you cannot initiate any layer actions.
- When the canvas is not busy, and the layer is locked, you initiate any mutating actions.
- When the canvas is not busy and the layer is not locked, you can initiate any layer action.
Besides safely giving users more freedom, it also fixes an issue where the context menu for a layer was disabled if it was not the selected layer.
- Add method to force a rebuild of the pydantic type adapter for the union of invocations, which is used to validate graphs.
- Update the xfail'd test.
Had missed several of these, which means we were invalidating caches far too often. For example, when you changed a RG prompt, we were invalidating the cached canvas for that entity, even though changing the prompt doesn't affect the canvas at all.
Previously, merge visible deleted all other visible layers. This is not how affinity works, I should have confirmed before making it work like this in the first place.Ï
`CanvasCompositorModule` had a fairly inflexible API, only supporting compositing all raster layers or inpaint masks.
The API has been generalized work with a list of canvas entities. This enables `Merge Down` and `Merge Selected` functionality (though `Merge Selected` is not part of this set of changes).
Let the parent module adopt the filtered/segemented image instead of destroying it and making the parent re-create it, which results in a brief flash of the parent layer's original objects before the new image is rendered.
We were scaling the unscaled image and mask down before doing the paste-back, but this adds an extraneous step & image output.
We can do the paste-back first, then scale to output size after. So instead of 2 resizes before the paste-back, we have 1 resize after.
The end result is the same.
- Restore dedicated `Apply` buttons
- Remove icons from the buttons, too much noise when the words are short and clear
- Update loading state to show a spinner next to the `Process` button instead of on _every_ button
A blue button is begging to be clicked, but clicking it will do nothing. Instead, we should communicate that no action is needed by disabling the button when the default settings are already in use.
Using `&&` will result in false negatives for settings where a falsy value might be valid. For example, any setting for which 0 is a valid number. To be on the safe side, just use an explicit null check on all values.
We use an in-memory cache for PIL images to reduce I/O. If a node mutates the image in any way, the cached image object is also updated (but the on-disk image file is not).
We've lucked out that this hasn't caused major issues in the past (well, maybe it has but we didn't understand them?) mainly because of a happy accident. When you call `context.images.get_pil` in a node, if you provide an image mode (e.g. `mode="RGB"`), we call `convert` on the image. This returns a copy. The node can do whatever it wants to that copy and nothing breaks.
However, when mode is not specified, we return the image directly. This is where we get in trouble - nodes that load the image like this, and then mutate the image, update the cache. Other nodes that reference that same image will now get the mutated version of it.
The fix is super simple - we make sure to return only copies from `get_pil`.
- Use a hash of the last processed points instead of a `hasProcessed` flag to determine whether or not we should re-process a given set of points.
- Store point coords in state instead of pulling them out of the konva node positions. This makes moving a point a more explicit action in code.
- Add a `roundCoord` util to round the x and y values of a coordinate.
- Ensure we always re-process when $points changes.
Realized we are doing a lot of event listening even when segmenting is not occuring. I don't think this will have a meaningful performance impact, but it makes sense to remove these listeners when not in use.
Fix an issue where if the input image is transparent in a region to be masked, that transparent region ends up opaque black. Need to respect the input image transparency by applying the mask to the alpha channel only.
Each version of torch is only available for specific versions of CUDA and ROCm.
The Invoke installer and dockerfile try to install torch 2.4.1 with ROCm 5.6
support, which does not exist. As a result, the installation falls back to the
default CUDA version so AMD GPUs aren't detected. This commits fixes that by
bumping the ROCm version to 6.1, as suggested by the PyTorch documentation. [1]
The specified CUDA version of 12.4 is still correct according to [1] so it does
need to be changed.
Closes#7006Closes#7146
[1]: https://pytorch.org/get-started/previous-versions/#v241
## Summary
This PR adds support for the XLabs IP-Adapter
(https://huggingface.co/XLabs-AI/flux-ip-adapter) in workflows. Linear
UI integration is coming in a follow-up PR. The XLabs IP-Adapter can be
installed in the Starter Models tab.
Usage tips:
- Use a `cfg_scale` value of 2.0 to 4.0
- Start with an IP-Adatper weight of ~0.6 and adjust from there.
- Set `cfg_scale_start_step = 1`
- Set `cfg_scale_end_step` to roughly the halfway point (it's
unnecessary to apply CFG to all steps, and this will improve processing
time).
Sample workflow:
<img width="976" alt="image"
src="https://github.com/user-attachments/assets/4627b459-7e5a-4703-80e7-f7575c5fce19">
Result:

## Related Issues / Discussions
Prerequisite: https://github.com/invoke-ai/InvokeAI/pull/7152
## Remaining TODO:
- [ ] Update default workflows.
## QA Instructions
- [x] Test basic happy path
- [x] Test with multiple IP-Adapters (it runs, but results aren't great)
- [ ] ~Test with multiple images to a single IP-Adapter~ (this is not
supported for now)
- [ ] Test automatic runtime installation of CLIP-L, CLIP-H, and CLIP-G
image encoder models if they are not already installed.
- [ ] Test starter model installation of the XLabs FLUX IP-Adapter
- [ ] Test SD and SDXL IP-Adapters for regression.
- [ ] Check peak memory utilization.
## Merge Plan
- [ ] Merge #7152
- [ ] Change target branch to main
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
## Summary
Add support for Classifier-Free Guidance with FLUX.
- Using CFG doubles the time for the denoising process. Running both the
positive and negative conditioning in a single batch is left for future
work, because most users are already VRAM-constrained (this would
probably be faster at the cost of higher peak VRAM).
- Negative text conditioning is optional and only required if `cfg_scale
!= 1.0`
- CFG is skipped if `cfg_scale == 1.0` (i.e. no compute overhead in this
case)
- `cfg_scale_start_step` and `cfg_scale_end_step` can be used to easily
control the range of steps that CFG is applied for.
- CFG is a prerequisite for IP-Adapter support.
## Example
Positive Caption: `Professional photography of a luxury hotel in the
Nevada desert`
CFG: 1.0

Positive Caption: `Professional photography of a luxury hotel in the
Nevada desert`
Negative Caption: `Swimming pool`
CFG: 2.0
Same seed

## QA Instructions
- [ ] Test interactions with ControlNet
- [ ] Verify that peak RAM/VRAM utilization has not increased
significantly
- [ ] Test that CFG is skipped when cfg_scale == 1.0
- [ ] Test that negative text conditioning can be omitted when cfg_scale
== 1.0
- [ ] Test that a clear error message is returned when negative text
conditioning is omitted when cfg_scale != 1.0
- [ ] Test that the negative text prompt gets applied when cfg_scale
>1.0
- [ ] Test that a collection of cfg_scale values can be provided for
per-step control.
- [ ] Test that `cfg_scale_start_step` and `cfg_scale_end_step` control
the range of steps that CFG is applied
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
Introduce two-stage logging configuration and overrides for enabled status, log level and log namespaces.
The first stage in `<InvokeAIUI />`, before we set up redux (and therefore before we have access to the user's configured logging setup). In this stage, we use the overrides or default values.
The second stage is in `<App />`, after we set up redux, via `useSyncLoggingConfig`. In this stage, we use the overrides or the user's configured logging setup. This hook also handles pushing changes made by the user into localstorage.
Other changes:
- Extract logging config to util function
- Remove the `useEffect` from `SettingsModal` that was changing the logging settings
- Remove extraneous log effects from `useLogger`
- Export new `LoggingOverrides` type
While troubleshooting an issue with this middleware, I found the inclusion of the nextState and diff to be very noisy. It's now a function that accepts some options to configure the output, and returns the middleware.
We can use the drop overlay component directly for this, without needing to add it as a `noop` dnd target.
Other changes:
- The `label` prop is now used to conditionally render the label - every drop target provides its own label, so this doesn't break anything.
- Add `withBackdrop` prop to control whether we apply the dimmed drop target effect.
Instead of providing a duration to the upload action, we close the toast imperatively in the `imageUploaded` listener using a timeout. 3s after the last upload toast, we close it.
This handles the case when we are uploading multiple images and don't want the toast to close til it's all finished.
Currently translated at 98.7% (1476 of 1494 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.8% (1476 of 1493 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.8% (1474 of 1491 strings)
Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
To trigger the edge case:
- Have an empty layer and non-empty layer
- Select the non-empty layer
- Refresh the page
- Select to the empty layer without doing any other action
- You may be unable to draw on the layer
- Zoom in/out slightly
- You can now draw on it
The problem was not syncing visibility when a layer is selected, leaving the layer hidden. This indirectly disabled interactions.
The fix is to listen for changes to the layer's selected status and sync visibility when that changes.
We were:
- Incrementing `addedControlNets` or `addedT2IAdapters`
- Attempting to add it, but maybe failing and skipping
Need to swap the order of operations to prevent misreporting of added cnet/t2i.
I don't think this would ever actually cause problems.
## Summary
Add support for FLUX ControlNet models (XLabs and InstantX).
## QA Instructions
- [x] SD1 and SDXL ControlNets, since the ModelLoaderRegistry calls were
changed.
- [x] Single Xlabs controlnet
- [x] Single InstantX union controlnet
- [x] Single InstantX controlnet
- [x] Single Shakker Labs Union controlnet
- [x] Multiple controlnets
- [x] Weight, start, end params all work as expected
- [x] Can be used with image-to-image and inpainting.
- [x] Clear error message if no VAE is passed when using InstantX
controlnet.
- [x] Install InstantX ControlNet in diffusers format from HF repo
(`InstantX/FLUX.1-dev-Controlnet-Union`)
- [x] Test all FLUX ControlNet starter models
## Merge Plan
No special instructions.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
This replicates the img2img flow:
- Reset the canvas
- Resize the bbox to the image's aspect ratio at the optimal size for the selected model
- Add the image as a raster layer
- Resizes the layer to fit the bbox using the 'fill' strategy
After this completes, the user can immediately click Invoke and it will do img2img.
If an entity needs to do something after init, it can use this system. For example, if a layer should be transformed immediately after initializing, it can use an init callback.
This feature involves a certain amount of extra work to ensure stroke and fill with partial opacity render correctly together. However, none of our shapes actually use that combination of attributes, so we can disable this for a minor perf boost.
Instead of pulling the preview canvas from the konva internals, use the canvas created for bbox calculations as the preview canvas.
This doesn't change perf characteristics, because we were already creating this canvas. It just means we don't need to dip into the konva internals.
It fixes an issue where the layer preview didn't update or show when a layer is disabled or otherwise hidden.
- When resetting workflows, retain the current mode state
- Remove the useEffect that reacted to the `isCleanEditor` flag to prevent getting menu getting locked open
This could be triggered by transforming a layer, undoing, then transforming again. The simple fix is to ignore the rasterization cache for all transforms.
There's a Konva bug where `pointerenter` & `pointerleave` events aren't fired correctly on the stage.
In 87fdea4cc6 I made a change that surfaced this bug, breaking touch and Apple Pencil interactions, because the cursor position doesn't get updated.
Simple fix - ensure we update the cursor on `pointerdown` events, even though we shouldn't need to.
Will make a bug report upstream
- Set an empty title to prevent browsers from showing "Please match the requested format." when hovering the number input
- Fix issue w/ `z-index` that prevented the popover button from being clicked while the input was focused
Currently translated at 98.7% (1461 of 1479 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.7% (1460 of 1479 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.5% (1458 of 1479 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.7% (1459 of 1477 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.7% (1453 of 1471 strings)
Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
* add tooltips for images/assets tabs
* add icon by board name that can be used to activate editable
* update getting started text
---------
Co-authored-by: Mary Hipp <maryhipp@Marys-MacBook-Air.local>
Currently translated at 98.7% (1453 of 1471 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.7% (1453 of 1471 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.7% (1452 of 1471 strings)
Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
- Reverts the `onClick -> onPointerUp` changes, which fixed Apple Pencil interactions of buttons with tooltips but broke things in other subtle ways.
- Adds a default `openDelay` on tooltips of 500ms. This is another way to fix Apple Pencil interactions, and according to some searching online, is the best practice for tooltips anyways. The default behaviour should be for there to be a delay, and only in specific circumstances should there be no delay. So we'll see how this is received.
The color picker take some time to sample the color from the canvas state. This could cause a race condition where the cursor position changes between the time sampling starts, resulting in the picker showing the wrong color. Sometimes it picks up the color picker tool preview!
To resolve this, the color picker's color syncing is now throttled to once per animation frame. Besides fixing the incorrect color issue, it improves the perf substantially by reducing number of samples we take.
- Record both absolute and relative positions
- Use simpler method to get relative position
- Generalize getColorUnderCursor to be getColorAtCoordinate
We just changed all buttons to use `onPointerUp` events to fix Apple Pencil behaviour. This, plus the specific DOM layout of boards, resulted in the `onPointerUp` being triggered on a board before the drop triggered.
The app saw this as selecting the board, which then reset the gallery selection to the first image in the board. By the time you drop, the gallery selection had reset.
DOM layout slightly altered to work around this.
Currently translated at 45.4% (668 of 1470 strings)
translationBot(ui): update translation (French)
Currently translated at 33.1% (488 of 1470 strings)
translationBot(ui): update translation (French)
Currently translated at 32.5% (479 of 1470 strings)
translationBot(ui): update translation (French)
Currently translated at 30.7% (449 of 1458 strings)
translationBot(ui): update translation (French)
Currently translated at 30.2% (442 of 1460 strings)
Co-authored-by: Thomas Bolteau <thomas.bolteau50@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/fr/
Translation: InvokeAI/Web UI
## Summary
#6890 bumped torch, which caused an incompatibility with xformers when
installing with `pip install ".[xformers]"`. This PR bumps xformers.
## QA Instructions
I ran some smoke tests to confirm that generating with xformers still
works.
In my tests on an A100, there is a performance regression after bumping
xformers (2.7 it/s vs 3.2 it/s). I think it is ok to ignore this for
A100s, since users should be using torch-sdp, which is much faster (4.3
it/s). But, we should test for regression on older cards where xformers
is still recommended.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
A new "session" just means to reset most settings to default values, excluding model.
There are a few things that need to be reset:
- Parameters state, except for models and things dependent on model selection (like VAE precision)
- Canvas state, except for the `modelBase`, which is dependent on the model selection
- Canvas staging area state
- LoRAs state
- HRF state
- Style presets state
We also select the canvas tab.
For new gallery sessions, we:
- Open the image viewer
- Set the right panel tab to `gallery`
And for new canvas sessions, we:
- Close the image viewer
- Set the right panel tab to `layers`
Currently translated at 24.1% (351 of 1452 strings)
translationBot(ui): update translation (French)
Currently translated at 17.9% (261 of 1452 strings)
translationBot(ui): update translation (French)
Currently translated at 17.8% (259 of 1452 strings)
translationBot(ui): update translation (French)
Currently translated at 17.5% (255 of 1452 strings)
translationBot(ui): update translation (French)
Currently translated at 10.3% (150 of 1452 strings)
Co-authored-by: Thomas Bolteau <thomas.bolteau50@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/fr/
Translation: InvokeAI/Web UI
Currently translated at 62.0% (901 of 1452 strings)
translationBot(ui): update translation (German)
Currently translated at 56.4% (819 of 1452 strings)
translationBot(ui): update translation (German)
Currently translated at 53.8% (782 of 1452 strings)
Co-authored-by: Ettore Atalan <atalanttore@googlemail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/de/
Translation: InvokeAI/Web UI
Currently translated at 56.4% (819 of 1452 strings)
translationBot(ui): update translation (German)
Currently translated at 53.8% (782 of 1452 strings)
translationBot(ui): update translation (German)
Currently translated at 45.3% (658 of 1451 strings)
Co-authored-by: B N <berndnieschalk@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/de/
Translation: InvokeAI/Web UI
## Summary
This PR add support for FLUX LoRA models in kohya format with `lora_te1`
layers (i.e. CLIP LoRA layers). Previously, only transformer LoRA layers
were supported.
Example LoRA model in this format:
https://huggingface.co/cocktailpeanut/optimus
### Example
Prompt: `optimus is playing tennis in a tennis court`
Seed: 0
Without LoRA:

With LoRA:

## QA Instructions
I tested the following:
- [x] The optimus LoRA (with CLIP layers) can be applied.
- [x] FLUX LoRAs without CLIP layers still work
- [x] Loading the optimus LoRA, but applying it to the transformer
_only_ produces a different result. I.e. verified that patching the CLIP
layers is doing _something_. Ironically, the results seem better without
applying the CLIP layers. The CLIP layers seem to pull in more
background concepts. Regardless, it works.
- [x] The optimus LoRA can be applied via the Linear UI, and the output
matches results from manually constructing the workflow graph.
- [x] FLUX LoRAs without CLIP layers still work via the Linear UI.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
This pops up every now and then and I could never figure it out. A user figured it out in #6936. The cause is appending a query string to the app URL.
For example:
```sh
http://127.0.0.1:9090/?__theme=dark
```
The query string breaking the static file serving, which prevents our translations from loading correctly. Instead of the JSON translations, FastAPI sends the index HTML page. The UI then errors when attempting to parse the translation JSON.
The query string ?__theme=dark is used by Gradio to force dark mode. I believe the users with this issue are doing the same thing the user in #6936 did (just change the port number on an existing bookmark) or their browser history/bookmark includes the query string.
Though this is technically a user-caused problem (we cannot prevent the user from using a malformed URL), we can work around it. When query string is used on the root path, we can redirect the browser to the root path without the query string.
This is done via very simple middleware.
Closes#6696Closes#6817Closes#6828Closes#6936Closes#6983
`usePanel` started panels with a `minSize` and `defaultSize` of 0, which means collapsed. This causes panels to load as collapsed on the very first app load. Then, in the layout effect, we see the panel as collapsed and skip setting it to the correct size.
Reviewing the library's API, `minSize` and `defaultSize` should not be lower than 1. Thankfully, setting this to 1 also prevents the issue described above.
- `minSize` and `defaultSize` start at 1
- Return a sentinel value when converting percentages to pixels, if the panel's container has no size. When that happens, we should not update the `minSize` or `defaultSize`.
- Split observer callback into its own function, so that the exact same logic can be used on the first run of hte effect.
- Update prop names and docstrings to accurately reflect that the numerical values are in pixels
* restore send-to functionality
* lint
* feat(ui): add getImageMetadata helper
* feat(ui): updated usePreselectedImage logic
* fix(ui): race condition when creating & initializing canvas entity adapters
There was a race condition when the canvas was reset as it was initializing. This could occur when the "use preselected image" functionality was triggered.
It was possible to get an error (non-app-breaking) when attempting to initialize an entity:
1. Canvas initializes
2. Canvas starts creating and initializing all entities (this happens in `CanvasEntityRendererModule.render`)
3. Canvas is reset before that process finishes, clearing state
4. The method call from 2) attempts to initialize an entity that has been deleted from state and fails
Changes to fix this:
- Split `CanvasEntityRendererModule.render` into individual methods for each entity type, each with their own store subscription
- Do not `await` initialization after creating the entity adapter classes - let them initialize in the background
So the `render` method now completes very fast - quick enough that we don't run into this race condition.
It's possible that something will change in the future, and this race condition will come back. In that case, we could use mutexes in `CanvasEntityRendererModule` to prevent the failure condition. It's a bit more complicated to do that so I'm skipping it for now.
* feat(ui): export workflow library is open atom
* feat(ui): export image viewer atom
* tidy(ui): organise style presets menu state
* feat(ui): consolidate studio init actions
* build(ui): export type StudioInitAction
* feat(ui): add getStylePreset helper
* feat(ui): add toasts to useStudioInitAction
* tidy(ui): comment & minor cleanup for useStudioInitAction
* chore(ui): lint
* only show version when local
---------
Co-authored-by: Mary Hipp <maryhipp@Marys-MacBook-Air.local>
Co-authored-by: psychedelicious <4822129+psychedelicious@users.noreply.github.com>
Simplify the handle component and use the provided data attributes to style the handles correctly.
Fixes a styling issue where you if you hover at the T-junction between two handles, only one brightens up.
This unused logic was unnecessarily complicating the hook. It also inadvertently made the default panel size arg a percentage value even if it was actually a pixel value.
Cleaned up a couple other little bits.
Only change the selection array when its contents have changed. This prevents unnecessary re-renders.
For example, if the selection is currently `[image1]` and we set it again to `[image1]`, while the array contains the same objects, it is a new array. This will trigger unncessary re-renders.
Selecting a board selects the image, and then we were selecting it again afterwards. So we programmatically select the newly generated image twice.
This can cause a race condition if the user changes image selection between when the two programmatic image selection actions. Their selection will be quickly overridden by the second programmatic selection action.
I broke this in dfac0292f4 due to misunderstanding of what the upscale model actually was. I thought it was a main model but actually its a spandrel model.
There's a situation in which the enqueue response comes after the graph actually executes. This was unexpected when I first wrote the logic. I suppose it has to do with the async endpoint handling.
- Update canvas slice's to track the current base model architecture instead of just the optimal dimension. This lets us derive both optimal dimension _and_ grid size for the currently selected model.
- Update all bbox size utilities to use derived grid size instead of hardcoded values of 8 or 64
- Review every damned instance of the number 8 in the whole frontend and update the ones that need to use the grid size
- Update the invoke button blocking logic to check against scaled bbox size, unless scaling is disabled.
- Update the invoke button blocking to say if it's width or height that is invalid and if its bbox or scaled, for both FLUX and the T2I adapter constraints
- Use consistent logic for all model type handlers
- Fix bug where we could select invalid upscaling models (not sure how this hadn't caused problems...)
- Add logging for each action
- Only reset models when there is a change to be made - skip dispatching actions when there would be no change made to state
Previously the setting was `showOnlyRasterLayersWhileStaging`. This has been renamed to `isolatedStagingPreview`. Works the same.
Also added `isolatedFilteringPreview` an `isolatedTransformingPreview`. These work the same way, but they isolate the current selected layer. There are toggles in the canvas settings popover _and_ the filter/transform popups (same setting).
We need to ensure the getQueueCountsByDestination query is sync'd, invalidating its tags as queue items complete. Unfortunately it's 2 extra network requests per queue item.
Also clean up some jank w/ the handling of accepting staging images - there was this no-op action & a listener for it... should just be a simple callback.
Both the vanilla and autoscale invocations report progress while processing each tile.
The autoscale version, which may run the spandrel model multiple times, also includes the current iteration.
Each of these was a bit off:
- The SD callback started at `-1` and ended at `i`. Combined w/ the weird math on the previous `calc_percentage` util, this caused the progress bar to never finish.
- The MultiDiffusion callback had the same problems as SD.
- The FLUX callback didn't emit a pre-denoising step 0 image. It also reported total_steps as 1 higher than the actual step count.
Each of these now emit the expected events to the frontend:
- The initial latents at 0%
- Progress at each step, ending at 100%
- Update the step callback methods in the invocation API to use the new signal_progress API
- Copy and update the `calc_percentage`, reducing special handling for step and total_steps - a followup commit will fix callers of the step callbacks
## Summary
This PR makes some improvements to the FLUX image-to-image and
inpainting behaviours.
Changes:
- Expand inpainting region at a cutoff timestep. This improves seam
coherence around inpainting regions.
- Add Trajectory Guidance to improve the ability to control how much an
image gets modified during image-to-image/inpainting (see the code for a
more technical explanation - it's well-documented).
## `trajectory_guidance_strength` Usage
- The `trajectory_guidance_strength` param has been added to the `FLUX
Denoise` invocation.
- `trajectory_guidance_strength` defaults to `0.0` and should be in the
range [0, 1].
- `trajectory_guidance_strength = 0.0` has no effect on the denoising
process.
- `trajectory_guidance_strength = 1.0` will guide strongly towards the
original image.
## FLUX image-to-image usage tips
- As always, prompt matters a lot.
- If you are trying to making minor perturbations to an image, use
vanilla image-to-image by setting the `denoising_start` param.
- If you are trying to make significant changes to an image, using
trajectory guidance will give more control than using vanilla
image-to-image. Set `denoising_start=0.0` and adjust
`trajectory_guidance_strength` to control the amount of change in the
image.
- The 'transition point' where the image changes the most as you adjust
`trajectory_guidance_strength` or `denoise_start` varies depending on
the noise. So, set a fixed noise seed, then tune those params.
## QA Instructions
- [x] Vanilla image-to-image - No change in output
- [x] Vanilla inpainting - No change in output
- [x] Vanilla outpainting - No change in output
- Trajectory Guidance image-to-image
- [x] TGS = 0.0 is identical to Vanilla case
- [x] TGS = 1.0 guides close to the original image
- Not as close as I'd like, but it's not broken.
- [x] Smooth transition as TGS varies
- [x] Smoke test: TGS with denoise_start > 0.0
- TG inpainting
- [x] TGS = 0.0 is identical to Vanilla case
- [x] TGS = 1.0 guides close to the original image
- Not as close as I'd like, but it's not broken
- [x] Smooth transition as TGS varies
- [x] Smoke test: TGS with denoise_start > 0.0
- TG outpainting
- [x] TGS = 0.0 is identical to Vanilla case
- [x] Smoke test TGS outpainting
- [x] Smoke test FLUX text-to-image
- [x] Preview images look ok for all of above.
## Known issues (will be addressed in follow-up PRs)
- The current TGS scale biases towards creating more change than desired
in the image. More tuning of the TG change schedule is required.
- TGS does not work very well for outpainting right now. This _might_ be
solvable, but more likely we'll just want to discourage it in the Linear
UI.
## Merge Plan
No special instructions.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
## Summary
If a FLUX dev model is selected, show icon and popover telling user
about its license for commercial use
## Related Issues / Discussions
<!--WHEN APPLICABLE: List any related issues or discussions on github or
discord. If this PR closes an issue, please use the "Closes #1234"
format, so that the issue will be automatically closed when the PR
merges.-->
## QA Instructions
<!--WHEN APPLICABLE: Describe how you have tested the changes in this
PR. Provide enough detail that a reviewer can reproduce your tests.-->
## Merge Plan
<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->
## Checklist
- [ ] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
## Summary
This PR attempts to fix a flaky FLUX LoRA unit test.
Example test failure:
https://github.com/invoke-ai/InvokeAI/actions/runs/10958325913/job/30428299328?pr=6898
The failure _seems_ to be caused by a numerical precision error, but I
haven't been able to reproduce it locally. I have reduced the tolerance
of the offending comparison, and am pretty confident that this will
solve the issue.
## QA Instructions
No QA necessary.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
This can be used for nodes that Invoke uses internally. Internal nodes do not have API stability guarantees. For example, they may change if the needs of the linear UI change.
Two main changes:
- Add `runGraphAndReturnImageOutput` to `CanvasStateApiModule`. This method is a safe and convenient abstraction to execute a graph and retrieve the image output of one of its nodes. It supports cancellation (via an AbortSignal) and timeout.
- Update filters to build whole graphs, as opposed to nodes.
These changes allow:
- Filter execution is resilient, with all error cases handled (afaik)
- `CanvasEntityFilterer` class is much simpler
- Stuck or long-running filters may be canceled
- Filters may be arbitrarily complex - so long as there is one node that outputs an image, the filter will just work
- Rename util to `getImageDTOSafe`
- Update API to accept the same options as RTKQ's `initiate`
- Add `getImageDTO`; while `getImageDTOSafe` returns null if the image is not found, the new util throws
- Update usage of `getImageDTOSafe`
* wip
* more updates for new user experience
* pull whats new out
* use loading state
* lint
* fix(ui): translation missing period
* feat(ui): create icon component for invoke logo
* feat(ui): tweaked invoke logo colors
---------
Co-authored-by: Mary Hipp <maryhipp@Marys-MacBook-Air.local>
Co-authored-by: psychedelicious <4822129+psychedelicious@users.noreply.github.com>
## Summary
There was an issue w/ the calculation causing an infinite loop but the
fixed algorithm from #6887 wasn't correct bc it doesn't take into
account the grid gap correctly. This then breaks arrow key navigation.
- Restore the previous calculation
- Bail out if the gallery elements don't have any width, which causes
the infinite loop - this part was missed when copying the logic from
GalleryImageGrid
## Related Issues / Discussions
n/a
## QA Instructions
shouldn't freeze
## Merge Plan
n/a
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
There was an issue w/ the calculation causing an infinite loop but the fixed algorithm wasn't correct bc it doesn't take into account the grid gap correctly. This then breaks arrow key navigation.
- Restore the previous calculation
- Bail out if the gallery elements don't have any width, which causes the infinite loop - this part was missed when copying the logic from GalleryImageGrid
- Allow `uploadImage` util to accept `metadata` to embed in the image
- Update compositor to support `metadata` field when uploading rasterized composite layer
- Add async zod refiner to `zImageWithDims` which fetches the image as part of validation
- Add `zServerValidatedModelIdentifierField`, a zod-refined version of `zModelIdentifierField` which fetches the model as part of validation
- Add `zCanvasMetadata` zod schema, which contains only canvas entities - no bbox, and no `isHidden` flags
- Renamed "Send to Canvas" -> "New Layer from Image"
- Added "New Canvas from Image"
This clarifies the purpose of the menu items and gives tablet users a way to easily add images tot he canvas.
Also update the verbiage for the alerts:
- "Sending to Canvas" -> "Staging Generations on Canvas"
- "Sending to Gallery" -> "Sending Generations to Gallery"
- Add buttons to zoom in/out
- Update hotkeys for fit & 100% to match affinity (e.g. ctrl+0, ctrl+1)
- Add hotkeys for 200%, 400%, 800%
- Update tooltips
This mirrors affinity/photoshop's default `d` hotkey, which sets the fg/bg to white/black. We don't have a concept of "background color", and white is more useful for control images, so it sets to white.
- Rework hotkey data to include the keys for each hotkey action.
- Add wrapper for `useHotkeys` that accepts a hotkey category and id. Automatically selects the key from the hotkey data.
- Add handling for macOS (cmd vs ctrl, option vs alt).
- Redo all hotkey descriptions, deleting nonexistant ones.
- Some `esc` hotkeys that just close whatever you are currently in are omitted due to their relative simplicity and intuitiveness.
This was caused by allowing the stage to be set to fractional coordinates. For example, the stage might be positioned at `x: 142.22255, y: 488.79`.
When positioned like this, the canvas will be slightly misaligned with its native pixel grid. The browser does its best, but this causes tiny scaling artifacts throughout the image. It's most noticeable where there is a sharp contrast.
This behaviour was introduced while troubleshooting an issue with degraded quality when saving canvas to gallery. Turned out the stage position was unrelated to that issue, but I didn't realize that the change would cause this other type of problem.
The fix is super simple - ensure we floor stage coords when setting the manually. Konva never sets the position to fractional coordinates itself. For example, while dragging the stage, Konva sets the stage coordiantes itself, and they are always integers.
## Summary
This PR adds support for FLUX LoRA models on both quantized and
non-quantized base models.
Supported formats:
- diffusers
- kohya
Full changelist:
- Consolidated LoRA handling code in `invokeai/backend/lora`
- Add support for FLUX kohya and FLUX diffusers LoRA model loading
- Add ability to either patch LoRAs or run as a sidecar model (the
latter enables LoRAs to be applied to a wide range of quantized models).
## QA Instructions
Note to reviewers: I tested everything in this checklist. Feel free to
re-verify any of this, but also test any LoRAs that you have. There are
many small LoRA format variations, and there's a risk of breaking one of
them with this change.
FLUX LoRA
- [x] Import / probe of kohya FLUX LoRA
(https://civitai.com/models/159333/pokemon-trainer-sprite-pixelart?modelVersionId=779247)
- [x] Import / probe of Diffusers FLUX LoRA
(https://civitai.com/models/200255/hands-xl-sd-15-flux1-dev?modelVersionId=781855)
- [x] kohya with non-quantized base model
- [x] kohya with quantized base model (should roughly match the
non-quantized case)
- [x] diffusers with non-quantized base model
- [x] diffusers with quantized base model (should roughly match the
non-quantized case)
- [x] Sidecar LoRA patching speed (<0.1secs after model is loaded)
- [x] Stacking multiple fused LoRA models (i.e. on top on non-quantized
model)
- [x] Stacking multiple sidecar LoRA models (i.e. on top of quantized
model)
Regression Tests
- [x] SD1.5 LoRA (check output, speed and memory)
- [x] SDXL LoRA (check output, speed and memory)
- [x] `USE_MODULAR_DENOISE=1` smoke test with LoRA
Test for output regression with the following LoRA formats:
- [x] LoRA
- [x] LoHA
- [x] LoKr
- [x] IA3
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
- Canvas manages its own progress socket event listeners and progress event data.
- Remove cancellations listener jank.
- Dip into low-level redux subscription API to watch for queue status changes, clearing the last "global" progress event when the queue has nothing in progress. Could also do this in a useEffect I guess.
- Had to shuffle some things around to prevent circular imports, so there are a lot of tiny changes here.
- Remove queue front button. Hold shift while clicking `Invoke` button to queue front.
- Restore queue menu actions w/ the reclaimed space.
- Simplify queue interaction hooks.
The lineart model often outputs a lot of almost-black noise. SD1.5 ControlNets seem to be OK with this, but SDXL ControlNets are not - they need a cleaner map. 12 was experimentally determined to be a good threshold, eliminating all the noise while keeping the actual edges. Other approaches to thresholding may be better, for example stretching the contrast or removing noise.
I tried:
- Simple thresholding (as implemented here) - works fine.
- Adaptive thresholding - doesn't work, because the thresholding is done in the context of small blocks, while we want thresholding in the context of the whole image.
- Gamma adjustment - alters the white values too much. Hard to tuen.
- Contrast stretching, with and without pre-simple-thresholding - this allows us to treshold out the noise, then stretch everything above the threshold down to almost-zero. So you have a smoother gradient of lightness near zero. It works but it also stretches contrast near white down a bit, which is probably undesired.
In the end, simple thresholding works fine and is very simple.
The HTML Canvas context has an `imageSmoothingEnabled` property which defaults to `true`. This causes the browser canvas API to, well, apply image smoothing - everything gets antialiased when drawn.
This is, of course, problematic when our goal is to be pixel-perfect. When the same image is drawn multiple times, we get progressive image degradation.
In `CanvasEntityObjectRenderer.cloneObjectGroup()`, where we use Konva's `Node.cache()` method to create a canvas from the entity's objects. Here, we were not setting `imageSmoothingEnabled` to false. This method is used very often by the compositor and we end up feeding back antialiased versions of the image data back into the canvas or generation backend.
Disabling smoothing here appears to fix the issue. I've also disabled image smoothing everywhere else we interact with a canvas rendering context.
The checkerboard background was rendered as a separate DOM element that stretched to fill the canvas container.
While the canvas width and height are always integers, this background element could have non-integer dimensions, depending on panel sizes.As a result, it could be slightly larger than the canvas, introducing a fine border around the canvas.
This is purely a visual issue, but it's very noticeable when you use the bbox overlay. It also can be noticed with masks that extend beyond the edge of the visible canvas.
- Refactor the checkerboard background to be rendered by the canvas instead of as a DOM element, resolving the issue.
- Add a helper method to get the scaled rect of the stage, updating a few places where we need such a rect.
- Rename `CanvasStageModule.getScaledPixels` method to `unscale`, clarifying its purpose.
Track various canvas states:
- Filtering an entity
- Transforming an entity
- Rasterizing an entity
- Compositing
- Busy (derived from all of the above)
Also track individual entity states:
- Locked
- Disabled
- All of type are hidden
- Has objects
- Interactable (derived from all of the above)
These states then gate various actions. For example:
- Cannot invoke while the canvas is busy.
- Cannot transform, filter, duplicate, or delete when the canvas is busy.
Tool interaction restrictions are not yet implemented.
## Summary
This PR splits the lora.py monolith into separate files. The main
motivation for doing this in a standalone PR is to make the diffs more
interpretable in the [upcoming
changes](https://github.com/invoke-ai/InvokeAI/compare/main...ryan/flux-lora-sidecar)
to support LoRAs for FLUX.
This PR does not make any functional changes - it just moves files
around and changes import paths.
## QA Instructions
I smoke tested generation with LoRA, LoHA, LoKr, and IA3.
## Merge Plan
No special instructions. Merge on approval.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
- Add backcompat for cnet model default settings
- Default filter selection based on model type
- Updated UI components to use new filter nodes
- Added handling for failed filter executions, preventing filter from getting stuck in case it failed for some reason
- New translations for all filters & fields
Use a generic to narrow the `type` field from `string` to a literal. Now you can do e.g. `adapter.type === 'control_layer_adapter'` and TS narrows the type.
Similar to the existing node, but without any resizing. The backend logic was consolidated and modified so that it the model loading can be managed by the model manager.
The ONNX Runtime `InferenceSession` class was added to the `AnyModel` union to satisfy the type checker.
Similar to the existing node, but without any resizing and with a revised model loading API that uses the model manager.
All code related to the invocation now lives in the Invoke repo.
Similar to the existing node, but without any resizing and with a revised model loading API that uses the model manager.
All code related to the invocation now lives in the Invoke repo. Unfortunately, this includes a whole git repo for EfficientNet. I believe we could use the package `timm` instead of this, but it's beyond me.
Similar to the existing node, but without any resizing and with a revised model loading API that uses the model manager.
All code related to the invocation now lives in the Invoke repo.
Similar to the existing node, but without any resizing and with a revised model loading API that uses the model manager.
All code related to the invocation now lives in the Invoke repo.
So far, this includes:
- Save Canvas to Gallery
- Save Bbox to Gallery
- Send Bbox to Regional IP Adapter
- Send Bbox to Global IP Adapter
- Send Bbox to Control Layer
- Send Bbox to Raster Layer
To prevent losing all ephemeral canvas stage when switching tabs, we will refrain from destroying the canvas manager instance when its tab unmounts, and use the existing canvas manager instance on mount, if there is one.
One small change required in `CanvasStageModule` - a `setContainer` method to update the konva stage DOM element.
- Add `reset` functionality
- Rename badly named `autoPreviewFilter` to `autoProcessFilter`
- Do not process filter when starting, unless `autoProcessFilter` is enabled
This includes some fixes for the composite number input component's local value handling, resolving an infinite recursion problem when an invalid value is set.
Snap can be any of off, 8px or 64px.
The snap is used when moving and transforming entities.
When transforming and locking aspect ratio, the snap is ignored entirely, because we'd change the aspect ratio if we forced the snap.
Otherwise, if we are not locking aspect ratio (e.g. the user is holding shift), we snap the transform anchors to the grid.
Realized we can use listener middleware to respond to _actions_, as opposed to using the redux store subscription to respond to _state changes_... This might simplify some things.
Using this pattern here.
Only hiccup - there's a TS issue preventing this from being added to the state api module. The `addListener` method has an overloaded type signature and TS cannot extract the overloaded arg type using `Parameters<T>`. As a result, if we try to wrap this, we end up with a broken TS signature for the wrapper method.
There's a race condition where we sometimes get progress events from canceled queue items, depending on the timing of the cancellation request and last event or two from the queue item.
I can't imagine how to resolve this except by tracking all cancellations and ignoring events for cancelled items, which is implemented in this change.
- Add selectors to get the default control adapter and ip adapter with model, preferring controlnet over t2i adapter for model
- Add hooks to add each entity type, using the defaults
- Add hooks to add prompts/ip adapters to a regional guidance layer
- Use the defaults in other places where we add control layers or ip adapters (e.g. dnd-triggered entity creation)
- Each entity gets its own `CanvasEntityFilterer`
- Add auto-preview feature to filter, debounced by 1000ms leading + trailing
- Fix flash when preview updates
When resetting the canvas or staging area, we don't want to cancel generations that are going to the gallery - only those going to the canvas.
Thus the method should not cancel by origin, but instead cancel by destination.
Update the queue method and route.
Use the min of each pixel's alpha value and lightness for the output alpha. This prevents artifacts when using the transparency effect, especially with non-black pixels with low alpha.
- Rely on redux + reselect more
- Remove all nanostores that simply "mirrored" redux state in favor of direct subscriptions to redux store
- Add abstractions for creating redux subs and running selectors
- Add `initialize` method to CanvasModuleBase, for post-instantiation tasks
- Reduce local caching of state in modules to a minimum
Big cleanup. Makes these classes easier to implement, lots of comments and docstrings to clarify how it all works.
- Add default implementations for `destroy`, `repr` and `getLoggingContext`
- Tidy individual module configs
- Update `CanvasManager.buildLogger` to accept a canvas module as the arg
- Add `CanvasManager.buildPath`
TBH not sure exactly why this broke. Fixed by rollback back the use of a render prop in favor of global state. Also revised the API of `useBoolean` and `buildUseBoolean`.
- Canvas generation mode is replace with a boolean `sendToCanvas` flag. When off, images generated on the canvas go to the gallery. When on, they get added to the staging area.
- When an image result is received, if its destination is the canvas, staging is automatically started.
- Updated queue list to show the destination column.
- Added `IconSwitch` component to represent binary choices, used for the new `sendToCanvas` flag and image viewer toggle.
- Remove the queue actions menu in `QueueControls`. Move the queue count badge to the cancel button.
- Redo layout of `QueueControls` to prevent duplicate queue count badges.
- Fix issue where gallery and options panels could show thru transparent regions of queue tab.
- Disable panel hotkeys when on mm/queue tabs.
The frontend needs to know where queue items came from (i.e. which tab), and where results are going to (i.e. send images to gallery or canvas). The `origin` column is not quite enough to represent this cleanly.
A `destination` column provides the frontend what it needs to handle incoming generations.
This hook forcibly updates _all_ portals with `data-hidden=true` when the modal opens - then reverts it when the modal closes. It's intended to help screen readers. Unfortunately, this absolutely tanks performance because we have many portals. React needs to do alot of layout calculations (not re-renders).
IMO this behaviour is a bug in chakra. The modals which generated the portals are hidden by default, so this data attr should really be set by default. Dunno why it isn't.
Previously this badge, floating over the queue menu button next to the invoke button, was rendered within the existing layout. When I initially positioned it, the app layout interfered - it would extend into an area reserved for a flex gap, which cut off the badge.
As a (bad) workaround, I had shifted the whole app down a few pixels to make room for it. What I should have done is what I've done in this commit - render the badge in a portal to take it out of the layout so we don't need that extra vertical padding.
Sleekified some styling a bit too.
The canvas size was dynamic based on the container div's size. When the div was hidden (e.g. when selecting another tab), the container's effective size is 0. This resulted in the preview image canvas being drawn at a scale of 0.
Fixed by using an absolute size for the canvas container.
- Add lock toggle
- Tweak lock and enabled styles
- Update entity list action bar w/ delete & delete all
- Move add layer menu to action bar
- Adjust opacity slider style
- Throttle pushing to history for actions of the same type, starting with 1000ms throttle.
- History has a limit of 64 items, same as workflow editor
- Add clear history button
- Fix an issue where entity transformers would reset the entity state when the entity is fully transparent, resetting the redo stack. This could happen when you undo to the starting state of a layer
I learned that the inline selector syntax recreates the selector function on every render:
```ts
const val = useAppSelector((s) => s.slice.val)
```
Not good! Better is to create a selector outside the function and use it. Doing that for all selectors now, most of the way through now. Feels snappier.
Things like `$lastCursorPos` are now created within the canvas drawing classes. Consumers in react access them via `useCanvasManager`.
For example:
```tsx
const canvasManager = useCanvasManager();
const lastCursorPos = useStore(canvasManager.stateApi.$lastCursorPos);
```
Previously, canvas actions specific to an entity type only needed the id of that entity type. This allowed you to pass in the id of an entity of the wrong type.
All actions for a specific entity now take a full entity identifier, and the entity identifier type can be narrowed.
`selectEntity` and `selectEntityOrThrow` now need a full entity identifier, and narrow their return values to a specific entity type _if_ the entity identifier is narrowed.
The types for canvas entities are updated with optional type parameters for this purpose.
All reducers, actions and components have been updated.
While we lose the benefit of the caches persisting across reloads, this is a much simpler way to handle things. If we need a persistent cache, we can explore it in the future.
- use `stable-hash` to generate stable, non-crypto hashes for cache entries, instead of using deep object comparisons
- use an object to store image name caches
Sequence of events causing the race condition:
- Enqueue batch
- Invalidate `SessionQueueStatus` tag
- Request updated queue status via HTTP - batch still processing at this point
- Batch completes
- Event emitted saying so
- Optimistically update the queue status cache, it is correct
- HTTP request makes it back and overwrites the optimistic update, indicating the batch is still in progress
FIxed by not invalidating the cache.
Download events and invocation status events (including progress images) are very frequent. There's no real need for these to pass through redux. Handling them outside redux is a significant performance win - far fewer store subscription calls, far fewer trips through middleware.
All event handling is moved outside middleware. Cleanup of unused actions and listeners to follow.
- create a context for entity identifiers, massively simplifying UI for each entity int he list
- consolidate common redux actions
- remove now-unused code
The origin is an optional field indicating the queue item's origin. For example, "canvas" when the queue item originated from the canvas or "workflows" when the queue item originated from the workflows tab. If omitted, we assume the queue item originated from the API directly.
- Add migration to add the nullable column to the `session_queue` table.
- Update relevant event payloads with the new field.
- Add `cancel_by_origin` method to `session_queue` service and corresponding route. This is required for the canvas to bail out early when staging images.
- Add `origin` to both `SessionQueueItem` and `Batch` - it needs to be provided initially via the batch and then passed onto the queue item.
-
Instead of chaining konva `find` and `findOne` methods, all konva nodes are added to a mapping object. Finding and manipulating them is much simpler.
Done for regions and layers, wip for control adapters.
Subscribe to redux store directly, skipping all the react overhead.
With react in dev mode, a typical frame while using the brush tool on almost-empty canvas is reduced from ~7.5ms to ~3.5ms. All things considered, this still feels slow, but it's a massive improvement.
- Create separate object types for brush and eraser lines, instead of a single type that has a `tool` field.
- Create new object type for rect shapes.
- Add logic to schemas to migrate old object types to new.
- Update renderers & reducers.
Currently translated at 98.2% (1350 of 1374 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.2% (1350 of 1374 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.2% (1350 of 1374 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.4% (1349 of 1370 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.4% (1348 of 1369 strings)
Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
* [MM] add API routes for getting & setting MM cache sizes, and retrieving MM stats
* Update invokeai/app/api/routers/model_manager.py
Co-authored-by: Ryan Dick <ryanjdick3@gmail.com>
* code cleanup after @ryand review
* Update invokeai/app/api/routers/model_manager.py
Co-authored-by: Ryan Dick <ryanjdick3@gmail.com>
* fix merge conflicts; tested and working
---------
Co-authored-by: Lincoln Stein <lstein@gmail.com>
Co-authored-by: Ryan Dick <ryanjdick3@gmail.com>
## Summary
This PR adds support for Image-to-Image and inpainting workflows with
the FLUX model.
Full changelog:
- Split out `FLUX VAE Encode` and `FLUX VAE Decode` nodes
- Renamed `FLUX Text-to-Image` node to `FLUX Denoise` (since it now
supports image-to-image too). This is a workflow-breaking change.
- Added support for FLUX image-to-image via the `Latents` param on the
FLUX denoising node.
- Added support for FLUX masked inpainting via the `Denoise Mask` param
on the FLUX denoising node.
- Added "Denoise Start" and "Denoise End" params to the "FLUX Denoise"
node.
- Updated the "FLUX Text to Image" default workflow.
- Added a "FLUX Image to Image" default workflow.
### Example
FLUX inpainting workflow
<img width="1282" alt="image"
src="https://github.com/user-attachments/assets/86fc1170-e620-4412-8fd8-e119f875fc2e">
Input image

Mask

Output image

### Callouts for reviewers:
- I renamed FLUXTextToImageInvocation -> FLUXDenoisingInvocation. This
is, of course, a breaking change. It feels like the right move and now
is the right time to do it. Any objection?
- I added new `FLUX VAE Encode` and `FLUX VAE Decode` nodes.
Alternatively, I could have tried to match these names to the
corresponding SD nodes (e.g. `FLUX Image to Latents`, `FLUX Latents to
Image`). Personally, I prefer the current names, but want to hear other
opinions.
### Usage notes:
- With the default dev timestep scheduler, the image structure is
largely determined in the first ~3 steps. A consequence of this is that
the denoise_start parameter provides limited 'granularity' of control.
This will likely be improved in the future as we add more scheduler
options. In the meantime, you will likely want to use small values for
`denoise_start` (e.g. 0.03) to start denoising on step ~1-4 out of ~30.
- Currently, there is no 'noise' parameter on the `FLUX Denoise` node,
so the `denoise_end` parameter has limited utility. This will be added
in the future.
## QA Instructions
Test the following workflows:
- [x] Vanilla FLUX text-to-image behaviour is unchanged
- [x] Image-to-image with FLUX dev, no mask
- [x] Image-to-image with FLUX dev, with mask
- [x] Image-to-image with FLUX schnell, no mask (smoke test, not
expected to work well)
## Merge Plan
No special instructions.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
Allocates the specified amount of VRAM, or allocates enough VRAM such that you have the specified amount of VRAM free.
Useful to simulate an environment with a specific amount of VRAM.
## Summary
This PR contains several improvements to memory management for FLUX
workflows.
It is now possible to achieve better FLUX model caching performance, but
this still requires users to manually configure their `ram`/`vram`
settings. E.g. a `vram` setting of 16.0 should allow for all quantized
FLUX models to be kept in memory on the GPU.
Changes:
- Check the size of a model on disk and free the requisite space in the
model cache before loading it. (This behaviour existed previously, but
was removed in https://github.com/invoke-ai/InvokeAI/pull/6072/files.
The removal did not seem to be intentional).
- Removed the hack to free 24GB of space in the cache before loading the
FLUX model.
- Split the T5 embedding and CLIP embedding steps into separate
functions so that the two models don't both have to be held in RAM at
the same time.
- Fix a bug in `InvokeLinear8bitLt` that was causing some tensors to be
left on the GPU when the model was offloaded to the CPU. (This class is
getting very messy due to the non-standard state_dict handling in
`bnb.nn.Linear8bitLt`. )
- Tidy up some dtype handling in FluxTextToImageInvocation to avoid
situations where we hold references to two copies of the same tensor
unnecessarily.
- (minor) Misc cleanup of ModelCache: improve docs and remove unused
vars.
Future:
We should revisit our default ram/vram configs. The current defaults are
very conservative, and users could see major performance improvements
from tuning these values.
## QA Instructions
I tested the FLUX workflow with the following configurations and
verified that the cache hit rates and memory usage matched the expected
behaviour:
- `ram = 16` and `vram = 16`
- `ram = 16` and `vram = 1`
- `ram = 1` and `vram = 1`
Note that the changes in this PR are not isolated to FLUX. Since we now
check the size of models on disk, we may see slight changes in model
cache offload patterns for other models as well.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
## Summary
- Add selectedStylePreset to app parameters
<!--A description of the changes in this PR. Include the kind of change
(fix, feature, docs, etc), the "why" and the "how". Screenshots or
videos are useful for frontend changes.-->
## Related Issues / Discussions
<!--WHEN APPLICABLE: List any related issues or discussions on github or
discord. If this PR closes an issue, please use the "Closes #1234"
format, so that the issue will be automatically closed when the PR
merges.-->
## QA Instructions
<!--WHEN APPLICABLE: Describe how you have tested the changes in this
PR. Provide enough detail that a reviewer can reproduce your tests.-->
## Merge Plan
<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->
## Checklist
- [ ] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
These two scripts are broken and can cause data loss. Remove them.
They are not in the launcher script, but _are_ available to users in the terminal/file browser.
Hopefully, when we removing them here, `pip` will delete them on next installation of the package...
The root cause was the active style preset not being reset when it was deleted, or no longer present in the list of style presets.
- Add extra reducer to `stylePresetSlice` to reset the active preset if it is deleted or otherwise unavailable
- Update the dynamic prompts listener to trigger on delete/update/list of style presets
When invoke.sh is executed using a symlink with a working directory outside of InvokeAI's root directory, it will fail.
invoke.sh attempts to cd into the correct directory at the start of the script, but will cd into the directory of the symlink instead. This commit fixes that.
## Summary
Adds option to download all prompt templates to a CSV
## Related Issues / Discussions
<!--WHEN APPLICABLE: List any related issues or discussions on github or
discord. If this PR closes an issue, please use the "Closes #1234"
format, so that the issue will be automatically closed when the PR
merges.-->
## QA Instructions
<!--WHEN APPLICABLE: Describe how you have tested the changes in this
PR. Provide enough detail that a reviewer can reproduce your tests.-->
## Merge Plan
<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->
## Checklist
- [ ] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
## Summary
added a base prop for selectedWorkflow to allow loading a workflow on
launch
<!--A description of the changes in this PR. Include the kind of change
(fix, feature, docs, etc), the "why" and the "how". Screenshots or
videos are useful for frontend changes.-->
## Related Issues / Discussions
<!--WHEN APPLICABLE: List any related issues or discussions on github or
discord. If this PR closes an issue, please use the "Closes #1234"
format, so that the issue will be automatically closed when the PR
merges.-->
## QA Instructions
can test by loading InvokeAIUI with a selectedWorkflow prop of the
workflow ID
<!--WHEN APPLICABLE: Describe how you have tested the changes in this
PR. Provide enough detail that a reviewer can reproduce your tests.-->
## Merge Plan
<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->
## Checklist
- [ ] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
- Enforce name is present and not an empty string
- Provide empty string as default for positive and negative prompt
- Add `positive_prompt` as validation alias for `prompt` field
- Strip whitespace automatically
- Create `TypeAdapter` to validate the whole list in one go
Currently translated at 98.5% (1336 of 1355 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.5% (1302 of 1321 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.6% (1302 of 1320 strings)
Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
## Summary
Adds prompt templates to the UI. Demo video is attached.
* added default prompt templates to seed database on startup (these
cannot be edited or deleted by users via the UI)
* can create fresh prompt template, create from an image in gallery that
has prompt metadata, or copy an existing prompt template and modify
* if a template is active, can view what your prompt will be invoked as
by switching to "view mode"
https://github.com/user-attachments/assets/32d84e0c-b04c-48da-bae5-aa6eb685d209
## Related Issues / Discussions
<!--WHEN APPLICABLE: List any related issues or discussions on github or
discord. If this PR closes an issue, please use the "Closes #1234"
format, so that the issue will be automatically closed when the PR
merges.-->
## QA Instructions
<!--WHEN APPLICABLE: Describe how you have tested the changes in this
PR. Provide enough detail that a reviewer can reproduce your tests.-->
## Merge Plan
<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->
## Checklist
- [ ] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
Around the time we (I) implemented pydantic events, I noticed a short pause between progress images every 4 or 5 steps when generating with SDXL. It didn't happen with SD1.5, but I did notice that with SD1.5, we'd get 4 or 5 progress events simultaneously. I'd expect one event every ~25ms, matching my it/s with SD1.5. Mysterious!
Digging in, I found an issue is related to our use of a synchronous queue for events. When the event queue is empty, we must call `asyncio.sleep` before checking again. We were sleeping for 100ms.
Said another way, every time we clear the event queue, we have to wait 100ms before another event can be dispatched, even if it is put on the queue immediately after we start waiting. In practice, this means our events get buffered into batches, dispatched once every 100ms.
This explains why I was getting batches of 4 or 5 SD1.5 progress events at once, but not the intermittent SDXL delay.
But this 100ms wait has another effect when the events are put on the queue in intervals that don't perfectly line up with the 100ms wait. This is most noticeable when the time between events is >100ms, and can add up to 100ms delay before the event is dispatched.
For example, say the queue is empty and we start a 100ms wait. Then, immediately after - like 0.01ms later - we push an event on to the queue. We still need to wait another 99.9ms before that event will be dispatched. That's the SDXL delay.
The easy fix is to reduce the sleep to something like 0.01 seconds, but this feels kinda dirty. Can't we just wait on the queue and dispatch every event immediately? Not with the normal synchronous queue - but we can with `asyncio.Queue`.
I switched the events queue to use `asyncio.Queue` (as seen in this commit), which lets us asynchronous wait on the queue in a loop.
Unfortunately, I ran into another issue - events now felt like their timing was inconsistent, but in a different way than with the 100ms sleep. The time between pushing events on the queue and dispatching them was not consistently ~0ms as I'd expect - it was highly variable from ~0ms up to ~100ms.
This is resolved by passing the asyncio loop directly into the events service and using its methods to create the task and interact with the queue. I don't fully understand why this resolved the issue, because either way we are interacting with the same event loop (as shown by `asyncio.get_running_loop()`). I suppose there's some scheduling magic happening.
There's a FastAPI bug that results in the OpenAPI spec outputting the same operation id for each operation when specifying multiple HTTP methods.
- Discussion: https://github.com/tiangolo/fastapi/discussions/8449
- Pending PR to fix: https://github.com/tiangolo/fastapi/pull/10694
In our case, we have a `get_image_full` endpoint that handles GET and HEAD.
This results in an invalid OpenAPI schema. A workaround is to use two route decorators for the operation handler. This works as expected - HEAD requests get the header, and GET requests get the resource. And the OpenAPI schema is valid.
- Updated the previous DepthAnything manual implementation to use the
`transformers` implementation instead. So we can get upstream features.
- Plugged in the DepthAnything models to be handled by Invoke's Model
Manager.
- `small_v2` model will use DepthAnythingV2. This has been added as a
new model option and is now also the default in the Linear UI.

# Merge
Review and merge.
Currently translated at 98.6% (1303 of 1321 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.6% (1302 of 1320 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.6% (1294 of 1312 strings)
Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
There was a problem w/ this release on windows and the builds were pulled from pypi. When installing invoke on windows, pip attempts to build from source, but most (all?) systems won't have the prerequisites for this and installs fail.
This also affects GH actions.
The simple fix is to exclude version 3.9.1 from our deps.
For more information, see https://github.com/matplotlib/matplotlib/issues/28551
## Summary
This PR enables Grounded SAM workflows
(https://arxiv.org/pdf/2401.14159) via the following:
- `GroundingDinoInvocation` for running a Grounding DINO model.
- `SegmentAnythingModelInvocation` for running a SAM model.
- `MaskTensorToImageInvocation` for convenient visualization.
Other notes:
- Uses the transformers implementation of Grounding DINO and SAM.
- The new models are treated as 'utility models' meaning that they are
not visible in the Models tab, and are downloaded automatically the
first time that they are used.
<img width="874" alt="image"
src="https://github.com/user-attachments/assets/1cbaa97d-0e27-4943-86b1-dc7327ba8675">
## Example
Input image

Prompt: "wheels", all other configs default
Result:

## Related Issues / Discussions
Thanks to @blessedcoolant for the initial draft here:
https://github.com/invoke-ai/InvokeAI/pull/6678
## QA Instructions
Manual tests:
- [ ] Test that default settings work well.
- [ ] Test with / without apply_polygon_refinement
- [ ] Test mask_filter options
- [ ] Test detection_threshold values
- [ ] Test RGB input image
- [ ] Test RGBA input image
- [ ] Test grayscale input image
- [ ] Smoke test that an empty mask is returned when 0 objects are
detected
- [ ] Test on CPU
- [ ] Test on MPS (Works on Mac OS, but had to force both models to run
on CPU instead of MPS)
Performance:
- Peak GPU memory utilization with both Grounding DINO and SAM models
loaded is ~4.5GB. (The models do not need to be loaded at the same time,
so could be offloaded by the MM if needed.)
- On an RTX4090, with the models already cached, node execution takes
~0.6 secs.
- On my CPU, with the models cached, node execution takes ~10secs.
## Merge Plan
No special instructions.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
## Summary
- we want a way to load the studio while being directed to a specific
tab, introduced a destination prop to achieve that
<!--A description of the changes in this PR. Include the kind of change
(fix, feature, docs, etc), the "why" and the "how". Screenshots or
videos are useful for frontend changes.-->
## Related Issues / Discussions
<!--WHEN APPLICABLE: List any related issues or discussions on github or
discord. If this PR closes an issue, please use the "Closes #1234"
format, so that the issue will be automatically closed when the PR
merges.-->
## QA Instructions
<!--WHEN APPLICABLE: Describe how you have tested the changes in this
PR. Provide enough detail that a reviewer can reproduce your tests.-->
## Merge Plan
<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->
## Checklist
- [ ] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
## Summary
Code for lora patching from #6577.
Additionally made it the way, that lora can patch not only `weight`, but
also `bias`, because saw some loras which doing it.
## Related Issues / Discussions
#6606https://invokeai.notion.site/Modular-Stable-Diffusion-Backend-Design-Document-e8952daab5d5472faecdc4a72d377b0d
## QA Instructions
Run with and without set `USE_MODULAR_DENOISE` environment.
## Merge Plan
Replace old lora patcher with new after review done.
If you think that there should be some kind of tests - feel free to add.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
## Summary
Gradient mask node outputs mask tensor with values in range [-1, 1],
which unexpected range for mask.
It handled in denoise node the way it translates to [0, 2] mask, which
looks even more wrongly)
From discussion with @dunkeroni I understand him as he thought that
negative values will be treated same as 0, so clamping values not change
intended node logic.
## Related Issues / Discussions
#6643
## QA Instructions
\-
## Merge Plan
\-
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
## Summary
Add karras variants of `deis`, `unipc`, `kdpm2` and `kdpm_2_a`
schedulers.
Also added `dpmpp_3` schedulers, but `dpmpp_3s` currently bugged, so
added only 3m:
https://github.com/huggingface/diffusers/issues/9007
## Related Issues / Discussions
\-
## QA Instructions
\-
## Merge Plan
~@psychedelicious We need to decide what to do with schedulers order, as
it looks a bit broken:~

## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
## Summary
<!--A description of the changes in this PR. Include the kind of change
(fix, feature, docs, etc), the "why" and the "how". Screenshots or
videos are useful for frontend changes.-->
## Related Issues / Discussions
<!--WHEN APPLICABLE: List any related issues or discussions on github or
discord. If this PR closes an issue, please use the "Closes #1234"
format, so that the issue will be automatically closed when the PR
merges.-->
## QA Instructions
<!--WHEN APPLICABLE: Describe how you have tested the changes in this
PR. Provide enough detail that a reviewer can reproduce your tests.-->
## Merge Plan
<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->
## Checklist
- [ ] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
## Summary
Code for inpainting and inpaint models handling from
https://github.com/invoke-ai/InvokeAI/pull/6577.
Separated in 2 extensions as discussed briefly before, so wait for
discussion about such implementation.
## Related Issues / Discussions
#6606https://invokeai.notion.site/Modular-Stable-Diffusion-Backend-Design-Document-e8952daab5d5472faecdc4a72d377b0d
## QA Instructions
Run with and without set `USE_MODULAR_DENOISE` environment.
Try and compare outputs between backends in cases:
- Normal generation on inpaint model
- Inpainting on inpaint model
- Inpainting on normal model
## Merge Plan
Nope.
If you think that there should be some kind of tests - feel free to add.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
We were checking the selected and auto-add board ids against the query cache to see if they still exist. If not, we reset.
This only works if the query cache is updated by the time we do the check - race condition!
We already have the board id from the query args, so there's no need to check the query cache - just compare the deleted board ID directly.
Previously this file's several listeners were all in a single one and I had adapted/split its logic up a bit wonkily, introducing these problems.
The logic was incorrect in two ways:
1. We only ran the logic if we _enable_ showing archived boards. It should be run we we _disable_ showing archived boards.
2. If we couldn't find the selected board in the query cache, we didn't do the reset. This is wrong - if the board isn't in the query cache, we _should_ do the reset. This inverted logic makes more sense before the fix for issue 1.
## Summary
T2I Adapter code from #6577.
## Related Issues / Discussions
#6606https://invokeai.notion.site/Modular-Stable-Diffusion-Backend-Design-Document-e8952daab5d5472faecdc4a72d377b0d
## QA Instructions
Run with and without set `USE_MODULAR_DENOISE` environment.
## Merge Plan
Nope.
If you think that there should be some kind of tests - feel free to add.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
## Summary
Seamless code from #6577.
## Related Issues / Discussions
#6606https://invokeai.notion.site/Modular-Stable-Diffusion-Backend-Design-Document-e8952daab5d5472faecdc4a72d377b0d
## QA Instructions
Run with and without set `USE_MODULAR_DENOISE` environment.
## Merge Plan
Nope.
If you think that there should be some kind of tests - feel free to add.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
The model edit UI's composition allows for the model edit form to be instantiated before the model's config has been received. This results in the form having no values - all the fields are blank instead of populated by the model config.
Part of the fix is to pass the model config around directly instead of relying on _all_ components to fetch the model directly.
I also fixed a crapload of performance issues related to improper use of redux selectors.
Problems this was causing:
- Deleting an edge was a copy of another edge deletes both edges
- Deleting a node that was a copy-with-edges of another node deletes its edges and it's original edges, leaving what I will call "ghost noodles" behind
Previously you could spam the next/prev buttons and really thrash the server. Throttled to 500ms, which feels like a happy medium between responsive and not-thrash-y.
- Autofocus on popover open
- Autoselect number on popover open
- Enter works to change page when input is focused
- Esc works to close popover when input is focused
It was possible to clear the search term while a debounced setSearchTerm is still pending. This resulted in the gallery getting out of sync w/ the search term.
To fix this, we need to lift the state up a bit and cancel any pending debounced setSearchTerm calls when closing the search or clearing the search term box.
`spandrel_image_to_image` now just runs the model with no changes.
`spandrel_image_to_image_autoscale` runs the model repeatedly until the desired scale is reached. previously, `spandrel_image_to_image` did this.
* [MM2] replace untyped config dict passed to install_model with typed ModelRecordChanges
- adjusted frontend to work with new schema
- used this facility to assign "starter model" names and descriptions to the installed
models.
* documentation fix
* [MM2] replace untyped config dict passed to install_model with typed ModelRecordChanges
- adjusted frontend to work with new schema
- used this facility to assign "starter model" names and descriptions to the installed
models.
* documentation fix
* remove v9 pnpm lockfile
* [MM2] replace untyped config dict passed to install_model with typed ModelRecordChanges
- adjusted frontend to work with new schema
- used this facility to assign "starter model" names and descriptions to the installed
models.
* [MM2] replace untyped config dict passed to install_model with typed ModelRecordChanges
- adjusted frontend to work with new schema
- used this facility to assign "starter model" names and descriptions to the installed
models.
* remove v9 pnpm lockfile
* regenerate schema.ts
* prettified
---------
Co-authored-by: Lincoln Stein <lstein@gmail.com>
## Summary
Update Simple Upscale Button to work with spandrel models, add
UpscaleWarning when models aren't available, clean up ESRGAN logic
<!--A description of the changes in this PR. Include the kind of change
(fix, feature, docs, etc), the "why" and the "how". Screenshots or
videos are useful for frontend changes.-->
## Related Issues / Discussions
<!--WHEN APPLICABLE: List any related issues or discussions on github or
discord. If this PR closes an issue, please use the "Closes #1234"
format, so that the issue will be automatically closed when the PR
merges.-->
## QA Instructions
<!--WHEN APPLICABLE: Describe how you have tested the changes in this
PR. Provide enough detail that a reviewer can reproduce your tests.-->
## Merge Plan
<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->
## Checklist
- [ ] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
## Summary
ControlNet code from #6577.
## Related Issues / Discussions
#6606https://invokeai.notion.site/Modular-Stable-Diffusion-Backend-Design-Document-e8952daab5d5472faecdc4a72d377b0d
## QA Instructions
Run with and without set `USE_MODULAR_DENOISE` environment.
## Merge Plan
Merge #6641 firstly, to be able see output difference properly.
If you think that there should be some kind of tests - feel free to add.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
## Summary
Rescale CFG code from #6577.
## Related Issues / Discussions
#6606https://invokeai.notion.site/Modular-Stable-Diffusion-Backend-Design-Document-e8952daab5d5472faecdc4a72d377b0d
## QA Instructions
Run with and without set `USE_MODULAR_DENOISE` environment.
~~Note: for some reasons there slightly different output from run to
run, but I able sometimes to get same output on main and this branch.~~
Fix presented in #6641.
## Merge Plan
~~Nope.~~ Merge #6641 firstly, to be able see output difference
properly.
If you think that there should be some kind of tests - feel free to add.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
## Summary
- currently the total for uncategorized images is not updating when
moving and deleting images, this will update that count when making
those actions
<!--A description of the changes in this PR. Include the kind of change
(fix, feature, docs, etc), the "why" and the "how". Screenshots or
videos are useful for frontend changes.-->
## Related Issues / Discussions
<!--WHEN APPLICABLE: List any related issues or discussions on github or
discord. If this PR closes an issue, please use the "Closes #1234"
format, so that the issue will be automatically closed when the PR
merges.-->
## QA Instructions
<!--WHEN APPLICABLE: Describe how you have tested the changes in this
PR. Provide enough detail that a reviewer can reproduce your tests.-->
## Merge Plan
<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->
## Checklist
- [ ] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
## Summary
Fix function call that we forgot to update in #6606
## QA Instructions
Run a TiledMultiDiffusionDenoiseLatents invocation and make sure it
doesn't crash.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
## Summary
Base code of new modular backend from #6577.
Contains normal generation and regional prompts support.
Also preview extension included to test if extensions logic works.
## Related Issues / Discussions
https://invokeai.notion.site/Modular-Stable-Diffusion-Backend-Design-Document-e8952daab5d5472faecdc4a72d377b0d
## QA Instructions
Run with and without set `USE_MODULAR_DENOISE` environment.
Currently only normal and regional conditionings supported, so just
generate some images and compare with main output.
## Merge Plan
Discuss a bit more about injection point names?
As if for example in future unet will be overridable, current
`pre_unet`/`post_unet` assumes to name override as `unet` what feels a
bit odd.
Also `apply_cfg` - future implementation could ignore/not use cfg, so in
this case `combine_noise_predictions`/`combine_noise` seems more
suitable.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
## Summary
This PR adds some spandrel upscale models to the starter model list.
In the future we may also want to add:
- Some DAT models
(https://drive.google.com/drive/folders/1iBdf_-LVZuz_PAbFtuxSKd_11RL1YKxM)
## QA Instructions
I installed the starter models via the model manager UI, and tested that
I could use them in a workflow.
## Merge Plan
- [ ] Merge the preceding Spandrel PRs first, then change the target
branch to `main`.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
## Summary
Add tiling to the `SpandrelImageToImageInvocation` node so that it can
process large images.
Tiling enables this node to run on effectively any input image
dimension. Of course, the computation time increases quadratically with
the image dimension.
Some profiling results on an RTX4090:
- Input 1024x1024, 4x upscale, 4x UltraSharp ESRGAN: `13 secs`, `<4 GB
VRAM`
- Input 4096x4096, 4x upscale, 4x UltraSharop ESRGAN: `46 secs`, `<4 GB
VRAM`
- Input 4096x4096, 2x upscale, SwinIR: `165 secs`, `<5 GB VRAM`
A lot of the time is spent PNG encoding the final image:
- PNG encoding of a 16384x16384 image takes `83secs @
pil_compress_level=7`, `24secs @ pil_compress_level=1`
Callout: If we want to start building workflows that pass large images
between nodes, we are going to have to find a way to avoid the PNG
encode/decode roundtrip that we are currently doing. As is, we will be
incurring a huge penalty for every node that receives/produces a large
image.
## QA Instructions
- [x] Tested with tiling up to 4096x4096 -> 16384x16384.
- [x] Test on images with an alpha channel (the alpha channel is
dropped).
- [x] Test on images with odd dimension.
- [x] Test no tiling (`tile_size=0`)
## Merge Plan
- [x] Merge #6556 first, and change the target branch to `main`.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
## Summary
- Add support for all
[spandrel](https://github.com/chaiNNer-org/spandrel) image-to-image
models - this is a collection of many popular super-resolution models
(e.g. ESRGAN, Real-ESRGAN, SwinIR, DAT, etc.)
Examples of supported models:
- DAT:
https://drive.google.com/drive/folders/1iBdf_-LVZuz_PAbFtuxSKd_11RL1YKxM
- SwinIR: https://github.com/JingyunLiang/SwinIR/releases
- Any ESRGAN / Real-ESRGAN model
## Related Issues
Closes#6394
## QA Instructions
- [x] Test that unsupported models still fail the probe (i.e. no false
positive spandrel models)
- [x] Test adding a few non-spandrel model types
- [x] Test adding a handful of spandrel model types: ESRGAN,
Real-ESRGAN, SwinIR, DAT
- [x] Verify model size estimation for the model cache
- [x] Test using the spandrel models in a practical image upscaling
workflow
## Merge Plan
- [x] Get approval from @brandonrising and @maryhipp before merging -
this PR has commercial implications.
- [x] Merge #6571 and change the target branch to `main`
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
In #6490 we enabled non-blocking torch device transfers throughout the model manager's memory management code. When using this torch feature, torch attempts to wait until the tensor transfer has completed before allowing any access to the tensor. Theoretically, that should make this a safe feature to use.
This provides a small performance improvement but causes race conditions in some situations. Specific platforms/systems are affected, and complicated data dependencies can make this unsafe.
- Intermittent black images on MPS devices - reported on discord and #6545, fixed with special handling in #6549.
- Intermittent OOMs and black images on a P4000 GPU on Windows - reported in #6613, fixed in this commit.
On my system, I haven't experience any issues with generation, but targeted testing of non-blocking ops did expose a race condition when moving tensors from CUDA to CPU.
One workaround is to use torch streams with manual sync points. Our application logic is complicated enough that this would be a lot of work and feels ripe for edge cases and missed spots.
Much safer is to fully revert non-locking - which is what this change does.
This issue is caused by a race condition. When a large image is served to the client, it is done using a streaming `FileResponse`. This concurrently serves the image straight from disk. The file is kept open by FastAPI until the image is fully served.
When a user deletes an image before the file is done serving, the delete fails because the file is still held by FastAPI.
To reproduce the issue:
- Create a very large image (8k reliably creates the issue).
- Create a smaller image, so that the first image in the gallery is not the large image.
- Refresh the app. The small image should be selected.
- Select the large image and immediately delete it. You have to be fast, to delete it before it finishes loading.
- In the terminal, we expect to see an error saying `Failed to delete image file`, and the image does not disappear from the UI.
- After a short wait, once the image has fully loaded, try deleting it again. We expect this to work.
The workaround is to instead serve the image from memory.
Loading the image to memory is very fast, so there is only a tiny window in which we could create the race condition, but it technically could still occur, because FastAPI is asynchronous and handles requests concurrently.
Once we load the image into memory, deletions of that image will work. Then we return a normal `Response` object with the image bytes. This is essentially what `FileResponse` does - except it uses `anyio.open_file`, which is async.
The tradeoff is that the server thread is blocked while opening the file. I think this is a fair tradeoff.
A future enhancement could be to implement soft deletion of images (db is already set up for this), and then clean up deleted image files on startup/shutdown. We could move back to using the async `FileResponse` for best responsiveness in the server without any risk of race conditions.
For some reason, I started getting this indefinite hang when the app checks if port 9090 is available. After some fiddling around, I found that adding a timeout resolves the issue.
I confirmed that the util still works by starting the app on 9090, then starting a second instance. The second instance correctly saw 9090 in use and moved to 9091.
## Summary
This PR changes the handling of invalid model configs in the DB to log a
warning rather than crashing the app.
This change is being made in preparation for some upcoming new model
additions. Previously, if a user rolled back from an app version that
added a new model type, the app would not launch until the DB was fixed.
This PR changes this behaviour to allow rollbacks of this type (with
warnings).
**Keep in mind that this change is only helpful to users _rolling back
to a version that has this fix_. I.e. it offers no help in the first
version that includes it.**
## QA Instructions
1. Run the Spandrel model branch, which adds a new model type
https://github.com/invoke-ai/InvokeAI/pull/6556.
2. Add a spandrel model via the model manager.
3. Rollback to main. The app will crash on launch due to the invalid
spandrel model config.
4. Checkout this branch. The app should now run with warnings about the
invalid model config.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
Currently translated at 100.0% (1282 of 1282 strings)
translationBot(ui): update translation (Russian)
Currently translated at 100.0% (1280 of 1280 strings)
translationBot(ui): update translation (Russian)
Currently translated at 100.0% (1275 of 1275 strings)
translationBot(ui): update translation (Russian)
Currently translated at 100.0% (1273 of 1273 strings)
Co-authored-by: Васянатор <ilabulanov339@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/ru/
Translation: InvokeAI/Web UI
Currently translated at 98.2% (1260 of 1282 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.4% (1260 of 1280 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.4% (1255 of 1275 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.4% (1253 of 1273 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.4% (1245 of 1265 strings)
Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
- Refine layout
- Update colors - more minimal, fewer shaded boxes
- Add indicator for search icons showing a search term is entered
- Handle new `projectName` and `projectUrl` ui props
## Summary
Update Boards UI in the gallery and adds support for creating and
displaying private boards
<!--A description of the changes in this PR. Include the kind of change
(fix, feature, docs, etc), the "why" and the "how". Screenshots or
videos are useful for frontend changes.-->
## Related Issues / Discussions
<!--WHEN APPLICABLE: List any related issues or discussions on github or
discord. If this PR closes an issue, please use the "Closes #1234"
format, so that the issue will be automatically closed when the PR
merges.-->
## QA Instructions
Can view private boards by setting config.allowPrivateBoards to true
<!--WHEN APPLICABLE: Describe how you have tested the changes in this
PR. Provide enough detail that a reviewer can reproduce your tests.-->
## Merge Plan
<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->
## Checklist
- [ ] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
## Summary
Demote error log to warning for models treated as having size 0.
## Related Issues / Discussions
Closes#6587
I looked into handling ESRGAN model sizes properly. They load a
state_dict with a bit of an unusual nested-dict structure. Rather than
figure out how to accurately calculate their size, we can just wait for
https://github.com/invoke-ai/InvokeAI/pull/6556. ESRGAN model size
handling should work properly when loaded through that pathway.
## QA Instructions
Loaded an ESRGAN model, and confirmed that the warning log is at the
warning level.
## Merge Plan
No special instructions.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
This commit corrects a broken link on line 16 that was pointing to the latest release but causing a 404 error (page not found) when clicked. The issue was identified as a trailing dot at the end of the URL, which has now been removed. This ensures users can access the intended latest release page.
## Summary
This PR tweaks the wording of the PR template QA instructions with the
goals of:
1. Make it more clear that PR authors are responsible for testing their
PRs.
2. Encouraging sufficient detail in the test descriptions.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
## Summary
Delete an unused duplicate libc_util.py file. The active version is at
`invokeai/backend/model_manager/libc_util.py`
## QA Instructions
I ran a smoke test to confirm that memory snapshotting still works.
## Merge Plan
- [x] Change target branch to `main` before merging.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
## Summary
This PR migrates all relative imports to absolute imports, and adds a
ruff check to enforce this going forward.
The justification for this change is here:
https://github.com/invoke-ai/InvokeAI/issues/6575
## QA Instructions
Smoke test all common workflows. Most of the relative -> absolute
conversions could be completed automatically, so the risk is relatively
low.
## Merge Plan
As with any far-reaching change like this, it is likely to cause some
merge conflicts with some in-flight branches. Unfortunately, there's no
way around this, but let me know if you can think of in-flight work that
will be significantly disrupted by this.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_ N/A
- [x] _Documentation added / updated (if applicable)_ N/A
## Summary
This PR fixes a regression that caused the following models to be
treated as having size 0 in the model cache: `(TextualInversionModelRaw,
IPAdapter, LoRAModelRaw)`.
Changes:
- Call the correct model size calculation for all supported model types.
- Log an error message if an unexpected model type is loaded, to prevent
similar regressions in the future.
## QA Instructions
I tested the following features and verified that no models fell back to
using a size of 0 unexpectedly:
- Test-to-image
- Textual Inversion
- LoRA
- IP-Adapter
- ControlNet
(All tested with both SD1.5 and SDXL.)
I compared the model cache switching behavior before and after this
change with a large number of LoRAs (10). Since LoRAs are small compared
to the main models, the changes in behaviour are minimal. Nonetheless,
it makes sense to get this in for correctness. And it might make a
difference for some usage patterns with limited RAM.
## Merge Plan
No special instructions.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
- For single image deletion, select the image in the same slot as the deleted image
- For multiple image deletion, empty selection
- On list images, if no images are currently selected, select the first image
## Summary
- This PR exposes a `tile_size` field on `ImageToLatentsInvocation` and
`LatentsToImageInvocation`.
- Setting `tile_size = 0` preserves the default behaviour.
- This feature is primarily intended to support upscaling workflows that
require VAE encoding/decoding high resolution images. In the future, we
may want to expose the tile size as a global application config, but
that's a separate conversation.
- As a general rule, larger tile sizes produce better results at the
cost of higher memory usage.
### Example:
Original (5472x5472)

VAE roundtrip with 512x512 tiles (note the discoloration)

VAE roundtrip with 1024x1024 tiles (some discoloration still present,
but less severe than at 512x512)

## Related Issues / Discussions
Related: #6144
## QA Instructions
- [x] Test image generation via the Linear tab
- [x] Test VAE roundtrip with tiling disabled
- [x] Test VAE roundtrip with tiling and tile_size = 0
- [x] Test VAE roundtrip with tiling and tile_size > 0
## Merge Plan
No special instructions.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
The selection logic is a bit complicated. We have image selection and pagination, both of which can be triggered using the mouse or hotkeys. We have viewer image selection and comparison image selection, which is determined by the alt key.
This change ties the room together with these behaviours:
- Changing the page using pagination buttons never changes the selection.
- Changing the selected image using arrows may change the page, if the arrow key pressed would select an image off the current page.
- `right` on the last image of the current page goes to the next page
- `down` on the last row of images goes to the next page
- `left` on the first image of the current page goes to the previous page
- `up` on the first row of images goes to the previous page
- If `alt` is held when using arrow keys, we change the page, but we only change the comparison image selection.
- When using arrow keys, if the page has changed since the last image was selected, the selection is reset to the first image on the page.
- The next/previous buttons on the image viewer do the same thing as `left` and `right` without `alt`.
- When clicking an image in the gallery:
- If no modifier keys are held, the image is exclusively selected.
- If `ctrl` or `meta` are held, the image's selection status is toggled.
- If `shift` is held, all images from the last-selected image to the image are selected. If there are no images on the current page, the selection is unchanged.
- If `alt` is held, the image is set as the compare image.
- `ctrl+a` and `meta+a` add the current page to the selection.
The logic for gallery navigation and selection is now pretty hairy. It's spread across 3 hooks, a listener, redux slice, components.
When we next make changes to this part of the app, we should consider consolidating some of the related logic. Probably most of it can go into a single listener and make it much simpler to grok.
Don't like this UI (even though I suggested it). No need to prevent the user from interacting with the search term field during fetching. Let's figure out a nicer way to present this in a followup.
## Summary
Python 3.11 has a wonderfully devious breaking change where _sometimes_
using enum classes that inherit from `str` or `int` do not work the same
way as they do in 3.10 when used within string formatting/interpolation.
This breaks the new gallery sort queries. The fix is to use
`order_dir.value` instead of `order_dir` in the query.
This was not an issue during development because the feature was
developed w/ python 3.10.
## Related Issues / Discussions
Thanks to @JPPhoto for reporting and troubleshooting:
https://discord.com/channels/1020123559063990373/1149513625321603162/1256211815982039173
## QA Instructions
JP's fancy python 3.11 system should work on this PR.
## Merge Plan
n/a
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
If the currently selected or auto-add board is archived or deleted, we should reset them. There are some edge cases taht weren't handled in the previous implementation.
All handling of this logic is moved to the (renamed) listener.
Before this change, if you attempt to create an image that with a nonexistent board, we'd get an unhandled error when adding the image to a board. The record would be created, but file not, due to the structure of the code.
With this change, we now log a warning if we have a problem adding the image to the board, but the record and file are still created.
A future improvement would be to create a transaction for this part of the code, preventing some other situation that could result in only the record or only the file beings saved.
* use model_class.load_singlefile() instead of converting; works, but performance is poor
* adjust the convert api - not right just yet
* working, needs sql migrator update
* rename migration_11 before conflict merge with main
* Update invokeai/backend/model_manager/load/model_loaders/stable_diffusion.py
Co-authored-by: Ryan Dick <ryanjdick3@gmail.com>
* Update invokeai/backend/model_manager/load/model_loaders/stable_diffusion.py
Co-authored-by: Ryan Dick <ryanjdick3@gmail.com>
* implement lightweight version-by-version config migration
* simplified config schema migration code
* associate sdxl config with sdxl VAEs
* remove use of original_config_file in load_single_file()
---------
Co-authored-by: Lincoln Stein <lstein@gmail.com>
Co-authored-by: Ryan Dick <ryanjdick3@gmail.com>
## Summary
We can get black outputs when moving tensors from CPU to MPS. It appears
MPS to CPU is fine. See:
- https://github.com/pytorch/pytorch/issues/107455
-
https://discuss.pytorch.org/t/should-we-set-non-blocking-to-true/38234/28
Changes:
- Add properties for each device on `TorchDevice` as a convenience.
- Add `get_non_blocking` static method on `TorchDevice`. This utility
takes a torch device and returns the flag to be used for non_blocking
when moving a tensor to the device provided.
- Update model patching and caching APIs to use this new utility.
## Related Issues / Discussions
Fixes: #6545
## QA Instructions
For both MPS and CUDA:
- Generate at least 5 images using LoRAs
- Generate at least 5 images using IP Adapters
## Merge Plan
We have pagination merged into `main` but aren't ready for that to be
released.
Once this fix is tested and merged, we will probably want to create a
`v4.2.5post1` branch off the `v4.2.5` tag, cherry-pick the fix and do a
release from the hotfix branch.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_ @RyanJDick @lstein This
feels testable but I'm not sure how.
- [ ] _Documentation added / updated (if applicable)_
We can get black outputs when moving tensors from CPU to MPS. It appears MPS to CPU is fine. See:
- https://github.com/pytorch/pytorch/issues/107455
- https://discuss.pytorch.org/t/should-we-set-non-blocking-to-true/38234/28
Changes:
- Add properties for each device on `TorchDevice` as a convenience.
- Add `get_non_blocking` static method on `TorchDevice`. This utility takes a torch device and returns the flag to be used for non_blocking when moving a tensor to the device provided.
- Update model patching and caching APIs to use this new utility.
Fixes: #6545
We only need to show the totals in the tooltip. Tooltips accpet a component for the tooltip label. The component isn't rendered until the tooltip is triggered.
Move the board total fetching into a tooltip component for the boards. Now we only fire these requests when the user mouses over the board
- Simplify the gallery layout
- Set an initial gallery limit to load _some_ images immediately.
- Refactor the resize observer to use the actual rendered image component to calculate the number of images per row/col. This prevents inaccuracies caused by image padding that could result in the wrong number of images.
- Debounce the limit update to not thrash teh API
- Use absolute positioning trick to ensure the gallery container is always exactly the right size
- Minimum of `imagesPerRow` images loaded at all times
This is one of those unexpected CSS quirks. Flex containers need min-width or min-height for their children to not overflow. Add `minH={0}` to gallery container.
## Summary
https://github.com/invoke-ai/InvokeAI/pull/6522 introduced a change in
behavior in cases where start/end were set such that there are 0
timesteps. This PR reverts that change.
cc @StAlKeR7779
## QA Instructions
Run with euler, 5 steps, start: 0.0, end: 0.05. I ran this test before
#6522, after #6522, and on this branch. This branch restores the
behavior to pre-#6522 i.e. noise is injected even if no denoising steps
are applied.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
The version of Invoke you have installed. If it is not the latest version, please update and try again to confirm the issue still exists. If you are testing main, please include the commit hash instead.
placeholder:ex. 3.6.1
The version of Invoke you have installed. If it is not the [latest version](https://github.com/invoke-ai/InvokeAI/releases/latest), please update and try again to confirm the issue still exists. If you are testing main, please include the commit hash instead.
placeholder:ex. v6.0.2
validations:
required:true
@@ -85,17 +99,17 @@ body:
id:browser-version
attributes:
label:Browser
description:Your web browser and version.
description:Your web browser and version, if you do not use the Launcher's provided GUI.
placeholder:ex. Firefox 123.0b3
validations:
required:true
required:false
- type:textarea
id:python-deps
attributes:
label:Python dependencies
label:System Information
description:|
If the problem occurred during image generation, click the gear icon at the bottom left corner, click "About", click the copy button and then paste here.
Click the gear icon at the bottom left corner, then click "About". Click the copy button and then paste here.
# Invoke - Professional Creative AI Tools for Visual Media
#### To learn more about Invoke, or implement our Business solutions, visit [invoke.com]
[![discord badge]][discord link] [![latest release badge]][latest release link] [![github stars badge]][github stars link] [![github forks badge]][github forks link] [![CI checks on main badge]][CI checks on main link] [![latest commit to main badge]][latest commit to main link] [![github open issues badge]][github open issues link] [![github open prs badge]][github open prs link] [![translation status badge]][translation status link]
</div>
Invoke is a leading creative engine built to empower professionals and enthusiasts alike. Generate and create stunning visual media using the latest AI-driven technologies. Invoke offers an industry leading web-based UI, and serves as the foundation for multiple commercial products.
[Installation and Updates][installation docs] - [Documentation and Tutorials][docs home] - [Bug Reports][github issues] - [Contributing][contributing docs]
<div align="center">
- Free to use under a commercially-friendly license
- Download and install on compatible hardware
- Generate, refine, iterate on images, and build workflows

</div>
---
> ## 📣 Are you a new or returning InvokeAI user?
> Take our first annual [User's Survey](https://forms.gle/rCE5KuQ7Wfrd1UnS7)
## Quick Start
---
1. Download and unzip the installer from the bottom of the [latest release][latest release link].
2. Run the installer script.
# Documentation
- **Windows**: Double-click on the `install.bat` script.
- **macOS**: Open a Terminal window, drag the file `install.sh` from Finder into the Terminal, and press enter.
| [Installation and Updates][installation docs] - [Documentation and Tutorials][docs home] - [Bug Reports][github issues] - [Contributing][contributing docs] |
3. When prompted, enter a location for the install and select your GPU type.
4. Once the install finishes, find the directory you selected during install. The default location is `C:\Users\Username\invokeai` for Windows or `~/invokeai` for Linux/macOS.
5. Run the launcher script (`invoke.bat` for Windows, `invoke.sh` for macOS and Linux) the same way you ran the installer script in step 2.
6. Select option 1 to start the application. Once it starts up, open your browser and go to <http://localhost:9090>.
7. Open the model manager tab to install a starter model and then you'll be ready to generate.
# Installation
More detail, including hardware requirements and manual install instructions, are available in the [installation documentation][installation docs].
To get started with Invoke, [Download the Launcher](https://github.com/invoke-ai/launcher/releases/latest).
## Troubleshooting, FAQ and Support
@@ -57,21 +52,45 @@ The Unified Canvas is a fully integrated canvas implementation with support for
### Workflows & Nodes
Invoke offers a fully featured workflow management solution, enabling users to combine the power of node-based workflows with the easy of a UI. This allows for customizable generation pipelines to be developed and shared by users looking to create specific workflows to support their production use-cases.
Invoke offers a fully featured workflow management solution, enabling users to combine the power of node-based workflows with the ease of a UI. This allows for customizable generation pipelines to be developed and shared by users looking to create specific workflows to support their production use-cases.
### Board & Gallery Management
Invoke features an organized gallery system for easily storing, accessing, and remixing your content in the Invoke workspace. Images can be dragged/dropped onto any Image-base UI element in the application, and rich metadata within the Image allows for easy recall of key prompts or settings used in your workflow.
### Model Support
- SD 1.5
- SD 2.0
- SDXL
- SD 3.5 Medium
- SD 3.5 Large
- CogView 4
- Flux.1 Dev
- Flux.1 Schnell
- Flux.1 Kontext
- Flux.1 Krea
- Flux Redux
- Flux Fill
- Flux.2 Klein 4B
- Flux.2 Klein 9B
- Z-Image Turbo
- Z-Image Base
- Anima
- Qwen Image
- Qwen Image Edit
- Nano Banana (API Only)
- GPT Image (API Only)
- Wan (API Only)
### Other features
- Support for both ckpt and diffusers models
- SD1.5, SD2.0, and SDXL support
- Support for ckpt, diffusers, and some gguf models
This document describes the implementation of user isolation features in the InvokeAI session queue and processing system to address issues identified in the enhancement request.
## Issues Addressed
### 1. Cross-User Image/Preview Visibility
**Problem:** When two users are logged in simultaneously and one initiates a generation, the generation preview shows up in both users' browsers and the generated image gets saved to both users' image boards.
**Solution:** Implemented socket-level event filtering based on user authentication:
**Problem:** When the job queue tab is open in multiple browsers and a generation is begun in one browser window, the queue does not update in the other window.
**Status:** This issue is likely resolved by the socket authentication and event filtering changes. The existing socket subscription mechanism (`subscribe_queue` event) already supports multiple connections per user. Testing is required to confirm this works correctly with the new authentication flow.
### 4. User Information Display
**Problem:** Queue table lacks user identification, making it difficult to know who launched which job.
**Solution:** Added user information to queue items and UI:
All commands should be run within the `docker` directory: `cd docker`
First things first:
## Quickstart :rocket:
- Ensure that Docker can use your [NVIDIA][nvidia docker docs] or [AMD][amd docker docs] GPU.
- This document assumes a Linux system, but should work similarly under Windows with WSL2.
- We don't recommend running Invoke in Docker on macOS at this time. It works, but very slowly.
On a known working Linux+Docker+CUDA (Nvidia) system, execute `./run.sh` in this directory. It will take a few minutes - depending on your internet speed - to install the core models. Once the application starts up, open `http://localhost:9090` in your browser to Invoke!
## Quickstart
For more configuration options (using an AMD GPU, custom root directory location, etc): read on.
No `docker compose`, no persistence, single command, using the official images:
## Detailed setup
**CUDA (NVIDIA GPU):**
```bash
docker run --runtime=nvidia --gpus=all --publish 9090:9090 ghcr.io/invoke-ai/invokeai
```
**ROCm (AMD GPU):**
```bash
docker run --device /dev/kfd --device /dev/dri --publish 9090:9090 ghcr.io/invoke-ai/invokeai:main-rocm
```
Open `http://localhost:9090` in your browser once the container finishes booting, install some models, and generate away!
### Data persistence
To persist your generated images and downloaded models outside of the container, add a `--volume/-v` flag to the above command, e.g.:
```bash
docker run --volume /some/local/path:/invokeai {...etc...}
```
`/some/local/path/invokeai` will contain all your data.
It can *usually* be reused between different installs of Invoke. Tread with caution and read the release notes!
## Customize the container
The included `run.sh` script is a convenience wrapper around `docker compose`. It can be helpful for passing additional build arguments to `docker compose`. Alternatively, the familiar `docker compose` commands work just as well.
```bash
cd docker
cp .env.sample .env
# edit .env to your liking if you need to; it is well commented.
./run.sh
```
It will take a few minutes to build the image the first time. Once the application starts up, open `http://localhost:9090` in your browser to invoke!
>[!TIP]
>When using the `run.sh` script, the container will continue running after Ctrl+C. To shut it down, use the `docker compose down` command.
## Docker setup in detail
#### Linux
1. Ensure builkit is enabled in the Docker daemon settings (`/etc/docker/daemon.json`)
1. Ensure buildkit is enabled in the Docker daemon settings (`/etc/docker/daemon.json`)
2. Install the `docker compose` plugin using your package manager, or follow a [tutorial](https://docs.docker.com/compose/install/linux/#install-using-the-repository).
- The deprecated `docker-compose` (hyphenated) CLI continues to work for now.
- The deprecated `docker-compose` (hyphenated) CLI probably won't work. Update to a recent version.
3. Ensure docker daemon is able to access the GPU.
-You may need to install [nvidia-container-toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html)
> You'll be better off installing Invoke directly on your system, because Docker can not use the GPU on macOS.
If you are still reading:
1. Ensure Docker has at least 16GB RAM
2. Enable VirtioFS for file sharing
3. Enable `docker compose` V2 support
This is done via Docker Desktop preferences
This is done via Docker Desktop preferences.
### Configure Invoke environment
### Configure the Invoke Environment
1. Make a copy of `.env.sample` and name it `.env` (`cp .env.sample .env` (Mac/Linux) or `copy example.env .env` (Windows)). Make changes as necessary. Set `INVOKEAI_ROOT` to an absolute path to:
a. the desired location of the InvokeAI runtime directory, or
b. an existing, v3.0.0 compatible runtime directory.
1. Make a copy of `.env.sample` and name it `.env` (`cp .env.sample .env` (Mac/Linux) or `copy example.env .env` (Windows)). Make changes as necessary. Set `INVOKEAI_ROOT` to an absolute path to the desired location of the InvokeAI runtime directory. It may be an existing directory from a previous installation (post 4.0.0).
1. Execute `run.sh`
The image will be built automatically if needed.
The runtime directory (holding models and outputs) will be created in the location specified by `INVOKEAI_ROOT`. The default location is `~/invokeai`. The runtime directory will be populated with the base configs and models necessary to start generating.
The runtime directory (holding models and outputs) will be created in the location specified by `INVOKEAI_ROOT`. The default location is `~/invokeai`. Navigate to the Model Manager tab and install some models before generating.
### Use a GPU
@@ -43,9 +90,9 @@ The runtime directory (holding models and outputs) will be created in the locati
- WSL2 is *required* for Windows.
- only `x86_64` architecture is supported.
The Docker daemon on the system must be already set up to use the GPU. In case of Linux, this involves installing `nvidia-docker-runtime` and configuring the `nvidia` runtime as default. Steps will be different for AMD. Please see Docker documentation for the most up-to-date instructions for using your GPU with Docker.
The Docker daemon on the system must be already set up to use the GPU. In case of Linux, this involves installing `nvidia-docker-runtime` and configuring the `nvidia` runtime as default. Steps will be different for AMD. Please see Docker/NVIDIA/AMD documentation for the most up-to-date instructions for using your GPU with Docker.
To use an AMD GPU, set `GPU_DRIVER=rocm` in your `.env` file.
To use an AMD GPU, set `GPU_DRIVER=rocm` in your `.env` file before running `./run.sh`.
## Customize
@@ -59,30 +106,12 @@ Values are optional, but setting `INVOKEAI_ROOT` is highly recommended. The defa
INVOKEAI_ROOT=/Volumes/WorkDrive/invokeai
HUGGINGFACE_TOKEN=the_actual_token
CONTAINER_UID=1000
GPU_DRIVER=nvidia
GPU_DRIVER=cuda
```
Any environment variables supported by InvokeAI can be set here - please see the [Configuration docs](https://invoke-ai.github.io/InvokeAI/features/CONFIGURATION/) for further detail.
Any environment variables supported by InvokeAI can be set here. See the [Configuration docs](https://invoke.ai/configuration/invokeai-yaml/) for further detail.
## Even More Customizing!
---
See the `docker-compose.yml` file. The `command` instruction can be uncommented and used to run arbitrary startup commands. Some examples below.
### Reconfigure the runtime directory
Can be used to download additional models from the supported model list
In conjunction with `INVOKEAI_ROOT` can be also used to initialize a runtime directory
The Invoke application is published as a python package on [PyPI]. This includes both a source distribution and built distribution (a wheel).
Most users install it with the [Launcher](https://github.com/invoke-ai/launcher/), others with `pip`.
The launcher uses GitHub as the source of truth for available releases.
## Broad Strokes
- Merge all changes and bump the version in the codebase.
- Tag the release commit.
- Wait for the release workflow to complete.
- Approve the PyPI publish jobs.
- Write GH release notes.
## General Prep
Make a developer call-out for PRs to merge. Merge and test things
out. Create a branch with a name like user/chore/vX.X.X-prep and bump the version by editing
`invokeai/version/invokeai_version.py` and commit locally.
## Release Workflow
The `release.yml` workflow runs a number of jobs to handle code checks, tests, build and publish on PyPI.
It is triggered on **tag push**, when the tag matches `v*`.
### Triggering the Workflow
Ensure all commits that should be in the release are merged into this branch, and that you have pulled them locally.
Run `make tag-release` to tag the current commit and kick off the workflow. You will be prompted to provide a message - use the version specifier.
If this version's tag already exists for some reason (maybe you had to make a last minute change), the script will overwrite it.
Push the commit to trigger the workflow.
> In case you cannot use the Make target, the release may also be dispatched [manually] via GH.
### Workflow Jobs and Process
The workflow consists of a number of concurrently-run checks and tests, then two final publish jobs.
The publish jobs require manual approval and are only run if the other jobs succeed.
#### `check-version` Job
This job ensures that the `invokeai` python package version specifier matches the tag for the release. The version specifier is pulled from the `__version__` variable in `invokeai/version/invokeai_version.py`.
This job uses [samuelcolvin/check-python-version].
> Any valid [version specifier] works, so long as the tag matches the version. The release workflow works exactly the same for `RC`, `post`, `dev`, etc.
#### Check and Test Jobs
Next, these jobs run and must pass. They are the same jobs that are run for every PR.
- **`python-tests`**: runs `pytest` on matrix of platforms
- **`python-checks`**: runs `ruff` (format and lint)
- **`frontend-tests`**: runs `vitest`
- **`frontend-checks`**: runs `prettier` (format), `eslint` (lint), `dpdm` (circular refs), `tsc` (static type check) and `knip` (unused imports)
- **`typegen-checks`**: ensures the frontend and backend types are synced
#### `build-wheel` Job
This sets up both python and frontend dependencies and builds the python package. Internally, this runs `./scripts/build_wheel.sh` and uploads `dist.zip`, which contains the wheel and unarchived build.
You don't need to download or test these artifacts.
#### Sanity Check & Smoke Test
At this point, the release workflow pauses as the remaining publish jobs require approval.
It's possible to test the python package before it gets published to PyPI. We've never had problems with it, so it's not necessary to do this.
But, if you want to be extra-super careful, here's how to test it:
- Download the `dist.zip` build artifact from the `build-wheel` job
- Unzip it and find the wheel file
- Create a fresh Invoke install by following the [manual install guide](https://invoke-ai.github.io/InvokeAI/installation/manual/) - but instead of installing from PyPI, install from the wheel
- Test the app
##### Something isn't right
If testing reveals any issues, no worries. Cancel the workflow, which will cancel the pending publish jobs (you didn't approve them prematurely, right?) and start over.
#### PyPI Publish Jobs
The publish jobs will not run if any of the previous jobs fail.
They use [GitHub environments], which are configured as [trusted publishers] on PyPI.
Both jobs require a @lstein or @blessedcoolant to approve them from the workflow's **Summary** tab.
- Click the **Review deployments** button
- Select the environment (either `testpypi` or `pypi` - typically you select both)
- Click **Approve and deploy**
> **If the version already exists on PyPI, the publish jobs will fail.** PyPI only allows a given version to be published once - you cannot change it. If version published on PyPI has a problem, you'll need to "fail forward" by bumping the app version and publishing a followup release.
##### Failing PyPI Publish
Check the [python infrastructure status page] for incidents.
If there are no incidents, contact @lstein or @blessedcoolant, who have owner access to GH and PyPI, to see if access has expired or something like that.
#### `publish-testpypi` Job
Publishes the distribution on the [Test PyPI] index, using the `testpypi` GitHub environment.
This job is not required for the production PyPI publish, but included just in case you want to test the PyPI release for some reason:
- Approve this publish job without approving the prod publish
- Let it finish
- Create a fresh Invoke install by following the [manual install guide](https://invoke-ai.github.io/InvokeAI/installation/manual/), making sure to use the Test PyPI index URL: `https://test.pypi.org/simple/`
- Test the app
#### `publish-pypi` Job
Publishes the distribution on the production PyPI index, using the `pypi` GitHub environment.
It's a good idea to wait to approve and run this job until you have the release notes ready!
## Prep and publish the GitHub Release
1. [Draft a new release] on GitHub, choosing the tag that triggered the release.
2. The **Generate release notes** button automatically inserts the changelog and new contributors. Make sure to select the correct tags for this release and the last stable release. GH often selects the wrong tags - do this manually.
3. Write the release notes, describing important changes. Contributions from community members should be shouted out. Use the GH-generated changelog to see all contributors. If there are Weblate translation updates, open that PR and shout out every person who contributed a translation.
4. Check **Set as a pre-release** if it's a pre-release.
5. Approve and wait for the `publish-pypi` job to finish if you haven't already.
6. Publish the GH release.
7. Post the release in Discord in the [releases](https://discord.com/channels/1020123559063990373/1149260708098359327) channel with abbreviated notes. For example:
> It's a pretty big one - Form Builder, Metadata Nodes (thanks @SkunkWorxDark!), and much more.
8. Right click the message in releases and copy the link to it. Then, post that link in the [new-release-discussion](https://discord.com/channels/1020123559063990373/1149506274971631688) channel. For example:
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.