InvokeAI

mirror of https://github.com/invoke-ai/InvokeAI.git synced 2026-04-23 03:00:31 -04:00

Author	SHA1	Message	Date
copilot-swe-agent[bot]	323cb2dbd0	Fix dimension handling for Z-Image 2D tensors Co-authored-by: lstein <111189+lstein@users.noreply.github.com>	2025-12-28 16:53:23 +00:00
copilot-swe-agent[bot]	5f4ef67f92	Address code review feedback - fix logic and add constants Co-authored-by: lstein <111189+lstein@users.noreply.github.com>	2025-12-28 14:35:32 +00:00
copilot-swe-agent[bot]	7d65cdfc16	Simplify Seed Variance Enhancer implementation Co-authored-by: lstein <111189+lstein@users.noreply.github.com>	2025-12-28 14:32:57 +00:00
copilot-swe-agent[bot]	6266e0e89d	Add Seed Variance Enhancer invocation for Z-Image Co-authored-by: lstein <111189+lstein@users.noreply.github.com>	2025-12-28 14:30:41 +00:00
copilot-swe-agent[bot]	5de0574723	Initial plan	2025-12-28 14:23:57 +00:00
blessedcoolant	7182ff26dc	fix(ui): misaligned Color Compensation Option (#8714 )	2025-12-27 23:11:48 -05:00
Josh Corbett	8deafabe6b	feat(prompts): 💄 increase prompt font size (#8712 ) * feat(prompts): 💄 increase prompt font size * style(prompts): 🚨 satisfy linter	2025-12-27 21:18:23 -05:00
blessedcoolant	d42bf9c941	fix(model-manager): add Z-Image LoRA/DoRA detection support (#8709 ) ## Summary Fix Z-Image LoRA/DoRA model detection failing during installation. Z-Image LoRAs use different key patterns than SD/SDXL LoRAs. The base `LoRA_LyCORIS_Config_Base` class only checked for key suffixes like `lora_A.weight` and `lora_B.weight`, but Z-Image LoRAs (especially those in DoRA format) use: - `lora_down.weight` / `lora_up.weight` (standard LoRA format) - `dora_scale` (DoRA weight decomposition) This PR overrides `_validate_looks_like_lora` in `LoRA_LyCORIS_ZImage_Config` to recognize Z-Image specific patterns: - Keys starting with `diffusion_model.layers.` (Z-Image S3-DiT architecture) - Keys ending with `lora_down.weight`, `lora_up.weight`, `lora_A.weight`, `lora_B.weight`, or `dora_scale` ## Related Issues / Discussions Fixes installation of Z-Image LoRAs trained with DoRA (Weight-Decomposed Low-Rank Adaptation). ## QA Instructions 1. Download a Z-Image LoRA in DoRA format (e.g., from CivitAI with keys like `diffusion_model.layers.X.attention.to_k.lora_down.weight`) 2. Try to install the LoRA via Model Manager 3. Verify the model is recognized as a Z-Image LoRA and installs successfully 4. Verify the LoRA can be applied when generating with Z-Image ## Merge Plan Standard merge, no special considerations. ## Checklist - [x] _The PR has a short but descriptive title, suitable for a changelog_ - [ ] _Tests added / updated (if applicable)_ - [ ] _❗Changes to a redux slice have a corresponding migration_ - [ ] _Documentation added / updated (if applicable)_ - [ ] _Updated `What's New` copy (if doing a release after this PR)_	2025-12-27 23:10:06 +05:30
Alexander Eichhorn	d403587c7f	Merge branch 'fix/z-image-lora-dora-detection' of https://github.com/Pfannkuchensack/InvokeAI into fix/z-image-lora-dora-detection	2025-12-27 09:17:33 +01:00
Alexander Eichhorn	355c985cc3	fix(model-manager): add Z-Image LoRA/DoRA detection and loading support Two fixes for Z-Image LoRA support: 1. Override _validate_looks_like_lora in LoRA_LyCORIS_ZImage_Config to recognize Z-Image specific LoRA formats that use different key patterns than SD/SDXL LoRAs. Z-Image LoRAs use lora_down.weight/lora_up.weight and dora_scale suffixes instead of lora_A.weight/lora_B.weight. 2. Fix _group_by_layer in z_image_lora_conversion_utils.py to correctly group LoRA keys by layer name. The previous logic used rsplit with maxsplit=2 which incorrectly grouped keys like: - "to_k.alpha" -> layer "diffusion_model.layers.17.attention" - "lora_down.weight" -> layer "diffusion_model.layers.17.attention.to_k" Now uses suffix matching to ensure all keys for a layer are grouped together (alpha, dora_scale, lora_down.weight, lora_up.weight).	2025-12-27 09:17:29 +01:00
Alexander Eichhorn	41742146e2	fix(model-manager): add Z-Image LoRA/DoRA detection support Override _validate_looks_like_lora in LoRA_LyCORIS_ZImage_Config to recognize Z-Image specific LoRA formats that use different key patterns than SD/SDXL LoRAs. Z-Image LoRAs (including DoRA format) use keys like: - diffusion_model.layers.X.attention.to_k.lora_down.weight - diffusion_model.layers.X.attention.to_k.dora_scale The base LyCORIS config only checked for lora_A.weight/lora_B.weight suffixes, missing the lora_down.weight/lora_up.weight and dora_scale patterns used by Z-Image LoRAs.	2025-12-27 07:06:12 +01:00
Lincoln Stein	0b1befa9ab	(chore) Prep for v6.10.0rc2 (#8701 )	2025-12-26 18:26:04 -05:00
Alexander Eichhorn	65efc3db7d	Feature: Add Z-Image-Turbo regional guidance (#8672 ) * feat: Add Regional Guidance support for Z-Image model Implements regional prompting for Z-Image (S3-DiT Transformer) allowing different prompts to affect different image regions using attention masks. Backend changes: - Add ZImageRegionalPromptingExtension for mask preparation - Add ZImageTextConditioning and ZImageRegionalTextConditioning data classes - Patch transformer forward to inject 4D regional attention masks - Use additive float mask (0.0 attend, -inf block) in bfloat16 for compatibility - Alternate regional/full attention layers for global coherence Frontend changes: - Update buildZImageGraph to support regional conditioning collectors - Update addRegions to create z_image_text_encoder nodes for regions - Update addZImageLoRAs to handle optional negCond when guidance_scale=0 - Add Z-Image validation (no IP adapters, no autoNegative) * @Pfannkuchensack Fix windows path again * ruff check fix * ruff formating * fix(ui): Z-Image CFG guidance_scale check uses > 1 instead of > 0 Changed the guidance_scale check from > 0 to > 1 for Z-Image models. Since Z-Image uses guidance_scale=1.0 as "no CFG" (matching FLUX convention), negative conditioning should only be created when guidance_scale > 1. --------- Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com> v6.10.0rc1	2025-12-26 02:25:38 +00:00
Lincoln Stein	de1aa557b8	chore: bump version to v6.10.0rc1 (#8695 ) * chore: bump version to v6.10.0rc1 * docs: fix names of code owners in release doc	2025-12-26 02:08:14 +00:00
Lincoln Stein	b9493ddce7	Workaround for Windows being unable to remove tmp directories when installing GGUF files (#8699 ) * (bugfix)(mm) work around Windows being unable to rmtree tmp directories after GGUF install * (style) fix ruff error * (fix) add workaround for Windows Permission Denied on GGUF file move() call * (fix) perform torch copy() in GGUF reader to avoid deletion failures on Windows * (style) fix ruff formatting issues	2025-12-26 02:02:39 +00:00
Josh Corbett	ddb85ca669	fix(prompts): 🐛 prompt attention behaviors, add tests (#8683 ) * fix(prompts): 🐛 prompt attention adjust elevation edge cases, added tests * refactor(prompts): ♻️ create attention edit helper for prompt boxes * feat(prompts): ✨ apply attention keybinds to negative prompt * feat(prompts): 🚀 reconsider behaviors, simplify code * fix(prompts): 🐛 keybind attention update not tracked by undo/redo * feat(prompts): ✨ overhaul prompt attention behavior * fix(prompts): 🩹 remove unused type * fix(prompts): 🩹 remove unused `Token` type --------- Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>	2025-12-24 17:38:24 -05:00
Alexander Eichhorn	ac245cbf6c	feat(backend): add support for xlabs Flux LoRA format (#8686 ) Add support for loading Flux LoRA models in the xlabs format, which uses keys like `double_blocks.X.processor.{qkv\|proj}_lora{1\|2}.{down\|up}.weight`. The xlabs format maps: - lora1 -> img_attn (image attention stream) - lora2 -> txt_attn (text attention stream) - qkv -> query/key/value projection - proj -> output projection Changes: - Add FluxLoRAFormat.XLabs enum value - Add flux_xlabs_lora_conversion_utils.py with detection and conversion - Update formats.py to detect xlabs format - Update lora.py loader to handle xlabs format - Update model probe to accept recognized Flux LoRA formats - Add unit tests for xlabs format detection and conversion Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>	2025-12-24 20:18:11 +00:00
Alexander Eichhorn	5be1e03d73	Feature/user workflow tags (#8698 ) * Feature: Add Tag System for user made Workflows * feat(ui): display tags on workflow library tiles Show workflow tags at the bottom of each tile in the workflow browser, making it easier to identify workflow categories at a glance. --------- Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>	2025-12-24 14:54:22 -05:00
Josh Corbett	87314142b5	feat(hotkeys modal): ⚡ loading state + performance improvements (#8694 ) * feat(hotkeys modal): ⚡ loading state + performance improvements * feat(hotkeys modal): add tooltip to edit button and adjust layout spacing * style(hotkeys modal): 🚨 satisfy the linter --------- Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>	2025-12-24 14:39:14 -05:00
Alexander Eichhorn	4cb9b8d97d	Feature: add prompt template node (#8680 ) * feat(nodes): add Prompt Template node Add a new node that applies Style Preset templates to prompts in workflows. The node takes a style preset ID and positive/negative prompts as inputs, then replaces {prompt} placeholders in the template with the provided prompts. This makes Style Preset templates accessible in Workflow mode, enabling users to apply consistent styling across their workflow-based generations. * feat(nodes): add StylePresetField for database-driven preset selection Adds a new StylePresetField type that enables dropdown selection of style presets from the database in the workflow editor. Changes: - Add StylePresetField to backend (fields.py) - Update Prompt Template node to use StylePresetField instead of string ID - Add frontend field type definitions (zod schemas, type guards) - Create StylePresetFieldInputComponent with Combobox - Register field in InputFieldRenderer and nodesSlice - Add translations for preset selection * fix schema.ts on windows. * chore(api): regenerate schema.ts after merge --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-12-24 14:33:16 -05:00
Alexander Eichhorn	21138e5d52	fix support multi-subfolder downloads for Z-Image Qwen3 encoder (#8692 ) * fix(model-install): support multi-subfolder downloads for Z-Image Qwen3 encoder The Z-Image Qwen3 text encoder requires both text_encoder and tokenizer subfolders from the HuggingFace repo, but the previous implementation only downloaded the text_encoder subfolder, causing model identification to fail. Changes: - Add subfolders property to HFModelSource supporting '+' separated paths - Extend filter_files() and download_urls() to handle multiple subfolders - Update _multifile_download() to preserve subfolder structure - Make Qwen3Encoder probe check both nested and direct config.json paths - Update Qwen3EncoderLoader to handle both directory structures - Change starter model source to text_encoder+tokenizer * ruff format * fix schema description * fix schema description --------- Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>	2025-12-23 23:39:43 -05:00
Alexander Eichhorn	39114b0ad0	Feature (UI): add model path update for external models (#8675 ) * feat(ui): add model path update for external models Add ability to update file paths for externally managed models (models with absolute paths). Invoke-controlled models (with relative paths in the models directory) are excluded from this feature to prevent breaking internal model management. - Add ModelUpdatePathButton component with modal dialog - Only show button for external models (absolute path check) - Add translations for path update UI elements * Added support for Windows UNC paths in ModelView.tsx:38-41. The isExternalModel function now detects: Unix absolute paths: /home/user/models/... Windows drive paths: C:\Models\... or D:/Models/... Windows UNC paths: \\ServerName\ShareName\... or //ServerName/ShareName/... * fix(ui): validate path format in Update Path modal to prevent invalid paths When updating an external model's path, the new path is now validated to ensure it follows an absolute path format (Unix, Windows drive, or UNC). This prevents users from accidentally entering invalid paths that would cause the Update Path button to disappear, leaving them unable to correct the mistake. * fix(ui): extract isExternalModel to separate file to fix circular dependency Moves the isExternalModel utility function to its own file to break the circular dependency between ModelView.tsx and ModelUpdatePathButton.tsx. --------- Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>	2025-12-23 22:46:50 -05:00
Josh Corbett	3fe5f62c48	feat(hotkeys): ✨ Overhaul hotkeys modal UI (#8682 ) * feat(hotkeys): ✨ overhaul hotkeys modal UI * fix(model manager): 🩹 improved check for hotkey search clear button * fix(model manager): 🩹 remove unused exports * feat(starter-models): add Z-Image Turbo starter models Add Z-Image Turbo and related models to the starter models list: - Z-Image Turbo (full precision, ~13GB) - Z-Image Turbo quantized (GGUF Q4_K, ~4GB) - Z-Image Qwen3 Text Encoder (full precision, ~8GB) - Z-Image Qwen3 Text Encoder quantized (GGUF Q6_K, ~3.3GB) - Z-Image ControlNet Union (Canny, HED, Depth, Pose, MLSD, Inpainting) The quantized Turbo model includes the quantized Qwen3 encoder as a dependency for automatic installation. * feat(starter-models): add Z-Image Q8 quant and ControlNet Tile Add higher quality Q8_0 quantization option for Z-Image Turbo (~6.6GB) to complement existing Q4_K variant, providing better quality for users with more VRAM. Add dedicated Z-Image ControlNet Tile model (~6.7GB) for upscaling and detail enhancement workflows. * feat(hotkeys): ✨ overhaul hotkeys modal UI * feat(hotkeys modal): 💄 shrink add hotkey button * fix(hotkeys): normalization and detection issues * style: 🚨 satisfy the linter * fix(hotkeys modal): 🩹 remove unused exports --------- Co-authored-by: Alexander Eichhorn <alex@eichhorn.dev> Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>	2025-12-23 22:24:00 -05:00
Josh Corbett	73c6b31011	feat(model manager): 💄 refactor model manager bulk actions UI (#8684 ) * feat(model manager): 💄 refactor model manager bulk actions UI * feat(model manager): 💄 tweak model list item ui for checkbox selects * style(model manager): 🚨 satisfy the linter * feat(model manager): 💄 tweak search and actions dropdown placement * refactor(model manager): 🔥 remove unused `ModelListHeader` component * fix(model manager): 🐛 list items overlapping sticky headers --------- Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>	2025-12-23 22:17:07 -05:00
blessedcoolant	f82bcd40fc	fix: CFG Scale min value reset to zero (#8691 ) No longer needed coz Z Image works at 1.0	2025-12-23 19:27:30 -05:00
blessedcoolant	5a0b227256	feat(starter-models): add Z-Image Turbo starter models (#8689 ) ## Summary Add Z-Image Turbo and related models to the starter models list for easy installation via the Model Manager: - Z-Image Turbo - Full precision Diffusers format (~13GB) - Z-Image Turbo (quantized) - GGUF Q4_K format (~4GB) - Z-Image Qwen3 Text Encoder - Full precision (~8GB) - Z-Image Qwen3 Text Encoder (quantized) - GGUF Q6_K format (~3.3GB) - Z-Image ControlNet Union - Unified ControlNet supporting Canny, HED, Depth, Pose, MLSD, and Inpainting modes The quantized Turbo model includes the quantized Qwen3 encoder as a dependency for automatic installation. ## Related Issues / Discussions Builds on the Z-Image Turbo support added in main. ## QA Instructions 1. Open Model Manager → Starter Models 2. Search for "Z-Image" 3. Verify all 5 models appear with correct descriptions 4. Install the quantized version and confirm the Qwen3 encoder dependency is also installed ## Merge Plan Standard merge, no special considerations. ## Checklist - [x] _The PR has a short but descriptive title, suitable for a changelog_ - [ ] _Tests added / updated (if applicable)_ - [ ] _❗Changes to a redux slice have a corresponding migration_ - [ ] _Documentation added / updated (if applicable)_ - [ ] _Updated `What's New` copy (if doing a release after this PR)_	2025-12-23 08:31:34 +05:30
blessedcoolant	1b5d91d1cf	Merge branch 'main' into feat/z-image-starter-models	2025-12-23 08:27:25 +05:30
Alexander Eichhorn	a748519e92	feat(starter-models): add Z-Image Q8 quant and ControlNet Tile Add higher quality Q8_0 quantization option for Z-Image Turbo (~6.6GB) to complement existing Q4_K variant, providing better quality for users with more VRAM. Add dedicated Z-Image ControlNet Tile model (~6.7GB) for upscaling and detail enhancement workflows.	2025-12-23 03:27:09 +01:00
blessedcoolant	90e34002f0	fix(z-image): Fix padding token shape mismatch for GGUF models (#8690 ) ## Summary Fix shape mismatch when loading GGUF-quantized Z-Image transformer models. GGUF Z-Image models store `x_pad_token` and `cap_pad_token` with shape `[3840]`, but diffusers `ZImageTransformer2DModel` expects `[1, 3840]` (with batch dimension). This caused a `RuntimeError` on Linux systems when loading models like `z_image_turbo-Q4_K.gguf`. The fix: - Dequantizes GGMLTensors first (since they don't support `unsqueeze`) - Reshapes the tensors to add the missing batch dimension ## Related Issues / Discussions Reported by Linux user using: - https://huggingface.co/leejet/Z-Image-Turbo-GGUF/resolve/main/z_image_turbo-Q4_K.gguf - https://huggingface.co/worstplayer/Z-Image_Qwen_3_4b_text_encoder_GGUF/resolve/main/Qwen_3_4b-Q6_K.gguf ## QA Instructions 1. Install a GGUF-quantized Z-Image model (e.g., `z_image_turbo-Q4_K.gguf`) 2. Install a Qwen3 GGUF encoder 3. Run a Z-Image generation 4. Verify no `RuntimeError: size mismatch for x_pad_token` error occurs ## Merge Plan None, straightforward fix. ## Checklist - [x] _The PR has a short but descriptive title, suitable for a changelog_ - [ ] _Tests added / updated (if applicable)_ - [ ] _❗Changes to a redux slice have a corresponding migration_ - [ ] _Documentation added / updated (if applicable)_ - [ ] _Updated `What's New` copy (if doing a release after this PR)_	2025-12-23 06:04:40 +05:30
blessedcoolant	7068cf956a	Merge branch 'main' into pr/8690	2025-12-23 05:59:49 +05:30
blessedcoolant	aa764f8bf4	Feature: z-image Turbo Control Net (#8679 ) ## Summary Add support for Z-Image ControlNet V2.0 alongside the existing V1 support. Key changes: - Auto-detect `control_in_dim` from adapter weights (16 for V1, 33 for V2.0) - Auto-detect `n_refiner_layers` from state dict - Add zero-padding for V2.0's additional control channels (diffusers approach) - Use `accelerate.init_empty_weights()` for more efficient model creation - Add `ControlNet_Checkpoint_ZImage_Config` to frontend schema ## Related Issues / Discussions Part of Z-Image feature implementation. ## QA Instructions 1. Load a Z-Image ControlNet V1 model (control_in_dim=16) and verify it works 2. Load a Z-Image ControlNet V2.0 model (control_in_dim=33) and verify it works 3. Test with different control types: Canny, Depth, Pose 4. Recommended `control_context_scale`: 0.65-0.80 ## Merge Plan Can be merged after review. No special considerations needed. ## Checklist - [x] _The PR has a short but descriptive title, suitable for a changelog_ - [ ] _Tests added / updated (if applicable)_ - [ ] _❗Changes to a redux slice have a corresponding migration_ - [ ] _Documentation added / updated (if applicable)_ - [ ] _Updated `What's New` copy (if doing a release after this PR)_	2025-12-23 05:58:58 +05:30
Alexander Eichhorn	73be5e5d35	Merge branch 'main' into feature/z-image-control	2025-12-22 22:56:30 +01:00
DustyShoe	259304bac5	Feature(UI): add extract masked area from raster layers (#8667 ) * chore: localize extraction errors * chore: rename extract masked area menu item * chore: rename inpaint mask extract component * fix: use mask bounds for extraction region * Prettier format applied to InpaintMaskMenuItemsExtractMaskedArea.tsx * Fix base64 image import bug in extracted area in InpaintMaskMenuItemsExtractMaskedArea.tsx and removed unused locales entries in en.json * Fix formatting issue in InpaintMaskMenuItemsExtractMaskedArea.tsx * Minor comment fix --------- Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>	2025-12-22 15:57:27 -05:00
Alexander Eichhorn	2be701cfe3	Feature: Add Tag System for user made Workflows (#8673 ) Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>	2025-12-22 15:41:48 -05:00
blessedcoolant	874b547598	chore: format code for ruff checks	2025-12-23 01:04:22 +05:30
blessedcoolant	7b9ce35806	Merge branch 'main' into pr/8679	2025-12-23 01:03:43 +05:30
Alexander Eichhorn	84f3e44a5d	Merge branch 'main' into feat/z-image-starter-models	2025-12-22 20:16:05 +01:00
Alexander Eichhorn	5264b7511c	Merge branch 'main' into fix/z-image-gguf-padding-token-shape	2025-12-22 20:15:18 +01:00
Alexander Eichhorn	f8b1f42f6d	fix(z-image): Fix padding token shape mismatch for GGUF models GGUF Z-Image models store x_pad_token and cap_pad_token with shape [dim], but diffusers ZImageTransformer2DModel expects [1, dim]. This caused a RuntimeError when loading GGUF-quantized Z-Image models. The fix dequantizes GGMLTensors first (since they don't support unsqueeze), then reshapes to add the batch dimension.	2025-12-22 18:31:57 +01:00
Josh Corbett	e1acb636d8	fix(ui): 🐛 `HotkeysModal` and `SettingsModal` initial focus (#8687 ) * fix(ui): 🐛 `HotkeysModal` and `SettingsModal` initial focus instead of using the `initialFocusRef` prop, the `Modal` component was focusing on the last available Button. This is a workaround that uses `tabIndex` instead which seems to be working. Closes #8685 * style: 🚨 satisfy linter --------- Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>	2025-12-22 11:20:44 -05:00
Alexander Eichhorn	b08accd4be	feat(starter-models): add Z-Image Turbo starter models Add Z-Image Turbo and related models to the starter models list: - Z-Image Turbo (full precision, ~13GB) - Z-Image Turbo quantized (GGUF Q4_K, ~4GB) - Z-Image Qwen3 Text Encoder (full precision, ~8GB) - Z-Image Qwen3 Text Encoder quantized (GGUF Q6_K, ~3.3GB) - Z-Image ControlNet Union (Canny, HED, Depth, Pose, MLSD, Inpainting) The quantized Turbo model includes the quantized Qwen3 encoder as a dependency for automatic installation.	2025-12-22 15:04:27 +01:00
Alexander Eichhorn	3668d5b83b	feat(z-image): add Extension-based Z-Image ControlNet support Implement Z-Image ControlNet as an Extension pattern (similar to FLUX ControlNet) instead of merging control weights into the base transformer. This provides: - Lower memory usage (no weight duplication) - Flexibility to enable/disable control per step - Cleaner architecture with separate control adapter Key implementation details: - ZImageControlNetExtension: computes control hints per denoising step - z_image_forward_with_control: custom forward pass with hint injection - patchify_control_context: utility for control image patchification - ZImageControlAdapter: standalone adapter with control_layers and noise_refiner Architecture matches original VideoX-Fun implementation: - Hints computed ONCE using INITIAL unified state (before main layers) - Hints injected at every other main transformer layer (15 control blocks) - Control signal added after each designated layer's forward pass V2.0 ControlNet support (control_in_dim=33): - Channels 0-15: control image latents - Channels 16-31: reference image (zeros for pure control) - Channel 32: inpaint mask (1.0 = don't inpaint, use control signal)	2025-12-21 22:30:28 +01:00
Alexander Eichhorn	1c13ca8159	style: apply ruff formatting	2025-12-21 18:52:12 +01:00
Alexander Eichhorn	3ed0e55d9d	fix: resolve linting errors in Z-Image ControlNet support - Add missing ControlNet_Checkpoint_ZImage_Config import - Remove unused imports (Any, Dict, ADALN_EMBED_DIM, is_torch_version) - Add strict=True to zip() calls - Replace mutable list defaults with immutable tuples - Replace dict() calls with literal syntax - Sort imports in z_image_denoise.py	2025-12-21 18:50:43 +01:00
Alexander Eichhorn	8db8aa8594	Add Z-Image ControlNet V2.0 support VRAM usage is high. - Auto-detect control_in_dim from adapter weights (16 for V1, 33 for V2.0) - Auto-detect n_refiner_layers from state dict - Add zero-padding for V2.0's additional channels - Use accelerate.init_empty_weights() for efficient model creation - Add ControlNet_Checkpoint_ZImage_Config to frontend schema	2025-12-21 18:43:02 +01:00
Alexander Eichhorn	456d578f20	WIP not working. feat: Add Z-Image ControlNet support with spatial conditioning Add comprehensive ControlNet support for Z-Image models including: Backend: - New ControlNet_Checkpoint_ZImage_Config for Z-Image control adapter models - Z-Image control key detection (_has_z_image_control_keys) to identify control layers - ZImageControlAdapter loader for standalone control models - ZImageControlTransformer2DModel combining base transformer with control layers - Memory-efficient model loading by building combined state dict	2025-12-21 18:43:02 +01:00
blessedcoolant	ab6b6721dc	Feature: Add Z-Image-Turbo model support (#8671 ) Add comprehensive support for Z-Image-Turbo (S3-DiT) models including: Backend: - New BaseModelType.ZImage in taxonomy - Z-Image model config classes (ZImageTransformerConfig, Qwen3TextEncoderConfig) - Model loader for Z-Image transformer and Qwen3 text encoder - Z-Image conditioning data structures - Step callback support for Z-Image with FLUX latent RGB factors Invocations: - z_image_model_loader: Load Z-Image transformer and Qwen3 encoder - z_image_text_encoder: Encode prompts using Qwen3 with chat template - z_image_denoise: Flow matching denoising with time-shifted sigmas - z_image_image_to_latents: Encode images to 16-channel latents - z_image_latents_to_image: Decode latents using FLUX VAE Frontend: - Z-Image graph builder for text-to-image generation - Model picker and validation updates for z-image base type - CFG scale now allows 0 (required for Z-Image-Turbo) - Clip skip disabled for Z-Image (uses Qwen3, not CLIP) - Optimal dimension settings for Z-Image (1024x1024) Technical details: - Uses Qwen3 text encoder (not CLIP/T5) - 16 latent channels with FLUX-compatible VAE - Flow matching scheduler with dynamic time shift - 8 inference steps recommended for Turbo variant - bfloat16 inference dtype ## Summary <!--A description of the changes in this PR. Include the kind of change (fix, feature, docs, etc), the "why" and the "how". Screenshots or videos are useful for frontend changes.--> ## Related Issues / Discussions <!--WHEN APPLICABLE: List any related issues or discussions on github or discord. If this PR closes an issue, please use the "Closes #1234" format, so that the issue will be automatically closed when the PR merges.--> ## QA Instructions - Install a Z-Image-Turbo model (e.g., from HuggingFace) - Select the model in the Model Picker - Generate a text-to-image with: - CFG Scale: 0 - Steps: 8 - Resolution: 1024x1024 - Verify the generated image is coherent (not noise) ## Merge Plan Standard merge, no special considerations needed. ## Checklist - [x] _The PR has a short but descriptive title, suitable for a changelog_ - [ ] _Tests added / updated (if applicable)_ - [ ] _❗Changes to a redux slice have a corresponding migration_ - [ ] _Documentation added / updated (if applicable)_ - [ ] _Updated `What's New` copy (if doing a release after this PR)_	2025-12-21 22:11:37 +05:30
blessedcoolant	93a587da90	Merge branch 'main' into feat/z-image-turbo-support	2025-12-21 21:58:22 +05:30
blessedcoolant	87bebf9c28	chore: upgrade diffusers to 0.36.0 to support z image	2025-12-21 21:54:47 +05:30
Alexander Eichhorn	f417c269d1	fix(vae): Fix dtype mismatch in FP32 VAE decode mode The previous mixed-precision optimization for FP32 mode only converted some VAE decoder layers (post_quant_conv, conv_in, mid_block) to the latents dtype while leaving others (up_blocks, conv_norm_out) in float32. This caused "expected scalar type Half but found Float" errors after recent diffusers updates. Simplify FP32 mode to consistently use float32 for both VAE and latents, removing the incomplete mixed-precision logic. This trades some VRAM usage for stability and correctness. Also removes now-unused attention processor imports.	2025-12-16 15:58:48 +01:00

1 2 3 4 5 ...

18626 Commits