InvokeAI

mirror of https://github.com/invoke-ai/InvokeAI.git synced 2026-04-23 03:00:31 -04:00

Author	SHA1	Message	Date
Lincoln Stein	8a6d593fe8	Merge branch 'main' into copilot/add-unload-model-option	2026-01-03 22:48:36 -05:00
Alexander Eichhorn	9f8f9965f9	fix(model-loaders): add local_files_only=True to prevent network requests (#8735 )	2026-01-03 22:21:42 -05:00
Jonathan	44a21a348d	Merge branch 'main' into copilot/add-unload-model-option	2026-01-03 22:00:11 -05:00
Alexander Eichhorn	be0cbe046c	feat(flux): add scheduler selection for Flux models (#8704 ) * feat(flux): add scheduler selection for Flux models Add support for alternative diffusers Flow Matching schedulers: - Euler (default, 1st order) - Heun (2nd order, better quality, 2x slower) - LCM (optimized for few steps) Backend: - Add schedulers.py with scheduler type definitions and class mapping - Modify denoise.py to accept optional scheduler parameter - Add scheduler InputField to flux_denoise invocation (v4.2.0) Frontend: - Add fluxScheduler to Redux state and paramsSlice - Create ParamFluxScheduler component for Linear UI - Add scheduler to buildFLUXGraph for generation * fix(flux): prevent progress percentage overflow with LCM scheduler LCM scheduler may have more internal timesteps than user-facing steps, causing user_step to exceed total_steps. This resulted in progress percentage > 1.0, which caused a pydantic validation error. Fix: Only call step_callback when user_step <= total_steps. * Ruff format * fix(flux): remove initial step-0 callback for consistent step count Remove the initial step_callback at step=0 to match SD/SDXL behavior. Previously Flux showed N+1 steps (step 0 + N denoising steps), while SD/SDXL showed only N steps. Now all models display N steps consistently. * feat(flux): add scheduler support with metadata recall - Handle LCM scheduler by using num_inference_steps instead of custom sigmas - Fix progress bar to show user-facing steps instead of internal scheduler steps - Pass scheduler parameter to Flux denoise node in graph builder - Add model-aware metadata recall for Flux scheduler --------- Co-authored-by: Jonathan <34005131+JPPhoto@users.noreply.github.com> Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>	2026-01-03 15:52:00 -05:00
Jonathan	e39b880f6d	Merge branch 'main' into copilot/add-unload-model-option	2026-01-03 15:41:59 -05:00
Alexander Eichhorn	689953e3cf	Feature/zimage scheduler support (#8705 ) * feat(flux): add scheduler selection for Flux models Add support for alternative diffusers Flow Matching schedulers: - Euler (default, 1st order) - Heun (2nd order, better quality, 2x slower) - LCM (optimized for few steps) Backend: - Add schedulers.py with scheduler type definitions and class mapping - Modify denoise.py to accept optional scheduler parameter - Add scheduler InputField to flux_denoise invocation (v4.2.0) Frontend: - Add fluxScheduler to Redux state and paramsSlice - Create ParamFluxScheduler component for Linear UI - Add scheduler to buildFLUXGraph for generation * feat(z-image): add scheduler selection for Z-Image models Add support for alternative diffusers Flow Matching schedulers for Z-Image: - Euler (default) - 1st order, optimized for Z-Image-Turbo (8 steps) - Heun (2nd order) - Better quality, 2x slower - LCM - Optimized for few-step generation Backend: - Extend schedulers.py with Z-Image scheduler types and mapping - Add scheduler InputField to z_image_denoise invocation (v1.3.0) - Refactor denoising loop to support diffusers schedulers Frontend: - Add zImageScheduler to Redux state in paramsSlice - Create ParamZImageScheduler component for Linear UI - Add scheduler to buildZImageGraph for generation * fix ruff check * fix(schedulers): prevent progress percentage overflow with LCM scheduler LCM scheduler may have more internal timesteps than user-facing steps, causing user_step to exceed total_steps. This resulted in progress percentage > 1.0, which caused a pydantic validation error. Fix: Only call step_callback when user_step <= total_steps. * Ruff format * fix(schedulers): remove initial step-0 callback for consistent step count Remove the initial step_callback at step=0 to match SD/SDXL behavior. Previously Flux/Z-Image showed N+1 steps (step 0 + N denoising steps), while SD/SDXL showed only N steps. Now all models display N steps consistently in the server log. * feat(z-image): add scheduler support with metadata recall - Handle LCM scheduler by using num_inference_steps instead of custom sigmas - Fix progress bar to show user-facing steps instead of internal scheduler steps - Pass scheduler parameter to Z-Image denoise node in graph builder - Add model-aware metadata recall for Flux and Z-Image schedulers --------- Co-authored-by: Jonathan <34005131+JPPhoto@users.noreply.github.com> Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>	2026-01-03 20:37:04 +00:00
Lincoln Stein	8cf4c6944a	(style) ruff fix	2026-01-03 14:54:15 -05:00
Lincoln Stein	db228ddc4f	(style) add @record_activity and @synchronized to locked methods	2026-01-03 14:52:31 -05:00
Jonathan	f49e1b8dae	Merge branch 'main' into copilot/add-unload-model-option	2026-01-01 21:31:08 -05:00
Alexander Eichhorn	3b2d2ef10a	fix(gguf): ensure dequantized tensors are on correct device for MPS (#8713 ) When using GGUF-quantized models on MPS (Apple Silicon), the dequantized tensors could end up on a different device than the other operands in math operations, causing "Expected all tensors to be on the same device" errors. This fix ensures that after dequantization, tensors are moved to the same device as the other tensors in the operation. Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>	2026-01-02 00:45:50 +00:00
Alexander Eichhorn	66974841f1	fix(model-manager): support offline Qwen3 tokenizer loading for Z-Image (#8719 ) Add local_files_only fallback for Qwen3 tokenizer loading in both Checkpoint and GGUF loaders. This ensures Z-Image models can generate images offline after the initial tokenizer download. The tokenizer is now loaded with local_files_only=True first, falling back to network download only if files aren't cached yet. Fixes #8716 Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>	2026-01-02 00:40:08 +00:00
Lincoln Stein	d44b99ae0a	Merge branch 'main' into copilot/add-unload-model-option	2025-12-28 22:39:45 -05:00
blessedcoolant	1675712094	Implement PBR Maps Node (#8700 ) * feat: Implement PBR Maps Generation Node * feat(ui): Add PBR Maps Generation to UI * chore: fix typegen checks * chore: possible fix for nvidia 5000 series cards * fix: Use safetensor models for PBR maps instead of pickles. * fix: incorrect naming of upconv_block for PBR network * fix: incorrect naming of displacement map variable * chore: add relevant docs to the PBR generate function * fix: clear cuda cache after loading state_dict for PBR maps * fix: load torch_device only once as multiple models are loaded * chore(ui): update the filter icon for PBR to CubeBold More relevant --------- Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>	2025-12-29 02:11:46 +00:00
Lincoln Stein	f1624a6215	Merge branch 'main' into copilot/add-unload-model-option	2025-12-28 20:38:42 -05:00
Alexander Eichhorn	d7d051200f	fix(z_image): use unrestricted image self-attention for regional prompting (#8718 ) Changes image self-attention from restricted (region-isolated) to unrestricted (all image tokens can attend to each other), similar to the FLUX approach. This fixes the issue where ZImage-Turbo with multiple regional guidance layers would generate two separate/disconnected images instead of compositing them into a single unified image. The regional text-image attention remains restricted so that each region still responds to its corresponding prompt. Fixes #8715	2025-12-28 11:32:50 -05:00
Lincoln Stein	56fd1da888	Merge branch 'main' into copilot/add-unload-model-option	2025-12-27 21:08:17 -05:00
Alexander Eichhorn	355c985cc3	fix(model-manager): add Z-Image LoRA/DoRA detection and loading support Two fixes for Z-Image LoRA support: 1. Override _validate_looks_like_lora in LoRA_LyCORIS_ZImage_Config to recognize Z-Image specific LoRA formats that use different key patterns than SD/SDXL LoRAs. Z-Image LoRAs use lora_down.weight/lora_up.weight and dora_scale suffixes instead of lora_A.weight/lora_B.weight. 2. Fix _group_by_layer in z_image_lora_conversion_utils.py to correctly group LoRA keys by layer name. The previous logic used rsplit with maxsplit=2 which incorrectly grouped keys like: - "to_k.alpha" -> layer "diffusion_model.layers.17.attention" - "lora_down.weight" -> layer "diffusion_model.layers.17.attention.to_k" Now uses suffix matching to ensure all keys for a layer are grouped together (alpha, dora_scale, lora_down.weight, lora_up.weight).	2025-12-27 09:17:29 +01:00
Lincoln Stein	a7205e4e36	Merge branch 'main' into copilot/add-unload-model-option	2025-12-25 21:33:59 -05:00
Alexander Eichhorn	65efc3db7d	Feature: Add Z-Image-Turbo regional guidance (#8672 ) * feat: Add Regional Guidance support for Z-Image model Implements regional prompting for Z-Image (S3-DiT Transformer) allowing different prompts to affect different image regions using attention masks. Backend changes: - Add ZImageRegionalPromptingExtension for mask preparation - Add ZImageTextConditioning and ZImageRegionalTextConditioning data classes - Patch transformer forward to inject 4D regional attention masks - Use additive float mask (0.0 attend, -inf block) in bfloat16 for compatibility - Alternate regional/full attention layers for global coherence Frontend changes: - Update buildZImageGraph to support regional conditioning collectors - Update addRegions to create z_image_text_encoder nodes for regions - Update addZImageLoRAs to handle optional negCond when guidance_scale=0 - Add Z-Image validation (no IP adapters, no autoNegative) * @Pfannkuchensack Fix windows path again * ruff check fix * ruff formating * fix(ui): Z-Image CFG guidance_scale check uses > 1 instead of > 0 Changed the guidance_scale check from > 0 to > 1 for Z-Image models. Since Z-Image uses guidance_scale=1.0 as "no CFG" (matching FLUX convention), negative conditioning should only be created when guidance_scale > 1. --------- Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>	2025-12-26 02:25:38 +00:00
Lincoln Stein	b9493ddce7	Workaround for Windows being unable to remove tmp directories when installing GGUF files (#8699 ) * (bugfix)(mm) work around Windows being unable to rmtree tmp directories after GGUF install * (style) fix ruff error * (fix) add workaround for Windows Permission Denied on GGUF file move() call * (fix) perform torch copy() in GGUF reader to avoid deletion failures on Windows * (style) fix ruff formatting issues	2025-12-26 02:02:39 +00:00
Lincoln Stein	5b69403ba8	Merge branch 'main' into copilot/add-unload-model-option	2025-12-24 15:39:46 -05:00
Alexander Eichhorn	ac245cbf6c	feat(backend): add support for xlabs Flux LoRA format (#8686 ) Add support for loading Flux LoRA models in the xlabs format, which uses keys like `double_blocks.X.processor.{qkv\|proj}_lora{1\|2}.{down\|up}.weight`. The xlabs format maps: - lora1 -> img_attn (image attention stream) - lora2 -> txt_attn (text attention stream) - qkv -> query/key/value projection - proj -> output projection Changes: - Add FluxLoRAFormat.XLabs enum value - Add flux_xlabs_lora_conversion_utils.py with detection and conversion - Update formats.py to detect xlabs format - Update lora.py loader to handle xlabs format - Update model probe to accept recognized Flux LoRA formats - Add unit tests for xlabs format detection and conversion Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>	2025-12-24 20:18:11 +00:00
copilot-swe-agent[bot]	4987b4da1c	Fix timeout message appearing during active generation Only log "Clearing model cache" message when there are actually unlocked models to clear. This prevents the misleading message from appearing during active generation when all models are locked. Changes: - Check for unlocked models before logging clear message - Add count of unlocked models in log message - Add debug log when all models are locked - Improves user experience by avoiding confusing messages Co-authored-by: lstein <111189+lstein@users.noreply.github.com>	2025-12-24 05:31:11 +00:00
Lincoln Stein	1e15b8c106	Merge branch 'main' into copilot/add-unload-model-option	2025-12-24 00:14:45 -05:00
Alexander Eichhorn	21138e5d52	fix support multi-subfolder downloads for Z-Image Qwen3 encoder (#8692 ) * fix(model-install): support multi-subfolder downloads for Z-Image Qwen3 encoder The Z-Image Qwen3 text encoder requires both text_encoder and tokenizer subfolders from the HuggingFace repo, but the previous implementation only downloaded the text_encoder subfolder, causing model identification to fail. Changes: - Add subfolders property to HFModelSource supporting '+' separated paths - Extend filter_files() and download_urls() to handle multiple subfolders - Update _multifile_download() to preserve subfolder structure - Make Qwen3Encoder probe check both nested and direct config.json paths - Update Qwen3EncoderLoader to handle both directory structures - Change starter model source to text_encoder+tokenizer * ruff format * fix schema description * fix schema description --------- Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>	2025-12-23 23:39:43 -05:00
copilot-swe-agent[bot]	8d76b4e4d4	Fix ruff whitespace errors and improve timeout logging - Remove all trailing whitespace (W293 errors) - Add debug logging when timeout fires but activity detected - Add debug logging when timeout fires but cache is empty - Only log "Clearing model cache" message when actually clearing - Prevents misleading timeout messages during active generation Co-authored-by: lstein <111189+lstein@users.noreply.github.com>	2025-12-24 04:05:57 +00:00
copilot-swe-agent[bot]	c3217d8a08	Address code review feedback - Remove unused variable in test - Add clarifying comment for daemon thread setting - Add detailed comment explaining cache clearing with 1000 GB value - Improve code documentation Co-authored-by: lstein <111189+lstein@users.noreply.github.com>	2025-12-24 00:27:39 +00:00
copilot-swe-agent[bot]	2500153ed8	Fix race condition in timeout mechanism - Added clarifying comment that _record_activity is called with lock held - Enhanced double-check in _on_timeout for thread safety - Added lock protection to shutdown method - Improved handling of edge cases where timer fires during activity Co-authored-by: lstein <111189+lstein@users.noreply.github.com>	2025-12-24 00:26:01 +00:00
copilot-swe-agent[bot]	9bbd2b3f11	Add model_cache_keep_alive config option and timeout mechanism - Added model_cache_keep_alive config field (minutes, default 0 = infinite) - Implemented timeout tracking in ModelCache class - Added _record_activity() to track model usage - Added _on_timeout() to auto-clear cache when timeout expires - Added shutdown() method to clean up timers - Integrated timeout with get(), lock(), unlock(), and put() operations - Updated ModelManagerService to pass keep_alive parameter - Added cleanup in stop() method Co-authored-by: lstein <111189+lstein@users.noreply.github.com>	2025-12-24 00:22:59 +00:00
blessedcoolant	1b5d91d1cf	Merge branch 'main' into feat/z-image-starter-models	2025-12-23 08:27:25 +05:30
Alexander Eichhorn	a748519e92	feat(starter-models): add Z-Image Q8 quant and ControlNet Tile Add higher quality Q8_0 quantization option for Z-Image Turbo (~6.6GB) to complement existing Q4_K variant, providing better quality for users with more VRAM. Add dedicated Z-Image ControlNet Tile model (~6.7GB) for upscaling and detail enhancement workflows.	2025-12-23 03:27:09 +01:00
blessedcoolant	7068cf956a	Merge branch 'main' into pr/8690	2025-12-23 05:59:49 +05:30
blessedcoolant	874b547598	chore: format code for ruff checks	2025-12-23 01:04:22 +05:30
Alexander Eichhorn	f8b1f42f6d	fix(z-image): Fix padding token shape mismatch for GGUF models GGUF Z-Image models store x_pad_token and cap_pad_token with shape [dim], but diffusers ZImageTransformer2DModel expects [1, dim]. This caused a RuntimeError when loading GGUF-quantized Z-Image models. The fix dequantizes GGMLTensors first (since they don't support unsqueeze), then reshapes to add the batch dimension.	2025-12-22 18:31:57 +01:00
Alexander Eichhorn	b08accd4be	feat(starter-models): add Z-Image Turbo starter models Add Z-Image Turbo and related models to the starter models list: - Z-Image Turbo (full precision, ~13GB) - Z-Image Turbo quantized (GGUF Q4_K, ~4GB) - Z-Image Qwen3 Text Encoder (full precision, ~8GB) - Z-Image Qwen3 Text Encoder quantized (GGUF Q6_K, ~3.3GB) - Z-Image ControlNet Union (Canny, HED, Depth, Pose, MLSD, Inpainting) The quantized Turbo model includes the quantized Qwen3 encoder as a dependency for automatic installation.	2025-12-22 15:04:27 +01:00
Alexander Eichhorn	3668d5b83b	feat(z-image): add Extension-based Z-Image ControlNet support Implement Z-Image ControlNet as an Extension pattern (similar to FLUX ControlNet) instead of merging control weights into the base transformer. This provides: - Lower memory usage (no weight duplication) - Flexibility to enable/disable control per step - Cleaner architecture with separate control adapter Key implementation details: - ZImageControlNetExtension: computes control hints per denoising step - z_image_forward_with_control: custom forward pass with hint injection - patchify_control_context: utility for control image patchification - ZImageControlAdapter: standalone adapter with control_layers and noise_refiner Architecture matches original VideoX-Fun implementation: - Hints computed ONCE using INITIAL unified state (before main layers) - Hints injected at every other main transformer layer (15 control blocks) - Control signal added after each designated layer's forward pass V2.0 ControlNet support (control_in_dim=33): - Channels 0-15: control image latents - Channels 16-31: reference image (zeros for pure control) - Channel 32: inpaint mask (1.0 = don't inpaint, use control signal)	2025-12-21 22:30:28 +01:00
Alexander Eichhorn	1c13ca8159	style: apply ruff formatting	2025-12-21 18:52:12 +01:00
Alexander Eichhorn	3ed0e55d9d	fix: resolve linting errors in Z-Image ControlNet support - Add missing ControlNet_Checkpoint_ZImage_Config import - Remove unused imports (Any, Dict, ADALN_EMBED_DIM, is_torch_version) - Add strict=True to zip() calls - Replace mutable list defaults with immutable tuples - Replace dict() calls with literal syntax - Sort imports in z_image_denoise.py	2025-12-21 18:50:43 +01:00
Alexander Eichhorn	456d578f20	WIP not working. feat: Add Z-Image ControlNet support with spatial conditioning Add comprehensive ControlNet support for Z-Image models including: Backend: - New ControlNet_Checkpoint_ZImage_Config for Z-Image control adapter models - Z-Image control key detection (_has_z_image_control_keys) to identify control layers - ZImageControlAdapter loader for standalone control models - ZImageControlTransformer2DModel combining base transformer with control layers - Memory-efficient model loading by building combined state dict	2025-12-21 18:43:02 +01:00
blessedcoolant	8785d9a3a9	chore: fix ruff checks	2025-12-14 19:51:22 +05:30
Alexander Eichhorn	1e72feb744	Remove unneeded Loggging	2025-12-14 06:44:29 +01:00
Alexander Eichhorn	3ee24cbdde	Remove the ParamScheduler for z-images Fixed the DEFAULT_TOKENIZER_SOURCE to Qwen/Qwen3-4B	2025-12-13 04:23:34 +01:00
Alexander Eichhorn	f9605e18a0	z-image-turbo-fp8-e5m2 works. the z-image-turbo_fp8_scaled_e4m3fn_KJ dont.	2025-12-10 17:15:54 +01:00
Alexander Eichhorn	fb1a99b650	feat(cache): add partial loading support for Z-Image RMSNorm and LayerNorm - Add CustomDiffusersRMSNorm for diffusers.models.normalization.RMSNorm - Add CustomLayerNorm for torch.nn.LayerNorm - Register both in AUTOCAST_MODULE_TYPE_MAPPING Enables partial loading (enable_partial_loading: true) for Z-Image models by wrapping their normalization layers with device autocast support	2025-12-10 03:45:42 +01:00
Alexander Eichhorn	3b5d9c26d3	feat(z-image): add Qwen3 GGUF text encoder support and default parameters - Add Qwen3EncoderGGUFLoader for llama.cpp GGUF quantized text encoders - Convert llama.cpp key format (blk.X., token_embd) to PyTorch format - Handle tied embeddings (lm_head.weight ↔ embed_tokens.weight) - Dequantize embed_tokens for embedding lookups (GGMLTensor limitation) - Add QK normalization key mappings (q_norm, k_norm) for Qwen3 - Set Z-Image defaults: steps=9, cfg_scale=0.0, width/height=1024 - Allow cfg_scale >= 0 (was >= 1) for Z-Image Turbo compatibility - Add GGUF format detection for Qwen3 model probing	2025-12-10 03:07:07 +01:00
Alexander Eichhorn	ba2475c3f0	fix(z-image): improve device/dtype compatibility and error handling Add robust device capability detection for bfloat16, replacing hardcoded dtype with runtime checks that fallback to float16/float32 on unsupported hardware. This prevents runtime failures on GPUs and CPUs without bfloat16. Key changes: - Add TorchDevice.choose_bfloat16_safe_dtype() helper for safe dtype selection - Fix LoRA device mismatch in layer_patcher.py (add device= to .to() call) - Replace all assert statements with descriptive exceptions (TypeError/ValueError) - Add hidden_states bounds check and apply_chat_template fallback in text encoder - Add GGUF QKV tensor validation (divisible by 3 check) - Fix CPU noise generation to use float32 for compatibility - Remove verbose debug logging from LoRA conversion utils	2025-12-09 07:37:06 +01:00
Alexander Eichhorn	e9d52734d1	feat(z-image): add single-file checkpoint support for Z-Image models Add support for loading Z-Image transformer and Qwen3 encoder models from single-file safetensors format (in addition to existing diffusers directory format). Changes: - Add Main_Checkpoint_ZImage_Config and Main_GGUF_ZImage_Config for single-file Z-Image transformer models - Add Qwen3Encoder_Checkpoint_Config for single-file Qwen3 text encoder - Add ZImageCheckpointModel and ZImageGGUFCheckpointModel loaders with automatic key conversion from original to diffusers format - Add Qwen3EncoderCheckpointLoader using Qwen3ForCausalLM with fast loading via init_empty_weights and proper weight tying for lm_head - Update z_image_denoise to accept Checkpoint format models	2025-12-09 06:32:51 +01:00
Alexander Eichhorn	2e0cd4d68c	Patch from @lstein for the update of diffusers	2025-12-06 03:12:50 +01:00
Alexander Eichhorn	280202908a	feat: Add GGUF quantized Z-Image support and improve VAE/encoder flexibility Add comprehensive support for GGUF quantized Z-Image models and improve component flexibility: Backend: - New Main_GGUF_ZImage_Config for GGUF quantized Z-Image transformers - Z-Image key detection (_has_z_image_keys) to identify S3-DiT models - GGUF quantization detection and sidecar LoRA patching for quantized models - Qwen3Encoder_Qwen3Encoder_Config for standalone Qwen3 encoder models Model Loader: - Split Z-Image model	2025-12-02 20:31:11 +01:00
Alexander Eichhorn	2b062b21cd	fix: Improve Flux AI Toolkit LoRA detection to prevent Z-Image misidentification Move Flux layer structure check before metadata check to prevent misidentifying Z-Image LoRAs (which use `diffusion_model.layers.X`) as Flux AI Toolkit format. Flux models use `double_blocks` and `single_blocks` patterns which are now checked first regardless of metadata presence.	2025-12-02 15:50:01 +01:00

1 2 3 4 5 ...

2521 Commits