InvokeAI

mirror of https://github.com/invoke-ai/InvokeAI.git synced 2026-04-23 03:00:31 -04:00

Author	SHA1	Message	Date
Lincoln Stein	56fd7bc7c4	docs(z-image) add Z-Image requirements and starter bundle (#8734 ) * docs(z-image) add minimum requirements for Z-Image and create Z-Image starter bundle * fix(model manager) add flux VAE to Z-Image bundle * docs(model manager) remove out-of-date model info link * chore: fix frontendchecks * chore: lint:prettier * docs(model manager): clarify minimum hardware for z-image turbo * (fix) add flux VAE to ZIT starter dependencies & tweak UI docs	2026-01-04 10:17:26 -05:00
Alexander Eichhorn	9f8f9965f9	fix(model-loaders): add local_files_only=True to prevent network requests (#8735 )	2026-01-03 22:21:42 -05:00
Alexander Eichhorn	be0cbe046c	feat(flux): add scheduler selection for Flux models (#8704 ) * feat(flux): add scheduler selection for Flux models Add support for alternative diffusers Flow Matching schedulers: - Euler (default, 1st order) - Heun (2nd order, better quality, 2x slower) - LCM (optimized for few steps) Backend: - Add schedulers.py with scheduler type definitions and class mapping - Modify denoise.py to accept optional scheduler parameter - Add scheduler InputField to flux_denoise invocation (v4.2.0) Frontend: - Add fluxScheduler to Redux state and paramsSlice - Create ParamFluxScheduler component for Linear UI - Add scheduler to buildFLUXGraph for generation * fix(flux): prevent progress percentage overflow with LCM scheduler LCM scheduler may have more internal timesteps than user-facing steps, causing user_step to exceed total_steps. This resulted in progress percentage > 1.0, which caused a pydantic validation error. Fix: Only call step_callback when user_step <= total_steps. * Ruff format * fix(flux): remove initial step-0 callback for consistent step count Remove the initial step_callback at step=0 to match SD/SDXL behavior. Previously Flux showed N+1 steps (step 0 + N denoising steps), while SD/SDXL showed only N steps. Now all models display N steps consistently. * feat(flux): add scheduler support with metadata recall - Handle LCM scheduler by using num_inference_steps instead of custom sigmas - Fix progress bar to show user-facing steps instead of internal scheduler steps - Pass scheduler parameter to Flux denoise node in graph builder - Add model-aware metadata recall for Flux scheduler --------- Co-authored-by: Jonathan <34005131+JPPhoto@users.noreply.github.com> Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>	2026-01-03 15:52:00 -05:00
Alexander Eichhorn	689953e3cf	Feature/zimage scheduler support (#8705 ) * feat(flux): add scheduler selection for Flux models Add support for alternative diffusers Flow Matching schedulers: - Euler (default, 1st order) - Heun (2nd order, better quality, 2x slower) - LCM (optimized for few steps) Backend: - Add schedulers.py with scheduler type definitions and class mapping - Modify denoise.py to accept optional scheduler parameter - Add scheduler InputField to flux_denoise invocation (v4.2.0) Frontend: - Add fluxScheduler to Redux state and paramsSlice - Create ParamFluxScheduler component for Linear UI - Add scheduler to buildFLUXGraph for generation * feat(z-image): add scheduler selection for Z-Image models Add support for alternative diffusers Flow Matching schedulers for Z-Image: - Euler (default) - 1st order, optimized for Z-Image-Turbo (8 steps) - Heun (2nd order) - Better quality, 2x slower - LCM - Optimized for few-step generation Backend: - Extend schedulers.py with Z-Image scheduler types and mapping - Add scheduler InputField to z_image_denoise invocation (v1.3.0) - Refactor denoising loop to support diffusers schedulers Frontend: - Add zImageScheduler to Redux state in paramsSlice - Create ParamZImageScheduler component for Linear UI - Add scheduler to buildZImageGraph for generation * fix ruff check * fix(schedulers): prevent progress percentage overflow with LCM scheduler LCM scheduler may have more internal timesteps than user-facing steps, causing user_step to exceed total_steps. This resulted in progress percentage > 1.0, which caused a pydantic validation error. Fix: Only call step_callback when user_step <= total_steps. * Ruff format * fix(schedulers): remove initial step-0 callback for consistent step count Remove the initial step_callback at step=0 to match SD/SDXL behavior. Previously Flux/Z-Image showed N+1 steps (step 0 + N denoising steps), while SD/SDXL showed only N steps. Now all models display N steps consistently in the server log. * feat(z-image): add scheduler support with metadata recall - Handle LCM scheduler by using num_inference_steps instead of custom sigmas - Fix progress bar to show user-facing steps instead of internal scheduler steps - Pass scheduler parameter to Z-Image denoise node in graph builder - Add model-aware metadata recall for Flux and Z-Image schedulers --------- Co-authored-by: Jonathan <34005131+JPPhoto@users.noreply.github.com> Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>	2026-01-03 20:37:04 +00:00
Alexander Eichhorn	3b2d2ef10a	fix(gguf): ensure dequantized tensors are on correct device for MPS (#8713 ) When using GGUF-quantized models on MPS (Apple Silicon), the dequantized tensors could end up on a different device than the other operands in math operations, causing "Expected all tensors to be on the same device" errors. This fix ensures that after dequantization, tensors are moved to the same device as the other tensors in the operation. Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>	2026-01-02 00:45:50 +00:00
Alexander Eichhorn	66974841f1	fix(model-manager): support offline Qwen3 tokenizer loading for Z-Image (#8719 ) Add local_files_only fallback for Qwen3 tokenizer loading in both Checkpoint and GGUF loaders. This ensures Z-Image models can generate images offline after the initial tokenizer download. The tokenizer is now loaded with local_files_only=True first, falling back to network download only if files aren't cached yet. Fixes #8716 Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>	2026-01-02 00:40:08 +00:00
blessedcoolant	1675712094	Implement PBR Maps Node (#8700 ) * feat: Implement PBR Maps Generation Node * feat(ui): Add PBR Maps Generation to UI * chore: fix typegen checks * chore: possible fix for nvidia 5000 series cards * fix: Use safetensor models for PBR maps instead of pickles. * fix: incorrect naming of upconv_block for PBR network * fix: incorrect naming of displacement map variable * chore: add relevant docs to the PBR generate function * fix: clear cuda cache after loading state_dict for PBR maps * fix: load torch_device only once as multiple models are loaded * chore(ui): update the filter icon for PBR to CubeBold More relevant --------- Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>	2025-12-29 02:11:46 +00:00
Alexander Eichhorn	d7d051200f	fix(z_image): use unrestricted image self-attention for regional prompting (#8718 ) Changes image self-attention from restricted (region-isolated) to unrestricted (all image tokens can attend to each other), similar to the FLUX approach. This fixes the issue where ZImage-Turbo with multiple regional guidance layers would generate two separate/disconnected images instead of compositing them into a single unified image. The regional text-image attention remains restricted so that each region still responds to its corresponding prompt. Fixes #8715	2025-12-28 11:32:50 -05:00
Alexander Eichhorn	355c985cc3	fix(model-manager): add Z-Image LoRA/DoRA detection and loading support Two fixes for Z-Image LoRA support: 1. Override _validate_looks_like_lora in LoRA_LyCORIS_ZImage_Config to recognize Z-Image specific LoRA formats that use different key patterns than SD/SDXL LoRAs. Z-Image LoRAs use lora_down.weight/lora_up.weight and dora_scale suffixes instead of lora_A.weight/lora_B.weight. 2. Fix _group_by_layer in z_image_lora_conversion_utils.py to correctly group LoRA keys by layer name. The previous logic used rsplit with maxsplit=2 which incorrectly grouped keys like: - "to_k.alpha" -> layer "diffusion_model.layers.17.attention" - "lora_down.weight" -> layer "diffusion_model.layers.17.attention.to_k" Now uses suffix matching to ensure all keys for a layer are grouped together (alpha, dora_scale, lora_down.weight, lora_up.weight).	2025-12-27 09:17:29 +01:00
Alexander Eichhorn	65efc3db7d	Feature: Add Z-Image-Turbo regional guidance (#8672 ) * feat: Add Regional Guidance support for Z-Image model Implements regional prompting for Z-Image (S3-DiT Transformer) allowing different prompts to affect different image regions using attention masks. Backend changes: - Add ZImageRegionalPromptingExtension for mask preparation - Add ZImageTextConditioning and ZImageRegionalTextConditioning data classes - Patch transformer forward to inject 4D regional attention masks - Use additive float mask (0.0 attend, -inf block) in bfloat16 for compatibility - Alternate regional/full attention layers for global coherence Frontend changes: - Update buildZImageGraph to support regional conditioning collectors - Update addRegions to create z_image_text_encoder nodes for regions - Update addZImageLoRAs to handle optional negCond when guidance_scale=0 - Add Z-Image validation (no IP adapters, no autoNegative) * @Pfannkuchensack Fix windows path again * ruff check fix * ruff formating * fix(ui): Z-Image CFG guidance_scale check uses > 1 instead of > 0 Changed the guidance_scale check from > 0 to > 1 for Z-Image models. Since Z-Image uses guidance_scale=1.0 as "no CFG" (matching FLUX convention), negative conditioning should only be created when guidance_scale > 1. --------- Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>	2025-12-26 02:25:38 +00:00
Lincoln Stein	b9493ddce7	Workaround for Windows being unable to remove tmp directories when installing GGUF files (#8699 ) * (bugfix)(mm) work around Windows being unable to rmtree tmp directories after GGUF install * (style) fix ruff error * (fix) add workaround for Windows Permission Denied on GGUF file move() call * (fix) perform torch copy() in GGUF reader to avoid deletion failures on Windows * (style) fix ruff formatting issues	2025-12-26 02:02:39 +00:00
Alexander Eichhorn	ac245cbf6c	feat(backend): add support for xlabs Flux LoRA format (#8686 ) Add support for loading Flux LoRA models in the xlabs format, which uses keys like `double_blocks.X.processor.{qkv\|proj}_lora{1\|2}.{down\|up}.weight`. The xlabs format maps: - lora1 -> img_attn (image attention stream) - lora2 -> txt_attn (text attention stream) - qkv -> query/key/value projection - proj -> output projection Changes: - Add FluxLoRAFormat.XLabs enum value - Add flux_xlabs_lora_conversion_utils.py with detection and conversion - Update formats.py to detect xlabs format - Update lora.py loader to handle xlabs format - Update model probe to accept recognized Flux LoRA formats - Add unit tests for xlabs format detection and conversion Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>	2025-12-24 20:18:11 +00:00
Alexander Eichhorn	21138e5d52	fix support multi-subfolder downloads for Z-Image Qwen3 encoder (#8692 ) * fix(model-install): support multi-subfolder downloads for Z-Image Qwen3 encoder The Z-Image Qwen3 text encoder requires both text_encoder and tokenizer subfolders from the HuggingFace repo, but the previous implementation only downloaded the text_encoder subfolder, causing model identification to fail. Changes: - Add subfolders property to HFModelSource supporting '+' separated paths - Extend filter_files() and download_urls() to handle multiple subfolders - Update _multifile_download() to preserve subfolder structure - Make Qwen3Encoder probe check both nested and direct config.json paths - Update Qwen3EncoderLoader to handle both directory structures - Change starter model source to text_encoder+tokenizer * ruff format * fix schema description * fix schema description --------- Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>	2025-12-23 23:39:43 -05:00
blessedcoolant	1b5d91d1cf	Merge branch 'main' into feat/z-image-starter-models	2025-12-23 08:27:25 +05:30
Alexander Eichhorn	a748519e92	feat(starter-models): add Z-Image Q8 quant and ControlNet Tile Add higher quality Q8_0 quantization option for Z-Image Turbo (~6.6GB) to complement existing Q4_K variant, providing better quality for users with more VRAM. Add dedicated Z-Image ControlNet Tile model (~6.7GB) for upscaling and detail enhancement workflows.	2025-12-23 03:27:09 +01:00
blessedcoolant	7068cf956a	Merge branch 'main' into pr/8690	2025-12-23 05:59:49 +05:30
blessedcoolant	874b547598	chore: format code for ruff checks	2025-12-23 01:04:22 +05:30
Alexander Eichhorn	f8b1f42f6d	fix(z-image): Fix padding token shape mismatch for GGUF models GGUF Z-Image models store x_pad_token and cap_pad_token with shape [dim], but diffusers ZImageTransformer2DModel expects [1, dim]. This caused a RuntimeError when loading GGUF-quantized Z-Image models. The fix dequantizes GGMLTensors first (since they don't support unsqueeze), then reshapes to add the batch dimension.	2025-12-22 18:31:57 +01:00
Alexander Eichhorn	b08accd4be	feat(starter-models): add Z-Image Turbo starter models Add Z-Image Turbo and related models to the starter models list: - Z-Image Turbo (full precision, ~13GB) - Z-Image Turbo quantized (GGUF Q4_K, ~4GB) - Z-Image Qwen3 Text Encoder (full precision, ~8GB) - Z-Image Qwen3 Text Encoder quantized (GGUF Q6_K, ~3.3GB) - Z-Image ControlNet Union (Canny, HED, Depth, Pose, MLSD, Inpainting) The quantized Turbo model includes the quantized Qwen3 encoder as a dependency for automatic installation.	2025-12-22 15:04:27 +01:00
Alexander Eichhorn	3668d5b83b	feat(z-image): add Extension-based Z-Image ControlNet support Implement Z-Image ControlNet as an Extension pattern (similar to FLUX ControlNet) instead of merging control weights into the base transformer. This provides: - Lower memory usage (no weight duplication) - Flexibility to enable/disable control per step - Cleaner architecture with separate control adapter Key implementation details: - ZImageControlNetExtension: computes control hints per denoising step - z_image_forward_with_control: custom forward pass with hint injection - patchify_control_context: utility for control image patchification - ZImageControlAdapter: standalone adapter with control_layers and noise_refiner Architecture matches original VideoX-Fun implementation: - Hints computed ONCE using INITIAL unified state (before main layers) - Hints injected at every other main transformer layer (15 control blocks) - Control signal added after each designated layer's forward pass V2.0 ControlNet support (control_in_dim=33): - Channels 0-15: control image latents - Channels 16-31: reference image (zeros for pure control) - Channel 32: inpaint mask (1.0 = don't inpaint, use control signal)	2025-12-21 22:30:28 +01:00
Alexander Eichhorn	1c13ca8159	style: apply ruff formatting	2025-12-21 18:52:12 +01:00
Alexander Eichhorn	3ed0e55d9d	fix: resolve linting errors in Z-Image ControlNet support - Add missing ControlNet_Checkpoint_ZImage_Config import - Remove unused imports (Any, Dict, ADALN_EMBED_DIM, is_torch_version) - Add strict=True to zip() calls - Replace mutable list defaults with immutable tuples - Replace dict() calls with literal syntax - Sort imports in z_image_denoise.py	2025-12-21 18:50:43 +01:00
Alexander Eichhorn	456d578f20	WIP not working. feat: Add Z-Image ControlNet support with spatial conditioning Add comprehensive ControlNet support for Z-Image models including: Backend: - New ControlNet_Checkpoint_ZImage_Config for Z-Image control adapter models - Z-Image control key detection (_has_z_image_control_keys) to identify control layers - ZImageControlAdapter loader for standalone control models - ZImageControlTransformer2DModel combining base transformer with control layers - Memory-efficient model loading by building combined state dict	2025-12-21 18:43:02 +01:00
blessedcoolant	8785d9a3a9	chore: fix ruff checks	2025-12-14 19:51:22 +05:30
Alexander Eichhorn	1e72feb744	Remove unneeded Loggging	2025-12-14 06:44:29 +01:00
Alexander Eichhorn	3ee24cbdde	Remove the ParamScheduler for z-images Fixed the DEFAULT_TOKENIZER_SOURCE to Qwen/Qwen3-4B	2025-12-13 04:23:34 +01:00
Alexander Eichhorn	f9605e18a0	z-image-turbo-fp8-e5m2 works. the z-image-turbo_fp8_scaled_e4m3fn_KJ dont.	2025-12-10 17:15:54 +01:00
Alexander Eichhorn	fb1a99b650	feat(cache): add partial loading support for Z-Image RMSNorm and LayerNorm - Add CustomDiffusersRMSNorm for diffusers.models.normalization.RMSNorm - Add CustomLayerNorm for torch.nn.LayerNorm - Register both in AUTOCAST_MODULE_TYPE_MAPPING Enables partial loading (enable_partial_loading: true) for Z-Image models by wrapping their normalization layers with device autocast support	2025-12-10 03:45:42 +01:00
Alexander Eichhorn	3b5d9c26d3	feat(z-image): add Qwen3 GGUF text encoder support and default parameters - Add Qwen3EncoderGGUFLoader for llama.cpp GGUF quantized text encoders - Convert llama.cpp key format (blk.X., token_embd) to PyTorch format - Handle tied embeddings (lm_head.weight ↔ embed_tokens.weight) - Dequantize embed_tokens for embedding lookups (GGMLTensor limitation) - Add QK normalization key mappings (q_norm, k_norm) for Qwen3 - Set Z-Image defaults: steps=9, cfg_scale=0.0, width/height=1024 - Allow cfg_scale >= 0 (was >= 1) for Z-Image Turbo compatibility - Add GGUF format detection for Qwen3 model probing	2025-12-10 03:07:07 +01:00
Alexander Eichhorn	ba2475c3f0	fix(z-image): improve device/dtype compatibility and error handling Add robust device capability detection for bfloat16, replacing hardcoded dtype with runtime checks that fallback to float16/float32 on unsupported hardware. This prevents runtime failures on GPUs and CPUs without bfloat16. Key changes: - Add TorchDevice.choose_bfloat16_safe_dtype() helper for safe dtype selection - Fix LoRA device mismatch in layer_patcher.py (add device= to .to() call) - Replace all assert statements with descriptive exceptions (TypeError/ValueError) - Add hidden_states bounds check and apply_chat_template fallback in text encoder - Add GGUF QKV tensor validation (divisible by 3 check) - Fix CPU noise generation to use float32 for compatibility - Remove verbose debug logging from LoRA conversion utils	2025-12-09 07:37:06 +01:00
Alexander Eichhorn	e9d52734d1	feat(z-image): add single-file checkpoint support for Z-Image models Add support for loading Z-Image transformer and Qwen3 encoder models from single-file safetensors format (in addition to existing diffusers directory format). Changes: - Add Main_Checkpoint_ZImage_Config and Main_GGUF_ZImage_Config for single-file Z-Image transformer models - Add Qwen3Encoder_Checkpoint_Config for single-file Qwen3 text encoder - Add ZImageCheckpointModel and ZImageGGUFCheckpointModel loaders with automatic key conversion from original to diffusers format - Add Qwen3EncoderCheckpointLoader using Qwen3ForCausalLM with fast loading via init_empty_weights and proper weight tying for lm_head - Update z_image_denoise to accept Checkpoint format models	2025-12-09 06:32:51 +01:00
Alexander Eichhorn	2e0cd4d68c	Patch from @lstein for the update of diffusers	2025-12-06 03:12:50 +01:00
Alexander Eichhorn	280202908a	feat: Add GGUF quantized Z-Image support and improve VAE/encoder flexibility Add comprehensive support for GGUF quantized Z-Image models and improve component flexibility: Backend: - New Main_GGUF_ZImage_Config for GGUF quantized Z-Image transformers - Z-Image key detection (_has_z_image_keys) to identify S3-DiT models - GGUF quantization detection and sidecar LoRA patching for quantized models - Qwen3Encoder_Qwen3Encoder_Config for standalone Qwen3 encoder models Model Loader: - Split Z-Image model	2025-12-02 20:31:11 +01:00
Alexander Eichhorn	2b062b21cd	fix: Improve Flux AI Toolkit LoRA detection to prevent Z-Image misidentification Move Flux layer structure check before metadata check to prevent misidentifying Z-Image LoRAs (which use `diffusion_model.layers.X`) as Flux AI Toolkit format. Flux models use `double_blocks` and `single_blocks` patterns which are now checked first regardless of metadata presence.	2025-12-02 15:50:01 +01:00
Alexander Eichhorn	f05ea28cbd	feat: Add Z-Image LoRA support Add comprehensive LoRA support for Z-Image models including: Backend: - New Z-Image LoRA config classes (LoRA_LyCORIS_ZImage_Config, LoRA_Diffusers_ZImage_Config) - Z-Image LoRA conversion utilities with key mapping for transformer and Qwen3 encoder - LoRA prefix constants (Z_IMAGE_LORA_TRANSFORMER_PREFIX, Z_IMAGE_LORA_QWEN3_PREFIX) - LoRA detection logic to distinguish Z-Image from Flux models - Layer patcher improvements for proper dtype conversion and parameter	2025-12-01 22:23:30 +01:00
Alexander Eichhorn	eb3f1c9a61	feat: Add Z-Image-Turbo model support Add comprehensive support for Z-Image-Turbo (S3-DiT) models including: Backend: - New BaseModelType.ZImage in taxonomy - Z-Image model config classes (ZImageTransformerConfig, Qwen3TextEncoderConfig) - Model loader for Z-Image transformer and Qwen3 text encoder - Z-Image conditioning data structures - Step callback support for Z-Image with FLUX latent RGB factors Invocations: - z_image_model_loader: Load Z-Image transformer and Qwen3 encoder - z_image_text_encoder: Encode prompts using Qwen3 with chat template - z_image_denoise: Flow matching denoising with time-shifted sigmas - z_image_image_to_latents: Encode images to 16-channel latents - z_image_latents_to_image: Decode latents using FLUX VAE Frontend: - Z-Image graph builder for text-to-image generation - Model picker and validation updates for z-image base type - CFG scale now allows 0 (required for Z-Image-Turbo) - Clip skip disabled for Z-Image (uses Qwen3, not CLIP) - Optimal dimension settings for Z-Image (1024x1024) Technical details: - Uses Qwen3 text encoder (not CLIP/T5) - 16 latent channels with FLUX-compatible VAE - Flow matching scheduler with dynamic time shift - 8 inference steps recommended for Turbo variant - bfloat16 inference dtype	2025-12-01 00:22:32 +01:00
gogurtenjoyer	382d85ee23	Fix memory issues when installing models on Windows (#8652 ) * Wrap GGUF loader for context managed close() Wrap gguf.GGUFReader and then use a context manager to load memory-mapped GGUF files, so that they will automatically close properly when no longer needed. Should prevent the 'file in use in another process' errors on Windows. * Additional check for cached state_dict Additional check for cached state_dict as path is now optional - should solve model manager 'missing' this and the resultant memory errors. * Appease ruff * Further ruff appeasement * ruff * loaders.py fix for linux No longer attempting to delete internal object. * loaders.py - one more _mmap ref removed --------- Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>	2025-11-16 09:25:52 -05:00
DustyShoe	8d6e00533e	Fix to enable loading fp16 repo variant ControlNets (#8643 ) * Fix ControlNet repo variant detection for fp16 weights * Remove ControlNet diffusers fp16 regression test * Update invokeai/backend/model_manager/configs/controlnet.py Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com> * style: ruff format controlnet.py --------- Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>	2025-11-03 21:23:35 -05:00
psychedelicious	093f8d6720	fix(mm): ignore files in hidden directories when identifying models	2025-10-16 10:33:02 +11:00
psychedelicious	c8d9cdc22e	docs(mm): add readme for updating or adding new model support	2025-10-16 08:08:44 +11:00
psychedelicious	875aba8979	tidy(mm): remove unused class	2025-10-16 08:08:44 +11:00
psychedelicious	7cff5da2c0	tidy: removing unused code paths 1	2025-10-15 10:46:16 +11:00
psychedelicious	454d05bbde	refactor: model manager v3 (#8607 ) * feat(mm): add UnknownModelConfig * refactor(ui): move model categorisation-ish logic to central location, simplify model manager models list * refactor(ui)refactor(ui): more cleanup of model categories * refactor(ui): remove unused excludeSubmodels I can't remember what this was for and don't see any reference to it. Maybe it's just remnants from a previous implementation? * feat(nodes): add unknown as model base * chore(ui): typegen * feat(ui): add unknown model base support in ui * feat(ui): allow changing model type in MM, fix up base and variant selects * feat(mm): omit model description instead of making it "base type filename model" * feat(app): add setting to allow unknown models * feat(ui): allow changing model format in MM * feat(app): add the installed model config to install complete events * chore(ui): typegen * feat(ui): toast warning when installed model is unidentified * docs: update config docstrings * chore(ui): typegen * tests(mm): fix test for MM, leave the UnknownModelConfig class in the list of configs * tidy(ui): prefer types from zod schemas for model attrs * chore(ui): lint * fix(ui): wrong translation string * feat(mm): normalized model storage Store models in a flat directory structure. Each model is in a dir named its unique key (a UUID). Inside that dir is either the model file or the model dir. * feat(mm): add migration to flat model storage * fix(mm): normalized multi-file/diffusers model installation no worky now worky * refactor: port MM probes to new api - Add concept of match certainty to new probe - Port CLIP Embed models to new API - Fiddle with stuff * feat(mm): port TIs to new API * tidy(mm): remove unused probes * feat(mm): port spandrel to new API * fix(mm): parsing for spandrel * fix(mm): loader for clip embed * fix(mm): tis use existing weight_files method * feat(mm): port vae to new API * fix(mm): vae class inheritance and config_path * tidy(mm): patcher types and import paths * feat(mm): better errors when invalid model config found in db * feat(mm): port t5 to new API * feat(mm): make config_path optional * refactor(mm): simplify model classification process Previously, we had a multi-phase strategy to identify models from their files on disk: 1. Run each model config classes' `matches()` method on the files. It checks if the model could possibly be an identified as the candidate model type. This was intended to be a quick check. Break on the first match. 2. If we have a match, run the config class's `parse()` method. It derive some additional model config attrs from the model files. This was intended to encapsulate heavier operations that may require loading the model into memory. 3. Derive the common model config attrs, like name, description, calculate the hash, etc. Some of these are also heavier operations. This strategy has some issues: - It is not clear how the pieces fit together. There is some back-and-forth between different methods and the config base class. It is hard to trace the flow of logic until you fully wrap your head around the system and therefore difficult to add a model architecture to the probe. - The assumption that we could do quick, lightweight checks before heavier checks is incorrect. We often _must_ load the model state dict in the `matches()` method. So there is no practical perf benefit to splitting up the responsibility of `matches()` and `parse()`. - Sometimes we need to do the same checks in `matches()` and `parse()`. In these cases, splitting the logic is has a negative perf impact because we are doing the same work twice. - As we introduce the concept of an "unknown" model config (i.e. a model that we cannot identify, but still record in the db; see #8582), we will _always_ run _all_ the checks for every model. Therefore we need not try to defer heavier checks or resource-intensive ops like hashing. We are going to do them anyways. - There are situations where a model may match multiple configs. One known case are SD pipeline models with merged LoRAs. In the old probe API, we relied on the implicit order of checks to know that if a model matched for pipeline _and_ LoRA, we prefer the pipeline match. But, in the new API, we do not have this implicit ordering of checks. To resolve this in a resilient way, we need to get all matches up front, then use tie-breaker logic to figure out which should win (or add "differential diagnosis" logic to the matchers). - Field overrides weren't handled well by this strategy. They were only applied at the very end, if a model matched successfully. This means we cannot tell the system "Hey, this model is type X with base Y. Trust me bro.". We cannot override the match logic. As we move towards letting users correct mis-identified models (see #8582), this is a requirement. We can simplify the process significantly and better support "unknown" models. Firstly, model config classes now have a single `from_model_on_disk()` method that attempts to construct an instance of the class from the model files. This replaces the `matches()` and `parse()` methods. If we fail to create the config instance, a special exception is raised that indicates why we think the files cannot be identified as the given model config class. Next, the flow for model identification is a bit simpler: - Derive all the common fields up-front (name, desc, hash, etc). - Merge in overrides. - Call `from_model_on_disk()` for every config class, passing in the fields. Overrides are handled in this method. - Record the results for each config class and choose the best one. The identification logic is a bit more verbose, with the special exceptions and handling of overrides, but it is very clear what is happening. The one downside I can think of for this strategy is we do need to check every model type, instead of stopping at the first match. It's a bit less efficient. In practice, however, this isn't a hot code path, and the improved clarity is worth far more than perf optimizations that the end user will likely never notice. * refactor(mm): remove unused methods in config.py * refactor(mm): add model config parsing utils * fix(mm): abstractmethod bork * tidy(mm): clarify that model id utils are private * fix(mm): fall back to UnknownModelConfig correctly * feat(mm): port CLIPVisionDiffusersConfig to new api * feat(mm): port SigLIPDiffusersConfig to new api * feat(mm): make match helpers more succint * feat(mm): port flux redux to new api * feat(mm): port ip adapter to new api * tidy(mm): skip optimistic override handling for now * refactor(mm): continue iterating on config * feat(mm): port flux "control lora" and t2i adapter to new api * tidy(ui): use Extract to get model config types * fix(mm): t2i base determination * feat(mm): port cnet to new api * refactor(mm): add config validation utils, make it all consistent and clean * feat(mm): wip port of main models to new api * feat(mm): wip port of main models to new api * feat(mm): wip port of main models to new api * docs(mm): add todos * tidy(mm): removed unused model merge class * feat(mm): wip port main models to new api * tidy(mm): clean up model heuristic utils * tidy(mm): clean up ModelOnDisk caching * tidy(mm): flux lora format util * refactor(mm): make config classes narrow Simpler logic to identify, less complexity to add new model, fewer useless attrs that do not relate to the model arch, etc * refactor(mm): diffusers loras w * feat(mm): consistent naming for all model config classes * fix(mm): tag generation & scattered probe fixes * tidy(mm): consistent class names * refactor(mm): split configs into separate files * docs(mm): add comments for identification utils * chore(ui): typegen * refactor(mm): remove legacy probe, new configs dir structure, update imports * fix(mm): inverted condition * docs(mm): update docsstrings in factory.py * docs(mm): document flux variant attr * feat(mm): add helper method for legacy configs * feat(mm): satisfy type checker in flux denoise * docs(mm): remove extraneous comment * fix(mm): ensure unknown model configs get unknown attrs * fix(mm): t5 identification * fix(mm): sdxl ip adapter identification * feat(mm): more flexible config matching utils * fix(mm): clip vision identification * feat(mm): add sanity checks before probing paths * docs(mm): add reminder for self for field migrations * feat(mm): clearer naming for main config class hierarchy * feat(mm): fix clip vision starter model bases, add ref to actual models * feat(mm): add model config schema migration logic * fix(mm): duplicate import * refactor(mm): split big migration into 3 Split the big migration that did all of these things into 3: - Migration 22: Remove unique contraint on base/name/type in models table - Migration 23: Migrate configs to v6.8.0 schemas - Migration 24: Normalize file storage * fix(mm): pop base/type/format when creating unknown model config * fix(db): migration 22 insert only real cols * fix(db): migration 23 fall back to unknown model when config change fails * feat(db): run migrations 23 and 24 * fix(mm): false negative on flux lora * fix(mm): vae checkpoint probe checking for dir instead of file * fix(mm): ModelOnDisk skips dirs when looking for weights Previously a path w/ any of the known weights suffixes would be seen as a weights file, even if it was a directory. We now check to ensure the candidate path is actually a file before adding it to the list of weights. * feat(mm): add method to get main model defaults from a base * feat(mm): do not log when multiple non-unknown model matches * refactor(mm): continued iteration on model identifcation * tests(mm): refactor model identification tests Overhaul of model identification (probing) tests. Previously we didn't test the correctness of probing except in a few narrow cases - now we do. See tests/model_identification/README.md for a detailed overview of the new test setup. It includes instructions for adding a new test case. In brief: - Download the model you want to add as a test case - Run a script against it to generate the test model files - Fill in the expected model type/format/base/etc in the generated test metadata JSON file Included test cases: - All starter models - A handful of other models that I had installed - Models present in the previous test cases as smoke tests, now also tested for correctness * fix(mm): omit type/format/base when creating unknown config instance * feat(mm): use ValueError for model id sanity checks * feat(mm): add flag for updating models to allow class changes * tests(mm): fix remaining MM tests * feat: allow users to edit models freely * feat(ui): add warning for model settings edit * tests(mm): flux state dict tests * tidy: remove unused file * fix(mm): lora state dict loading in model id * feat(ui): use translation string for model edit warning * docs(db): update version numbers in migration comments * chore: bump version to v6.9.0a1 * docs: update model id readme * tests(mm): attempt to fix windows model id tests * fix(mm): issue with deleting single file models * feat(mm): just delete the dir w/ rmtree when deleting model * tests(mm): windows CI issue * fix(ui): typegen schema sync * fix(mm): fixes for migration 23 - Handle CLIP Embed and Main SD models missing variant field - Handle errors when calling the discriminator function, previously only handled ValidationError but it could be a ValueError or something else - Better logging for config migration * chore: bump version to v6.9.0a2 * chore: bump version to v6.9.0a3	2025-10-15 10:18:53 +11:00
Iq1pl	8c742a6e38	ruff format	2025-09-18 11:05:32 +10:00
Iq1pl	693373f1c1	Update ip_adapter.py added support for NOOB-IPA-MARK1	2025-09-18 11:05:32 +10:00
psychedelicious	4880a1d946	feat(nodes): accept neg coords for bbox This actually works fine for SAM.	2025-09-11 12:15:41 +10:00
psychedelicious	f8ad62b5eb	tidy(backend) cleanup sam pipelines	2025-09-11 12:15:41 +10:00
psychedelicious	ec1a058dbe	fix(backend): issue w/ multiple bbox and sam1	2025-09-11 12:15:41 +10:00
psychedelicious	d828502bc8	refactor(backend): simplify segment anything APIs There was a really confusing aspect of the SAM pipeline classes where they accepted deeply nested lists of different dimensions (bbox, points, and labels). The lengths of the lists are related; each point must have a corresponding label, and if bboxes are provided with points, they must be same length. I've refactored the backend API to take a single list of SAMInput objects. This class has a bbox and/or a list of points, making it much simpler to provide the right shape of inputs. Internally, the pipeline classes take rejigger these input classes to have the correct nesting. The Nodes still have an awkward API where you can provide both bboxes and points of different lengths, so I added a pydantic validator that enforces correct lenghts.	2025-09-11 12:15:41 +10:00
psychedelicious	a3625efd3a	chore: ruff	2025-09-11 12:15:41 +10:00

1 2 3 4 5 ...

2505 Commits