704 Commits

Author SHA1 Message Date
Alexander Eichhorn
c83f29362e fix(flux2-vae): support FLUX.2 small-decoder VAE variant (#9032)
Infer encoder and decoder block_out_channels independently from the
state dict and rebuild the decoder submodule when its channel widths
differ from the encoder, so the asymmetric full_encoder_small_decoder
checkpoint from black-forest-labs/FLUX.2-small-decoder loads correctly.

Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>
2026-04-20 21:58:16 -04:00
Alexander Eichhorn
e521817aa4 Fix Z-Image LoRA detection for Kohya and ComfyUI formats (#9007)
Add support for Kohya-format Z-Image LoRAs (lora_unet__ prefix) by
adding key detection and conversion to dot-notation module paths.
Fix ComfyUI-format Z-Image LoRAs being misidentified as main models
by ensuring LoRA-specific suffixes (including .alpha) are checked
before Z-Image key matching in _has_z_image_keys().
2026-04-21 00:28:10 +00:00
Jonathan
0d7205ff79 Handle mixed-dtype mismatches in autocast linear and conv wrappers (#9006)
* Handle CustomConv2d bias dtype mismatches

* Fix mixed-dtype autocast regressions

* Format custom_conv2d with ruff
2026-04-20 20:31:35 +00:00
CypherNaugh_0x
9deb545cc1 External models (Gemini Nano Banana & OpenAI GPT Image) (#8633) (#8884)
* feat: initial external model support

* feat: support reference images for external models

* fix: sorting lint error

* chore: hide Reidentify button for external models

* review: enable auto-install/remove fro external models

* feat: show external mode name during install

* review: model descriptions

* review: implemented review comments

* review: added optional seed control for external models

* chore: fix linter warning

* review: save api keys to a seperate file

* docs: updated external model docs

* chore: fix linter errors

* fix: sync configured external starter models on startup

* feat(ui): add provider-specific external generation nodes

* feat: expose external panel schemas in model configs

* feat(ui): drive external panels from panel schema

* docs: sync app config docstring order

* feat: add gemini 3.1 flash image preview starter model

* feat: update gemini image model limits

* fix: resolve TypeScript errors and move external provider config to api_keys.yaml

Add 'external', 'external_image_generator', and 'external_api' to Zod
enum schemas (zBaseModelType, zModelType, zModelFormat) to match the
generated OpenAPI types. Remove redundant union workarounds from
component prop types and Record definitions.

Fix type errors in ModelEdit (react-hook-form Control invariance),
parsing.tsx (model identifier narrowing), buildExternalGraph (edge
typing), and ModelSettings import/export buttons.

Move external_gemini_base_url and external_openai_base_url into
api_keys.yaml alongside the API keys so all external provider config
lives in one dedicated file, separate from invokeai.yaml.

* feat: add resolution presets and imageConfig support for Gemini 3 models

Add combined resolution preset selector for external models that maps
aspect ratio + image size to fixed dimensions. Gemini 3 Pro and 3.1 Flash
now send imageConfig (aspectRatio + imageSize) via generationConfig instead
of text-based aspect ratio hints used by Gemini 2.5 Flash.

Backend: ExternalResolutionPreset model, resolution_presets capability field,
image_size on ExternalGenerationRequest, and Gemini provider imageConfig logic.

Frontend: ExternalSettingsAccordion with combo resolution select, dimension
slider disabling for fixed-size models, and panel schema constraint wiring
for Steps/Guidance/Seed controls.

* Remove unused external model fields and add provider-specific parameters

- Remove negative_prompt, steps, guidance, reference_image_weights,
  reference_image_modes from external model nodes (unused by any provider)
- Remove supports_negative_prompt, supports_steps, supports_guidance
  from ExternalModelCapabilities
- Add provider_options dict to ExternalGenerationRequest for
  provider-specific parameters
- Add OpenAI-specific fields: quality, background, input_fidelity
- Add Gemini-specific fields: temperature, thinking_level
- Add new OpenAI starter models: GPT Image 1.5, GPT Image 1 Mini,
  DALL-E 3, DALL-E 2
- Fix OpenAI provider to use output_format (GPT Image) vs
  response_format (DALL-E) and send model ID in requests
- Add fixed aspect ratio sizes for OpenAI models (bucketing)
- Add ExternalProviderRateLimitError with retry logic for 429 responses
- Add provider-specific UI components in ExternalSettingsAccordion
- Simplify ParamSteps/ParamGuidance by removing dead external overrides
- Update all backend and frontend tests

* Chore Ruff check & format

* Chore typegen

* feat: full canvas workflow integration for external models

- Add missing aspect ratios (4:5, 5:4, 8:1, 4:1, 1:4, 1:8) to type
  system for external model support
- Sync canvas bbox when external model resolution preset is selected
- Use params preset dimensions in buildExternalGraph to prevent
  "unsupported aspect ratio" errors
- Lock all bbox controls (resize handles, aspect ratio select,
  width/height sliders, swap/optimal buttons) for external models
  with fixed dimension presets
- Disable denoise strength slider for external models (not applicable)
- Sync bbox aspect ratio changes back to paramsSlice for external models
- Initialize bbox dimensions when switching to an external model

* Chore typegen Linux seperator

* feat: full canvas workflow integration for external models
- Update buildExternalGraph test to include dimensions in mock params

* Merge remote-tracking branch 'upstream/main' into external-models

* Chore pnpm fix

* add missing parameter

* docs: add External Models guide with Gemini and OpenAI provider pages

* fix(external-models): address PR review feedback

- Gemini recall: write temperature, thinking_level, image_size to image metadata;
  wire external graph as metadata receiver; add recall handlers.
- Canvas: gate regional guidance, inpaint mask, and control layer for external models.
- Canvas: throw a clear error on outpainting for external models (was falling back to
  inpaint and hitting an API-side mask/image size mismatch).
- Workflow editor: add ui_model_provider_id filter so OpenAI and Gemini nodes only
  list their own provider's models.
- Workflow editor: silently drop seed when the selected model does not support it
  instead of raising a capability error.
- Remove the legacy external_image_generation invocation and the graph-builder
  fallback; providers must register a dedicated node.
- Regenerate schema.ts.
- remove Gemini debug dumps to outputs/external_debug

* fix(external-models): resolve TSC errors in metadata parsing and external graph

- Export imageSizeChanged from paramsSlice (required by the new ImageSize
  recall handler).
- Emit the external graph's metadata model entry via zModelIdentifierField
  since ExternalApiModelConfig is not part of the AnyModelConfig union.

* chore: prettier format ModelIdentifierFieldInputComponent

* fix: remove unsupported thinkingConfig from Gemini image models and restrict GPT Image models to txt2img

* chore typegen

* chore(docs): regenerate settings.json for external provider fields

* fix(external): fix mask handling and mode support for external providers

- Remove img2img and inpaint modes from Gemini models (Gemini has no
  bitmap mask or dedicated edit API; image editing works via reference
  images in the UI)
- Fix DALL-E 2 inpainting: convert grayscale mask to RGBA with alpha
  channel transparency (OpenAI expects transparent=edit area) and
  convert init image to RGBA when mask is present

* fix(external): update mode support and UI for external providers

- Remove DALL-E 2 from starter models (deprecated, shutdown May 12 2026)
- Enable img2img for GPT Image 1/1.5/1-mini (supports edits endpoint)
- Set Gemini models to txt2img only (no mask/edit API; editing via
  ref images)
- Hide mode/init_image/mask_image fields on Gemini node (not usable)
- Hide mask_image field on OpenAI node (no model supports inpaint)

* Chore typegen

* fix(external): improve OpenAI node UX and disable cache by default

- Hide OpenAI node's mode and init_image fields: OpenAI's API has no
  img2img/inpaint distinction (the edits endpoint is invoked
  automatically when reference images are provided). init_image is
  functionally identical to a reference image and was misleading users.
- Default use_cache to False for external image generation nodes:
  external API calls are non-deterministic and incur usage costs.
  Cache hits returned stale image references that did not produce new
  gallery entries on repeat invokes.

* fix(external): duplicate cached images on cache hit instead of skipping

External image generation nodes use the standard invocation cache, but
returning the cached output (with stale image_name references) on cache
hits resulted in no new gallery entries — the Invoke button would spin
indefinitely on repeat invokes with identical parameters.

Override invoke_internal so that on cache hit, the cached images are
loaded and re-saved as new gallery entries. The expensive API call is
still skipped (cost saving), but the user sees a new image as expected.

* Chore typegen + ruff

* CHore ruff format

* fix(external): restore OpenAI advanced settings on Remix recall

Remix recall iterates through ImageMetadataHandlers but only Gemini's
temperature handler was wired up — OpenAI's quality, background, and
input_fidelity were stored in image metadata but never parsed back into
the params slice. Add the three missing handlers so Remix restores
these settings as expected.

---------

Co-authored-by: Alexander Eichhorn <alex@eichhorn.dev>
Co-authored-by: Alexander Eichhorn <alex@code-with.us>
Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>
2026-04-20 17:13:26 +00:00
Lincoln Stein
b42274a57e Feat[model support]: Qwen Image — full pipeline with edit, generate LoRA, GGUF, quantization, and UI (#9000) 2026-04-12 14:39:13 +02:00
Jonathan
ee600973ed Broaden text encoder partial-load recovery (#9034) 2026-04-09 20:09:40 -04:00
4pointoh
f0d09c34a8 feat: add Anima model support (#8961)
* feat: add Anima model support

* schema

* image to image

* regional guidance

* loras

* last fixes

* tests

* fix attributions

* fix attributions

* refactor to use diffusers reference

* fix an additional lora type

* some adjustments to follow flux 2 paper implementation

* use t5 from model manager instead of downloading

* make lora identification more reliable

* fix: resolve lint errors in anima module

Remove unused variable, fix import ordering, inline dict() call,
and address minor lint issues across anima-related files.

* Chore Ruff format again

* fix regional guidance error

* fix(anima): validate unexpected keys after strict=False checkpoint loading

Capture the load_state_dict result and raise RuntimeError on unexpected
keys (indicating a corrupted or incompatible checkpoint), while logging
a warning for missing keys (expected for inv_freq buffers regenerated
at runtime).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(anima): make model loader submodel fields required instead of Optional

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(anima): add Classification.Prototype to LoRA loaders, fix exception types

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(anima): fix replace-all in key conversion, warn on DoRA+LoKR, unify grouping functions

- Use key.replace(old, new, 1) in _convert_kohya_unet_key and _convert_kohya_te_key to avoid replacing multiple occurrences
- Upgrade DoRA+LoKR dora_scale strip from logger.debug to logger.warning since it represents data loss
- Replace _group_kohya_keys and _group_by_layer with a single _group_keys_by_layer function parameterized by extra_suffixes, with _KOHYA_KNOWN_SUFFIXES and _PEFT_EXTRA_SUFFIXES constants
- Add test_empty_state_dict_returns_empty_model to verify empty input produces a model with no layers

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(anima): add safety cap for Qwen3 sequence length to prevent OOM

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(anima): add denoising range validation, fix closure capture, add edge case tests

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(anima): add T5 to metadata, fix dead code, decouple scheduler type guard

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(anima): update VAE field description for required field

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* chore: regenerate frontend types after upstream merge

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* chore: ruff format anima_denoise.py

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(anima): add T5 encoder metadata recall handler

The T5 encoder was added to generation metadata but had no recall
handler, so it wasn't restored when recalling from metadata.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* chore(frontend): add regression test for buildAnimaGraph

Add tests for CFG gating (negative conditioning omitted when cfgScale <= 1)
and basic graph structure (model loader, text encoder, denoise nodes).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* only show 0.6b for anima

* dont show 0.6b for other models

* schema

* Anima preview 3

* fix ci

---------

Co-authored-by: Your Name <you@example.com>
Co-authored-by: kappacommit <samwolfe40@gmail.com>
Co-authored-by: Alexander Eichhorn <alex@eichhorn.dev>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>
2026-04-09 12:04:11 -04:00
Alexander Eichhorn
80be1b7282 fix: correct inaccurate download size estimates in starter models (#8968)
Verified model sizes against Hugging Face repositories and corrected
11 descriptions that had wrong or outdated download size estimates.

Key corrections:
- T5-XXL base encoder: ~8GB → ~9.5GB
- FLUX.2 VAE: ~335MB → ~168MB (was confused with FLUX.1 VAE)
- FLUX.1 Krea dev: ~33GB → ~29GB (uses quantized T5, not full)
- FLUX.2 Klein 4B/9B Diffusers: ~10GB/~20GB → ~16GB/~35GB
- SD3.5 Medium/Large: ~15GB/~19G → ~16GB/~28GB
- CogView4: ~29GB → ~31GB
- Z-Image Turbo: ~30.6GB → ~33GB
- FLUX.1 Kontext/Krea quantized: ~14GB → ~12GB
2026-04-07 03:09:29 +00:00
Alexander Eichhorn
dbbf28925b fix: detect FLUX.2 Klein 9B Base variant via filename heuristic (#9011)
Klein 9B Base (undistilled) and Klein 9B (distilled) have identical
architectures and cannot be distinguished from the state dict alone.
Use a filename heuristic ("base" in the name) to detect the Base
variant for checkpoint, GGUF, and diffusers format models.

Also fixes the incorrect guidance_embeds-based detection for diffusers
format, since both variants have guidance_embeds=False.
2026-04-07 02:31:33 +00:00
Alexander Eichhorn
f08b802968 feat: add support for OneTrainer BFL Flux LoRA format (#8984)
* feat: add support for OneTrainer BFL Flux LoRA format

Newer versions of OneTrainer export Flux LoRAs using BFL internal key
names (double_blocks, single_blocks, img_attn, etc.) with a
'transformer.' prefix and split QKV projections (qkv.0/1/2, linear1.0/1/2/3).
This format was not recognized by any existing detector.

Add detection and conversion for this format, merging split QKV and
linear1 layers into MergedLayerPatch instances for the fused BFL model.

* chore ruff
2026-04-07 02:04:48 +00:00
Alexander Eichhorn
ae42182246 fix: detect Z-Image LoRAs with transformer.layers prefix (#8986)
OneTrainer exports Z-Image LoRAs with 'transformer.layers.' key prefix
instead of 'diffusion_model.layers.'. Add this prefix (and the
PEFT-wrapped 'base_model.model.transformer.layers.' variant) to the
Z-Image LoRA probe so these models are correctly identified and loaded.
2026-04-07 01:52:06 +00:00
Jonathan
dc5007fe95 Fix/model cache Qwen/CogView4 cancel repair (#8959)
* Repair partially loaded Qwen models after cancel to avoid device mismatches

* ruff

* Repair CogView4 text encoder after canceled partial loads

* Avoid MPS CI crash in repair regression test

* Fix MPS device assertion in repair test
2026-03-15 10:04:15 -04:00
Alexander Eichhorn
274d9b3a74 fix(model_manager): detect Flux 2 Klein LoRAs in Kohya format with transformer-only keys (#8938)
LoRAs trained with musubi-tuner (and potentially other trainers) that
only target transformer blocks (double_blocks/single_blocks) without
embedding layers (txt_in/vector_in/context_embedder) were incorrectly
classified as Flux 1. Add fallback detection using attention projection
hidden_size and MLP ratio from transformer block tensors

Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>
2026-03-07 01:52:25 +00:00
DustyShoe
445c6a3c36 Fix(MM): Fixed incorrect advertised model size for Z-Image Turbo (#8934) 2026-03-01 21:31:53 -05:00
Alexander Eichhorn
54c1609687 Filter non-transformer keys from Z-Image checkpoint state dicts (#8918)
Merged Z-Image checkpoints (e.g. models with LoRAs baked in) may bundle
text encoder weights (text_encoders.*) or other non-transformer keys
alongside the transformer weights. These cause load_state_dict() to fail
with strict=True. Instead of disabling strict mode, explicitly whitelist
valid ZImageTransformer2DModel key prefixes and discard everything else.

Also moves RAM allocation after filtering so it doesn't over-allocate
for discarded keys.

Co-authored-by: Jonathan <34005131+JPPhoto@users.noreply.github.com>
2026-02-28 16:22:29 +00:00
Lincoln Stein
dfc66b7142 Feature: Add FLUX.2 LOKR model support (detection and loading) (#8909)
* Add FLUX.2 LOKR model support (detection and loading) (#88)

Fix BFL LOKR models being misidentified as AIToolkit format



Fix alpha key warning in LOKR QKV split layers

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>

* Fix BFL→diffusers key mapping for non-block layers in FLUX.2 LoRA/LoKR

BFL's FLUX.2 model uses different names than diffusers' Flux2Transformer2DModel
for top-level modules (embedders, modulations, output layers). The existing
conversion only handled block-level renames (double_blocks→transformer_blocks),
causing "Failed to find module" warnings for non-block LoRA keys like img_in,
txt_in, modulation.lin, time_in, and final_layer.

---------

Co-authored-by: Copilot <198982749+Copilot@users.noreply.github.com>
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
Co-authored-by: Alexander Eichhorn <alex@eichhorn.dev>
2026-02-27 00:45:13 +00:00
Alexander Eichhorn
afbd45ace7 Feature: flux2 klein lora support (#8862)
* WIP: Add FLUX.2 Klein LoRA support (BFL PEFT format)

Initial implementation for loading and applying LoRA models trained
with BFL's PEFT format for FLUX.2 Klein transformers.

Changes:
- Add LoRA_Diffusers_Flux2_Config and LoRA_LyCORIS_Flux2_Config
- Add BflPeft format to FluxLoRAFormat taxonomy
- Add flux_bfl_peft_lora_conversion_utils for weight conversion
- Add Flux2KleinLoraLoaderInvocation node

Status: Work in progress - not yet fully tested

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* feat(flux2): add LoRA support for FLUX.2 Klein models

Add BFL PEFT LoRA support for FLUX.2 Klein, including runtime conversion
of BFL-format keys to diffusers format with fused QKV splitting, improved
detection of Klein 4B LoRAs via MLP ratio check, and frontend graph wiring.

* feat(flux2): detect Klein LoRA variant (4B/9B) and filter by compatibility

Auto-detect FLUX.2 Klein LoRA variant from tensor dimensions during model
probe, warn on variant mismatch at load time, and filter the LoRA picker
to only show variant-compatible LoRAs.

* Chore Ruff

* Chore pnpm

* Fix detection and loading of 3 unrecognized Flux.2 Klein LoRA formats

Three Flux.2 Klein LoRAs were either unrecognized or misclassified due to
format detection gaps:

1. PEFT-wrapped BFL format (base_model.model.* prefix) was not recognized
   because the detector only accepted the diffusion_model.* prefix.
2. Klein 4B LoRAs with hidden_size=3072 were misidentified as Flux.1 due to
   a break statement exiting the detection loop before txt_in/vector_in
   dimensions could be checked.
3. Flux2 native diffusers format (to_qkv_mlp_proj, ff.linear_in) was not
   detected because the detector only checked for Flux.1 diffusers keys.

Also handles mixed PEFT/standard LoRA suffix formats within the same file.

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>
2026-02-24 02:55:39 +00:00
Alexander Eichhorn
b0f7b555b7 feat(z-image): add Z-Image Base (undistilled) model variant support (#8799)
* feat(z-image): add Z-Image Base (undistilled) model variant support

- Add ZImageVariantType enum with 'turbo' and 'zbase' variants
- Auto-detect variant on import via scheduler_config.json shift value (3.0=turbo, 6.0=zbase)
- Add database migration to populate variant field for existing Z-Image models
- Re-add LCM scheduler with variant-aware filtering (LCM hidden for zbase)
- Auto-reset scheduler to Euler when switching to zbase model if LCM selected
- Update frontend to show/hide LCM option based on model variant
- Add toast notification when scheduler is auto-reset

Z-Image Base models are undistilled and require more steps (28-50) with higher
guidance (3.0-5.0), while Z-Image Turbo is distilled for ~8 steps with CFG 1.0.
LCM scheduler only works with distilled (Turbo) models.

* Chore ruff format

* Chore fix windows path

* feat(z-image): filter LoRAs by variant compatibility and warn on mismatch

LoRA picker now hides Z-Image LoRAs with incompatible variants (e.g. ZBase
LoRAs when using Turbo model). LoRAs without a variant are always shown.
Backend loaders warn at runtime if a LoRA variant doesn't match the
transformer variant.

* Chore typegen

---------

Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>
2026-02-20 00:32:38 +00:00
Lincoln Stein
76b0838094 Feature(backend): Add user toggle to run encoder models on CPU (#8777)
* feature(backend) Add user toggle to run encoder models on CPU

Co-authored-by: lstein <111189+lstein@users.noreply.github.com>

Add frontend UI for CPU-only model execution toggle

Co-authored-by: lstein <111189+lstein@users.noreply.github.com>

* chore(frontend): remove package lock file created by npm

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
Co-authored-by: Jonathan <34005131+JPPhoto@users.noreply.github.com>
2026-02-04 15:13:29 -05:00
Alexander Eichhorn
0ae7392c81 fix(model_manager): detect Flux1/2 VAE by latent space dimensions instead of filename (#8790)
* fix(model_manager): detect Flux VAE by latent space dimensions instead of filename

VAE detection previously relied solely on filename pattern matching, which failed
for Flux VAE files with generic names like "ae.safetensors". Now probes the model's
decoder.conv_in weight shape to determine the latent space dimensions:
- 16 channels -> Flux VAE
- 4 channels -> SD/SDXL VAE (with filename fallback for SD1/SD2/SDXL distinction)

* fix(model_manager): add latent space probing for Flux2 VAE detection

Extend Flux2 VAE detection to also check for 32-dimensional latent space
(decoder.conv_in with 32 input channels) in addition to BatchNorm layers.
This provides more robust detection for Flux2 VAE files regardless of filename.

* Chore Ruff format

---------

Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>
2026-01-27 05:20:50 +00:00
Alexander Eichhorn
b92c6ae633 feat(flux2): add FLUX.2 klein model support (#8768)
* WIP: feat(flux2): add FLUX 2 Kontext model support

- Add new invocation nodes for FLUX 2:
  - flux2_denoise: Denoising invocation for FLUX 2
  - flux2_klein_model_loader: Model loader for Klein architecture
  - flux2_klein_text_encoder: Text encoder for Qwen3-based encoding
  - flux2_vae_decode: VAE decoder for FLUX 2

- Add backend support:
  - New flux2 module with denoise and sampling utilities
  - Extended model manager configs for FLUX 2 models
  - Updated model loaders for Klein architecture

- Update frontend:
  - Extended graph builder for FLUX 2 support
  - Added FLUX 2 model types and configurations
  - Updated readiness checks and UI components

* fix(flux2): correct VAE decode with proper BN denormalization

FLUX.2 VAE uses Batch Normalization in the patchified latent space
(128 channels). The decode must:
1. Patchify latents from (B, 32, H, W) to (B, 128, H/2, W/2)
2. Apply BN denormalization using running_mean/running_var
3. Unpatchify back to (B, 32, H, W) for VAE decode

Also fixed image normalization from [-1, 1] to [0, 255].

This fixes washed-out colors in generated FLUX.2 Klein images.

* feat(flux2): add FLUX.2 Klein model support with ComfyUI checkpoint compatibility

- Add FLUX.2 transformer loader with BFL-to-diffusers weight conversion
- Fix AdaLayerNorm scale-shift swap for final_layer.adaLN_modulation weights
- Add VAE batch normalization handling for FLUX.2 latent normalization
- Add Qwen3 text encoder loader with ComfyUI FP8 quantization support
- Add frontend components for FLUX.2 Klein model selection
- Update configs and schema for FLUX.2 model types

* Chore Ruff

* Fix Flux1 vae probing

* Fix Windows Paths schema.ts

* Add 4B und 9B klein to Starter Models.

* feat(flux2): add non-commercial license indicator for FLUX.2 Klein 9B

- Add isFlux2Klein9BMainModelConfig and isNonCommercialMainModelConfig functions
- Update MainModelPicker and InitialStateMainModelPicker to show license icon
- Update license tooltip text to include FLUX.2 Klein 9B

* feat(flux2): add Klein/Qwen3 variant support and encoder filtering

Backend:
- Add klein_4b/klein_9b variants for FLUX.2 Klein models
- Add qwen3_4b/qwen3_8b variants for Qwen3 encoder models
- Validate encoder variant matches Klein model (4B↔4B, 9B↔8B)
- Auto-detect Qwen3 variant from hidden_size during probing

Frontend:
- Show variant field for all model types in ModelView
- Filter Qwen3 encoder dropdown to only show compatible variants
- Update variant type definitions (zFlux2VariantType, zQwen3VariantType)
- Remove unused exports (isFluxDevMainModelConfig, isFlux2Klein9BMainModelConfig)

* Chore Ruff

* feat(flux2): add Klein 9B Base (undistilled) variant support

Distinguish between FLUX.2 Klein 9B (distilled) and Klein 9B Base (undistilled)
models by checking guidance_embeds in diffusers config or guidance_in keys in
safetensors. Klein 9B Base requires more steps but offers higher quality.

* feat(flux2): improve diffusers compatibility and distilled model support

Backend changes:
- Update text encoder layers from [9,18,27] to (10,20,30) matching diffusers
- Use apply_chat_template with system message instead of manual formatting
- Change position IDs from ones to zeros to match diffusers implementation
- Add get_schedule_flux2() with empirical mu computation for proper schedule shifting
- Add txt_embed_scale parameter for Qwen3 embedding magnitude control
- Add shift_schedule toggle for base (28+ steps) vs distilled (4 steps) models
- Zero out guidance_embedder weights for Klein models without guidance_embeds

UI changes:
- Clear Klein VAE and Qwen3 encoder when switching away from flux2 base
- Clear Qwen3 encoder when switching between different Klein model variants
- Add toast notification informing user to select compatible encoder

* feat(flux2): fix distilled model scheduling with proper dynamic shifting

- Configure scheduler with FLUX.2 Klein parameters from scheduler_config.json
  (use_dynamic_shifting=True, shift=3.0, time_shift_type="exponential")
- Pass mu parameter to scheduler.set_timesteps() for resolution-aware shifting
- Remove manual shift_schedule parameter (scheduler handles this automatically)
- Simplify get_schedule_flux2() to return linear sigmas only
- Remove txt_embed_scale parameter (no longer needed)

This matches the diffusers Flux2KleinPipeline behavior where the
FlowMatchEulerDiscreteScheduler applies dynamic timestep shifting
based on image resolution via the mu parameter.

Fixes 4-step distilled Klein 9B model quality issues.

* fix(ui): fix FLUX.1 graph building with posCondCollect node lookup

The posCondCollect node was created with getPrefixedId() which generates
a random suffix (e.g., 'pos_cond_collect:abc123'), but g.getNode() was
called with the plain string 'pos_cond_collect', causing a node lookup
failure.

Fix by declaring posCondCollect as a module-scoped variable and
referencing it directly instead of using g.getNode().

* Remove Flux2 Klein Base from Starter Models

* Remove Logging

* Add Default Values for Flux2 Klein and add variant as additional info to from_base

* Add migrations for the z-image qwen3 encoder without a variant value

* Add img2img, inpainting and outpainting support for FLUX.2 Klein

- Add flux2_vae_encode invocation for encoding images to FLUX.2 latents
- Integrate inpaint_extension into FLUX.2 denoise loop for proper mask handling
- Apply BN normalization to init_latents and noise for consistency in inpainting
- Use manual Euler stepping for img2img/inpaint to preserve exact timestep schedule
- Add flux2_img2img, flux2_inpaint, flux2_outpaint generation modes
- Expand starter models with FP8 variants, standalone transformers, and separate VAE/encoders
- Fix outpainting to always use full denoising (0-1) since strength doesn't apply
- Improve error messages in model loader with clear guidance for standalone models

* Add GGUF quantized model support and Diffusers VAE loader for FLUX.2 Klein

- Add Main_GGUF_Flux2_Config for GGUF-quantized FLUX.2 transformer models
- Add VAE_Diffusers_Flux2_Config for FLUX.2 VAE in diffusers format
- Add Flux2GGUFCheckpointModel loader with BFL-to-diffusers conversion
- Add Flux2VAEDiffusersLoader for AutoencoderKLFlux2
- Add FLUX.2 Klein 4B/9B hardware requirements to documentation
- Update starter model descriptions to clarify dependencies install together
- Update frontend schema for new model configs

* Fix FLUX.2 model detection and add FP8 weight dequantization support

- Improve FLUX.2 variant detection for GGUF/checkpoint models (BFL format keys)
- Fix guidance_embeds logic: distilled=False, undistilled=True
- Add FP8 weight dequantization for ComfyUI-style quantized models
- Prevent FLUX.2 models from being misidentified as FLUX.1
- Preserve user-editable fields (name, description, etc.) on model reidentify
- Improve Qwen3Encoder detection by variant in starter models
- Add defensive checks for tensor operations

* Chore ruff format

* Chore Typegen

* Fix FLUX.2 Klein 9B model loading by detecting hidden_size from weights

Previously num_attention_heads was hardcoded to 24, which is correct for
Klein 4B but causes size mismatches when loading Klein 9B checkpoints.

Now dynamically calculates num_attention_heads from the hidden_size
dimension of context_embedder weights:
- Klein 4B: hidden_size=3072 → num_attention_heads=24
- Klein 9B: hidden_size=4096 → num_attention_heads=32

Fixes both Checkpoint and GGUF loaders for FLUX.2 models.

* Only clear Qwen3 encoder when FLUX.2 Klein variant changes

Previously the encoder was cleared whenever switching between any Klein
models, even if they had the same variant. Now compares the variant of
the old and new model and only clears the encoder when switching between
different variants (e.g., klein_4b to klein_9b).

This allows users to switch between different Klein 9B models without
having to re-select the Qwen3 encoder each time.

* Add metadata recall support for FLUX.2 Klein parameters

The scheduler, VAE model, and Qwen3 encoder model were not being
recalled correctly for FLUX.2 Klein images. This adds dedicated
metadata handlers for the Klein-specific parameters.

* Fix FLUX.2 Klein denoising scaling and Z-Image VAE compatibility

- Apply exponential denoising scaling (exponent 0.2) to FLUX.2 Klein,
  matching FLUX.1 behavior for more intuitive inpainting strength
- Add isFlux1VAEModelConfig type guard to filter FLUX 1.0 VAEs only
- Restrict Z-Image VAE selection to FLUX 1.0 VAEs, excluding FLUX.2
  Klein 32-channel VAEs which are incompatible

* chore pnpm fix

* Add FLUX.2 Klein to starter bundles and documentation

- Add FLUX.2 Klein hardware requirements to quick start guide
- Create flux2_klein_bundle with GGUF Q4 model, VAE, and Qwen3 encoder
- Add "What's New" entry announcing FLUX.2 Klein support

* Add FLUX.2 Klein built-in reference image editing support

FLUX.2 Klein has native multi-reference image editing without requiring
a separate model (unlike FLUX.1 which needs a Kontext model).

Backend changes:
- Add Flux2RefImageExtension for encoding reference images with FLUX.2 VAE
- Apply BN normalization to reference image latents for correct scaling
- Use T-coordinate offset scale=10 like diffusers (T=10, 20, 30...)
- Concatenate reference latents with generated image during denoising
- Extract only generated portion in step callback for correct preview

Frontend changes:
- Add flux2_reference_image config type without model field
- Hide model selector for FLUX.2 reference images (built-in support)
- Add type guards to handle configs without model property
- Update validators to skip model validation for FLUX.2
- Add 'flux2' to SUPPORTS_REF_IMAGES_BASE_MODELS

* Chore windows path fix

* Add reference image resizing for FLUX.2 Klein

Resize large reference images to match BFL FLUX.2 sampling.py limits:
- Single reference: max 2024² pixels (~4.1M)
- Multiple references: max 1024² pixels (~1M)

Uses same scaling approach as BFL's cap_pixels() function.
2026-01-26 23:21:37 -05:00
Alexander Eichhorn
b2b8820519 fix(model_manager): prevent Z-Image LoRAs from being misclassified as main models (#8754)
* fix(model_manager): prevent Z-Image LoRAs from being misclassified as main models

Z-Image LoRAs containing keys like `diffusion_model.context_refiner.*` were being
incorrectly classified as main checkpoint models instead of LoRAs. This happened
because the `_has_z_image_keys()` function checked for Z-Image specific keys
(like `context_refiner`) without verifying if the file was actually a LoRA.

Since main models have higher priority than LoRAs in the classification sort order,
the incorrect main model classification would win.

The fix adds detection of LoRA-specific weight suffixes (`.lora_down.weight`,
`.lora_up.weight`, `.lora_A.weight`, `.lora_B.weight`, `.dora_scale`) and returns
False if any are found, ensuring LoRAs are correctly classified.

* refactor(mm): simplify _has_z_image_keys with early return

Return True directly when a Z-Image key is found instead of using an
intermediate variable.
2026-01-14 22:35:17 -05:00
Lincoln Stein
d6ad6a2dcb fix(invocation stats): Report delta VRAM for each invocation and fix reporting of RAM cache size 2026-01-10 11:32:37 -05:00
Lincoln Stein
0021404639 chore: remove dangling debug statements (#8738) 2026-01-05 00:47:46 +00:00
Jonathan
3a21e7699f Merge branch 'main' into copilot/add-unload-model-option 2026-01-04 10:22:44 -05:00
Lincoln Stein
56fd7bc7c4 docs(z-image) add Z-Image requirements and starter bundle (#8734)
* docs(z-image) add minimum requirements for Z-Image and create Z-Image starter bundle

* fix(model manager) add flux VAE to Z-Image bundle

* docs(model manager) remove out-of-date model info link

* chore: fix frontendchecks

* chore: lint:prettier

* docs(model manager): clarify minimum hardware for z-image turbo

* (fix) add flux VAE to ZIT starter dependencies & tweak UI docs
2026-01-04 10:17:26 -05:00
Lincoln Stein
8a6d593fe8 Merge branch 'main' into copilot/add-unload-model-option 2026-01-03 22:48:36 -05:00
Alexander Eichhorn
9f8f9965f9 fix(model-loaders): add local_files_only=True to prevent network requests (#8735) 2026-01-03 22:21:42 -05:00
Lincoln Stein
8cf4c6944a (style) ruff fix 2026-01-03 14:54:15 -05:00
Lincoln Stein
db228ddc4f (style) add @record_activity and @synchronized to locked methods 2026-01-03 14:52:31 -05:00
Jonathan
f49e1b8dae Merge branch 'main' into copilot/add-unload-model-option 2026-01-01 21:31:08 -05:00
Alexander Eichhorn
66974841f1 fix(model-manager): support offline Qwen3 tokenizer loading for Z-Image (#8719)
Add local_files_only fallback for Qwen3 tokenizer loading in both
Checkpoint and GGUF loaders. This ensures Z-Image models can generate
images offline after the initial tokenizer download.

The tokenizer is now loaded with local_files_only=True first, falling
back to network download only if files aren't cached yet.

Fixes #8716

Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>
2026-01-02 00:40:08 +00:00
Lincoln Stein
56fd1da888 Merge branch 'main' into copilot/add-unload-model-option 2025-12-27 21:08:17 -05:00
Alexander Eichhorn
355c985cc3 fix(model-manager): add Z-Image LoRA/DoRA detection and loading support
Two fixes for Z-Image LoRA support:

1. Override _validate_looks_like_lora in LoRA_LyCORIS_ZImage_Config to
   recognize Z-Image specific LoRA formats that use different key patterns
   than SD/SDXL LoRAs. Z-Image LoRAs use lora_down.weight/lora_up.weight
   and dora_scale suffixes instead of lora_A.weight/lora_B.weight.

2. Fix _group_by_layer in z_image_lora_conversion_utils.py to correctly
   group LoRA keys by layer name. The previous logic used rsplit with
   maxsplit=2 which incorrectly grouped keys like:
   - "to_k.alpha" -> layer "diffusion_model.layers.17.attention"
   - "lora_down.weight" -> layer "diffusion_model.layers.17.attention.to_k"

   Now uses suffix matching to ensure all keys for a layer are grouped
   together (alpha, dora_scale, lora_down.weight, lora_up.weight).
2025-12-27 09:17:29 +01:00
Lincoln Stein
5b69403ba8 Merge branch 'main' into copilot/add-unload-model-option 2025-12-24 15:39:46 -05:00
Alexander Eichhorn
ac245cbf6c feat(backend): add support for xlabs Flux LoRA format (#8686)
Add support for loading Flux LoRA models in the xlabs format, which uses
keys like `double_blocks.X.processor.{qkv|proj}_lora{1|2}.{down|up}.weight`.

The xlabs format maps:
- lora1 -> img_attn (image attention stream)
- lora2 -> txt_attn (text attention stream)
- qkv -> query/key/value projection
- proj -> output projection

Changes:
- Add FluxLoRAFormat.XLabs enum value
- Add flux_xlabs_lora_conversion_utils.py with detection and conversion
- Update formats.py to detect xlabs format
- Update lora.py loader to handle xlabs format
- Update model probe to accept recognized Flux LoRA formats
- Add unit tests for xlabs format detection and conversion

Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>
2025-12-24 20:18:11 +00:00
copilot-swe-agent[bot]
4987b4da1c Fix timeout message appearing during active generation
Only log "Clearing model cache" message when there are actually unlocked
models to clear. This prevents the misleading message from appearing during
active generation when all models are locked.

Changes:
- Check for unlocked models before logging clear message
- Add count of unlocked models in log message
- Add debug log when all models are locked
- Improves user experience by avoiding confusing messages

Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
2025-12-24 05:31:11 +00:00
Lincoln Stein
1e15b8c106 Merge branch 'main' into copilot/add-unload-model-option 2025-12-24 00:14:45 -05:00
Alexander Eichhorn
21138e5d52 fix support multi-subfolder downloads for Z-Image Qwen3 encoder (#8692)
* fix(model-install): support multi-subfolder downloads for Z-Image Qwen3 encoder

The Z-Image Qwen3 text encoder requires both text_encoder and tokenizer
subfolders from the HuggingFace repo, but the previous implementation
only downloaded the text_encoder subfolder, causing model identification
to fail.

Changes:
- Add subfolders property to HFModelSource supporting '+' separated paths
- Extend filter_files() and download_urls() to handle multiple subfolders
- Update _multifile_download() to preserve subfolder structure
- Make Qwen3Encoder probe check both nested and direct config.json paths
- Update Qwen3EncoderLoader to handle both directory structures
- Change starter model source to text_encoder+tokenizer

* ruff format

* fix schema description

* fix schema description

---------

Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>
2025-12-23 23:39:43 -05:00
copilot-swe-agent[bot]
8d76b4e4d4 Fix ruff whitespace errors and improve timeout logging
- Remove all trailing whitespace (W293 errors)
- Add debug logging when timeout fires but activity detected
- Add debug logging when timeout fires but cache is empty
- Only log "Clearing model cache" message when actually clearing
- Prevents misleading timeout messages during active generation

Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
2025-12-24 04:05:57 +00:00
copilot-swe-agent[bot]
c3217d8a08 Address code review feedback
- Remove unused variable in test
- Add clarifying comment for daemon thread setting
- Add detailed comment explaining cache clearing with 1000 GB value
- Improve code documentation

Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
2025-12-24 00:27:39 +00:00
copilot-swe-agent[bot]
2500153ed8 Fix race condition in timeout mechanism
- Added clarifying comment that _record_activity is called with lock held
- Enhanced double-check in _on_timeout for thread safety
- Added lock protection to shutdown method
- Improved handling of edge cases where timer fires during activity

Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
2025-12-24 00:26:01 +00:00
copilot-swe-agent[bot]
9bbd2b3f11 Add model_cache_keep_alive config option and timeout mechanism
- Added model_cache_keep_alive config field (minutes, default 0 = infinite)
- Implemented timeout tracking in ModelCache class
- Added _record_activity() to track model usage
- Added _on_timeout() to auto-clear cache when timeout expires
- Added shutdown() method to clean up timers
- Integrated timeout with get(), lock(), unlock(), and put() operations
- Updated ModelManagerService to pass keep_alive parameter
- Added cleanup in stop() method

Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
2025-12-24 00:22:59 +00:00
blessedcoolant
1b5d91d1cf Merge branch 'main' into feat/z-image-starter-models 2025-12-23 08:27:25 +05:30
Alexander Eichhorn
a748519e92 feat(starter-models): add Z-Image Q8 quant and ControlNet Tile
Add higher quality Q8_0 quantization option for Z-Image Turbo (~6.6GB)
to complement existing Q4_K variant, providing better quality for users
with more VRAM.

Add dedicated Z-Image ControlNet Tile model (~6.7GB) for upscaling and
detail enhancement workflows.
2025-12-23 03:27:09 +01:00
blessedcoolant
7068cf956a Merge branch 'main' into pr/8690 2025-12-23 05:59:49 +05:30
Alexander Eichhorn
f8b1f42f6d fix(z-image): Fix padding token shape mismatch for GGUF models
GGUF Z-Image models store x_pad_token and cap_pad_token with shape [dim],
but diffusers ZImageTransformer2DModel expects [1, dim]. This caused a
RuntimeError when loading GGUF-quantized Z-Image models.

The fix dequantizes GGMLTensors first (since they don't support unsqueeze),
then reshapes to add the batch dimension.
2025-12-22 18:31:57 +01:00
Alexander Eichhorn
b08accd4be feat(starter-models): add Z-Image Turbo starter models
Add Z-Image Turbo and related models to the starter models list:
- Z-Image Turbo (full precision, ~13GB)
- Z-Image Turbo quantized (GGUF Q4_K, ~4GB)
- Z-Image Qwen3 Text Encoder (full precision, ~8GB)
- Z-Image Qwen3 Text Encoder quantized (GGUF Q6_K, ~3.3GB)
- Z-Image ControlNet Union (Canny, HED, Depth, Pose, MLSD, Inpainting)

The quantized Turbo model includes the quantized Qwen3 encoder as a
dependency for automatic installation.
2025-12-22 15:04:27 +01:00
Alexander Eichhorn
3668d5b83b feat(z-image): add Extension-based Z-Image ControlNet support
Implement Z-Image ControlNet as an Extension pattern (similar to FLUX ControlNet)
instead of merging control weights into the base transformer. This provides:
- Lower memory usage (no weight duplication)
- Flexibility to enable/disable control per step
- Cleaner architecture with separate control adapter

Key implementation details:
- ZImageControlNetExtension: computes control hints per denoising step
- z_image_forward_with_control: custom forward pass with hint injection
- patchify_control_context: utility for control image patchification
- ZImageControlAdapter: standalone adapter with control_layers and noise_refiner

Architecture matches original VideoX-Fun implementation:
- Hints computed ONCE using INITIAL unified state (before main layers)
- Hints injected at every other main transformer layer (15 control blocks)
- Control signal added after each designated layer's forward pass

V2.0 ControlNet support (control_in_dim=33):
- Channels 0-15: control image latents
- Channels 16-31: reference image (zeros for pure control)
- Channel 32: inpaint mask (1.0 = don't inpaint, use control signal)
2025-12-21 22:30:28 +01:00
Alexander Eichhorn
1c13ca8159 style: apply ruff formatting 2025-12-21 18:52:12 +01:00