37 Commits

Author SHA1 Message Date
Lincoln Stein
b42274a57e Feat[model support]: Qwen Image — full pipeline with edit, generate LoRA, GGUF, quantization, and UI (#9000) 2026-04-12 14:39:13 +02:00
4pointoh
f0d09c34a8 feat: add Anima model support (#8961)
* feat: add Anima model support

* schema

* image to image

* regional guidance

* loras

* last fixes

* tests

* fix attributions

* fix attributions

* refactor to use diffusers reference

* fix an additional lora type

* some adjustments to follow flux 2 paper implementation

* use t5 from model manager instead of downloading

* make lora identification more reliable

* fix: resolve lint errors in anima module

Remove unused variable, fix import ordering, inline dict() call,
and address minor lint issues across anima-related files.

* Chore Ruff format again

* fix regional guidance error

* fix(anima): validate unexpected keys after strict=False checkpoint loading

Capture the load_state_dict result and raise RuntimeError on unexpected
keys (indicating a corrupted or incompatible checkpoint), while logging
a warning for missing keys (expected for inv_freq buffers regenerated
at runtime).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(anima): make model loader submodel fields required instead of Optional

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(anima): add Classification.Prototype to LoRA loaders, fix exception types

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(anima): fix replace-all in key conversion, warn on DoRA+LoKR, unify grouping functions

- Use key.replace(old, new, 1) in _convert_kohya_unet_key and _convert_kohya_te_key to avoid replacing multiple occurrences
- Upgrade DoRA+LoKR dora_scale strip from logger.debug to logger.warning since it represents data loss
- Replace _group_kohya_keys and _group_by_layer with a single _group_keys_by_layer function parameterized by extra_suffixes, with _KOHYA_KNOWN_SUFFIXES and _PEFT_EXTRA_SUFFIXES constants
- Add test_empty_state_dict_returns_empty_model to verify empty input produces a model with no layers

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(anima): add safety cap for Qwen3 sequence length to prevent OOM

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(anima): add denoising range validation, fix closure capture, add edge case tests

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(anima): add T5 to metadata, fix dead code, decouple scheduler type guard

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(anima): update VAE field description for required field

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* chore: regenerate frontend types after upstream merge

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* chore: ruff format anima_denoise.py

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(anima): add T5 encoder metadata recall handler

The T5 encoder was added to generation metadata but had no recall
handler, so it wasn't restored when recalling from metadata.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* chore(frontend): add regression test for buildAnimaGraph

Add tests for CFG gating (negative conditioning omitted when cfgScale <= 1)
and basic graph structure (model loader, text encoder, denoise nodes).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* only show 0.6b for anima

* dont show 0.6b for other models

* schema

* Anima preview 3

* fix ci

---------

Co-authored-by: Your Name <you@example.com>
Co-authored-by: kappacommit <samwolfe40@gmail.com>
Co-authored-by: Alexander Eichhorn <alex@eichhorn.dev>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>
2026-04-09 12:04:11 -04:00
Alexander Eichhorn
eb3f1c9a61 feat: Add Z-Image-Turbo model support
Add comprehensive support for Z-Image-Turbo (S3-DiT) models including:

Backend:
- New BaseModelType.ZImage in taxonomy
- Z-Image model config classes (ZImageTransformerConfig, Qwen3TextEncoderConfig)
- Model loader for Z-Image transformer and Qwen3 text encoder
- Z-Image conditioning data structures
- Step callback support for Z-Image with FLUX latent RGB factors

Invocations:
- z_image_model_loader: Load Z-Image transformer and Qwen3 encoder
- z_image_text_encoder: Encode prompts using Qwen3 with chat template
- z_image_denoise: Flow matching denoising with time-shifted sigmas
- z_image_image_to_latents: Encode images to 16-channel latents
- z_image_latents_to_image: Decode latents using FLUX VAE

Frontend:
- Z-Image graph builder for text-to-image generation
- Model picker and validation updates for z-image base type
- CFG scale now allows 0 (required for Z-Image-Turbo)
- Clip skip disabled for Z-Image (uses Qwen3, not CLIP)
- Optimal dimension settings for Z-Image (1024x1024)

Technical details:
- Uses Qwen3 text encoder (not CLIP/T5)
- 16 latent channels with FLUX-compatible VAE
- Flow matching scheduler with dynamic time shift
- 8 inference steps recommended for Turbo variant
- bfloat16 inference dtype
2025-12-01 00:22:32 +01:00
Kent Keirsey
3bfb497764 ruff fixes 2025-05-19 13:50:04 +10:00
Kent Keirsey
b02ea1a898 Expanded styles & updated UI 2025-05-19 13:50:04 +10:00
Kent Keirsey
d709040f4b Matt3o base changes 2025-05-19 13:50:04 +10:00
Ryan Dick
bac05a7885 Add CogView4TextEncoderInvocation 2025-04-10 10:50:13 +10:00
psychedelicious
5fa2cf59e2 fix(app): add trusted classes to torch safe globals to prevent errors when loading them
In `ObjectSerializerDisk`, we use `torch.load` to load serialized objects from disk. With torch 2.6.0, torch defaults to `weights_only=True`. As a result, torch will raise when attempting to deserialize anything with an unrecognized class.

For example, our `ConditioningFieldData` class is untrusted. When we load conditioning from disk, we will get a runtime error.

Torch provides a method to add trusted classes to an allowlist. This change adds an arg to `ObjectSerializerDisk` to add a list of safe globals to the allowlist and uses it for both `ObjectSerializerDisk` instances.

Note: My first attempt inferred the class from the generic type arg that `ObjectSerializerDisk` accepts, and added that to the allowlist. Unfortunately, this doesn't work.

For example, `ConditioningFieldData` has a `conditionings` attribute that may be one some other untrusted classes representing model-specific conditioning data. So, even if we allowlist `ConditioningFieldData`, loading will fail when torch deserializes the `conditionings` attribute.
2025-04-04 18:42:13 +11:00
Ryan Dick
2d86298b7f Add first draft of Sd3TextEncoderInvocation. 2024-11-04 12:42:09 -05:00
Ryan Dick
4e4b6c6dbc Tidy variable management and dtype handling in FluxTextToImageInvocation. 2024-08-29 19:08:18 +00:00
Ryan Dick
a52c899c6d Split a FluxTextEncoderInvocation out from the FluxTextToImageInvocation. This has the advantage that we benfit from automatic caching when the prompt isn't changed. 2024-08-26 20:17:50 -04:00
Sergey Borisov
3f79467f7b Ruff format 2024-07-17 04:24:45 +03:00
Sergey Borisov
2c2ec8f0bc Comments, a bit refactor 2024-07-17 04:20:31 +03:00
Sergey Borisov
03e22c257b Convert conditioning_mode to enum 2024-07-17 03:37:11 +03:00
Sergey Borisov
ae6d4fbc78 Move out _concat_conditionings_for_batch submethods 2024-07-17 03:31:26 +03:00
Sergey Borisov
d623bd429b Fix condtionings logic 2024-07-16 00:31:56 +03:00
Sergey Borisov
3a9dda9177 Renames 2024-07-12 22:44:00 +03:00
Sergey Borisov
0bc60378d3 A bit rework conditioning convert to unet kwargs 2024-07-12 20:43:32 +03:00
Sergey Borisov
9cc852cf7f Base code from draft PR 2024-07-12 20:31:26 +03:00
blessedcoolant
6ea183f0d4 wip: Initial Implementation IP Adapter Style & Comp Modes 2024-04-13 11:09:45 +05:30
Ryan Dick
0bdbfd4d1d Add support for IP-Adapter masks. 2024-04-09 15:06:51 -04:00
Ryan Dick
2e27ed5f3d Pass IP-Adapter scales through the cross_attn_kwargs pathway, since they are the same for all attention layers. This change also helps to prepare for adding IP-Adapter region masks. 2024-04-09 15:06:51 -04:00
Ryan Dick
4a828818da Remove support for Prompt-to-Prompt cross-attention control (aka .swap()). This feature is not widely used. It does not work with SDXL and is incompatible with IP-Adapter and regional prompting. The implementation is also intertwined with both text embedding and the UNet attention layers, resulting in a high maintenance burden. For all of these reasons, we have decided to drop support. 2024-04-09 10:57:02 -04:00
Ryan Dick
d1e45585d0 Add TextConditioningRegions to the TextConditioningData data structure. 2024-04-09 08:12:12 -04:00
Ryan Dick
aba023e0c5 Improve documentation of conditioning_data.py. 2024-04-09 08:12:12 -04:00
Ryan Dick
e354c29b52 Rename ConditioningData -> TextConditioningData. 2024-04-09 08:12:12 -04:00
Ryan Dick
a7f363e654 Split ip_adapter_conditioning out from ConditioningData. 2024-04-09 08:12:12 -04:00
Ryan Dick
9b2162e564 Remove scheduler_args from ConditioningData structure. 2024-04-09 08:12:12 -04:00
Ryan Dick
145bb45858 Remove dead code related to an old symmetry feature. 2024-03-10 00:13:18 -06:00
Ryan Dick
ad96857e0f Fix avoid storing extra conditioning info in two places. 2024-03-01 15:12:03 -05:00
psychedelicious
05fb485d33 feat(nodes): move ConditioningFieldData to conditioning_data.py 2024-03-01 10:42:33 +11:00
Damian Stewart
0beb08686c Add CFG Rescale option for supporting zero-terminal SNR models (#4335)
* add support for CFG rescale

* fix typo

* move rescale position and tweak docs

* move input position

* implement suggestions from github and discord

* cleanup unused code

* add back dropped FieldDescription

* fix(ui): revert unrelated UI changes

* chore(nodes): bump denoise_latents version 1.4.0 -> 1.5.0

* feat(nodes): add cfg_rescale_multiplier to metadata node

* feat(ui): add cfg rescale multiplier to linear UI

- add param to state
- update graph builders
- add UI under advanced
- add metadata handling & recall
- regen types

* chore: black

* fix(backend): make `StableDiffusionGeneratorPipeline._rescale_cfg()` staticmethod

This doesn't need access to class.

* feat(backend): add docstring for `_rescale_cfg()` method

* feat(ui): update cfg rescale mult translation string

---------

Co-authored-by: psychedelicious <4822129+psychedelicious@users.noreply.github.com>
2023-11-30 20:55:20 +11:00
Ryan Dick
8464450a53 Add support for multi-image IP-Adapter. 2023-10-14 12:50:33 -04:00
Ryan Dick
78828b6b9c WIP - Accept a list of IPAdapterFields in DenoiseLatents. 2023-10-06 20:43:43 -04:00
Ryan Dick
50a0691514 flake8 2023-09-08 18:05:31 -04:00
Ryan Dick
b2d5b53b5f Pass IP-Adapter conditioning via cross_attention_kwargs instead of concatenating to the text embedding. This avoids interference with other features that manipulate the text embedding (e.g. long prompts). 2023-09-08 11:47:36 -04:00
Ryan Dick
ddc148b70b Move ConditioningData and its field classes to their own file. This will allow new conditioning types to be added more cleanly without introducing circular dependencies. 2023-09-08 11:00:11 -04:00