InvokeAI

mirror of https://github.com/invoke-ai/InvokeAI.git synced 2026-04-23 03:00:31 -04:00

Files

Alexander Eichhorn eb3f1c9a61 feat: Add Z-Image-Turbo model support

Add comprehensive support for Z-Image-Turbo (S3-DiT) models including:

Backend:
- New BaseModelType.ZImage in taxonomy
- Z-Image model config classes (ZImageTransformerConfig, Qwen3TextEncoderConfig)
- Model loader for Z-Image transformer and Qwen3 text encoder
- Z-Image conditioning data structures
- Step callback support for Z-Image with FLUX latent RGB factors

Invocations:
- z_image_model_loader: Load Z-Image transformer and Qwen3 encoder
- z_image_text_encoder: Encode prompts using Qwen3 with chat template
- z_image_denoise: Flow matching denoising with time-shifted sigmas
- z_image_image_to_latents: Encode images to 16-channel latents
- z_image_latents_to_image: Decode latents using FLUX VAE

Frontend:
- Z-Image graph builder for text-to-image generation
- Model picker and validation updates for z-image base type
- CFG scale now allows 0 (required for Z-Image-Turbo)
- Clip skip disabled for Z-Image (uses Qwen3, not CLIP)
- Optimal dimension settings for Z-Image (1024x1024)

Technical details:
- Uses Qwen3 text encoder (not CLIP/T5)
- 16 latent channels with FLUX-compatible VAE
- Flow matching scheduler with dynamic time shift
- 8 inference steps recommended for Turbo variant
- bfloat16 inference dtype

2025-12-01 00:22:32 +01:00

__init__.py

Apply ruff rule to disallow all relative imports.

2024-07-04 09:35:37 -04:00

conditioning_data.py

feat: Add Z-Image-Turbo model support

2025-12-01 00:22:32 +01:00

custom_atttention.py

ruff fixes

2025-05-19 13:50:04 +10:00

regional_ip_data.py

Fix the padding behavior when max-pooling regional IP-Adapter masks to mirror the downscaling behavior of SD and SDXL. Prior to this change, denoising with input latent dimensions that were not evenly divisible by 8 would raise an exception.