mirror of
https://github.com/invoke-ai/InvokeAI.git
synced 2026-04-23 03:00:31 -04:00
main
4 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
a42fdb0f44 |
fix(flux2): Fix FLUX.2 Klein image generation quality (#8838)
* fix(flux2): Fix image quality degradation at resolutions > 1024x1024 This commit addresses severe quality degradation and artifacts when generating images larger than 1024x1024 with FLUX.2 Klein models. Root causes fixed: 1. Dynamic max_image_seq_len in scheduler (flux2_denoise.py) - Previously hardcoded to 4096 (1024x1024 only) - Now dynamically calculated based on actual resolution - Allows proper schedule shifting at all resolutions 2. Smoothed mu calculation discontinuity (sampling_utils.py) - Eliminated 40-50% mu value drop at seq_len 4300 threshold - Implemented smooth cosine interpolation (4096-4500 transition zone) - Gradual blend between low-res and high-res formulas Impact: - FLUX.2 Klein 9B: Major quality improvement at high resolutions - FLUX.2 Klein 4B: Improved quality at high resolutions - Baseline 1024x1024: Unchanged (no regression) - All generation modes: T2I and Kontext (reference images) Fixes: Community-reported quality degradation issue See: Discord discussions in #garbage-bin and #devchat Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> * fix(flux2): Fix high-resolution quality degradation for FLUX.2 Klein Fixes grid/diamond artifacts and color loss at resolutions > 1024x1024. Root causes identified and fixed: - BN normalization was incorrectly applied to random noise input (diffusers only normalizes image latents from VAE.encode) - BN denormalization must be applied to output before VAE decode - mu parameter was resolution-dependent causing over-shifted schedules at high resolutions (now fixed to 2.02, matching ComfyUI) Changes: - Remove BN normalization on noise input (not needed for N(0,1) noise) - Preserve BN denormalization on denoised output (required for VAE) - Fix mu to constant 2.02 for all resolutions (matches ComfyUI) Tested at 2048x2048 with FLUX.2 Klein 4B * Chore Ruff --------- Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com> Co-authored-by: Jonathan <34005131+JPPhoto@users.noreply.github.com> |
||
|
|
33779f3072 |
fix(flux2): support Heun scheduler for FLUX.2 Klein models (#8794)
* fix(flux2): support Heun scheduler for FLUX.2 Klein models FlowMatchHeunDiscreteScheduler does not support dynamic shifting parameters (use_dynamic_shifting, base_shift, max_shift, etc.) or sigmas/mu in set_timesteps. This caused FLUX.2 Klein to fail when using Heun scheduler. - Create Heun scheduler with only num_train_timesteps and shift parameters - Use num_inference_steps instead of sigmas for Heun's set_timesteps call - Euler and LCM schedulers continue to use full dynamic shifting support * fix(flux2): fix Heun scheduler detection using inspect.signature The previous hasattr check for state_in_first_order failed because the attribute doesn't exist before set_timesteps() is called. Now using inspect.signature to check for sigmas parameter support, matching the FLUX1 implementation. --------- Co-authored-by: Jonathan <34005131+JPPhoto@users.noreply.github.com> |
||
|
|
0ecb903ae2 |
fix: Klein 2 Inpainting breaking when there is a reference image (#8803)
Co-authored-by: Alexander Eichhorn <alex@eichhorn.dev> Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com> |
||
|
|
b92c6ae633 |
feat(flux2): add FLUX.2 klein model support (#8768)
* WIP: feat(flux2): add FLUX 2 Kontext model support - Add new invocation nodes for FLUX 2: - flux2_denoise: Denoising invocation for FLUX 2 - flux2_klein_model_loader: Model loader for Klein architecture - flux2_klein_text_encoder: Text encoder for Qwen3-based encoding - flux2_vae_decode: VAE decoder for FLUX 2 - Add backend support: - New flux2 module with denoise and sampling utilities - Extended model manager configs for FLUX 2 models - Updated model loaders for Klein architecture - Update frontend: - Extended graph builder for FLUX 2 support - Added FLUX 2 model types and configurations - Updated readiness checks and UI components * fix(flux2): correct VAE decode with proper BN denormalization FLUX.2 VAE uses Batch Normalization in the patchified latent space (128 channels). The decode must: 1. Patchify latents from (B, 32, H, W) to (B, 128, H/2, W/2) 2. Apply BN denormalization using running_mean/running_var 3. Unpatchify back to (B, 32, H, W) for VAE decode Also fixed image normalization from [-1, 1] to [0, 255]. This fixes washed-out colors in generated FLUX.2 Klein images. * feat(flux2): add FLUX.2 Klein model support with ComfyUI checkpoint compatibility - Add FLUX.2 transformer loader with BFL-to-diffusers weight conversion - Fix AdaLayerNorm scale-shift swap for final_layer.adaLN_modulation weights - Add VAE batch normalization handling for FLUX.2 latent normalization - Add Qwen3 text encoder loader with ComfyUI FP8 quantization support - Add frontend components for FLUX.2 Klein model selection - Update configs and schema for FLUX.2 model types * Chore Ruff * Fix Flux1 vae probing * Fix Windows Paths schema.ts * Add 4B und 9B klein to Starter Models. * feat(flux2): add non-commercial license indicator for FLUX.2 Klein 9B - Add isFlux2Klein9BMainModelConfig and isNonCommercialMainModelConfig functions - Update MainModelPicker and InitialStateMainModelPicker to show license icon - Update license tooltip text to include FLUX.2 Klein 9B * feat(flux2): add Klein/Qwen3 variant support and encoder filtering Backend: - Add klein_4b/klein_9b variants for FLUX.2 Klein models - Add qwen3_4b/qwen3_8b variants for Qwen3 encoder models - Validate encoder variant matches Klein model (4B↔4B, 9B↔8B) - Auto-detect Qwen3 variant from hidden_size during probing Frontend: - Show variant field for all model types in ModelView - Filter Qwen3 encoder dropdown to only show compatible variants - Update variant type definitions (zFlux2VariantType, zQwen3VariantType) - Remove unused exports (isFluxDevMainModelConfig, isFlux2Klein9BMainModelConfig) * Chore Ruff * feat(flux2): add Klein 9B Base (undistilled) variant support Distinguish between FLUX.2 Klein 9B (distilled) and Klein 9B Base (undistilled) models by checking guidance_embeds in diffusers config or guidance_in keys in safetensors. Klein 9B Base requires more steps but offers higher quality. * feat(flux2): improve diffusers compatibility and distilled model support Backend changes: - Update text encoder layers from [9,18,27] to (10,20,30) matching diffusers - Use apply_chat_template with system message instead of manual formatting - Change position IDs from ones to zeros to match diffusers implementation - Add get_schedule_flux2() with empirical mu computation for proper schedule shifting - Add txt_embed_scale parameter for Qwen3 embedding magnitude control - Add shift_schedule toggle for base (28+ steps) vs distilled (4 steps) models - Zero out guidance_embedder weights for Klein models without guidance_embeds UI changes: - Clear Klein VAE and Qwen3 encoder when switching away from flux2 base - Clear Qwen3 encoder when switching between different Klein model variants - Add toast notification informing user to select compatible encoder * feat(flux2): fix distilled model scheduling with proper dynamic shifting - Configure scheduler with FLUX.2 Klein parameters from scheduler_config.json (use_dynamic_shifting=True, shift=3.0, time_shift_type="exponential") - Pass mu parameter to scheduler.set_timesteps() for resolution-aware shifting - Remove manual shift_schedule parameter (scheduler handles this automatically) - Simplify get_schedule_flux2() to return linear sigmas only - Remove txt_embed_scale parameter (no longer needed) This matches the diffusers Flux2KleinPipeline behavior where the FlowMatchEulerDiscreteScheduler applies dynamic timestep shifting based on image resolution via the mu parameter. Fixes 4-step distilled Klein 9B model quality issues. * fix(ui): fix FLUX.1 graph building with posCondCollect node lookup The posCondCollect node was created with getPrefixedId() which generates a random suffix (e.g., 'pos_cond_collect:abc123'), but g.getNode() was called with the plain string 'pos_cond_collect', causing a node lookup failure. Fix by declaring posCondCollect as a module-scoped variable and referencing it directly instead of using g.getNode(). * Remove Flux2 Klein Base from Starter Models * Remove Logging * Add Default Values for Flux2 Klein and add variant as additional info to from_base * Add migrations for the z-image qwen3 encoder without a variant value * Add img2img, inpainting and outpainting support for FLUX.2 Klein - Add flux2_vae_encode invocation for encoding images to FLUX.2 latents - Integrate inpaint_extension into FLUX.2 denoise loop for proper mask handling - Apply BN normalization to init_latents and noise for consistency in inpainting - Use manual Euler stepping for img2img/inpaint to preserve exact timestep schedule - Add flux2_img2img, flux2_inpaint, flux2_outpaint generation modes - Expand starter models with FP8 variants, standalone transformers, and separate VAE/encoders - Fix outpainting to always use full denoising (0-1) since strength doesn't apply - Improve error messages in model loader with clear guidance for standalone models * Add GGUF quantized model support and Diffusers VAE loader for FLUX.2 Klein - Add Main_GGUF_Flux2_Config for GGUF-quantized FLUX.2 transformer models - Add VAE_Diffusers_Flux2_Config for FLUX.2 VAE in diffusers format - Add Flux2GGUFCheckpointModel loader with BFL-to-diffusers conversion - Add Flux2VAEDiffusersLoader for AutoencoderKLFlux2 - Add FLUX.2 Klein 4B/9B hardware requirements to documentation - Update starter model descriptions to clarify dependencies install together - Update frontend schema for new model configs * Fix FLUX.2 model detection and add FP8 weight dequantization support - Improve FLUX.2 variant detection for GGUF/checkpoint models (BFL format keys) - Fix guidance_embeds logic: distilled=False, undistilled=True - Add FP8 weight dequantization for ComfyUI-style quantized models - Prevent FLUX.2 models from being misidentified as FLUX.1 - Preserve user-editable fields (name, description, etc.) on model reidentify - Improve Qwen3Encoder detection by variant in starter models - Add defensive checks for tensor operations * Chore ruff format * Chore Typegen * Fix FLUX.2 Klein 9B model loading by detecting hidden_size from weights Previously num_attention_heads was hardcoded to 24, which is correct for Klein 4B but causes size mismatches when loading Klein 9B checkpoints. Now dynamically calculates num_attention_heads from the hidden_size dimension of context_embedder weights: - Klein 4B: hidden_size=3072 → num_attention_heads=24 - Klein 9B: hidden_size=4096 → num_attention_heads=32 Fixes both Checkpoint and GGUF loaders for FLUX.2 models. * Only clear Qwen3 encoder when FLUX.2 Klein variant changes Previously the encoder was cleared whenever switching between any Klein models, even if they had the same variant. Now compares the variant of the old and new model and only clears the encoder when switching between different variants (e.g., klein_4b to klein_9b). This allows users to switch between different Klein 9B models without having to re-select the Qwen3 encoder each time. * Add metadata recall support for FLUX.2 Klein parameters The scheduler, VAE model, and Qwen3 encoder model were not being recalled correctly for FLUX.2 Klein images. This adds dedicated metadata handlers for the Klein-specific parameters. * Fix FLUX.2 Klein denoising scaling and Z-Image VAE compatibility - Apply exponential denoising scaling (exponent 0.2) to FLUX.2 Klein, matching FLUX.1 behavior for more intuitive inpainting strength - Add isFlux1VAEModelConfig type guard to filter FLUX 1.0 VAEs only - Restrict Z-Image VAE selection to FLUX 1.0 VAEs, excluding FLUX.2 Klein 32-channel VAEs which are incompatible * chore pnpm fix * Add FLUX.2 Klein to starter bundles and documentation - Add FLUX.2 Klein hardware requirements to quick start guide - Create flux2_klein_bundle with GGUF Q4 model, VAE, and Qwen3 encoder - Add "What's New" entry announcing FLUX.2 Klein support * Add FLUX.2 Klein built-in reference image editing support FLUX.2 Klein has native multi-reference image editing without requiring a separate model (unlike FLUX.1 which needs a Kontext model). Backend changes: - Add Flux2RefImageExtension for encoding reference images with FLUX.2 VAE - Apply BN normalization to reference image latents for correct scaling - Use T-coordinate offset scale=10 like diffusers (T=10, 20, 30...) - Concatenate reference latents with generated image during denoising - Extract only generated portion in step callback for correct preview Frontend changes: - Add flux2_reference_image config type without model field - Hide model selector for FLUX.2 reference images (built-in support) - Add type guards to handle configs without model property - Update validators to skip model validation for FLUX.2 - Add 'flux2' to SUPPORTS_REF_IMAGES_BASE_MODELS * Chore windows path fix * Add reference image resizing for FLUX.2 Klein Resize large reference images to match BFL FLUX.2 sampling.py limits: - Single reference: max 2024² pixels (~4.1M) - Multiple references: max 1024² pixels (~1M) Uses same scaling approach as BFL's cap_pixels() function. |