Improve FLUX image-to-image (Trajectory Guidance) (#6900)

## Summary This PR makes some improvements to the FLUX image-to-image and inpainting behaviours. Changes: - Expand inpainting region at a cutoff timestep. This improves seam coherence around inpainting regions. - Add Trajectory Guidance to improve the ability to control how much an image gets modified during image-to-image/inpainting (see the code for a more technical explanation - it's well-documented). ## `trajectory_guidance_strength` Usage - The `trajectory_guidance_strength` param has been added to the `FLUX Denoise` invocation. - `trajectory_guidance_strength` defaults to `0.0` and should be in the range [0, 1]. - `trajectory_guidance_strength = 0.0` has no effect on the denoising process. - `trajectory_guidance_strength = 1.0` will guide strongly towards the original image. ## FLUX image-to-image usage tips - As always, prompt matters a lot. - If you are trying to making minor perturbations to an image, use vanilla image-to-image by setting the `denoising_start` param. - If you are trying to make significant changes to an image, using trajectory guidance will give more control than using vanilla image-to-image. Set `denoising_start=0.0` and adjust `trajectory_guidance_strength` to control the amount of change in the image. - The 'transition point' where the image changes the most as you adjust `trajectory_guidance_strength` or `denoise_start` varies depending on the noise. So, set a fixed noise seed, then tune those params. ## QA Instructions - [x] Vanilla image-to-image - No change in output - [x] Vanilla inpainting - No change in output - [x] Vanilla outpainting - No change in output - Trajectory Guidance image-to-image - [x] TGS = 0.0 is identical to Vanilla case - [x] TGS = 1.0 guides close to the original image - Not as close as I'd like, but it's not broken. - [x] Smooth transition as TGS varies - [x] Smoke test: TGS with denoise_start > 0.0 - TG inpainting - [x] TGS = 0.0 is identical to Vanilla case - [x] TGS = 1.0 guides close to the original image - Not as close as I'd like, but it's not broken - [x] Smooth transition as TGS varies - [x] Smoke test: TGS with denoise_start > 0.0 - TG outpainting - [x] TGS = 0.0 is identical to Vanilla case - [x] Smoke test TGS outpainting - [x] Smoke test FLUX text-to-image - [x] Preview images look ok for all of above. ## Known issues (will be addressed in follow-up PRs) - The current TGS scale biases towards creating more change than desired in the image. More tuning of the TG change schedule is required. - TGS does not work very well for outpainting right now. This _might_ be solvable, but more likely we'll just want to discourage it in the Linear UI. ## Merge Plan No special instructions. ## Checklist - [x] _The PR has a short but descriptive title, suitable for a changelog_ - [x] _Tests added / updated (if applicable)_ - [x] _Documentation added / updated (if applicable)_
2026-01-15 06:58:13 -05:00 · 2024-09-20 18:47:32 -04:00
parent 2f4a5a2542 183a67cb1e
commit eea20f1ae6
14 changed files with 425 additions and 138 deletions
--- a/invokeai/app/invocations/flux_denoise.py
+++ b/invokeai/app/invocations/flux_denoise.py
@@ -20,7 +20,6 @@ from invokeai.app.invocations.model import TransformerField
 from invokeai.app.invocations.primitives import LatentsOutput
 from invokeai.app.services.shared.invocation_context import InvocationContext
 from invokeai.backend.flux.denoise import denoise
-from invokeai.backend.flux.inpaint_extension import InpaintExtension
 from invokeai.backend.flux.model import Flux
 from invokeai.backend.flux.sampling_utils import (
    clip_timestep_schedule,
@@ -30,6 +29,7 @@ from invokeai.backend.flux.sampling_utils import (
    pack,
    unpack,
 )
+from invokeai.backend.flux.trajectory_guidance_extension import TrajectoryGuidanceExtension
 from invokeai.backend.lora.lora_model_raw import LoRAModelRaw
 from invokeai.backend.lora.lora_patcher import LoRAPatcher
 from invokeai.backend.model_manager.config import ModelFormat
@@ -43,7 +43,7 @@ from invokeai.backend.util.devices import TorchDevice
    title="FLUX Denoise",
    tags=["image", "flux"],
    category="image",
-    version="2.0.0",
+    version="2.1.0",
    classification=Classification.Prototype,
 )
 class FluxDenoiseInvocation(BaseInvocation, WithMetadata, WithBoard):
@@ -68,6 +68,12 @@ class FluxDenoiseInvocation(BaseInvocation, WithMetadata, WithBoard):
        description=FieldDescriptions.denoising_start,
    )
    denoising_end: float = InputField(default=1.0, ge=0, le=1, description=FieldDescriptions.denoising_end)
+    trajectory_guidance_strength: float = InputField(
+        default=0.0,
+        ge=0.0,
+        le=1.0,
+        description="Value indicating how strongly to guide the denoising process towards the initial latents (during image-to-image). Range [0, 1]. A value of 0.0 is equivalent to vanilla image-to-image. A value of 1.0 will guide the denoising process very close to the original latents.",
+    )
    transformer: TransformerField = InputField(
        description=FieldDescriptions.flux_model,
        input=Input.Connection,
@@ -181,14 +187,13 @@ class FluxDenoiseInvocation(BaseInvocation, WithMetadata, WithBoard):
        # Now that we have 'packed' the latent tensors, verify that we calculated the image_seq_len correctly.
        assert image_seq_len == x.shape[1]

-        # Prepare inpaint extension.
-        inpaint_extension: InpaintExtension | None = None
-        if inpaint_mask is not None:
-            assert init_latents is not None
-            inpaint_extension = InpaintExtension(
+        # Prepare trajectory guidance extension.
+        traj_guidance_extension: TrajectoryGuidanceExtension | None = None
+        if init_latents is not None:
+            traj_guidance_extension = TrajectoryGuidanceExtension(
                init_latents=init_latents,
                inpaint_mask=inpaint_mask,
-                noise=noise,
+                trajectory_guidance_strength=self.trajectory_guidance_strength,
            )

        with (
@@ -236,7 +241,7 @@ class FluxDenoiseInvocation(BaseInvocation, WithMetadata, WithBoard):
                timesteps=timesteps,
                step_callback=self._build_step_callback(context),
                guidance=self.guidance,
-                inpaint_extension=inpaint_extension,
+                traj_guidance_extension=traj_guidance_extension,
            )

        x = unpack(x.float(), self.height, self.width)
--- a/invokeai/app/services/workflow_records/default_workflows/FLUX
+++ b/invokeai/app/services/workflow_records/default_workflows/FLUX
@@ -2,7 +2,7 @@
  "name": "FLUX Image to Image",
  "author": "InvokeAI",
  "description": "A simple image-to-image workflow using a FLUX dev model. ",
-  "version": "1.0.4",
+  "version": "1.1.0",
  "contact": "",
  "tags": "image2image, flux, image-to-image",
  "notes": "Prerequisite model downloads: T5 Encoder, CLIP-L Encoder, and FLUX VAE. Quantized and un-quantized versions can be found in the starter models tab within your Model Manager. We recommend using FLUX dev models for image-to-image workflows. The image-to-image performance with FLUX schnell models is poor.",
@@ -23,17 +23,13 @@
      "nodeId": "f8d9d7c8-9ed7-4bd7-9e42-ab0e89bfac90",
      "fieldName": "vae_model"
    },
-    {
-      "nodeId": "ace0258f-67d7-4eee-a218-6fff27065214",
-      "fieldName": "denoising_start"
-    },
    {
      "nodeId": "01f674f8-b3d1-4df1-acac-6cb8e0bfb63c",
      "fieldName": "prompt"
    },
    {
-      "nodeId": "ace0258f-67d7-4eee-a218-6fff27065214",
-      "fieldName": "num_steps"
+      "nodeId": "2981a67c-480f-4237-9384-26b68dbf912b",
+      "fieldName": "image"
    }
  ],
  "meta": {
@@ -42,48 +38,18 @@
  },
  "nodes": [
    {
-      "id": "2981a67c-480f-4237-9384-26b68dbf912b",
+      "id": "eebd7252-0bd8-401a-bb26-2b8bc64892fa",
      "type": "invocation",
      "data": {
-        "id": "2981a67c-480f-4237-9384-26b68dbf912b",
-        "type": "flux_vae_encode",
-        "version": "1.0.0",
-        "label": "",
-        "notes": "",
-        "isOpen": true,
-        "isIntermediate": true,
-        "useCache": true,
-        "inputs": {
-          "image": {
-            "name": "image",
-            "label": "",
-            "value": {
-              "image_name": "8a5c62aa-9335-45d2-9c71-89af9fc1f8d4.png"
-            }
-          },
-          "vae": {
-            "name": "vae",
-            "label": ""
-          }
-        }
-      },
-      "position": {
-        "x": 732.7680166609682,
-        "y": -24.37398171806909
-      }
-    },
-    {
-      "id": "ace0258f-67d7-4eee-a218-6fff27065214",
-      "type": "invocation",
-      "data": {
-        "id": "ace0258f-67d7-4eee-a218-6fff27065214",
+        "id": "eebd7252-0bd8-401a-bb26-2b8bc64892fa",
        "type": "flux_denoise",
-        "version": "1.0.0",
+        "version": "2.1.0",
        "label": "",
        "notes": "",
        "isOpen": true,
        "isIntermediate": true,
        "useCache": true,
+        "nodePack": "invokeai",
        "inputs": {
          "board": {
            "name": "board",
@@ -111,6 +77,11 @@
            "label": "",
            "value": 1
          },
+          "trajectory_guidance_strength": {
+            "name": "trajectory_guidance_strength",
+            "label": "",
+            "value": 0.0
+          },
          "transformer": {
            "name": "transformer",
            "label": ""
@@ -131,7 +102,7 @@
          },
          "num_steps": {
            "name": "num_steps",
-            "label": "Steps (Recommend 30 for Dev, 4 for Schnell)",
+            "label": "",
            "value": 30
          },
          "guidance": {
@@ -147,8 +118,36 @@
        }
      },
      "position": {
-        "x": 1182.8836633018684,
-        "y": -251.38882958913183
+        "x": 1159.584057771928,
+        "y": -175.90561201366845
+      }
+    },
+    {
+      "id": "2981a67c-480f-4237-9384-26b68dbf912b",
+      "type": "invocation",
+      "data": {
+        "id": "2981a67c-480f-4237-9384-26b68dbf912b",
+        "type": "flux_vae_encode",
+        "version": "1.0.0",
+        "label": "",
+        "notes": "",
+        "isOpen": true,
+        "isIntermediate": true,
+        "useCache": true,
+        "inputs": {
+          "image": {
+            "name": "image",
+            "label": ""
+          },
+          "vae": {
+            "name": "vae",
+            "label": ""
+          }
+        }
+      },
+      "position": {
+        "x": 732.7680166609682,
+        "y": -24.37398171806909
      }
    },
    {
@@ -202,18 +201,32 @@
        "inputs": {
          "model": {
            "name": "model",
-            "label": "Model (dev variant recommended for Image-to-Image)"
+            "label": "Model (dev variant recommended for Image-to-Image)",
+            "value": {
+              "key": "b4990a6c-0899-48e9-969b-d6f3801acc6a",
+              "hash": "random:aad8f7bc19ce76541dfb394b62a30f77722542b66e48064a9f25453263b45fba",
+              "name": "FLUX Dev (Quantized)_2",
+              "base": "flux",
+              "type": "main"
+            }
          },
          "t5_encoder_model": {
            "name": "t5_encoder_model",
-            "label": ""
+            "label": "",
+            "value": {
+              "key": "d18d5575-96b6-4da3-b3d8-eb58308d6705",
+              "hash": "random:f2f9ed74acdfb4bf6fec200e780f6c25f8dd8764a35e65d425d606912fdf573a",
+              "name": "t5_bnb_int8_quantized_encoder",
+              "base": "any",
+              "type": "t5_encoder"
+            }
          },
          "clip_embed_model": {
            "name": "clip_embed_model",
            "label": "",
            "value": {
-              "key": "fa23a584-b623-415d-832a-21b5098ff1a1",
-              "hash": "blake3:17c19f0ef941c3b7609a9c94a659ca5364de0be364a91d4179f0e39ba17c3b70",
+              "key": "5a19d7e5-8d98-43cd-8a81-87515e4b3b4e",
+              "hash": "random:4bd08514c08fb6ff04088db9aeb45def3c488e8b5fd09a35f2cc4f2dc346f99f",
              "name": "clip-vit-large-patch14",
              "base": "any",
              "type": "clip_embed"
@@ -223,8 +236,8 @@
            "name": "vae_model",
            "label": "",
            "value": {
-              "key": "74fc82ba-c0a8-479d-a890-2126f82da758",
-              "hash": "blake3:ce21cb76364aa6e2421311cf4a4b5eb052a76c4f1cd207b50703d8978198a068",
+              "key": "9172beab-5c1d-43f0-b2f0-6e0b956710d9",
+              "hash": "random:c54dde288e5fa2e6137f1c92e9d611f598049e6f16e360207b6d96c9f5a67ba0",
              "name": "FLUX.1-schnell_ae",
              "base": "flux",
              "type": "vae"
@@ -308,28 +321,60 @@
  ],
  "edges": [
    {
-      "id": "reactflow__edge-2981a67c-480f-4237-9384-26b68dbf912bheight-ace0258f-67d7-4eee-a218-6fff27065214height",
+      "id": "reactflow__edge-eebd7252-0bd8-401a-bb26-2b8bc64892falatents-7e5172eb-48c1-44db-a770-8fd83e1435d1latents",
      "type": "default",
-      "source": "2981a67c-480f-4237-9384-26b68dbf912b",
-      "target": "ace0258f-67d7-4eee-a218-6fff27065214",
-      "sourceHandle": "height",
-      "targetHandle": "height"
+      "source": "eebd7252-0bd8-401a-bb26-2b8bc64892fa",
+      "target": "7e5172eb-48c1-44db-a770-8fd83e1435d1",
+      "sourceHandle": "latents",
+      "targetHandle": "latents"
    },
    {
-      "id": "reactflow__edge-2981a67c-480f-4237-9384-26b68dbf912bwidth-ace0258f-67d7-4eee-a218-6fff27065214width",
+      "id": "reactflow__edge-f8d9d7c8-9ed7-4bd7-9e42-ab0e89bfac90transformer-eebd7252-0bd8-401a-bb26-2b8bc64892fatransformer",
+      "type": "default",
+      "source": "f8d9d7c8-9ed7-4bd7-9e42-ab0e89bfac90",
+      "target": "eebd7252-0bd8-401a-bb26-2b8bc64892fa",
+      "sourceHandle": "transformer",
+      "targetHandle": "transformer"
+    },
+    {
+      "id": "reactflow__edge-01f674f8-b3d1-4df1-acac-6cb8e0bfb63cconditioning-eebd7252-0bd8-401a-bb26-2b8bc64892fapositive_text_conditioning",
+      "type": "default",
+      "source": "01f674f8-b3d1-4df1-acac-6cb8e0bfb63c",
+      "target": "eebd7252-0bd8-401a-bb26-2b8bc64892fa",
+      "sourceHandle": "conditioning",
+      "targetHandle": "positive_text_conditioning"
+    },
+    {
+      "id": "reactflow__edge-2981a67c-480f-4237-9384-26b68dbf912blatents-eebd7252-0bd8-401a-bb26-2b8bc64892falatents",
      "type": "default",
      "source": "2981a67c-480f-4237-9384-26b68dbf912b",
-      "target": "ace0258f-67d7-4eee-a218-6fff27065214",
+      "target": "eebd7252-0bd8-401a-bb26-2b8bc64892fa",
+      "sourceHandle": "latents",
+      "targetHandle": "latents"
+    },
+    {
+      "id": "reactflow__edge-2981a67c-480f-4237-9384-26b68dbf912bwidth-eebd7252-0bd8-401a-bb26-2b8bc64892fawidth",
+      "type": "default",
+      "source": "2981a67c-480f-4237-9384-26b68dbf912b",
+      "target": "eebd7252-0bd8-401a-bb26-2b8bc64892fa",
      "sourceHandle": "width",
      "targetHandle": "width"
    },
    {
-      "id": "reactflow__edge-2981a67c-480f-4237-9384-26b68dbf912blatents-ace0258f-67d7-4eee-a218-6fff27065214latents",
+      "id": "reactflow__edge-2981a67c-480f-4237-9384-26b68dbf912bheight-eebd7252-0bd8-401a-bb26-2b8bc64892faheight",
      "type": "default",
      "source": "2981a67c-480f-4237-9384-26b68dbf912b",
-      "target": "ace0258f-67d7-4eee-a218-6fff27065214",
-      "sourceHandle": "latents",
-      "targetHandle": "latents"
+      "target": "eebd7252-0bd8-401a-bb26-2b8bc64892fa",
+      "sourceHandle": "height",
+      "targetHandle": "height"
+    },
+    {
+      "id": "reactflow__edge-4754c534-a5f3-4ad0-9382-7887985e668cvalue-eebd7252-0bd8-401a-bb26-2b8bc64892faseed",
+      "type": "default",
+      "source": "4754c534-a5f3-4ad0-9382-7887985e668c",
+      "target": "eebd7252-0bd8-401a-bb26-2b8bc64892fa",
+      "sourceHandle": "value",
+      "targetHandle": "seed"
    },
    {
      "id": "reactflow__edge-f8d9d7c8-9ed7-4bd7-9e42-ab0e89bfac90vae-2981a67c-480f-4237-9384-26b68dbf912bvae",
@@ -339,38 +384,6 @@
      "sourceHandle": "vae",
      "targetHandle": "vae"
    },
-    {
-      "id": "reactflow__edge-ace0258f-67d7-4eee-a218-6fff27065214latents-7e5172eb-48c1-44db-a770-8fd83e1435d1latents",
-      "type": "default",
-      "source": "ace0258f-67d7-4eee-a218-6fff27065214",
-      "target": "7e5172eb-48c1-44db-a770-8fd83e1435d1",
-      "sourceHandle": "latents",
-      "targetHandle": "latents"
-    },
-    {
-      "id": "reactflow__edge-4754c534-a5f3-4ad0-9382-7887985e668cvalue-ace0258f-67d7-4eee-a218-6fff27065214seed",
-      "type": "default",
-      "source": "4754c534-a5f3-4ad0-9382-7887985e668c",
-      "target": "ace0258f-67d7-4eee-a218-6fff27065214",
-      "sourceHandle": "value",
-      "targetHandle": "seed"
-    },
-    {
-      "id": "reactflow__edge-f8d9d7c8-9ed7-4bd7-9e42-ab0e89bfac90transformer-ace0258f-67d7-4eee-a218-6fff27065214transformer",
-      "type": "default",
-      "source": "f8d9d7c8-9ed7-4bd7-9e42-ab0e89bfac90",
-      "target": "ace0258f-67d7-4eee-a218-6fff27065214",
-      "sourceHandle": "transformer",
-      "targetHandle": "transformer"
-    },
-    {
-      "id": "reactflow__edge-01f674f8-b3d1-4df1-acac-6cb8e0bfb63cconditioning-ace0258f-67d7-4eee-a218-6fff27065214positive_text_conditioning",
-      "type": "default",
-      "source": "01f674f8-b3d1-4df1-acac-6cb8e0bfb63c",
-      "target": "ace0258f-67d7-4eee-a218-6fff27065214",
-      "sourceHandle": "conditioning",
-      "targetHandle": "positive_text_conditioning"
-    },
    {
      "id": "reactflow__edge-f8d9d7c8-9ed7-4bd7-9e42-ab0e89bfac90vae-7e5172eb-48c1-44db-a770-8fd83e1435d1vae",
      "type": "default",
--- a/invokeai/app/services/workflow_records/default_workflows/Flux
+++ b/invokeai/app/services/workflow_records/default_workflows/Flux
@@ -2,7 +2,7 @@
  "name": "FLUX Text to Image",
  "author": "InvokeAI",
  "description": "A simple text-to-image workflow using FLUX dev or schnell models.",
-  "version": "1.0.4",
+  "version": "1.1.0",
  "contact": "",
  "tags": "text2image, flux",
  "notes": "Prerequisite model downloads: T5 Encoder, CLIP-L Encoder, and FLUX VAE. Quantized and un-quantized versions can be found in the starter models tab within your Model Manager. We recommend 4 steps for FLUX schnell models and 30 steps for FLUX dev models.",
@@ -26,10 +26,6 @@
    {
      "nodeId": "01f674f8-b3d1-4df1-acac-6cb8e0bfb63c",
      "fieldName": "prompt"
-    },
-    {
-      "nodeId": "4fe24f07-f906-4f55-ab2c-9beee56ef5bd",
-      "fieldName": "num_steps"
    }
  ],
  "meta": {
@@ -38,17 +34,18 @@
  },
  "nodes": [
    {
-      "id": "4fe24f07-f906-4f55-ab2c-9beee56ef5bd",
+      "id": "4ecda92d-ee0e-45ca-aa35-6e9410ac39b9",
      "type": "invocation",
      "data": {
-        "id": "4fe24f07-f906-4f55-ab2c-9beee56ef5bd",
+        "id": "4ecda92d-ee0e-45ca-aa35-6e9410ac39b9",
        "type": "flux_denoise",
-        "version": "1.0.0",
+        "version": "2.1.0",
        "label": "",
        "notes": "",
        "isOpen": true,
        "isIntermediate": true,
        "useCache": true,
+        "nodePack": "invokeai",
        "inputs": {
          "board": {
            "name": "board",
@@ -76,6 +73,11 @@
            "label": "",
            "value": 1
          },
+          "trajectory_guidance_strength": {
+            "name": "trajectory_guidance_strength",
+            "label": "",
+            "value": 0
+          },
          "transformer": {
            "name": "transformer",
            "label": ""
@@ -96,8 +98,8 @@
          },
          "num_steps": {
            "name": "num_steps",
-            "label": "Steps (Recommend 30 for Dev, 4 for Schnell)",
-            "value": 30
+            "label": "",
+            "value": 4
          },
          "guidance": {
            "name": "guidance",
@@ -112,8 +114,8 @@
        }
      },
      "position": {
-        "x": 1186.1868226120378,
-        "y": -214.9459927686657
+        "x": 1161.0101524413685,
+        "y": -223.33548695623742
      }
    },
    {
@@ -167,19 +169,47 @@
        "inputs": {
          "model": {
            "name": "model",
-            "label": ""
+            "label": "",
+            "value": {
+              "key": "b4990a6c-0899-48e9-969b-d6f3801acc6a",
+              "hash": "random:aad8f7bc19ce76541dfb394b62a30f77722542b66e48064a9f25453263b45fba",
+              "name": "FLUX Dev (Quantized)_2",
+              "base": "flux",
+              "type": "main"
+            }
          },
          "t5_encoder_model": {
            "name": "t5_encoder_model",
-            "label": ""
+            "label": "",
+            "value": {
+              "key": "d18d5575-96b6-4da3-b3d8-eb58308d6705",
+              "hash": "random:f2f9ed74acdfb4bf6fec200e780f6c25f8dd8764a35e65d425d606912fdf573a",
+              "name": "t5_bnb_int8_quantized_encoder",
+              "base": "any",
+              "type": "t5_encoder"
+            }
          },
          "clip_embed_model": {
            "name": "clip_embed_model",
-            "label": ""
+            "label": "",
+            "value": {
+              "key": "5a19d7e5-8d98-43cd-8a81-87515e4b3b4e",
+              "hash": "random:4bd08514c08fb6ff04088db9aeb45def3c488e8b5fd09a35f2cc4f2dc346f99f",
+              "name": "clip-vit-large-patch14",
+              "base": "any",
+              "type": "clip_embed"
+            }
          },
          "vae_model": {
            "name": "vae_model",
-            "label": ""
+            "label": "",
+            "value": {
+              "key": "9172beab-5c1d-43f0-b2f0-6e0b956710d9",
+              "hash": "random:c54dde288e5fa2e6137f1c92e9d611f598049e6f16e360207b6d96c9f5a67ba0",
+              "name": "FLUX.1-schnell_ae",
+              "base": "flux",
+              "type": "vae"
+            }
          }
        }
      },
@@ -259,33 +289,33 @@
  ],
  "edges": [
    {
-      "id": "reactflow__edge-f8d9d7c8-9ed7-4bd7-9e42-ab0e89bfac90transformer-4fe24f07-f906-4f55-ab2c-9beee56ef5bdtransformer",
+      "id": "reactflow__edge-f8d9d7c8-9ed7-4bd7-9e42-ab0e89bfac90transformer-4ecda92d-ee0e-45ca-aa35-6e9410ac39b9transformer",
      "type": "default",
      "source": "f8d9d7c8-9ed7-4bd7-9e42-ab0e89bfac90",
-      "target": "4fe24f07-f906-4f55-ab2c-9beee56ef5bd",
+      "target": "4ecda92d-ee0e-45ca-aa35-6e9410ac39b9",
      "sourceHandle": "transformer",
      "targetHandle": "transformer"
    },
    {
-      "id": "reactflow__edge-01f674f8-b3d1-4df1-acac-6cb8e0bfb63cconditioning-4fe24f07-f906-4f55-ab2c-9beee56ef5bdpositive_text_conditioning",
+      "id": "reactflow__edge-01f674f8-b3d1-4df1-acac-6cb8e0bfb63cconditioning-4ecda92d-ee0e-45ca-aa35-6e9410ac39b9positive_text_conditioning",
      "type": "default",
      "source": "01f674f8-b3d1-4df1-acac-6cb8e0bfb63c",
-      "target": "4fe24f07-f906-4f55-ab2c-9beee56ef5bd",
+      "target": "4ecda92d-ee0e-45ca-aa35-6e9410ac39b9",
      "sourceHandle": "conditioning",
      "targetHandle": "positive_text_conditioning"
    },
    {
-      "id": "reactflow__edge-4754c534-a5f3-4ad0-9382-7887985e668cvalue-4fe24f07-f906-4f55-ab2c-9beee56ef5bdseed",
+      "id": "reactflow__edge-4754c534-a5f3-4ad0-9382-7887985e668cvalue-4ecda92d-ee0e-45ca-aa35-6e9410ac39b9seed",
      "type": "default",
      "source": "4754c534-a5f3-4ad0-9382-7887985e668c",
-      "target": "4fe24f07-f906-4f55-ab2c-9beee56ef5bd",
+      "target": "4ecda92d-ee0e-45ca-aa35-6e9410ac39b9",
      "sourceHandle": "value",
      "targetHandle": "seed"
    },
    {
-      "id": "reactflow__edge-4fe24f07-f906-4f55-ab2c-9beee56ef5bdlatents-7e5172eb-48c1-44db-a770-8fd83e1435d1latents",
+      "id": "reactflow__edge-4ecda92d-ee0e-45ca-aa35-6e9410ac39b9latents-7e5172eb-48c1-44db-a770-8fd83e1435d1latents",
      "type": "default",
-      "source": "4fe24f07-f906-4f55-ab2c-9beee56ef5bd",
+      "source": "4ecda92d-ee0e-45ca-aa35-6e9410ac39b9",
      "target": "7e5172eb-48c1-44db-a770-8fd83e1435d1",
      "sourceHandle": "latents",
      "targetHandle": "latents"
--- a/invokeai/backend/flux/denoise.py
+++ b/invokeai/backend/flux/denoise.py
@@ -3,8 +3,8 @@ from typing import Callable
 import torch
 from tqdm import tqdm

-from invokeai.backend.flux.inpaint_extension import InpaintExtension
 from invokeai.backend.flux.model import Flux
+from invokeai.backend.flux.trajectory_guidance_extension import TrajectoryGuidanceExtension
 from invokeai.backend.stable_diffusion.diffusers_pipeline import PipelineIntermediateState


@@ -20,7 +20,7 @@ def denoise(
    timesteps: list[float],
    step_callback: Callable[[PipelineIntermediateState], None],
    guidance: float,
-    inpaint_extension: InpaintExtension | None,
+    traj_guidance_extension: TrajectoryGuidanceExtension | None,  # noqa: F821
 ):
    step = 0
    # guidance_vec is ignored for schnell.
@@ -36,13 +36,15 @@ def denoise(
            timesteps=t_vec,
            guidance=guidance_vec,
        )
+
+        if traj_guidance_extension is not None:
+            pred = traj_guidance_extension.update_noise(
+                t_curr_latents=img, pred_noise=pred, t_curr=t_curr, t_prev=t_prev
+            )
+
        preview_img = img - t_curr * pred
        img = img + (t_prev - t_curr) * pred

-        if inpaint_extension is not None:
-            img = inpaint_extension.merge_intermediate_latents_with_init_latents(img, t_prev)
-            preview_img = inpaint_extension.merge_intermediate_latents_with_init_latents(preview_img, 0.0)
-
        step_callback(
            PipelineIntermediateState(
                step=step,
--- a/invokeai/backend/flux/trajectory_guidance_extension.py
+++ b/invokeai/backend/flux/trajectory_guidance_extension.py
@@ -0,0 +1,134 @@
+import torch
+
+from invokeai.backend.util.build_line import build_line
+
+
+class TrajectoryGuidanceExtension:
+    """An implementation of trajectory guidance for FLUX.
+
+    What is trajectory guidance?
+    ----------------------------
+    With SD 1 and SDXL, the amount of change in image-to-image denoising is largely controlled by the denoising_start
+    parameter. Doing the same thing with the FLUX model does not work as well, because the FLUX model converges very
+    quickly (roughly time 1.0 to 0.9) to the structure of the final image. The result of this model characteristic is
+    that you typically get one of two outcomes:
+    1) a result that is very similar to the original image
+    2) a result that is very different from the original image, as though it was generated from the text prompt with
+       pure noise.
+
+    To address this issue with image-to-image workflows with FLUX, we employ the concept of trajectory guidance. The
+    idea is that in addition to controlling the denoising_start parameter (i.e. the amount of noise added to the
+    original image), we can also guide the denoising process to stay close to the trajectory that would reproduce the
+    original. By controlling the strength of the trajectory guidance throughout the denoising process, we can achieve
+    FLUX image-to-image behavior with the same level of control offered by SD1 and SDXL.
+
+    What is the trajectory_guidance_strength?
+    -----------------------------------------
+    In the limit, we could apply a different trajectory guidance 'strength' for every latent value in every timestep.
+    This would be impractical for a user, so instead we have engineered a strength schedule that is more convenient to
+    use. The `trajectory_guidance_strength` parameter is a single scalar value that maps to a schedule. The engineered
+    schedule is defined as:
+    1) An initial change_ratio at t=1.0.
+    2) A linear ramp up to change_ratio=1.0 at t = t_cutoff.
+    3) A constant change_ratio=1.0 after t = t_cutoff.
+    """
+
+    def __init__(
+        self, init_latents: torch.Tensor, inpaint_mask: torch.Tensor | None, trajectory_guidance_strength: float
+    ):
+        """Initialize TrajectoryGuidanceExtension.
+
+        Args:
+            init_latents (torch.Tensor): The initial latents (i.e. un-noised at timestep 0). In 'packed' format.
+            inpaint_mask (torch.Tensor | None): A mask specifying which elements to inpaint. Range [0, 1]. Values of 1
+                will be re-generated. Values of 0 will remain unchanged. Values between 0 and 1 can be used to blend the
+                inpainted region with the background. In 'packed' format. If None, will be treated as a mask of all 1s.
+            trajectory_guidance_strength (float): A value in [0, 1] specifying the strength of the trajectory guidance.
+                A value of 0.0 is equivalent to vanilla image-to-image. A value of 1.0 will guide the denoising process
+                very close to the original latents.
+        """
+        assert 0.0 <= trajectory_guidance_strength <= 1.0
+        self._init_latents = init_latents
+        if inpaint_mask is None:
+            # The inpaing mask is None, so we initialize a mask with a single value of 1.0.
+            # This value will be broadcasted and treated as a mask of all 1s.
+            self._inpaint_mask = torch.ones(1, device=init_latents.device, dtype=init_latents.dtype)
+        else:
+            self._inpaint_mask = inpaint_mask
+
+        # Calculate the params that define the trajectory guidance schedule.
+        # These mappings from trajectory_guidance_strength have no theoretical basis - they were tuned manually.
+        self._trajectory_guidance_strength = trajectory_guidance_strength
+        self._change_ratio_at_t_1 = build_line(x1=0.0, y1=1.0, x2=1.0, y2=0.0)(self._trajectory_guidance_strength)
+        self._change_ratio_at_cutoff = 1.0
+        self._t_cutoff = build_line(x1=0.0, y1=1.0, x2=1.0, y2=0.5)(self._trajectory_guidance_strength)
+
+    def _apply_mask_gradient_adjustment(self, t_prev: float) -> torch.Tensor:
+        """Applies inpaint mask gradient adjustment and returns the inpaint mask to be used at the current timestep."""
+        # As we progress through the denoising process, we promote gradient regions of the mask to have a full weight of
+        # 1.0. This helps to produce more coherent seams around the inpainted region. We experimented with a (small)
+        # number of promotion strategies (e.g. gradual promotion based on timestep), but found that a simple cutoff
+        # threshold worked well.
+        # We use a small epsilon to avoid any potential issues with floating point precision.
+        eps = 1e-4
+        mask_gradient_t_cutoff = 0.5
+        if t_prev > mask_gradient_t_cutoff:
+            # Early in the denoising process, use the inpaint mask as-is.
+            return self._inpaint_mask
+        else:
+            # After the cut-off, promote all non-zero mask values to 1.0.
+            mask = self._inpaint_mask.where(self._inpaint_mask <= (0.0 + eps), 1.0)
+
+        return mask
+
+    def _get_change_ratio(self, t_prev: float) -> float:
+        """Get the change_ratio for t_prev based on the change schedule."""
+        change_ratio = 1.0
+        if t_prev > self._t_cutoff:
+            # If we are before the cutoff, linearly interpolate between the change_ratio at t=1.0 and the change_ratio
+            # at the cutoff.
+            change_ratio = build_line(
+                x1=1.0, y1=self._change_ratio_at_t_1, x2=self._t_cutoff, y2=self._change_ratio_at_cutoff
+            )(t_prev)
+
+        # The change_ratio should be in the range [0, 1]. Assert that we didn't make any mistakes.
+        eps = 1e-5
+        assert 0.0 - eps <= change_ratio <= 1.0 + eps
+        return change_ratio
+
+    def update_noise(
+        self, t_curr_latents: torch.Tensor, pred_noise: torch.Tensor, t_curr: float, t_prev: float
+    ) -> torch.Tensor:
+        # Handle gradient cutoff.
+        mask = self._apply_mask_gradient_adjustment(t_prev)
+
+        mask = mask * self._get_change_ratio(t_prev)
+
+        # NOTE(ryand): During inpainting, it is common to guide the denoising process by noising the initial latents for
+        # the current timestep and then blending the predicted intermediate latents with the noised initial latents.
+        # For example:
+        # ```
+        # noised_init_latents = self._noise * t_prev + (1.0 - t_prev) * self._init_latents
+        # return t_prev_latents * self._inpaint_mask + noised_init_latents * (1.0 - self._inpaint_mask)
+        # ```
+        # Instead of guiding based on the noised initial latents, we have decided to guide based on the noise prediction
+        # that points towards the initial latents. The difference between these guidance strategies is minor, but
+        # qualitatively we found the latter to produce slightly better results. When change_ratio is 0.0 or 1.0 there is
+        # no difference between the two strategies.
+        #
+        # We experimented with a number of related guidance strategies, but not exhaustively. It's entirely possible
+        # that there's a much better way to do this.
+
+        # Calculate noise guidance
+        # What noise should the model have predicted at this timestep to step towards self._init_latents?
+        # Derivation:
+        # > t_prev_latents = t_curr_latents + (t_prev - t_curr) * pred_noise
+        # > t_0_latents = t_curr_latents + (0 - t_curr) * init_traj_noise
+        # > t_0_latents = t_curr_latents - t_curr * init_traj_noise
+        # > init_traj_noise = (t_curr_latents - t_0_latents) / t_curr)
+        init_traj_noise = (t_curr_latents - self._init_latents) / t_curr
+
+        # Blend the init_traj_noise with the pred_noise according to the inpaint mask and the trajectory guidance.
+        noise = pred_noise * mask + init_traj_noise * (1.0 - mask)
+
+        return noise
--- a/invokeai/backend/util/build_line.py
+++ b/invokeai/backend/util/build_line.py
@@ -0,0 +1,6 @@
+from typing import Callable
+
+
+def build_line(x1: float, y1: float, x2: float, y2: float) -> Callable[[float], float]:
+    """Build a linear function given two points on the line (x1, y1) and (x2, y2)."""
+    return lambda x: (y2 - y1) / (x2 - x1) * (x - x1) + y1
--- a/invokeai/frontend/web/public/locales/en.json
+++ b/invokeai/frontend/web/public/locales/en.json
@@ -1042,6 +1042,7 @@
        "strength": "Strength",
        "symmetry": "Symmetry",
        "tileSize": "Tile Size",
+        "optimizedInpainting": "Optimized Inpainting",
        "type": "Type",
        "postProcessing": "Post-Processing (Shift + U)",
        "processImage": "Process Image",
@@ -1547,6 +1548,12 @@
            "paragraphs": [
                "FLUX.1 [dev] models are licensed under the FLUX [dev] non-commercial license. To use this model type for commercial purposes in Invoke, visit our website to learn more."
            ]
+        },
+        "optimizedDenoising": {
+            "heading": "Optimized Inpainting",
+            "paragraphs": [
+                "Enable optimized denoising for enhanced inpainting transformations with Flux models. This setting improves detail and clarity during generation, but may be turned off to preserve more of your original image."
+            ]
        }
    },
    "unifiedCanvas": {
--- a/invokeai/frontend/web/src/common/components/InformationalPopover/constants.ts
+++ b/invokeai/frontend/web/src/common/components/InformationalPopover/constants.ts
@@ -60,6 +60,7 @@ export type Feature =
  | 'scale'
  | 'creativity'
  | 'structure'
+  | 'optimizedDenoising'
  | 'fluxDevLicense';

 export type PopoverData = PopoverProps & {
--- a/invokeai/frontend/web/src/features/controlLayers/store/paramsSlice.ts
+++ b/invokeai/frontend/web/src/features/controlLayers/store/paramsSlice.ts
@@ -40,6 +40,7 @@ export type ParamsState = {
  cfgRescaleMultiplier: ParameterCFGRescaleMultiplier;
  guidance: ParameterGuidance;
  img2imgStrength: ParameterStrength;
+  optimizedDenoisingEnabled: boolean;
  iterations: number;
  scheduler: ParameterScheduler;
  seed: ParameterSeed;
@@ -83,6 +84,7 @@ const initialState: ParamsState = {
  cfgRescaleMultiplier: 0,
  guidance: 4,
  img2imgStrength: 0.75,
+  optimizedDenoisingEnabled: true,
  iterations: 1,
  scheduler: 'euler',
  seed: 0,
@@ -141,6 +143,9 @@ export const paramsSlice = createSlice({
    setImg2imgStrength: (state, action: PayloadAction<number>) => {
      state.img2imgStrength = action.payload;
    },
+    setOptimizedDenoisingEnabled: (state, action: PayloadAction<boolean>) => {
+      state.optimizedDenoisingEnabled = action.payload;
+    },
    setSeamlessXAxis: (state, action: PayloadAction<boolean>) => {
      state.seamlessXAxis = action.payload;
    },
@@ -273,6 +278,7 @@ export const {
  setScheduler,
  setSeed,
  setImg2imgStrength,
+  setOptimizedDenoisingEnabled,
  setSeamlessXAxis,
  setSeamlessYAxis,
  setShouldRandomizeSeed,
@@ -341,6 +347,7 @@ export const selectInfillPatchmatchDownscaleSize = createParamsSelector(
 );
 export const selectInfillColorValue = createParamsSelector((params) => params.infillColorValue);
 export const selectImg2imgStrength = createParamsSelector((params) => params.img2imgStrength);
+export const selectOptimizedDenoisingEnabled = createParamsSelector((params) => params.optimizedDenoisingEnabled);
 export const selectPositivePrompt = createParamsSelector((params) => params.positivePrompt);
 export const selectNegativePrompt = createParamsSelector((params) => params.negativePrompt);
 export const selectPositivePrompt2 = createParamsSelector((params) => params.positivePrompt2);
--- a/invokeai/frontend/web/src/features/nodes/util/graph/generation/buildFLUXGraph.ts
+++ b/invokeai/frontend/web/src/features/nodes/util/graph/generation/buildFLUXGraph.ts
@@ -37,7 +37,17 @@ export const buildFLUXGraph = async (

  const { originalSize, scaledSize } = getSizes(bbox);

-  const { model, guidance, seed, steps, fluxVAE, t5EncoderModel, clipEmbedModel, img2imgStrength } = params;
+  const {
+    model,
+    guidance,
+    seed,
+    steps,
+    fluxVAE,
+    t5EncoderModel,
+    clipEmbedModel,
+    img2imgStrength,
+    optimizedDenoisingEnabled,
+  } = params;

  assert(model, 'No model found in state');
  assert(t5EncoderModel, 'No T5 Encoder model found in state');
@@ -68,7 +78,8 @@ export const buildFLUXGraph = async (
    guidance,
    num_steps: steps,
    seed,
-    denoising_start: 0, // denoising_start should be 0 when latents are not provided
+    trajectory_guidance_strength: 0,
+    denoising_start: 0,
    denoising_end: 1,
    width: scaledSize.width,
    height: scaledSize.height,
@@ -113,6 +124,8 @@ export const buildFLUXGraph = async (
    clip_embed_model: clipEmbedModel,
  });

+  const denoisingValue = 1 - img2imgStrength;
+
  if (generationMode === 'txt2img') {
    canvasOutput = addTextToImage(g, l2i, originalSize, scaledSize);
  } else if (generationMode === 'img2img') {
@@ -125,7 +138,7 @@ export const buildFLUXGraph = async (
      originalSize,
      scaledSize,
      bbox,
-      1 - img2imgStrength,
+      denoisingValue,
      false
    );
  } else if (generationMode === 'inpaint') {
@@ -139,9 +152,15 @@ export const buildFLUXGraph = async (
      modelLoader,
      originalSize,
      scaledSize,
-      1 - img2imgStrength,
+      denoisingValue,
      false
    );
+    if (optimizedDenoisingEnabled) {
+      g.updateNode(noise, {
+        denoising_start: 0,
+        trajectory_guidance_strength: denoisingValue,
+      });
+    }
  } else if (generationMode === 'outpaint') {
    canvasOutput = await addOutpaint(
      state,
@@ -153,7 +172,7 @@ export const buildFLUXGraph = async (
      modelLoader,
      originalSize,
      scaledSize,
-      1 - img2imgStrength,
+      denoisingValue,
      false
    );
  }
--- a/invokeai/frontend/web/src/features/parameters/components/Advanced/ParamOptimizedDenoisingToggle.tsx
+++ b/invokeai/frontend/web/src/features/parameters/components/Advanced/ParamOptimizedDenoisingToggle.tsx
@@ -0,0 +1,35 @@
+import { FormControl, FormLabel, Switch } from '@invoke-ai/ui-library';
+import { useAppDispatch, useAppSelector } from 'app/store/storeHooks';
+import { InformationalPopover } from 'common/components/InformationalPopover/InformationalPopover';
+import {
+  selectOptimizedDenoisingEnabled,
+  setOptimizedDenoisingEnabled,
+} from 'features/controlLayers/store/paramsSlice';
+import type { ChangeEvent } from 'react';
+import { memo, useCallback } from 'react';
+import { useTranslation } from 'react-i18next';
+
+export const ParamOptimizedDenoisingToggle = memo(() => {
+  const optimizedDenoisingEnabled = useAppSelector(selectOptimizedDenoisingEnabled);
+  const dispatch = useAppDispatch();
+
+  const onChange = useCallback(
+    (event: ChangeEvent<HTMLInputElement>) => {
+      dispatch(setOptimizedDenoisingEnabled(event.target.checked));
+    },
+    [dispatch]
+  );
+
+  const { t } = useTranslation();
+
+  return (
+    <FormControl w="min-content">
+      <InformationalPopover feature="optimizedDenoising">
+        <FormLabel m={0}>{t('parameters.optimizedInpainting')}</FormLabel>
+      </InformationalPopover>
+      <Switch isChecked={optimizedDenoisingEnabled} onChange={onChange} />
+    </FormControl>
+  );
+});
+
+ParamOptimizedDenoisingToggle.displayName = 'ParamOptimizedDenoisingToggle';
--- a/invokeai/frontend/web/src/features/settingsAccordions/components/ImageSettingsAccordion/ImageSettingsAccordion.tsx
+++ b/invokeai/frontend/web/src/features/settingsAccordions/components/ImageSettingsAccordion/ImageSettingsAccordion.tsx
@@ -3,8 +3,9 @@ import { Expander, Flex, FormControlGroup, StandaloneAccordion } from '@invoke-a
 import { EMPTY_ARRAY } from 'app/store/constants';
 import { createMemoizedSelector } from 'app/store/createMemoizedSelector';
 import { useAppSelector } from 'app/store/storeHooks';
-import { selectParamsSlice } from 'features/controlLayers/store/paramsSlice';
+import { selectIsFLUX, selectParamsSlice } from 'features/controlLayers/store/paramsSlice';
 import { selectCanvasSlice, selectScaleMethod } from 'features/controlLayers/store/selectors';
+import { ParamOptimizedDenoisingToggle } from 'features/parameters/components/Advanced/ParamOptimizedDenoisingToggle';
 import BboxScaledHeight from 'features/parameters/components/Bbox/BboxScaledHeight';
 import BboxScaledWidth from 'features/parameters/components/Bbox/BboxScaledWidth';
 import BboxScaleMethod from 'features/parameters/components/Bbox/BboxScaleMethod';
@@ -59,6 +60,7 @@ export const ImageSettingsAccordion = memo(() => {
    id: 'image-settings-advanced',
    defaultIsOpen: false,
  });
+  const isFLUX = useAppSelector(selectIsFLUX);

  return (
    <StandaloneAccordion
@@ -77,6 +79,7 @@ export const ImageSettingsAccordion = memo(() => {
        <ParamDenoisingStrength />
        <Expander label={t('accordions.advanced.options')} isOpen={isOpenExpander} onToggle={onToggleExpander}>
          <Flex gap={4} pb={4} flexDir="column">
+            {isFLUX && <ParamOptimizedDenoisingToggle />}
            <BboxScaleMethod />
            {scaleMethod !== 'none' && (
              <FormControlGroup formLabelProps={scalingLabelProps}>
--- a/invokeai/frontend/web/src/services/api/schema.ts
+++ b/invokeai/frontend/web/src/services/api/schema.ts
@@ -6340,6 +6340,12 @@ export type components = {
             * @default 1
             */
            denoising_end?: number;
+            /**
+             * Trajectory Guidance Strength
+             * @description Value indicating how strongly to guide the denoising process towards the initial latents (during image-to-image). Range [0, 1]. A value of 0.0 is equivalent to vanilla image-to-image. A value of 1.0 will guide the denoising process very close to the original latents.
+             * @default 0
+             */
+            trajectory_guidance_strength?: number;
            /**
             * Transformer
             * @description Flux model (Transformer) to load
--- a/tests/backend/util/test_build_line.py
+++ b/tests/backend/util/test_build_line.py
@@ -0,0 +1,19 @@
+import math
+
+import pytest
+
+from invokeai.backend.util.build_line import build_line
+
+
+@pytest.mark.parametrize(
+    ["x1", "y1", "x2", "y2", "x3", "y3"],
+    [
+        (0, 0, 1, 1, 2, 2),  # y = x
+        (0, 1, 1, 2, 2, 3),  # y = x + 1
+        (0, 0, 1, 2, 2, 4),  # y = 2x
+        (0, 1, 1, 0, 2, -1),  # y = -x + 1
+        (0, 5, 1, 5, 2, 5),  # y = 0
+    ],
+)
+def test_build_line(x1: float, y1: float, x2: float, y2: float, x3: float, y3: float):
+    assert math.isclose(build_line(x1, y1, x2, y2)(x3), y3, rel_tol=1e-9)