chore: bump version to v5.4.0a1

fix(ui): remember to disable isFiltering when finishing filtering
fix(ui): flash of original layer when applying filter/segment
2026-01-17 01:47:59 -05:00 · 2024-10-30 11:06:01 +11:00 · 2024-10-30 09:19:30 +11:00 · 2024-10-30 09:19:30 +11:00 · 2024-10-30 09:19:30 +11:00 · 2024-10-30 09:19:30 +11:00
141 changed files with 3471 additions and 5433 deletions
--- a/docs/contributing/dev-environment.md
+++ b/docs/contributing/dev-environment.md
@@ -17,46 +17,49 @@ If you just want to use Invoke, you should use the [installer][installer link].
 ## Setup

 1. Run through the [requirements][requirements link].
-1. [Fork and clone][forking link] the [InvokeAI repo][repo link].
-1. Create an directory for user data (images, models, db, etc). This is typically at `~/invokeai`, but if you already have a non-dev install, you may want to create a separate directory for the dev install.
-1. Create a python virtual environment inside the directory you just created:
+2. [Fork and clone][forking link] the [InvokeAI repo][repo link].
+3. Create an directory for user data (images, models, db, etc). This is typically at `~/invokeai`, but if you already have a non-dev install, you may want to create a separate directory for the dev install.
+4. Create a python virtual environment inside the directory you just created:

-   ```sh
-   python3 -m venv .venv --prompt InvokeAI-Dev
-   ```
+      ```sh
+      python3 -m venv .venv --prompt InvokeAI-Dev
+      ```

-1. Activate the venv (you'll need to do this every time you want to run the app):
+5. Activate the venv (you'll need to do this every time you want to run the app):

-   ```sh
-   source .venv/bin/activate
-   ```
+        ```sh
+        source .venv/bin/activate
+        ```

-1. Install the repo as an [editable install][editable install link]:
+6. Install the repo as an [editable install][editable install link]:

-   ```sh
-   pip install -e ".[dev,test,xformers]" --use-pep517 --extra-index-url https://download.pytorch.org/whl/cu121
-   ```
+      ```sh
+      pip install -e ".[dev,test,xformers]" --use-pep517 --extra-index-url https://download.pytorch.org/whl/cu121
+      ```

-   Refer to the [manual installation][manual install link]] instructions for more determining the correct install options. `xformers` is optional, but `dev` and `test` are not.
+      Refer to the [manual installation][manual install link]] instructions for more determining the correct install options. `xformers` is optional, but `dev` and `test` are not.

-1. Install the frontend dev toolchain:
+7. Install the frontend dev toolchain:

   - [`nodejs`](https://nodejs.org/) (recommend v20 LTS)
-   - [`pnpm`](https://pnpm.io/installation#installing-a-specific-version) (must be v8 - not v9!)
+   - [`pnpm`](https://pnpm.io/8.x/installation) (must be v8 - not v9!)

-1. Do a production build of the frontend:
+8. Do a production build of the frontend:

-   ```sh
-   pnpm build
-   ```
+      ```sh
+      cd PATH_TO_INVOKEAI_REPO/invokeai/frontend/web
+      pnpm i
+      pnpm build
+      ```

-1. Start the application:
+9. Start the application:

-   ```sh
-   python scripts/invokeai-web.py
-   ```
+      ```sh
+      cd PATH_TO_INVOKEAI_REPO
+      python scripts/invokeai-web.py
+      ```

-1. Access the UI at `localhost:9090`.
+10. Access the UI at `localhost:9090`.

 ## Updating the UI

--- a/invokeai/app/api/routers/model_manager.py
+++ b/invokeai/app/api/routers/model_manager.py
@@ -808,7 +808,11 @@ def get_is_installed(
    for model in installed_models:
        if model.source == starter_model.source:
            return True
-        if model.name == starter_model.name and model.base == starter_model.base and model.type == starter_model.type:
+        if (
+            (model.name == starter_model.name or model.name in starter_model.previous_names)
+            and model.base == starter_model.base
+            and model.type == starter_model.type
+        ):
            return True
    return False

--- a/invokeai/app/invocations/denoise_latents.py
+++ b/invokeai/app/invocations/denoise_latents.py
@@ -13,6 +13,7 @@ from diffusers.models.unets.unet_2d_condition import UNet2DConditionModel
 from diffusers.schedulers.scheduling_dpmsolver_sde import DPMSolverSDEScheduler
 from diffusers.schedulers.scheduling_tcd import TCDScheduler
 from diffusers.schedulers.scheduling_utils import SchedulerMixin as Scheduler
+from PIL import Image
 from pydantic import field_validator
 from torchvision.transforms.functional import resize as tv_resize
 from transformers import CLIPVisionModelWithProjection
@@ -510,6 +511,7 @@ class DenoiseLatentsInvocation(BaseInvocation):
        context: InvocationContext,
        t2i_adapters: Optional[Union[T2IAdapterField, list[T2IAdapterField]]],
        ext_manager: ExtensionsManager,
+        bgr_mode: bool = False,
    ) -> None:
        if t2i_adapters is None:
            return
@@ -519,6 +521,10 @@ class DenoiseLatentsInvocation(BaseInvocation):
            t2i_adapters = [t2i_adapters]

        for t2i_adapter_field in t2i_adapters:
+            image = context.images.get_pil(t2i_adapter_field.image.image_name)
+            if bgr_mode:  # SDXL t2i trained on cv2's BGR outputs, but PIL won't convert straight to BGR
+                r, g, b = image.split()
+                image = Image.merge("RGB", (b, g, r))
            ext_manager.add_extension(
                T2IAdapterExt(
                    node_context=context,
@@ -623,6 +629,10 @@ class DenoiseLatentsInvocation(BaseInvocation):
                max_unet_downscale = 8
            elif t2i_adapter_model_config.base == BaseModelType.StableDiffusionXL:
                max_unet_downscale = 4
+
+                # SDXL adapters are trained on cv2's BGR outputs
+                r, g, b = image.split()
+                image = Image.merge("RGB", (b, g, r))
            else:
                raise ValueError(f"Unexpected T2I-Adapter base model type: '{t2i_adapter_model_config.base}'.")

@@ -900,7 +910,8 @@ class DenoiseLatentsInvocation(BaseInvocation):
            #    ext = extension_field.to_extension(exit_stack, context, ext_manager)
            #    ext_manager.add_extension(ext)
            self.parse_controlnet_field(exit_stack, context, self.control, ext_manager)
-            self.parse_t2i_adapter_field(exit_stack, context, self.t2i_adapter, ext_manager)
+            bgr_mode = self.unet.unet.base == BaseModelType.StableDiffusionXL
+            self.parse_t2i_adapter_field(exit_stack, context, self.t2i_adapter, ext_manager, bgr_mode)

            # ext: t2i/ip adapter
            ext_manager.run_callback(ExtensionCallbackType.SETUP, denoise_ctx)
--- a/invokeai/app/invocations/fields.py
+++ b/invokeai/app/invocations/fields.py
@@ -133,7 +133,6 @@ class FieldDescriptions:
    clip_embed_model = "CLIP Embed loader"
    unet = "UNet (scheduler, LoRAs)"
    transformer = "Transformer"
-    mmditx = "MMDiTX"
    vae = "VAE"
    cond = "Conditioning tensor"
    controlnet_model = "ControlNet model to load"
@@ -141,7 +140,6 @@ class FieldDescriptions:
    lora_model = "LoRA model to load"
    main_model = "Main model (UNet, VAE, CLIP) to load"
    flux_model = "Flux model (Transformer) to load"
-    sd3_model = "SD3 model (MMDiTX) to load"
    sdxl_main_model = "SDXL Main model (UNet, VAE, CLIP1, CLIP2) to load"
    sdxl_refiner_model = "SDXL Refiner Main Modde (UNet, VAE, CLIP2) to load"
    onnx_main_model = "ONNX Main model (UNet, VAE, CLIP) to load"
--- a/invokeai/app/invocations/flux_model_loader.py
+++ b/invokeai/app/invocations/flux_model_loader.py
@@ -1,86 +0,0 @@
-from typing import Literal
-
-from invokeai.app.invocations.baseinvocation import (
-    BaseInvocation,
-    BaseInvocationOutput,
-    Classification,
-    invocation,
-    invocation_output,
-)
-from invokeai.app.invocations.fields import FieldDescriptions, Input, InputField, OutputField, UIType
-from invokeai.app.invocations.model import CLIPField, ModelIdentifierField, T5EncoderField, TransformerField, VAEField
-from invokeai.app.services.shared.invocation_context import InvocationContext
-from invokeai.backend.flux.util import max_seq_lengths
-from invokeai.backend.model_manager.config import CheckpointConfigBase, SubModelType
-
-
-@invocation_output("flux_model_loader_output")
-class FluxModelLoaderOutput(BaseInvocationOutput):
-    """Flux base model loader output"""
-
-    transformer: TransformerField = OutputField(description=FieldDescriptions.transformer, title="Transformer")
-    clip: CLIPField = OutputField(description=FieldDescriptions.clip, title="CLIP")
-    t5_encoder: T5EncoderField = OutputField(description=FieldDescriptions.t5_encoder, title="T5 Encoder")
-    vae: VAEField = OutputField(description=FieldDescriptions.vae, title="VAE")
-    max_seq_len: Literal[256, 512] = OutputField(
-        description="The max sequence length to used for the T5 encoder. (256 for schnell transformer, 512 for dev transformer)",
-        title="Max Seq Length",
-    )
-
-
-@invocation(
-    "flux_model_loader",
-    title="Flux Main Model",
-    tags=["model", "flux"],
-    category="model",
-    version="1.0.4",
-    classification=Classification.Prototype,
-)
-class FluxModelLoaderInvocation(BaseInvocation):
-    """Loads a flux base model, outputting its submodels."""
-
-    model: ModelIdentifierField = InputField(
-        description=FieldDescriptions.flux_model,
-        ui_type=UIType.FluxMainModel,
-        input=Input.Direct,
-    )
-
-    t5_encoder_model: ModelIdentifierField = InputField(
-        description=FieldDescriptions.t5_encoder, ui_type=UIType.T5EncoderModel, input=Input.Direct, title="T5 Encoder"
-    )
-
-    clip_embed_model: ModelIdentifierField = InputField(
-        description=FieldDescriptions.clip_embed_model,
-        ui_type=UIType.CLIPEmbedModel,
-        input=Input.Direct,
-        title="CLIP Embed",
-    )
-
-    vae_model: ModelIdentifierField = InputField(
-        description=FieldDescriptions.vae_model, ui_type=UIType.FluxVAEModel, title="VAE"
-    )
-
-    def invoke(self, context: InvocationContext) -> FluxModelLoaderOutput:
-        for key in [self.model.key, self.t5_encoder_model.key, self.clip_embed_model.key, self.vae_model.key]:
-            if not context.models.exists(key):
-                raise ValueError(f"Unknown model: {key}")
-
-        transformer = self.model.model_copy(update={"submodel_type": SubModelType.Transformer})
-        vae = self.vae_model.model_copy(update={"submodel_type": SubModelType.VAE})
-
-        tokenizer = self.clip_embed_model.model_copy(update={"submodel_type": SubModelType.Tokenizer})
-        clip_encoder = self.clip_embed_model.model_copy(update={"submodel_type": SubModelType.TextEncoder})
-
-        tokenizer2 = self.t5_encoder_model.model_copy(update={"submodel_type": SubModelType.Tokenizer2})
-        t5_encoder = self.t5_encoder_model.model_copy(update={"submodel_type": SubModelType.TextEncoder2})
-
-        transformer_config = context.models.get_config(transformer)
-        assert isinstance(transformer_config, CheckpointConfigBase)
-
-        return FluxModelLoaderOutput(
-            transformer=TransformerField(transformer=transformer, loras=[]),
-            clip=CLIPField(tokenizer=tokenizer, text_encoder=clip_encoder, loras=[], skipped_layers=0),
-            t5_encoder=T5EncoderField(tokenizer=tokenizer2, text_encoder=t5_encoder),
-            vae=VAEField(vae=vae),
-            max_seq_len=max_seq_lengths[transformer_config.config_path],
-        )
--- a/invokeai/app/invocations/mask.py
+++ b/invokeai/app/invocations/mask.py
@@ -165,6 +165,7 @@ class ApplyMaskTensorToImageInvocation(BaseInvocation, WithMetadata, WithBoard):

    mask: TensorField = InputField(description="The mask tensor to apply.")
    image: ImageField = InputField(description="The image to apply the mask to.")
+    invert: bool = InputField(default=False, description="Whether to invert the mask.")

    def invoke(self, context: InvocationContext) -> ImageOutput:
        image = context.images.get_pil(self.image.image_name, mode="RGBA")
@@ -179,6 +180,9 @@ class ApplyMaskTensorToImageInvocation(BaseInvocation, WithMetadata, WithBoard):
            mask = mask > 0.5
        mask_np = (mask.float() * 255).byte().cpu().numpy().astype(np.uint8)

+        if self.invert:
+            mask_np = 255 - mask_np
+
        # Apply the mask only to the alpha channel where the original alpha is non-zero. This preserves the original
        # image's transparency - else the transparent regions would end up as opaque black.

--- a/invokeai/app/invocations/model.py
+++ b/invokeai/app/invocations/model.py
@@ -1,5 +1,5 @@
 import copy
-from typing import List, Optional
+from typing import List, Literal, Optional

 from pydantic import BaseModel, Field

@@ -13,9 +13,11 @@ from invokeai.app.invocations.baseinvocation import (
 from invokeai.app.invocations.fields import FieldDescriptions, Input, InputField, OutputField, UIType
 from invokeai.app.services.shared.invocation_context import InvocationContext
 from invokeai.app.shared.models import FreeUConfig
+from invokeai.backend.flux.util import max_seq_lengths
 from invokeai.backend.model_manager.config import (
    AnyModelConfig,
    BaseModelType,
+    CheckpointConfigBase,
    ModelType,
    SubModelType,
 )
@@ -137,6 +139,78 @@ class ModelIdentifierInvocation(BaseInvocation):
        return ModelIdentifierOutput(model=self.model)


+@invocation_output("flux_model_loader_output")
+class FluxModelLoaderOutput(BaseInvocationOutput):
+    """Flux base model loader output"""
+
+    transformer: TransformerField = OutputField(description=FieldDescriptions.transformer, title="Transformer")
+    clip: CLIPField = OutputField(description=FieldDescriptions.clip, title="CLIP")
+    t5_encoder: T5EncoderField = OutputField(description=FieldDescriptions.t5_encoder, title="T5 Encoder")
+    vae: VAEField = OutputField(description=FieldDescriptions.vae, title="VAE")
+    max_seq_len: Literal[256, 512] = OutputField(
+        description="The max sequence length to used for the T5 encoder. (256 for schnell transformer, 512 for dev transformer)",
+        title="Max Seq Length",
+    )
+
+
+@invocation(
+    "flux_model_loader",
+    title="Flux Main Model",
+    tags=["model", "flux"],
+    category="model",
+    version="1.0.4",
+    classification=Classification.Prototype,
+)
+class FluxModelLoaderInvocation(BaseInvocation):
+    """Loads a flux base model, outputting its submodels."""
+
+    model: ModelIdentifierField = InputField(
+        description=FieldDescriptions.flux_model,
+        ui_type=UIType.FluxMainModel,
+        input=Input.Direct,
+    )
+
+    t5_encoder_model: ModelIdentifierField = InputField(
+        description=FieldDescriptions.t5_encoder, ui_type=UIType.T5EncoderModel, input=Input.Direct, title="T5 Encoder"
+    )
+
+    clip_embed_model: ModelIdentifierField = InputField(
+        description=FieldDescriptions.clip_embed_model,
+        ui_type=UIType.CLIPEmbedModel,
+        input=Input.Direct,
+        title="CLIP Embed",
+    )
+
+    vae_model: ModelIdentifierField = InputField(
+        description=FieldDescriptions.vae_model, ui_type=UIType.FluxVAEModel, title="VAE"
+    )
+
+    def invoke(self, context: InvocationContext) -> FluxModelLoaderOutput:
+        for key in [self.model.key, self.t5_encoder_model.key, self.clip_embed_model.key, self.vae_model.key]:
+            if not context.models.exists(key):
+                raise ValueError(f"Unknown model: {key}")
+
+        transformer = self.model.model_copy(update={"submodel_type": SubModelType.Transformer})
+        vae = self.vae_model.model_copy(update={"submodel_type": SubModelType.VAE})
+
+        tokenizer = self.clip_embed_model.model_copy(update={"submodel_type": SubModelType.Tokenizer})
+        clip_encoder = self.clip_embed_model.model_copy(update={"submodel_type": SubModelType.TextEncoder})
+
+        tokenizer2 = self.t5_encoder_model.model_copy(update={"submodel_type": SubModelType.Tokenizer2})
+        t5_encoder = self.t5_encoder_model.model_copy(update={"submodel_type": SubModelType.TextEncoder2})
+
+        transformer_config = context.models.get_config(transformer)
+        assert isinstance(transformer_config, CheckpointConfigBase)
+
+        return FluxModelLoaderOutput(
+            transformer=TransformerField(transformer=transformer, loras=[]),
+            clip=CLIPField(tokenizer=tokenizer, text_encoder=clip_encoder, loras=[], skipped_layers=0),
+            t5_encoder=T5EncoderField(tokenizer=tokenizer2, text_encoder=t5_encoder),
+            vae=VAEField(vae=vae),
+            max_seq_len=max_seq_lengths[transformer_config.config_path],
+        )
+
+
@invocation(
    "main_model_loader",
    title="Main Model",
--- a/invokeai/app/invocations/sd3_model_loader.py
+++ b/invokeai/app/invocations/sd3_model_loader.py
@@ -1,102 +0,0 @@
-from invokeai.app.invocations.baseinvocation import (
-    BaseInvocation,
-    BaseInvocationOutput,
-    Classification,
-    invocation,
-    invocation_output,
-)
-from invokeai.app.invocations.fields import FieldDescriptions, Input, InputField, OutputField, UIType
-from invokeai.app.invocations.model import CLIPField, ModelIdentifierField, T5EncoderField, TransformerField, VAEField
-from invokeai.app.services.shared.invocation_context import InvocationContext
-from invokeai.backend.model_manager.config import CheckpointConfigBase, SubModelType
-
-
-@invocation_output("sd3_model_loader_output")
-class Sd3ModelLoaderOutput(BaseInvocationOutput):
-    """SD3 base model loader output."""
-
-    mmditx: TransformerField = OutputField(description=FieldDescriptions.mmditx, title="MMDiTX")
-    clip_l: CLIPField = OutputField(description=FieldDescriptions.clip, title="CLIP L")
-    clip_g: CLIPField = OutputField(description=FieldDescriptions.clip, title="CLIP G")
-    t5_encoder: T5EncoderField = OutputField(description=FieldDescriptions.t5_encoder, title="T5 Encoder")
-    vae: VAEField = OutputField(description=FieldDescriptions.vae, title="VAE")
-
-
-@invocation(
-    "sd3_model_loader",
-    title="SD3 Main Model",
-    tags=["model", "sd3"],
-    category="model",
-    version="1.0.0",
-    classification=Classification.Prototype,
-)
-class Sd3ModelLoaderInvocation(BaseInvocation):
-    """Loads a SD3 base model, outputting its submodels."""
-
-    # TODO(ryand): Create a UIType.Sd3MainModelField to use here.
-    model: ModelIdentifierField = InputField(
-        description=FieldDescriptions.sd3_model,
-        ui_type=UIType.MainModel,
-        input=Input.Direct,
-    )
-
-    # TODO(ryand): Make the text encoders optional.
-    # Note: The text encoders are optional for SD3. The model was trained with dropout, so any can be left out at
-    # inference time. Typically, only the T5 encoder is omitted, since it is the largest by far.
-    t5_encoder_model: ModelIdentifierField = InputField(
-        description=FieldDescriptions.t5_encoder, ui_type=UIType.T5EncoderModel, input=Input.Direct, title="T5 Encoder"
-    )
-
-    clip_l_embed_model: ModelIdentifierField = InputField(
-        description=FieldDescriptions.clip_embed_model,
-        ui_type=UIType.CLIPEmbedModel,
-        input=Input.Direct,
-        title="CLIP L Embed",
-    )
-
-    clip_g_embed_model: ModelIdentifierField = InputField(
-        description=FieldDescriptions.clip_embed_model,
-        ui_type=UIType.CLIPEmbedModel,
-        input=Input.Direct,
-        title="CLIP G Embed",
-    )
-
-    # TODO(ryand): Create a UIType.Sd3VaModelField to use here.
-    vae_model: ModelIdentifierField = InputField(
-        description=FieldDescriptions.vae_model, ui_type=UIType.VAEModel, title="VAE"
-    )
-
-    def invoke(self, context: InvocationContext) -> Sd3ModelLoaderOutput:
-        for key in [
-            self.model.key,
-            self.t5_encoder_model.key,
-            self.clip_l_embed_model.key,
-            self.clip_g_embed_model.key,
-            self.vae_model.key,
-        ]:
-            if not context.models.exists(key):
-                raise ValueError(f"Unknown model: {key}")
-
-        # TODO(ryand): Figure out the sub-model types for SD3.
-        mmditx = self.model.model_copy(update={"submodel_type": SubModelType.Transformer})
-        vae = self.vae_model.model_copy(update={"submodel_type": SubModelType.VAE})
-
-        tokenizer_l = self.clip_l_embed_model.model_copy(update={"submodel_type": SubModelType.Tokenizer})
-        clip_encoder_l = self.clip_l_embed_model.model_copy(update={"submodel_type": SubModelType.TextEncoder})
-
-        tokenizer_g = self.clip_g_embed_model.model_copy(update={"submodel_type": SubModelType.Tokenizer})
-        clip_encoder_g = self.clip_g_embed_model.model_copy(update={"submodel_type": SubModelType.TextEncoder})
-
-        tokenizer_t5 = self.t5_encoder_model.model_copy(update={"submodel_type": SubModelType.Tokenizer2})
-        t5_encoder = self.t5_encoder_model.model_copy(update={"submodel_type": SubModelType.TextEncoder2})
-
-        transformer_config = context.models.get_config(mmditx)
-        assert isinstance(transformer_config, CheckpointConfigBase)
-
-        return Sd3ModelLoaderOutput(
-            mmditx=TransformerField(transformer=mmditx, loras=[]),
-            clip_l=CLIPField(tokenizer=tokenizer_l, text_encoder=clip_encoder_l, loras=[], skipped_layers=0),
-            clip_g=CLIPField(tokenizer=tokenizer_g, text_encoder=clip_encoder_g, loras=[], skipped_layers=0),
-            t5_encoder=T5EncoderField(tokenizer=tokenizer_t5, text_encoder=t5_encoder),
-            vae=VAEField(vae=vae),
-        )
--- a/invokeai/app/invocations/sd3_text_encoder.py
+++ b/invokeai/app/invocations/sd3_text_encoder.py
--- a/invokeai/app/services/shared/invocation_context.py
+++ b/invokeai/app/services/shared/invocation_context.py
@@ -1,3 +1,4 @@
+from copy import deepcopy
 from dataclasses import dataclass
 from pathlib import Path
 from typing import TYPE_CHECKING, Callable, Optional, Union
@@ -221,7 +222,7 @@ class ImagesInterface(InvocationContextInterface):
        )

    def get_pil(self, image_name: str, mode: IMAGE_MODES | None = None) -> Image:
-        """Gets an image as a PIL Image object.
+        """Gets an image as a PIL Image object. This method returns a copy of the image.

        Args:
            image_name: The name of the image to get.
@@ -233,11 +234,15 @@ class ImagesInterface(InvocationContextInterface):
        image = self._services.images.get_pil_image(image_name)
        if mode and mode != image.mode:
            try:
+                # convert makes a copy!
                image = image.convert(mode)
            except ValueError:
                self._services.logger.warning(
                    f"Could not convert image from {image.mode} to {mode}. Using original mode instead."
                )
+        else:
+            # copy the image to prevent the user from modifying the original
+            image = image.copy()
        return image

    def get_metadata(self, image_name: str) -> Optional[MetadataField]:
@@ -290,15 +295,15 @@ class TensorsInterface(InvocationContextInterface):
        return name

    def load(self, name: str) -> Tensor:
-        """Loads a tensor by name.
+        """Loads a tensor by name. This method returns a copy of the tensor.

        Args:
            name: The name of the tensor to load.

        Returns:
-            The loaded tensor.
+            The tensor.
        """
-        return self._services.tensors.load(name)
+        return self._services.tensors.load(name).clone()


 class ConditioningInterface(InvocationContextInterface):
@@ -316,16 +321,16 @@ class ConditioningInterface(InvocationContextInterface):
        return name

    def load(self, name: str) -> ConditioningFieldData:
-        """Loads conditioning data by name.
+        """Loads conditioning data by name. This method returns a copy of the conditioning data.

        Args:
            name: The name of the conditioning data to load.

        Returns:
-            The loaded conditioning data.
+            The conditioning data.
        """

-        return self._services.conditioning.load(name)
+        return deepcopy(self._services.conditioning.load(name))


 class ModelsInterface(InvocationContextInterface):
--- a/invokeai/backend/model_manager/config.py
+++ b/invokeai/backend/model_manager/config.py
@@ -53,8 +53,6 @@ class BaseModelType(str, Enum):
    Any = "any"
    StableDiffusion1 = "sd-1"
    StableDiffusion2 = "sd-2"
-    # TODO(ryand): Should this just be StableDiffusion3?
-    StableDiffusion35 = "sd-3.5"
    StableDiffusionXL = "sdxl"
    StableDiffusionXLRefiner = "sdxl-refiner"
    Flux = "flux"
--- a/invokeai/backend/model_manager/load/model_loaders/sd3.py
+++ b/invokeai/backend/model_manager/load/model_loaders/sd3.py
@@ -1,55 +0,0 @@
-from pathlib import Path
-from typing import Optional
-
-from invokeai.backend.model_manager.config import (
-    AnyModel,
-    AnyModelConfig,
-    BaseModelType,
-    CheckpointConfigBase,
-    MainCheckpointConfig,
-    ModelFormat,
-    ModelType,
-    SubModelType,
-)
-from invokeai.backend.model_manager.load.load_default import ModelLoader
-from invokeai.backend.model_manager.load.model_loader_registry import ModelLoaderRegistry
-
-
-@ModelLoaderRegistry.register(base=BaseModelType.StableDiffusion35, type=ModelType.Main, format=ModelFormat.Checkpoint)
-class FluxCheckpointModel(ModelLoader):
-    """Class to load main models."""
-
-    def _load_model(
-        self,
-        config: AnyModelConfig,
-        submodel_type: Optional[SubModelType] = None,
-    ) -> AnyModel:
-        if not isinstance(config, CheckpointConfigBase):
-            raise ValueError("Only CheckpointConfigBase models are currently supported here.")
-
-        match submodel_type:
-            case SubModelType.Transformer:
-                return self._load_from_singlefile(config)
-
-        raise ValueError(
-            f"Only Transformer submodels are currently supported. Received: {submodel_type.value if submodel_type else 'None'}"
-        )
-
-    def _load_from_singlefile(
-        self,
-        config: AnyModelConfig,
-    ) -> AnyModel:
-        assert isinstance(config, MainCheckpointConfig)
-        model_path = Path(config.path)
-
-        # model = Flux(params[config.config_path])
-        # sd = load_file(model_path)
-        # if "model.diffusion_model.double_blocks.0.img_attn.norm.key_norm.scale" in sd:
-        #     sd = convert_bundle_to_flux_transformer_checkpoint(sd)
-        # new_sd_size = sum([ten.nelement() * torch.bfloat16.itemsize for ten in sd.values()])
-        # self._ram_cache.make_room(new_sd_size)
-        # for k in sd.keys():
-        #     # We need to cast to bfloat16 due to it being the only currently supported dtype for inference
-        #     sd[k] = sd[k].to(torch.bfloat16)
-        # model.load_state_dict(sd, assign=True)
-        return model
--- a/invokeai/backend/model_manager/load/model_loaders/stable_diffusion.py
+++ b/invokeai/backend/model_manager/load/model_loaders/stable_diffusion.py
@@ -117,8 +117,6 @@ class StableDiffusionDiffusersModel(GenericDiffusersLoader):
            load_class = load_classes[config.base][config.variant]
        except KeyError as e:
            raise Exception(f"No diffusers pipeline known for base={config.base}, variant={config.variant}") from e
-        prediction_type = config.prediction_type.value
-        upcast_attention = config.upcast_attention

        # Without SilenceWarnings we get log messages like this:
        # site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
@@ -129,13 +127,7 @@ class StableDiffusionDiffusersModel(GenericDiffusersLoader):
        # ['text_model.embeddings.position_ids']

        with SilenceWarnings():
-            pipeline = load_class.from_single_file(
-                config.path,
-                torch_dtype=self._torch_dtype,
-                prediction_type=prediction_type,
-                upcast_attention=upcast_attention,
-                load_safety_checker=False,
-            )
+            pipeline = load_class.from_single_file(config.path, torch_dtype=self._torch_dtype)

        if not submodel_type:
            return pipeline
--- a/invokeai/backend/model_manager/metadata/fetch/huggingface.py
+++ b/invokeai/backend/model_manager/metadata/fetch/huggingface.py
@@ -20,7 +20,7 @@ from typing import Optional

 import requests
 from huggingface_hub import HfApi, configure_http_backend, hf_hub_url
-from huggingface_hub.utils._errors import RepositoryNotFoundError, RevisionNotFoundError
+from huggingface_hub.errors import RepositoryNotFoundError, RevisionNotFoundError
 from pydantic.networks import AnyHttpUrl
 from requests.sessions import Session

--- a/invokeai/backend/model_manager/probe.py
+++ b/invokeai/backend/model_manager/probe.py
@@ -37,7 +37,6 @@ from invokeai.backend.model_manager.config import (
 from invokeai.backend.model_manager.util.model_util import lora_token_vector_length, read_checkpoint_meta
 from invokeai.backend.quantization.gguf.ggml_tensor import GGMLTensor
 from invokeai.backend.quantization.gguf.loaders import gguf_sd_loader
-from invokeai.backend.sd3.sd3_state_dict_utils import is_sd3_checkpoint
 from invokeai.backend.spandrel_image_to_image_model import SpandrelImageToImageModel
 from invokeai.backend.util.silence_warnings import SilenceWarnings

@@ -121,7 +120,6 @@ class ModelProbe(object):
        "T2IAdapter": ModelType.T2IAdapter,
        "CLIPModel": ModelType.CLIPEmbed,
        "CLIPTextModel": ModelType.CLIPEmbed,
-        "CLIPTextModelWithProjection": ModelType.CLIPEmbed,
        "T5EncoderModel": ModelType.T5Encoder,
        "FluxControlNetModel": ModelType.ControlNet,
    }
@@ -243,11 +241,6 @@ class ModelProbe(object):
        for key in [str(k) for k in ckpt.keys()]:
            if key.startswith(
                (
-                    # The following prefixes appear when multiple models have been bundled together in a single file (I
-                    # believe the format originated in ComfyUI).
-                    # first_stage_model = VAE
-                    # cond_stage_model = Text Encoder
-                    # model.diffusion_model = UNet / Transformer
                    "cond_stage_model.",
                    "first_stage_model.",
                    "model.diffusion_model.",
@@ -404,9 +397,6 @@ class ModelProbe(object):
                    #   is used rather than attempting to support flux with separate model types and format
                    #   If changed in the future, please fix me
                    config_file = "flux-schnell"
-            elif base_type == BaseModelType.StableDiffusion35:
-                # TODO(ryand): Think about what to do here.
-                config_file = "sd3.5-large"
            else:
                config_file = LEGACY_CONFIGS[base_type][variant_type]
                if isinstance(config_file, dict):  # need another tier for sd-2.x models
@@ -472,8 +462,9 @@ MODEL_NAME_TO_PREPROCESSOR = {
    "normal": "normalbae_image_processor",
    "sketch": "pidi_image_processor",
    "scribble": "lineart_image_processor",
-    "lineart": "lineart_image_processor",
+    "lineart anime": "lineart_anime_image_processor",
    "lineart_anime": "lineart_anime_image_processor",
+    "lineart": "lineart_image_processor",
    "softedge": "hed_image_processor",
    "hed": "hed_image_processor",
    "shuffle": "content_shuffle_image_processor",
@@ -526,7 +517,7 @@ class CheckpointProbeBase(ProbeBase):
    def get_variant_type(self) -> ModelVariantType:
        model_type = ModelProbe.get_model_type_from_checkpoint(self.model_path, self.checkpoint)
        base_type = self.get_base_type()
-        if model_type != ModelType.Main or base_type in (BaseModelType.Flux, BaseModelType.StableDiffusion35):
+        if model_type != ModelType.Main or base_type == BaseModelType.Flux:
            return ModelVariantType.Normal
        state_dict = self.checkpoint.get("state_dict") or self.checkpoint
        in_channels = state_dict["model.diffusion_model.input_blocks.0.0.weight"].shape[1]
@@ -551,10 +542,6 @@ class PipelineCheckpointProbe(CheckpointProbeBase):
            or "model.diffusion_model.double_blocks.0.img_attn.norm.key_norm.scale" in state_dict
        ):
            return BaseModelType.Flux
-
-        if is_sd3_checkpoint(state_dict):
-            return BaseModelType.StableDiffusion35
-
        key_name = "model.diffusion_model.input_blocks.2.1.transformer_blocks.0.attn2.to_k.weight"
        if key_name in state_dict and state_dict[key_name].shape[-1] == 768:
            return BaseModelType.StableDiffusion1
--- a/invokeai/backend/model_manager/starter_models.py
+++ b/invokeai/backend/model_manager/starter_models.py
@@ -13,6 +13,9 @@ class StarterModelWithoutDependencies(BaseModel):
    type: ModelType
    format: Optional[ModelFormat] = None
    is_installed: bool = False
+    # allows us to track what models a user has installed across name changes within starter models
+    # if you update a starter model name, please add the old one to this list for that starter model
+    previous_names: list[str] = []


 class StarterModel(StarterModelWithoutDependencies):
@@ -243,44 +246,49 @@ easy_neg_sd1 = StarterModel(
 # endregion
 # region IP Adapter
 ip_adapter_sd1 = StarterModel(
-    name="IP Adapter",
+    name="Standard Reference (IP Adapter)",
    base=BaseModelType.StableDiffusion1,
    source="https://huggingface.co/InvokeAI/ip_adapter_sd15/resolve/main/ip-adapter_sd15.safetensors",
-    description="IP-Adapter for SD 1.5 models",
+    description="References images with a more generalized/looser degree of precision.",
    type=ModelType.IPAdapter,
    dependencies=[ip_adapter_sd_image_encoder],
+    previous_names=["IP Adapter"],
 )
 ip_adapter_plus_sd1 = StarterModel(
-    name="IP Adapter Plus",
+    name="Precise Reference (IP Adapter Plus)",
    base=BaseModelType.StableDiffusion1,
    source="https://huggingface.co/InvokeAI/ip_adapter_plus_sd15/resolve/main/ip-adapter-plus_sd15.safetensors",
-    description="Refined IP-Adapter for SD 1.5 models",
+    description="References images with a higher degree of precision.",
    type=ModelType.IPAdapter,
    dependencies=[ip_adapter_sd_image_encoder],
+    previous_names=["IP Adapter Plus"],
 )
 ip_adapter_plus_face_sd1 = StarterModel(
-    name="IP Adapter Plus Face",
+    name="Face Reference (IP Adapter Plus Face)",
    base=BaseModelType.StableDiffusion1,
    source="https://huggingface.co/InvokeAI/ip_adapter_plus_face_sd15/resolve/main/ip-adapter-plus-face_sd15.safetensors",
-    description="Refined IP-Adapter for SD 1.5 models, adapted for faces",
+    description="References images with a higher degree of precision, adapted for faces",
    type=ModelType.IPAdapter,
    dependencies=[ip_adapter_sd_image_encoder],
+    previous_names=["IP Adapter Plus Face"],
 )
 ip_adapter_sdxl = StarterModel(
-    name="IP Adapter SDXL",
+    name="Standard Reference (IP Adapter ViT-H)",
    base=BaseModelType.StableDiffusionXL,
    source="https://huggingface.co/InvokeAI/ip_adapter_sdxl_vit_h/resolve/main/ip-adapter_sdxl_vit-h.safetensors",
-    description="IP-Adapter for SDXL models",
+    description="References images with a higher degree of precision.",
    type=ModelType.IPAdapter,
    dependencies=[ip_adapter_sdxl_image_encoder],
+    previous_names=["IP Adapter SDXL"],
 )
 ip_adapter_flux = StarterModel(
-    name="XLabs FLUX IP-Adapter",
+    name="Standard Reference (XLabs FLUX IP-Adapter)",
    base=BaseModelType.Flux,
    source="https://huggingface.co/XLabs-AI/flux-ip-adapter/resolve/main/flux-ip-adapter.safetensors",
-    description="FLUX IP-Adapter",
+    description="References images with a more generalized/looser degree of precision.",
    type=ModelType.IPAdapter,
    dependencies=[clip_vit_l_image_encoder],
+    previous_names=["XLabs FLUX IP-Adapter"],
 )
 # endregion
 # region ControlNet
@@ -299,157 +307,162 @@ qr_code_cnet_sdxl = StarterModel(
    type=ModelType.ControlNet,
 )
 canny_sd1 = StarterModel(
-    name="canny",
+    name="Hard Edge Detection (canny)",
    base=BaseModelType.StableDiffusion1,
    source="lllyasviel/control_v11p_sd15_canny",
-    description="ControlNet weights trained on sd-1.5 with canny conditioning.",
+    description="Uses detected edges in the image to control composition.",
    type=ModelType.ControlNet,
+    previous_names=["canny"],
 )
 inpaint_cnet_sd1 = StarterModel(
-    name="inpaint",
+    name="Inpainting",
    base=BaseModelType.StableDiffusion1,
    source="lllyasviel/control_v11p_sd15_inpaint",
    description="ControlNet weights trained on sd-1.5 with canny conditioning, inpaint version",
    type=ModelType.ControlNet,
+    previous_names=["inpaint"],
 )
 mlsd_sd1 = StarterModel(
-    name="mlsd",
+    name="Line Drawing (mlsd)",
    base=BaseModelType.StableDiffusion1,
    source="lllyasviel/control_v11p_sd15_mlsd",
-    description="ControlNet weights trained on sd-1.5 with canny conditioning, MLSD version",
+    description="Uses straight line detection for controlling the generation.",
    type=ModelType.ControlNet,
+    previous_names=["mlsd"],
 )
 depth_sd1 = StarterModel(
-    name="depth",
+    name="Depth Map",
    base=BaseModelType.StableDiffusion1,
    source="lllyasviel/control_v11f1p_sd15_depth",
-    description="ControlNet weights trained on sd-1.5 with depth conditioning",
+    description="Uses depth information in the image to control the depth in the generation.",
    type=ModelType.ControlNet,
+    previous_names=["depth"],
 )
 normal_bae_sd1 = StarterModel(
-    name="normal_bae",
+    name="Lighting Detection (Normals)",
    base=BaseModelType.StableDiffusion1,
    source="lllyasviel/control_v11p_sd15_normalbae",
-    description="ControlNet weights trained on sd-1.5 with normalbae image conditioning",
+    description="Uses detected lighting information to guide the lighting of the composition.",
    type=ModelType.ControlNet,
+    previous_names=["normal_bae"],
 )
 seg_sd1 = StarterModel(
-    name="seg",
+    name="Segmentation Map",
    base=BaseModelType.StableDiffusion1,
    source="lllyasviel/control_v11p_sd15_seg",
-    description="ControlNet weights trained on sd-1.5 with seg image conditioning",
+    description="Uses segmentation maps to guide the structure of the composition.",
    type=ModelType.ControlNet,
+    previous_names=["seg"],
 )
 lineart_sd1 = StarterModel(
-    name="lineart",
+    name="Lineart",
    base=BaseModelType.StableDiffusion1,
    source="lllyasviel/control_v11p_sd15_lineart",
-    description="ControlNet weights trained on sd-1.5 with lineart image conditioning",
+    description="Uses lineart detection to guide the lighting of the composition.",
    type=ModelType.ControlNet,
+    previous_names=["lineart"],
 )
 lineart_anime_sd1 = StarterModel(
-    name="lineart_anime",
+    name="Lineart Anime",
    base=BaseModelType.StableDiffusion1,
    source="lllyasviel/control_v11p_sd15s2_lineart_anime",
-    description="ControlNet weights trained on sd-1.5 with anime image conditioning",
+    description="Uses anime lineart detection to guide the lighting of the composition.",
    type=ModelType.ControlNet,
+    previous_names=["lineart_anime"],
 )
 openpose_sd1 = StarterModel(
-    name="openpose",
+    name="Pose Detection (openpose)",
    base=BaseModelType.StableDiffusion1,
    source="lllyasviel/control_v11p_sd15_openpose",
-    description="ControlNet weights trained on sd-1.5 with openpose image conditioning",
+    description="Uses pose information to control the pose of human characters in the generation.",
    type=ModelType.ControlNet,
+    previous_names=["openpose"],
 )
 scribble_sd1 = StarterModel(
-    name="scribble",
+    name="Contour Detection (scribble)",
    base=BaseModelType.StableDiffusion1,
    source="lllyasviel/control_v11p_sd15_scribble",
-    description="ControlNet weights trained on sd-1.5 with scribble image conditioning",
+    description="Uses edges, contours, or line art in the image to control composition.",
    type=ModelType.ControlNet,
+    previous_names=["scribble"],
 )
 softedge_sd1 = StarterModel(
-    name="softedge",
+    name="Soft Edge Detection (softedge)",
    base=BaseModelType.StableDiffusion1,
    source="lllyasviel/control_v11p_sd15_softedge",
-    description="ControlNet weights trained on sd-1.5 with soft edge conditioning",
+    description="Uses a soft edge detection map to control composition.",
    type=ModelType.ControlNet,
+    previous_names=["softedge"],
 )
 shuffle_sd1 = StarterModel(
-    name="shuffle",
+    name="Remix (shuffle)",
    base=BaseModelType.StableDiffusion1,
    source="lllyasviel/control_v11e_sd15_shuffle",
    description="ControlNet weights trained on sd-1.5 with shuffle image conditioning",
    type=ModelType.ControlNet,
+    previous_names=["shuffle"],
 )
 tile_sd1 = StarterModel(
-    name="tile",
+    name="Tile",
    base=BaseModelType.StableDiffusion1,
    source="lllyasviel/control_v11f1e_sd15_tile",
-    description="ControlNet weights trained on sd-1.5 with tiled image conditioning",
-    type=ModelType.ControlNet,
-)
-ip2p_sd1 = StarterModel(
-    name="ip2p",
-    base=BaseModelType.StableDiffusion1,
-    source="lllyasviel/control_v11e_sd15_ip2p",
-    description="ControlNet weights trained on sd-1.5 with ip2p conditioning.",
+    description="Uses image data to replicate exact colors/structure in the resulting generation.",
    type=ModelType.ControlNet,
+    previous_names=["tile"],
 )
 canny_sdxl = StarterModel(
-    name="canny-sdxl",
+    name="Hard Edge Detection (canny)",
    base=BaseModelType.StableDiffusionXL,
    source="xinsir/controlNet-canny-sdxl-1.0",
-    description="ControlNet weights trained on sdxl-1.0 with canny conditioning, by Xinsir.",
+    description="Uses detected edges in the image to control composition.",
    type=ModelType.ControlNet,
+    previous_names=["canny-sdxl"],
 )
 depth_sdxl = StarterModel(
-    name="depth-sdxl",
+    name="Depth Map",
    base=BaseModelType.StableDiffusionXL,
    source="diffusers/controlNet-depth-sdxl-1.0",
-    description="ControlNet weights trained on sdxl-1.0 with depth conditioning.",
+    description="Uses depth information in the image to control the depth in the generation.",
    type=ModelType.ControlNet,
+    previous_names=["depth-sdxl"],
 )
 softedge_sdxl = StarterModel(
-    name="softedge-dexined-sdxl",
+    name="Soft Edge Detection (softedge)",
    base=BaseModelType.StableDiffusionXL,
    source="SargeZT/controlNet-sd-xl-1.0-softedge-dexined",
-    description="ControlNet weights trained on sdxl-1.0 with dexined soft edge preprocessing.",
-    type=ModelType.ControlNet,
-)
-depth_zoe_16_sdxl = StarterModel(
-    name="depth-16bit-zoe-sdxl",
-    base=BaseModelType.StableDiffusionXL,
-    source="SargeZT/controlNet-sd-xl-1.0-depth-16bit-zoe",
-    description="ControlNet weights trained on sdxl-1.0 with Zoe's preprocessor (16 bits).",
-    type=ModelType.ControlNet,
-)
-depth_zoe_32_sdxl = StarterModel(
-    name="depth-zoe-sdxl",
-    base=BaseModelType.StableDiffusionXL,
-    source="diffusers/controlNet-zoe-depth-sdxl-1.0",
-    description="ControlNet weights trained on sdxl-1.0 with Zoe's preprocessor (32 bits).",
+    description="Uses a soft edge detection map to control composition.",
    type=ModelType.ControlNet,
+    previous_names=["softedge-dexined-sdxl"],
 )
 openpose_sdxl = StarterModel(
-    name="openpose-sdxl",
+    name="Pose Detection (openpose)",
    base=BaseModelType.StableDiffusionXL,
    source="xinsir/controlNet-openpose-sdxl-1.0",
-    description="ControlNet weights trained on sdxl-1.0 compatible with the DWPose processor by Xinsir.",
+    description="Uses pose information to control the pose of human characters in the generation.",
    type=ModelType.ControlNet,
+    previous_names=["openpose-sdxl", "controlnet-openpose-sdxl"],
 )
 scribble_sdxl = StarterModel(
-    name="scribble-sdxl",
+    name="Contour Detection (scribble)",
    base=BaseModelType.StableDiffusionXL,
    source="xinsir/controlNet-scribble-sdxl-1.0",
-    description="ControlNet weights trained on sdxl-1.0 compatible with various lineart processors and black/white sketches by Xinsir.",
+    description="Uses edges, contours, or line art in the image to control composition.",
    type=ModelType.ControlNet,
+    previous_names=["scribble-sdxl", "controlnet-scribble-sdxl"],
 )
 tile_sdxl = StarterModel(
-    name="tile-sdxl",
+    name="Tile",
    base=BaseModelType.StableDiffusionXL,
    source="xinsir/controlNet-tile-sdxl-1.0",
-    description="ControlNet weights trained on sdxl-1.0 with tiled image conditioning",
+    description="Uses image data to replicate exact colors/structure in the resulting generation.",
+    type=ModelType.ControlNet,
+    previous_names=["tile-sdxl"],
+)
+union_cnet_sdxl = StarterModel(
+    name="Multi-Guidance Detection (Union Pro)",
+    base=BaseModelType.StableDiffusionXL,
+    source="InvokeAI/Xinsir-SDXL_Controlnet_Union",
+    description="A unified ControlNet for SDXL model that supports 10+ control types",
    type=ModelType.ControlNet,
 )
 union_cnet_flux = StarterModel(
@@ -462,60 +475,52 @@ union_cnet_flux = StarterModel(
 # endregion
 # region T2I Adapter
 t2i_canny_sd1 = StarterModel(
-    name="canny-sd15",
+    name="Hard Edge Detection (canny)",
    base=BaseModelType.StableDiffusion1,
    source="TencentARC/t2iadapter_canny_sd15v2",
-    description="T2I Adapter weights trained on sd-1.5 with canny conditioning.",
+    description="Uses detected edges in the image to control composition",
    type=ModelType.T2IAdapter,
+    previous_names=["canny-sd15"],
 )
 t2i_sketch_sd1 = StarterModel(
-    name="sketch-sd15",
+    name="Sketch",
    base=BaseModelType.StableDiffusion1,
    source="TencentARC/t2iadapter_sketch_sd15v2",
-    description="T2I Adapter weights trained on sd-1.5 with sketch conditioning.",
+    description="Uses a sketch to control composition",
    type=ModelType.T2IAdapter,
+    previous_names=["sketch-sd15"],
 )
 t2i_depth_sd1 = StarterModel(
-    name="depth-sd15",
+    name="Depth Map",
    base=BaseModelType.StableDiffusion1,
    source="TencentARC/t2iadapter_depth_sd15v2",
-    description="T2I Adapter weights trained on sd-1.5 with depth conditioning.",
-    type=ModelType.T2IAdapter,
-)
-t2i_zoe_depth_sd1 = StarterModel(
-    name="zoedepth-sd15",
-    base=BaseModelType.StableDiffusion1,
-    source="TencentARC/t2iadapter_zoedepth_sd15v1",
-    description="T2I Adapter weights trained on sd-1.5 with zoe depth conditioning.",
+    description="Uses depth information in the image to control the depth in the generation.",
    type=ModelType.T2IAdapter,
+    previous_names=["depth-sd15"],
 )
 t2i_canny_sdxl = StarterModel(
-    name="canny-sdxl",
+    name="Hard Edge Detection (canny)",
    base=BaseModelType.StableDiffusionXL,
    source="TencentARC/t2i-adapter-canny-sdxl-1.0",
-    description="T2I Adapter weights trained on sdxl-1.0 with canny conditioning.",
-    type=ModelType.T2IAdapter,
-)
-t2i_zoe_depth_sdxl = StarterModel(
-    name="zoedepth-sdxl",
-    base=BaseModelType.StableDiffusionXL,
-    source="TencentARC/t2i-adapter-depth-zoe-sdxl-1.0",
-    description="T2I Adapter weights trained on sdxl-1.0 with zoe depth conditioning.",
+    description="Uses detected edges in the image to control composition",
    type=ModelType.T2IAdapter,
+    previous_names=["canny-sdxl"],
 )
 t2i_lineart_sdxl = StarterModel(
-    name="lineart-sdxl",
+    name="Lineart",
    base=BaseModelType.StableDiffusionXL,
    source="TencentARC/t2i-adapter-lineart-sdxl-1.0",
-    description="T2I Adapter weights trained on sdxl-1.0 with lineart conditioning.",
+    description="Uses lineart detection to guide the lighting of the composition.",
    type=ModelType.T2IAdapter,
+    previous_names=["lineart-sdxl"],
 )
 t2i_sketch_sdxl = StarterModel(
-    name="sketch-sdxl",
+    name="Sketch",
    base=BaseModelType.StableDiffusionXL,
    source="TencentARC/t2i-adapter-sketch-sdxl-1.0",
-    description="T2I Adapter weights trained on sdxl-1.0 with sketch conditioning.",
+    description="Uses a sketch to control composition",
    type=ModelType.T2IAdapter,
+    previous_names=["sketch-sdxl"],
 )
 # endregion
 # region SpandrelImageToImage
@@ -600,22 +605,18 @@ STARTER_MODELS: list[StarterModel] = [
    softedge_sd1,
    shuffle_sd1,
    tile_sd1,
-    ip2p_sd1,
    canny_sdxl,
    depth_sdxl,
    softedge_sdxl,
-    depth_zoe_16_sdxl,
-    depth_zoe_32_sdxl,
    openpose_sdxl,
    scribble_sdxl,
    tile_sdxl,
+    union_cnet_sdxl,
    union_cnet_flux,
    t2i_canny_sd1,
    t2i_sketch_sd1,
    t2i_depth_sd1,
-    t2i_zoe_depth_sd1,
    t2i_canny_sdxl,
-    t2i_zoe_depth_sdxl,
    t2i_lineart_sdxl,
    t2i_sketch_sdxl,
    realesrgan_x4,
@@ -646,7 +647,6 @@ sd1_bundle: list[StarterModel] = [
    softedge_sd1,
    shuffle_sd1,
    tile_sd1,
-    ip2p_sd1,
    swinir,
 ]

@@ -657,8 +657,6 @@ sdxl_bundle: list[StarterModel] = [
    canny_sdxl,
    depth_sdxl,
    softedge_sdxl,
-    depth_zoe_16_sdxl,
-    depth_zoe_32_sdxl,
    openpose_sdxl,
    scribble_sdxl,
    tile_sdxl,
--- a/invokeai/backend/sd3/init.py
+++ b/invokeai/backend/sd3/init.py
--- a/invokeai/backend/sd3/mmditx.py
+++ b/invokeai/backend/sd3/mmditx.py
@@ -1,891 +0,0 @@
-# This file was originally copied from:
-# https://github.com/Stability-AI/sd3.5/blob/19bf11c4e1e37324c5aa5a61f010d4127848a09c/mmditx.py
-
-
-### This file contains impls for MM-DiT, the core model component of SD3
-
-import math
-from typing import Dict, List, Optional
-
-import numpy as np
-import torch
-from einops import rearrange, repeat
-
-from invokeai.backend.sd3.other_impls import Mlp, attention
-
-
-class PatchEmbed(torch.nn.Module):
-    """2D Image to Patch Embedding"""
-
-    def __init__(
-        self,
-        img_size: Optional[int] = 224,
-        patch_size: int = 16,
-        in_chans: int = 3,
-        embed_dim: int = 768,
-        flatten: bool = True,
-        bias: bool = True,
-        strict_img_size: bool = True,
-        dynamic_img_pad: bool = False,
-        dtype: torch.dtype | None = None,
-        device: torch.device | None = None,
-    ):
-        super().__init__()
-        self.patch_size = (patch_size, patch_size)
-        if img_size is not None:
-            self.img_size = (img_size, img_size)
-            self.grid_size = tuple([s // p for s, p in zip(self.img_size, self.patch_size, strict=False)])
-            self.num_patches = self.grid_size[0] * self.grid_size[1]
-        else:
-            self.img_size = None
-            self.grid_size = None
-            self.num_patches = None
-
-        # flatten spatial dim and transpose to channels last, kept for bwd compat
-        self.flatten = flatten
-        self.strict_img_size = strict_img_size
-        self.dynamic_img_pad = dynamic_img_pad
-
-        self.proj = torch.nn.Conv2d(
-            in_chans,
-            embed_dim,
-            kernel_size=patch_size,
-            stride=patch_size,
-            bias=bias,
-            dtype=dtype,
-            device=device,
-        )
-
-    def forward(self, x: torch.Tensor) -> torch.Tensor:
-        x = self.proj(x)
-        if self.flatten:
-            x = x.flatten(2).transpose(1, 2)  # NCHW -> NLC
-        return x
-
-
-def modulate(x: torch.Tensor, shift: torch.Tensor | None, scale: torch.Tensor) -> torch.Tensor:
-    if shift is None:
-        shift = torch.zeros_like(scale)
-    return x * (1 + scale.unsqueeze(1)) + shift.unsqueeze(1)
-
-
-#################################################################################
-#                   Sine/Cosine Positional Embedding Functions                  #
-#################################################################################
-
-
-def get_2d_sincos_pos_embed(
-    embed_dim: int,
-    grid_size: int,
-    cls_token: bool = False,
-    extra_tokens: int = 0,
-    scaling_factor: Optional[float] = None,
-    offset: Optional[float] = None,
-):
-    """
-    grid_size: int of the grid height and width
-    return:
-    pos_embed: [grid_size*grid_size, embed_dim] or [1+grid_size*grid_size, embed_dim] (w/ or w/o cls_token)
-    """
-    grid_h = np.arange(grid_size, dtype=np.float32)
-    grid_w = np.arange(grid_size, dtype=np.float32)
-    grid = np.meshgrid(grid_w, grid_h)  # here w goes first
-    grid = np.stack(grid, axis=0)
-    if scaling_factor is not None:
-        grid = grid / scaling_factor
-    if offset is not None:
-        grid = grid - offset
-    grid = grid.reshape([2, 1, grid_size, grid_size])
-    pos_embed = get_2d_sincos_pos_embed_from_grid(embed_dim, grid)
-    if cls_token and extra_tokens > 0:
-        pos_embed = np.concatenate([np.zeros([extra_tokens, embed_dim]), pos_embed], axis=0)
-    return pos_embed
-
-
-def get_2d_sincos_pos_embed_from_grid(embed_dim: int, grid):
-    assert embed_dim % 2 == 0
-    # use half of dimensions to encode grid_h
-    emb_h = get_1d_sincos_pos_embed_from_grid(embed_dim // 2, grid[0])  # (H*W, D/2)
-    emb_w = get_1d_sincos_pos_embed_from_grid(embed_dim // 2, grid[1])  # (H*W, D/2)
-    emb = np.concatenate([emb_h, emb_w], axis=1)  # (H*W, D)
-    return emb
-
-
-def get_1d_sincos_pos_embed_from_grid(embed_dim: int, pos):
-    """
-    embed_dim: output dimension for each position
-    pos: a list of positions to be encoded: size (M,)
-    out: (M, D)
-    """
-    assert embed_dim % 2 == 0
-    omega = np.arange(embed_dim // 2, dtype=np.float64)
-    omega /= embed_dim / 2.0
-    omega = 1.0 / 10000**omega  # (D/2,)
-    pos = pos.reshape(-1)  # (M,)
-    out = np.einsum("m,d->md", pos, omega)  # (M, D/2), outer product
-    emb_sin = np.sin(out)  # (M, D/2)
-    emb_cos = np.cos(out)  # (M, D/2)
-    return np.concatenate([emb_sin, emb_cos], axis=1)  # (M, D)
-
-
-#################################################################################
-#               Embedding Layers for Timesteps and Class Labels                 #
-#################################################################################
-
-
-class TimestepEmbedder(torch.nn.Module):
-    """Embeds scalar timesteps into vector representations."""
-
-    def __init__(self, hidden_size, frequency_embedding_size=256, dtype=None, device=None):
-        super().__init__()
-        self.mlp = torch.nn.Sequential(
-            torch.nn.Linear(
-                frequency_embedding_size,
-                hidden_size,
-                bias=True,
-                dtype=dtype,
-                device=device,
-            ),
-            torch.nn.SiLU(),
-            torch.nn.Linear(hidden_size, hidden_size, bias=True, dtype=dtype, device=device),
-        )
-        self.frequency_embedding_size = frequency_embedding_size
-
-    @staticmethod
-    def timestep_embedding(t, dim, max_period=10000):
-        """
-        Create sinusoidal timestep embeddings.
-        :param t: a 1-D Tensor of N indices, one per batch element.
-                          These may be fractional.
-        :param dim: the dimension of the output.
-        :param max_period: controls the minimum frequency of the embeddings.
-        :return: an (N, D) Tensor of positional embeddings.
-        """
-        half = dim // 2
-        freqs = torch.exp(-math.log(max_period) * torch.arange(start=0, end=half, dtype=torch.float32) / half).to(
-            device=t.device
-        )
-        args = t[:, None].float() * freqs[None]
-        embedding = torch.cat([torch.cos(args), torch.sin(args)], dim=-1)
-        if dim % 2:
-            embedding = torch.cat([embedding, torch.zeros_like(embedding[:, :1])], dim=-1)
-        if torch.is_floating_point(t):
-            embedding = embedding.to(dtype=t.dtype)
-        return embedding
-
-    def forward(self, t, dtype, **kwargs):
-        t_freq = self.timestep_embedding(t, self.frequency_embedding_size).to(dtype)
-        t_emb = self.mlp(t_freq)
-        return t_emb
-
-
-class VectorEmbedder(torch.nn.Module):
-    """Embeds a flat vector of dimension input_dim"""
-
-    def __init__(self, input_dim: int, hidden_size: int, dtype=None, device=None):
-        super().__init__()
-        self.mlp = torch.nn.Sequential(
-            torch.nn.Linear(input_dim, hidden_size, bias=True, dtype=dtype, device=device),
-            torch.nn.SiLU(),
-            torch.nn.Linear(hidden_size, hidden_size, bias=True, dtype=dtype, device=device),
-        )
-
-    def forward(self, x: torch.Tensor) -> torch.Tensor:
-        return self.mlp(x)
-
-
-#################################################################################
-#                                 Core DiT Model                                #
-#################################################################################
-
-
-def split_qkv(qkv, head_dim):
-    qkv = qkv.reshape(qkv.shape[0], qkv.shape[1], 3, -1, head_dim).movedim(2, 0)
-    return qkv[0], qkv[1], qkv[2]
-
-
-def optimized_attention(qkv, num_heads):
-    return attention(qkv[0], qkv[1], qkv[2], num_heads)
-
-
-class SelfAttention(torch.nn.Module):
-    ATTENTION_MODES = ("xformers", "torch", "torch-hb", "math", "debug")
-
-    def __init__(
-        self,
-        dim: int,
-        num_heads: int = 8,
-        qkv_bias: bool = False,
-        qk_scale: Optional[float] = None,
-        attn_mode: str = "xformers",
-        pre_only: bool = False,
-        qk_norm: Optional[str] = None,
-        rmsnorm: bool = False,
-        dtype=None,
-        device=None,
-    ):
-        super().__init__()
-        self.num_heads = num_heads
-        self.head_dim = dim // num_heads
-
-        self.qkv = torch.nn.Linear(dim, dim * 3, bias=qkv_bias, dtype=dtype, device=device)
-        if not pre_only:
-            self.proj = torch.nn.Linear(dim, dim, dtype=dtype, device=device)
-        assert attn_mode in self.ATTENTION_MODES
-        self.attn_mode = attn_mode
-        self.pre_only = pre_only
-
-        if qk_norm == "rms":
-            self.ln_q = RMSNorm(
-                self.head_dim,
-                elementwise_affine=True,
-                eps=1.0e-6,
-                dtype=dtype,
-                device=device,
-            )
-            self.ln_k = RMSNorm(
-                self.head_dim,
-                elementwise_affine=True,
-                eps=1.0e-6,
-                dtype=dtype,
-                device=device,
-            )
-        elif qk_norm == "ln":
-            self.ln_q = torch.nn.LayerNorm(
-                self.head_dim,
-                elementwise_affine=True,
-                eps=1.0e-6,
-                dtype=dtype,
-                device=device,
-            )
-            self.ln_k = torch.nn.LayerNorm(
-                self.head_dim,
-                elementwise_affine=True,
-                eps=1.0e-6,
-                dtype=dtype,
-                device=device,
-            )
-        elif qk_norm is None:
-            self.ln_q = torch.nn.Identity()
-            self.ln_k = torch.nn.Identity()
-        else:
-            raise ValueError(qk_norm)
-
-    def pre_attention(self, x: torch.Tensor):
-        B, L, C = x.shape
-        qkv = self.qkv(x)
-        q, k, v = split_qkv(qkv, self.head_dim)
-        q = self.ln_q(q).reshape(q.shape[0], q.shape[1], -1)
-        k = self.ln_k(k).reshape(q.shape[0], q.shape[1], -1)
-        return (q, k, v)
-
-    def post_attention(self, x: torch.Tensor) -> torch.Tensor:
-        assert not self.pre_only
-        x = self.proj(x)
-        return x
-
-    def forward(self, x: torch.Tensor) -> torch.Tensor:
-        (q, k, v) = self.pre_attention(x)
-        x = attention(q, k, v, self.num_heads)
-        x = self.post_attention(x)
-        return x
-
-
-class RMSNorm(torch.nn.Module):
-    def __init__(
-        self,
-        dim: int,
-        elementwise_affine: bool = False,
-        eps: float = 1e-6,
-        device=None,
-        dtype=None,
-    ):
-        """
-        Initialize the RMSNorm normalization layer.
-        Args:
-            dim (int): The dimension of the input tensor.
-            eps (float, optional): A small value added to the denominator for numerical stability. Default is 1e-6.
-        Attributes:
-            eps (float): A small value added to the denominator for numerical stability.
-            weight (torch.nn.Parameter): Learnable scaling parameter.
-        """
-        super().__init__()
-        self.eps = eps
-        self.learnable_scale = elementwise_affine
-        if self.learnable_scale:
-            self.weight = torch.nn.Parameter(torch.empty(dim, device=device, dtype=dtype))
-        else:
-            self.register_parameter("weight", None)
-
-    def _norm(self, x):
-        """
-        Apply the RMSNorm normalization to the input tensor.
-        Args:
-            x (torch.Tensor): The input tensor.
-        Returns:
-            torch.Tensor: The normalized tensor.
-        """
-        return x * torch.rsqrt(x.pow(2).mean(-1, keepdim=True) + self.eps)
-
-    def forward(self, x):
-        """
-        Forward pass through the RMSNorm layer.
-        Args:
-            x (torch.Tensor): The input tensor.
-        Returns:
-            torch.Tensor: The output tensor after applying RMSNorm.
-        """
-        x = self._norm(x)
-        if self.learnable_scale:
-            return x * self.weight.to(device=x.device, dtype=x.dtype)
-        else:
-            return x
-
-
-class SwiGLUFeedForward(torch.nn.Module):
-    def __init__(
-        self,
-        dim: int,
-        hidden_dim: int,
-        multiple_of: int,
-        ffn_dim_multiplier: Optional[float] = None,
-    ):
-        """
-        Initialize the FeedForward module.
-
-        Args:
-            dim (int): Input dimension.
-            hidden_dim (int): Hidden dimension of the feedforward layer.
-            multiple_of (int): Value to ensure hidden dimension is a multiple of this value.
-            ffn_dim_multiplier (float, optional): Custom multiplier for hidden dimension. Defaults to None.
-
-        Attributes:
-            w1 (ColumnParallelLinear): Linear transformation for the first layer.
-            w2 (RowParallelLinear): Linear transformation for the second layer.
-            w3 (ColumnParallelLinear): Linear transformation for the third layer.
-
-        """
-        super().__init__()
-        hidden_dim = int(2 * hidden_dim / 3)
-        # custom dim factor multiplier
-        if ffn_dim_multiplier is not None:
-            hidden_dim = int(ffn_dim_multiplier * hidden_dim)
-        hidden_dim = multiple_of * ((hidden_dim + multiple_of - 1) // multiple_of)
-
-        self.w1 = torch.nn.Linear(dim, hidden_dim, bias=False)
-        self.w2 = torch.nn.Linear(hidden_dim, dim, bias=False)
-        self.w3 = torch.nn.Linear(dim, hidden_dim, bias=False)
-
-    def forward(self, x):
-        return self.w2(torch.nn.functional.silu(self.w1(x)) * self.w3(x))
-
-
-class DismantledBlock(torch.nn.Module):
-    """A DiT block with gated adaptive layer norm (adaLN) conditioning."""
-
-    ATTENTION_MODES = ("xformers", "torch", "torch-hb", "math", "debug")
-
-    def __init__(
-        self,
-        hidden_size: int,
-        num_heads: int,
-        mlp_ratio: float = 4.0,
-        attn_mode: str = "xformers",
-        qkv_bias: bool = False,
-        pre_only: bool = False,
-        rmsnorm: bool = False,
-        scale_mod_only: bool = False,
-        swiglu: bool = False,
-        qk_norm: Optional[str] = None,
-        x_block_self_attn: bool = False,
-        dtype=None,
-        device=None,
-        **block_kwargs,
-    ):
-        super().__init__()
-        assert attn_mode in self.ATTENTION_MODES
-        if not rmsnorm:
-            self.norm1 = torch.nn.LayerNorm(
-                hidden_size,
-                elementwise_affine=False,
-                eps=1e-6,
-                dtype=dtype,
-                device=device,
-            )
-        else:
-            self.norm1 = RMSNorm(hidden_size, elementwise_affine=False, eps=1e-6)
-        self.attn = SelfAttention(
-            dim=hidden_size,
-            num_heads=num_heads,
-            qkv_bias=qkv_bias,
-            attn_mode=attn_mode,
-            pre_only=pre_only,
-            qk_norm=qk_norm,
-            rmsnorm=rmsnorm,
-            dtype=dtype,
-            device=device,
-        )
-        if x_block_self_attn:
-            assert not pre_only
-            assert not scale_mod_only
-            self.x_block_self_attn = True
-            self.attn2 = SelfAttention(
-                dim=hidden_size,
-                num_heads=num_heads,
-                qkv_bias=qkv_bias,
-                attn_mode=attn_mode,
-                pre_only=False,
-                qk_norm=qk_norm,
-                rmsnorm=rmsnorm,
-                dtype=dtype,
-                device=device,
-            )
-        else:
-            self.x_block_self_attn = False
-        if not pre_only:
-            if not rmsnorm:
-                self.norm2 = torch.nn.LayerNorm(
-                    hidden_size,
-                    elementwise_affine=False,
-                    eps=1e-6,
-                    dtype=dtype,
-                    device=device,
-                )
-            else:
-                self.norm2 = RMSNorm(hidden_size, elementwise_affine=False, eps=1e-6)
-        mlp_hidden_dim = int(hidden_size * mlp_ratio)
-        if not pre_only:
-            if not swiglu:
-                self.mlp = Mlp(
-                    in_features=hidden_size,
-                    hidden_features=mlp_hidden_dim,
-                    act_layer=torch.nn.GELU(approximate="tanh"),
-                    dtype=dtype,
-                    device=device,
-                )
-            else:
-                self.mlp = SwiGLUFeedForward(dim=hidden_size, hidden_dim=mlp_hidden_dim, multiple_of=256)
-        self.scale_mod_only = scale_mod_only
-        if x_block_self_attn:
-            assert not pre_only
-            assert not scale_mod_only
-            n_mods = 9
-        elif not scale_mod_only:
-            n_mods = 6 if not pre_only else 2
-        else:
-            n_mods = 4 if not pre_only else 1
-        self.adaLN_modulation = torch.nn.Sequential(
-            torch.nn.SiLU(),
-            torch.nn.Linear(hidden_size, n_mods * hidden_size, bias=True, dtype=dtype, device=device),
-        )
-        self.pre_only = pre_only
-
-    def pre_attention(self, x: torch.Tensor, c: torch.Tensor):
-        assert x is not None, "pre_attention called with None input"
-        if not self.pre_only:
-            if not self.scale_mod_only:
-                shift_msa, scale_msa, gate_msa, shift_mlp, scale_mlp, gate_mlp = self.adaLN_modulation(c).chunk(
-                    6, dim=1
-                )
-            else:
-                shift_msa = None
-                shift_mlp = None
-                scale_msa, gate_msa, scale_mlp, gate_mlp = self.adaLN_modulation(c).chunk(4, dim=1)
-            qkv = self.attn.pre_attention(modulate(self.norm1(x), shift_msa, scale_msa))
-            return qkv, (x, gate_msa, shift_mlp, scale_mlp, gate_mlp)
-        else:
-            if not self.scale_mod_only:
-                shift_msa, scale_msa = self.adaLN_modulation(c).chunk(2, dim=1)
-            else:
-                shift_msa = None
-                scale_msa = self.adaLN_modulation(c)
-            qkv = self.attn.pre_attention(modulate(self.norm1(x), shift_msa, scale_msa))
-            return qkv, None
-
-    def post_attention(
-        self,
-        attn: torch.Tensor,
-        x: torch.Tensor,
-        gate_msa: torch.Tensor,
-        shift_mlp: torch.Tensor,
-        scale_mlp: torch.Tensor,
-        gate_mlp: torch.Tensor,
-    ) -> torch.Tensor:
-        assert not self.pre_only
-        x = x + gate_msa.unsqueeze(1) * self.attn.post_attention(attn)
-        x = x + gate_mlp.unsqueeze(1) * self.mlp(modulate(self.norm2(x), shift_mlp, scale_mlp))
-        return x
-
-    def pre_attention_x(
-        self, x: torch.Tensor, c: torch.Tensor
-    ) -> tuple[
-        tuple[torch.Tensor, torch.Tensor, torch.Tensor],
-        tuple[torch.Tensor, torch.Tensor, torch.Tensor],
-        tuple[torch.Tensor, torch.Tensor, torch.Tensor, torch.Tensor, torch.Tensor, torch.Tensor],
-    ]:
-        assert self.x_block_self_attn
-        (
-            shift_msa,
-            scale_msa,
-            gate_msa,
-            shift_mlp,
-            scale_mlp,
-            gate_mlp,
-            shift_msa2,
-            scale_msa2,
-            gate_msa2,
-        ) = self.adaLN_modulation(c).chunk(9, dim=1)
-        x_norm = self.norm1(x)
-        qkv = self.attn.pre_attention(modulate(x_norm, shift_msa, scale_msa))
-        qkv2 = self.attn2.pre_attention(modulate(x_norm, shift_msa2, scale_msa2))
-        return (
-            qkv,
-            qkv2,
-            (
-                x,
-                gate_msa,
-                shift_mlp,
-                scale_mlp,
-                gate_mlp,
-                gate_msa2,
-            ),
-        )
-
-    def post_attention_x(
-        self,
-        attn: torch.Tensor,
-        attn2: torch.Tensor,
-        x: torch.Tensor,
-        gate_msa: torch.Tensor,
-        shift_mlp: torch.Tensor,
-        scale_mlp: torch.Tensor,
-        gate_mlp: torch.Tensor,
-        gate_msa2: torch.Tensor,
-        attn1_dropout: float = 0.0,
-    ):
-        assert not self.pre_only
-        if attn1_dropout > 0.0:
-            # Use torch.bernoulli to implement dropout, only dropout the batch dimension
-            attn1_dropout = torch.bernoulli(torch.full((attn.size(0), 1, 1), 1 - attn1_dropout, device=attn.device))
-            attn_ = gate_msa.unsqueeze(1) * self.attn.post_attention(attn) * attn1_dropout
-        else:
-            attn_ = gate_msa.unsqueeze(1) * self.attn.post_attention(attn)
-        x = x + attn_
-        attn2_ = gate_msa2.unsqueeze(1) * self.attn2.post_attention(attn2)
-        x = x + attn2_
-        mlp_ = gate_mlp.unsqueeze(1) * self.mlp(modulate(self.norm2(x), shift_mlp, scale_mlp))
-        x = x + mlp_
-        return x, (gate_msa, gate_msa2, gate_mlp, attn_, attn2_)
-
-    def forward(self, x: torch.Tensor, c: torch.Tensor):
-        assert not self.pre_only
-        if self.x_block_self_attn:
-            (q, k, v), (q2, k2, v2), intermediates = self.pre_attention_x(x, c)
-            attn = attention(q, k, v, self.attn.num_heads)
-            attn2 = attention(q2, k2, v2, self.attn2.num_heads)
-            return self.post_attention_x(attn, attn2, *intermediates)
-        else:
-            (q, k, v), intermediates = self.pre_attention(x, c)
-            attn = attention(q, k, v, self.attn.num_heads)
-            return self.post_attention(attn, *intermediates)
-
-
-def block_mixing(
-    context: torch.Tensor, x: torch.Tensor, context_block: DismantledBlock, x_block: DismantledBlock, c: torch.Tensor
-):
-    assert context is not None, "block_mixing called with None context"
-    context_qkv, context_intermediates = context_block.pre_attention(context, c)
-
-    if x_block.x_block_self_attn:
-        x_qkv, x_qkv2, x_intermediates = x_block.pre_attention_x(x, c)
-    else:
-        x_qkv, x_intermediates = x_block.pre_attention(x, c)
-
-    o: list[torch.Tensor] = []
-    for t in range(3):
-        o.append(torch.cat((context_qkv[t], x_qkv[t]), dim=1))
-    q, k, v = tuple(o)
-
-    attn = attention(q, k, v, x_block.attn.num_heads)
-    context_attn, x_attn = (
-        attn[:, : context_qkv[0].shape[1]],
-        attn[:, context_qkv[0].shape[1] :],
-    )
-
-    if not context_block.pre_only:
-        context = context_block.post_attention(context_attn, *context_intermediates)
-    else:
-        context = None
-
-    if x_block.x_block_self_attn:
-        x_q2, x_k2, x_v2 = x_qkv2
-        attn2 = attention(x_q2, x_k2, x_v2, x_block.attn2.num_heads)
-    else:
-        x = x_block.post_attention(x_attn, *x_intermediates)
-
-    return context, x
-
-
-class JointBlock(torch.nn.Module):
-    """just a small wrapper to serve as a fsdp unit"""
-
-    def __init__(self, *args, **kwargs):
-        super().__init__()
-        pre_only = kwargs.pop("pre_only")
-        qk_norm = kwargs.pop("qk_norm", None)
-        x_block_self_attn = kwargs.pop("x_block_self_attn", False)
-        self.context_block = DismantledBlock(*args, pre_only=pre_only, qk_norm=qk_norm, **kwargs)
-        self.x_block = DismantledBlock(
-            *args,
-            pre_only=False,
-            qk_norm=qk_norm,
-            x_block_self_attn=x_block_self_attn,
-            **kwargs,
-        )
-
-    def forward(self, *args, **kwargs):
-        return block_mixing(*args, context_block=self.context_block, x_block=self.x_block, **kwargs)
-
-
-class FinalLayer(torch.nn.Module):
-    """
-    The final layer of DiT.
-    """
-
-    def __init__(
-        self,
-        hidden_size: int,
-        patch_size: int,
-        out_channels: int,
-        total_out_channels: Optional[int] = None,
-        dtype: Optional[torch.dtype] = None,
-        device: Optional[torch.device] = None,
-    ):
-        super().__init__()
-        self.norm_final = torch.nn.LayerNorm(
-            hidden_size, elementwise_affine=False, eps=1e-6, dtype=dtype, device=device
-        )
-        self.linear = (
-            torch.nn.Linear(
-                hidden_size,
-                patch_size * patch_size * out_channels,
-                bias=True,
-                dtype=dtype,
-                device=device,
-            )
-            if (total_out_channels is None)
-            else torch.nn.Linear(hidden_size, total_out_channels, bias=True, dtype=dtype, device=device)
-        )
-        self.adaLN_modulation = torch.nn.Sequential(
-            torch.nn.SiLU(),
-            torch.nn.Linear(hidden_size, 2 * hidden_size, bias=True, dtype=dtype, device=device),
-        )
-
-    def forward(self, x: torch.Tensor, c: torch.Tensor) -> torch.Tensor:
-        shift, scale = self.adaLN_modulation(c).chunk(2, dim=1)
-        x = modulate(self.norm_final(x), shift, scale)
-        x = self.linear(x)
-        return x
-
-
-class MMDiTX(torch.nn.Module):
-    """Diffusion model with a Transformer backbone."""
-
-    def __init__(
-        self,
-        input_size: int | None = 32,
-        patch_size: int = 2,
-        in_channels: int = 4,
-        depth: int = 28,
-        mlp_ratio: float = 4.0,
-        learn_sigma: bool = False,
-        adm_in_channels: Optional[int] = None,
-        context_embedder_config: Optional[Dict] = None,
-        register_length: int = 0,
-        attn_mode: str = "torch",
-        rmsnorm: bool = False,
-        scale_mod_only: bool = False,
-        swiglu: bool = False,
-        out_channels: Optional[int] = None,
-        pos_embed_scaling_factor: Optional[float] = None,
-        pos_embed_offset: Optional[float] = None,
-        pos_embed_max_size: Optional[int] = None,
-        num_patches: Optional[int] = None,
-        qk_norm: Optional[str] = None,
-        x_block_self_attn_layers: Optional[List[int]] = None,
-        qkv_bias: bool = True,
-        dtype: Optional[torch.dtype] = None,
-        device: Optional[torch.device] = None,
-        verbose: bool = False,
-    ):
-        super().__init__()
-        if verbose:
-            print(
-                f"mmdit initializing with: {input_size=}, {patch_size=}, {in_channels=}, {depth=}, {mlp_ratio=}, {learn_sigma=}, {adm_in_channels=}, {context_embedder_config=}, {register_length=}, {attn_mode=}, {rmsnorm=}, {scale_mod_only=}, {swiglu=}, {out_channels=}, {pos_embed_scaling_factor=}, {pos_embed_offset=}, {pos_embed_max_size=}, {num_patches=}, {qk_norm=}, {qkv_bias=}, {dtype=}, {device=}"
-            )
-        self.dtype = dtype
-        self.learn_sigma = learn_sigma
-        self.in_channels = in_channels
-        default_out_channels = in_channels * 2 if learn_sigma else in_channels
-        self.out_channels = out_channels if out_channels is not None else default_out_channels
-        self.patch_size = patch_size
-        self.pos_embed_scaling_factor = pos_embed_scaling_factor
-        self.pos_embed_offset = pos_embed_offset
-        self.pos_embed_max_size = pos_embed_max_size
-        self.x_block_self_attn_layers = x_block_self_attn_layers or []
-
-        # apply magic --> this defines a head_size of 64
-        hidden_size = 64 * depth
-        num_heads = depth
-
-        self.num_heads = num_heads
-
-        self.x_embedder = PatchEmbed(
-            input_size,
-            patch_size,
-            in_channels,
-            hidden_size,
-            bias=True,
-            strict_img_size=self.pos_embed_max_size is None,
-            dtype=dtype,
-            device=device,
-        )
-        self.t_embedder = TimestepEmbedder(hidden_size, dtype=dtype, device=device)
-
-        if adm_in_channels is not None:
-            assert isinstance(adm_in_channels, int)
-            self.y_embedder = VectorEmbedder(adm_in_channels, hidden_size, dtype=dtype, device=device)
-
-        self.context_embedder = torch.nn.Identity()
-        if context_embedder_config is not None:
-            if context_embedder_config["target"] == "torch.nn.Linear":
-                self.context_embedder = torch.nn.Linear(**context_embedder_config["params"], dtype=dtype, device=device)
-
-        self.register_length = register_length
-        if self.register_length > 0:
-            self.register = torch.nn.Parameter(torch.randn(1, register_length, hidden_size, dtype=dtype, device=device))
-
-        # num_patches = self.x_embedder.num_patches
-        # Will use fixed sin-cos embedding:
-        # just use a buffer already
-        if num_patches is not None:
-            self.register_buffer(
-                "pos_embed",
-                torch.zeros(1, num_patches, hidden_size, dtype=dtype, device=device),
-            )
-        else:
-            self.pos_embed = None
-
-        self.joint_blocks = torch.nn.ModuleList(
-            [
-                JointBlock(
-                    hidden_size,
-                    num_heads,
-                    mlp_ratio=mlp_ratio,
-                    qkv_bias=qkv_bias,
-                    attn_mode=attn_mode,
-                    pre_only=i == depth - 1,
-                    rmsnorm=rmsnorm,
-                    scale_mod_only=scale_mod_only,
-                    swiglu=swiglu,
-                    qk_norm=qk_norm,
-                    x_block_self_attn=(i in self.x_block_self_attn_layers),
-                    dtype=dtype,
-                    device=device,
-                )
-                for i in range(depth)
-            ]
-        )
-
-        self.final_layer = FinalLayer(hidden_size, patch_size, self.out_channels, dtype=dtype, device=device)
-
-    def cropped_pos_embed(self, hw: torch.Size) -> torch.Tensor:
-        assert self.pos_embed_max_size is not None
-        p = self.x_embedder.patch_size[0]
-        h, w = hw
-        # patched size
-        h = h // p
-        w = w // p
-        assert h <= self.pos_embed_max_size, (h, self.pos_embed_max_size)
-        assert w <= self.pos_embed_max_size, (w, self.pos_embed_max_size)
-        top = (self.pos_embed_max_size - h) // 2
-        left = (self.pos_embed_max_size - w) // 2
-        spatial_pos_embed: torch.Tensor = rearrange(
-            self.pos_embed,
-            "1 (h w) c -> 1 h w c",
-            h=self.pos_embed_max_size,
-            w=self.pos_embed_max_size,
-        )  # type: ignore Type checking does not correctly infer the type of the self.pos_embed buffer.
-        spatial_pos_embed = spatial_pos_embed[:, top : top + h, left : left + w, :]
-        spatial_pos_embed = rearrange(spatial_pos_embed, "1 h w c -> 1 (h w) c")
-        return spatial_pos_embed
-
-    def unpatchify(self, x: torch.Tensor, hw: Optional[torch.Size] = None) -> torch.Tensor:
-        """
-        x: (N, T, patch_size**2 * C)
-        imgs: (N, H, W, C)
-        """
-        c = self.out_channels
-        p = self.x_embedder.patch_size[0]
-        if hw is None:
-            h = w = int(x.shape[1] ** 0.5)
-        else:
-            h, w = hw
-            h = h // p
-            w = w // p
-        assert h * w == x.shape[1]
-
-        x = x.reshape(shape=(x.shape[0], h, w, p, p, c))
-        x = torch.einsum("nhwpqc->nchpwq", x)
-        imgs = x.reshape(shape=(x.shape[0], c, h * p, w * p))
-        return imgs
-
-    def forward_core_with_concat(
-        self,
-        x: torch.Tensor,
-        c_mod: torch.Tensor,
-        context: Optional[torch.Tensor] = None,
-    ) -> torch.Tensor:
-        if self.register_length > 0:
-            context = torch.cat(
-                (
-                    repeat(self.register, "1 ... -> b ...", b=x.shape[0]),
-                    context if context is not None else torch.Tensor([]).type_as(x),
-                ),
-                1,
-            )
-
-        # context is B, L', D
-        # x is B, L, D
-        for block in self.joint_blocks:
-            context, x = block(context, x, c=c_mod)
-
-        x = self.final_layer(x, c_mod)  # (N, T, patch_size ** 2 * out_channels)
-        return x
-
-    def forward(
-        self,
-        x: torch.Tensor,
-        t: torch.Tensor,
-        y: Optional[torch.Tensor] = None,
-        context: Optional[torch.Tensor] = None,
-    ) -> torch.Tensor:
-        """
-        Forward pass of DiT.
-        x: (N, C, H, W) tensor of spatial inputs (images or latent representations of images)
-        t: (N,) tensor of diffusion timesteps
-        y: (N,) tensor of class labels
-        """
-        hw = x.shape[-2:]
-        x = self.x_embedder(x) + self.cropped_pos_embed(hw)
-        c = self.t_embedder(t, dtype=x.dtype)  # (N, D)
-        if y is not None:
-            y = self.y_embedder(y)  # (N, D)
-            c = c + y  # (N, D)
-
-        context = self.context_embedder(context)
-
-        x = self.forward_core_with_concat(x, c, context)
-
-        x = self.unpatchify(x, hw=hw)  # (N, out_channels, H, W)
-        return x
--- a/invokeai/backend/sd3/other_impls.py
+++ b/invokeai/backend/sd3/other_impls.py
@@ -1,795 +0,0 @@
-# This file was originally copied from:
-# https://github.com/Stability-AI/sd3.5/blob/19bf11c4e1e37324c5aa5a61f010d4127848a09c/other_impls.py
-
-### This file contains impls for underlying related models (CLIP, T5, etc)
-
-import math
-from typing import Callable, Optional
-
-import torch
-from transformers import CLIPTokenizer, T5TokenizerFast
-
-#################################################################################################
-### Core/Utility
-#################################################################################################
-
-
-def attention(
-    q: torch.Tensor, k: torch.Tensor, v: torch.Tensor, heads: int, mask: Optional[torch.Tensor] = None
-) -> torch.Tensor:
-    """Convenience wrapper around a basic attention operation"""
-    b, _, dim_head = q.shape
-    dim_head //= heads
-    q, k, v = map(lambda t: t.view(b, -1, heads, dim_head).transpose(1, 2), (q, k, v))
-    out = torch.nn.functional.scaled_dot_product_attention(q, k, v, attn_mask=mask, dropout_p=0.0, is_causal=False)
-    return out.transpose(1, 2).reshape(b, -1, heads * dim_head)
-
-
-class Mlp(torch.nn.Module):
-    """MLP as used in Vision Transformer, MLP-Mixer and related networks"""
-
-    def __init__(
-        self,
-        in_features: int,
-        hidden_features: Optional[int] = None,
-        out_features: Optional[int] = None,
-        act_layer: Callable[[torch.Tensor], torch.Tensor] | None = None,
-        bias: bool = True,
-        dtype: Optional[torch.dtype] = None,
-        device: Optional[torch.device] = None,
-    ):
-        super().__init__()
-        out_features = out_features or in_features
-        hidden_features = hidden_features or in_features
-        if act_layer is None:
-            act_layer = torch.nn.functional.gelu
-
-        self.fc1 = torch.nn.Linear(in_features, hidden_features, bias=bias, dtype=dtype, device=device)
-        self.act = act_layer
-        self.fc2 = torch.nn.Linear(hidden_features, out_features, bias=bias, dtype=dtype, device=device)
-
-    def forward(self, x: torch.Tensor) -> torch.Tensor:
-        x = self.fc1(x)
-        x = self.act(x)
-        x = self.fc2(x)
-        return x
-
-
-#################################################################################################
-### CLIP
-#################################################################################################
-
-
-class CLIPAttention(torch.nn.Module):
-    def __init__(self, embed_dim, heads, dtype, device):
-        super().__init__()
-        self.heads = heads
-        self.q_proj = torch.nn.Linear(embed_dim, embed_dim, bias=True, dtype=dtype, device=device)
-        self.k_proj = torch.nn.Linear(embed_dim, embed_dim, bias=True, dtype=dtype, device=device)
-        self.v_proj = torch.nn.Linear(embed_dim, embed_dim, bias=True, dtype=dtype, device=device)
-        self.out_proj = torch.nn.Linear(embed_dim, embed_dim, bias=True, dtype=dtype, device=device)
-
-    def forward(self, x, mask=None):
-        q = self.q_proj(x)
-        k = self.k_proj(x)
-        v = self.v_proj(x)
-        out = attention(q, k, v, self.heads, mask)
-        return self.out_proj(out)
-
-
-ACTIVATIONS = {
-    "quick_gelu": lambda a: a * torch.sigmoid(1.702 * a),
-    "gelu": torch.nn.functional.gelu,
-}
-
-
-class CLIPLayer(torch.nn.Module):
-    def __init__(
-        self,
-        embed_dim,
-        heads,
-        intermediate_size,
-        intermediate_activation,
-        dtype,
-        device,
-    ):
-        super().__init__()
-        self.layer_norm1 = torch.nn.LayerNorm(embed_dim, dtype=dtype, device=device)
-        self.self_attn = CLIPAttention(embed_dim, heads, dtype, device)
-        self.layer_norm2 = torch.nn.LayerNorm(embed_dim, dtype=dtype, device=device)
-        # self.mlp = CLIPMLP(embed_dim, intermediate_size, intermediate_activation, dtype, device)
-        self.mlp = Mlp(
-            embed_dim,
-            intermediate_size,
-            embed_dim,
-            act_layer=ACTIVATIONS[intermediate_activation],
-            dtype=dtype,
-            device=device,
-        )
-
-    def forward(self, x, mask=None):
-        x += self.self_attn(self.layer_norm1(x), mask)
-        x += self.mlp(self.layer_norm2(x))
-        return x
-
-
-class CLIPEncoder(torch.nn.Module):
-    def __init__(
-        self,
-        num_layers,
-        embed_dim,
-        heads,
-        intermediate_size,
-        intermediate_activation,
-        dtype,
-        device,
-    ):
-        super().__init__()
-        self.layers = torch.nn.ModuleList(
-            [
-                CLIPLayer(
-                    embed_dim,
-                    heads,
-                    intermediate_size,
-                    intermediate_activation,
-                    dtype,
-                    device,
-                )
-                for i in range(num_layers)
-            ]
-        )
-
-    def forward(self, x, mask=None, intermediate_output=None):
-        if intermediate_output is not None:
-            if intermediate_output < 0:
-                intermediate_output = len(self.layers) + intermediate_output
-        intermediate = None
-        for i, l in enumerate(self.layers):
-            x = l(x, mask)
-            if i == intermediate_output:
-                intermediate = x.clone()
-        return x, intermediate
-
-
-class CLIPEmbeddings(torch.nn.Module):
-    def __init__(self, embed_dim, vocab_size=49408, num_positions=77, dtype=None, device=None):
-        super().__init__()
-        self.token_embedding = torch.nn.Embedding(vocab_size, embed_dim, dtype=dtype, device=device)
-        self.position_embedding = torch.nn.Embedding(num_positions, embed_dim, dtype=dtype, device=device)
-
-    def forward(self, input_tokens):
-        return self.token_embedding(input_tokens) + self.position_embedding.weight
-
-
-class CLIPTextModel_(torch.nn.Module):
-    def __init__(self, config_dict, dtype, device):
-        num_layers = config_dict["num_hidden_layers"]
-        embed_dim = config_dict["hidden_size"]
-        heads = config_dict["num_attention_heads"]
-        intermediate_size = config_dict["intermediate_size"]
-        intermediate_activation = config_dict["hidden_act"]
-        super().__init__()
-        self.embeddings = CLIPEmbeddings(embed_dim, dtype=torch.float32, device=device)
-        self.encoder = CLIPEncoder(
-            num_layers,
-            embed_dim,
-            heads,
-            intermediate_size,
-            intermediate_activation,
-            dtype,
-            device,
-        )
-        self.final_layer_norm = torch.nn.LayerNorm(embed_dim, dtype=dtype, device=device)
-
-    def forward(self, input_tokens, intermediate_output=None, final_layer_norm_intermediate=True):
-        x = self.embeddings(input_tokens)
-        causal_mask = torch.empty(x.shape[1], x.shape[1], dtype=x.dtype, device=x.device).fill_(float("-inf")).triu_(1)
-        x, i = self.encoder(x, mask=causal_mask, intermediate_output=intermediate_output)
-        x = self.final_layer_norm(x)
-        if i is not None and final_layer_norm_intermediate:
-            i = self.final_layer_norm(i)
-        pooled_output = x[
-            torch.arange(x.shape[0], device=x.device),
-            input_tokens.to(dtype=torch.int, device=x.device).argmax(dim=-1),
-        ]
-        return x, i, pooled_output
-
-
-class CLIPTextModel(torch.nn.Module):
-    def __init__(self, config_dict, dtype, device):
-        super().__init__()
-        self.num_layers = config_dict["num_hidden_layers"]
-        self.text_model = CLIPTextModel_(config_dict, dtype, device)
-        embed_dim = config_dict["hidden_size"]
-        self.text_projection = torch.nn.Linear(embed_dim, embed_dim, bias=False, dtype=dtype, device=device)
-        self.text_projection.weight.copy_(torch.eye(embed_dim))
-        self.dtype = dtype
-
-    def get_input_embeddings(self):
-        return self.text_model.embeddings.token_embedding
-
-    def set_input_embeddings(self, embeddings):
-        self.text_model.embeddings.token_embedding = embeddings
-
-    def forward(self, *args, **kwargs):
-        x = self.text_model(*args, **kwargs)
-        out = self.text_projection(x[2])
-        return (x[0], x[1], out, x[2])
-
-
-def parse_parentheses(string):
-    result = []
-    current_item = ""
-    nesting_level = 0
-    for char in string:
-        if char == "(":
-            if nesting_level == 0:
-                if current_item:
-                    result.append(current_item)
-                    current_item = "("
-                else:
-                    current_item = "("
-            else:
-                current_item += char
-            nesting_level += 1
-        elif char == ")":
-            nesting_level -= 1
-            if nesting_level == 0:
-                result.append(current_item + ")")
-                current_item = ""
-            else:
-                current_item += char
-        else:
-            current_item += char
-    if current_item:
-        result.append(current_item)
-    return result
-
-
-def token_weights(string, current_weight):
-    a = parse_parentheses(string)
-    out = []
-    for x in a:
-        weight = current_weight
-        if len(x) >= 2 and x[-1] == ")" and x[0] == "(":
-            x = x[1:-1]
-            xx = x.rfind(":")
-            weight *= 1.1
-            if xx > 0:
-                try:
-                    weight = float(x[xx + 1 :])
-                    x = x[:xx]
-                except:
-                    pass
-            out += token_weights(x, weight)
-        else:
-            out += [(x, current_weight)]
-    return out
-
-
-def escape_important(text):
-    text = text.replace("\\)", "\0\1")
-    text = text.replace("\\(", "\0\2")
-    return text
-
-
-def unescape_important(text):
-    text = text.replace("\0\1", ")")
-    text = text.replace("\0\2", "(")
-    return text
-
-
-class SDTokenizer:
-    def __init__(
-        self,
-        max_length=77,
-        pad_with_end=True,
-        tokenizer=None,
-        has_start_token=True,
-        pad_to_max_length=True,
-        min_length=None,
-        extra_padding_token=None,
-    ):
-        self.tokenizer = tokenizer
-        self.max_length = max_length
-        self.min_length = min_length
-
-        empty = self.tokenizer("")["input_ids"]
-        if has_start_token:
-            self.tokens_start = 1
-            self.start_token = empty[0]
-            self.end_token = empty[1]
-        else:
-            self.tokens_start = 0
-            self.start_token = None
-            self.end_token = empty[0]
-        self.pad_with_end = pad_with_end
-        self.pad_to_max_length = pad_to_max_length
-        self.extra_padding_token = extra_padding_token
-
-        vocab = self.tokenizer.get_vocab()
-        self.inv_vocab = {v: k for k, v in vocab.items()}
-        self.max_word_length = 8
-
-    def tokenize_with_weights(self, text: str, return_word_ids=False):
-        """
-        Tokenize the text, with weight values - presume 1.0 for all and ignore other features here.
-        The details aren't relevant for a reference impl, and weights themselves has weak effect on SD3.
-        """
-        if self.pad_with_end:
-            pad_token = self.end_token
-        else:
-            pad_token = 0
-
-        text = escape_important(text)
-        parsed_weights = token_weights(text, 1.0)
-
-        # tokenize words
-        tokens = []
-        for weighted_segment, weight in parsed_weights:
-            to_tokenize = unescape_important(weighted_segment).replace("\n", " ").split(" ")
-            to_tokenize = [x for x in to_tokenize if x != ""]
-            for word in to_tokenize:
-                # parse word
-                tokens.append([(t, weight) for t in self.tokenizer(word)["input_ids"][self.tokens_start : -1]])
-
-        # reshape token array to CLIP input size
-        batched_tokens = []
-        batch = []
-        if self.start_token is not None:
-            batch.append((self.start_token, 1.0, 0))
-        batched_tokens.append(batch)
-        for i, t_group in enumerate(tokens):
-            # determine if we're going to try and keep the tokens in a single batch
-            is_large = len(t_group) >= self.max_word_length
-
-            while len(t_group) > 0:
-                if len(t_group) + len(batch) > self.max_length - 1:
-                    remaining_length = self.max_length - len(batch) - 1
-                    # break word in two and add end token
-                    if is_large:
-                        batch.extend([(t, w, i + 1) for t, w in t_group[:remaining_length]])
-                        batch.append((self.end_token, 1.0, 0))
-                        t_group = t_group[remaining_length:]
-                    # add end token and pad
-                    else:
-                        batch.append((self.end_token, 1.0, 0))
-                        if self.pad_to_max_length:
-                            batch.extend([(pad_token, 1.0, 0)] * (remaining_length))
-                    # start new batch
-                    batch = []
-                    if self.start_token is not None:
-                        batch.append((self.start_token, 1.0, 0))
-                    batched_tokens.append(batch)
-                else:
-                    batch.extend([(t, w, i + 1) for t, w in t_group])
-                    t_group = []
-
-        # pad extra padding token first befor getting to the end token
-        if self.extra_padding_token is not None:
-            batch.extend([(self.extra_padding_token, 1.0, 0)] * (self.min_length - len(batch) - 1))
-        # fill last batch
-        batch.append((self.end_token, 1.0, 0))
-        if self.pad_to_max_length:
-            batch.extend([(pad_token, 1.0, 0)] * (self.max_length - len(batch)))
-        if self.min_length is not None and len(batch) < self.min_length:
-            batch.extend([(pad_token, 1.0, 0)] * (self.min_length - len(batch)))
-
-        if not return_word_ids:
-            batched_tokens = [[(t, w) for t, w, _ in x] for x in batched_tokens]
-
-        return batched_tokens
-
-    def untokenize(self, token_weight_pair):
-        return list(map(lambda a: (a, self.inv_vocab[a[0]]), token_weight_pair))
-
-
-class SDXLClipGTokenizer(SDTokenizer):
-    def __init__(self, tokenizer):
-        super().__init__(pad_with_end=False, tokenizer=tokenizer)
-
-
-class SD3Tokenizer:
-    def __init__(self):
-        clip_tokenizer = CLIPTokenizer.from_pretrained("openai/clip-vit-large-patch14")
-        self.clip_l = SDTokenizer(tokenizer=clip_tokenizer)
-        self.clip_g = SDXLClipGTokenizer(clip_tokenizer)
-        self.t5xxl = T5XXLTokenizer()
-
-    def tokenize_with_weights(self, text: str):
-        out = {}
-        out["l"] = self.clip_l.tokenize_with_weights(text)
-        out["g"] = self.clip_g.tokenize_with_weights(text)
-        out["t5xxl"] = self.t5xxl.tokenize_with_weights(text[:226])
-        return out
-
-
-class ClipTokenWeightEncoder:
-    def encode_token_weights(self, token_weight_pairs):
-        tokens = list(map(lambda a: a[0], token_weight_pairs[0]))
-        out, pooled = self([tokens])
-        if pooled is not None:
-            first_pooled = pooled[0:1].cpu()
-        else:
-            first_pooled = pooled
-        output = [out[0:1]]
-        return torch.cat(output, dim=-2).cpu(), first_pooled
-
-
-class SDClipModel(torch.nn.Module, ClipTokenWeightEncoder):
-    """Uses the CLIP transformer encoder for text (from huggingface)"""
-
-    LAYERS = ["last", "pooled", "hidden"]
-
-    def __init__(
-        self,
-        device="cpu",
-        max_length=77,
-        layer="last",
-        layer_idx=None,
-        textmodel_json_config=None,
-        dtype=None,
-        model_class=CLIPTextModel,
-        special_tokens={"start": 49406, "end": 49407, "pad": 49407},
-        layer_norm_hidden_state=True,
-        return_projected_pooled=True,
-    ):
-        super().__init__()
-        assert layer in self.LAYERS
-        self.transformer = model_class(textmodel_json_config, dtype, device)
-        self.num_layers = self.transformer.num_layers
-        self.max_length = max_length
-        self.transformer = self.transformer.eval()
-        for param in self.parameters():
-            param.requires_grad = False
-        self.layer = layer
-        self.layer_idx = None
-        self.special_tokens = special_tokens
-        self.logit_scale = torch.nn.Parameter(torch.tensor(4.6055))
-        self.layer_norm_hidden_state = layer_norm_hidden_state
-        self.return_projected_pooled = return_projected_pooled
-        if layer == "hidden":
-            assert layer_idx is not None
-            assert abs(layer_idx) < self.num_layers
-            self.set_clip_options({"layer": layer_idx})
-        self.options_default = (
-            self.layer,
-            self.layer_idx,
-            self.return_projected_pooled,
-        )
-
-    def set_clip_options(self, options):
-        layer_idx = options.get("layer", self.layer_idx)
-        self.return_projected_pooled = options.get("projected_pooled", self.return_projected_pooled)
-        if layer_idx is None or abs(layer_idx) > self.num_layers:
-            self.layer = "last"
-        else:
-            self.layer = "hidden"
-            self.layer_idx = layer_idx
-
-    def forward(self, tokens):
-        backup_embeds = self.transformer.get_input_embeddings()
-        device = backup_embeds.weight.device
-        tokens = torch.LongTensor(tokens).to(device)
-        outputs = self.transformer(
-            tokens,
-            intermediate_output=self.layer_idx,
-            final_layer_norm_intermediate=self.layer_norm_hidden_state,
-        )
-        self.transformer.set_input_embeddings(backup_embeds)
-        if self.layer == "last":
-            z = outputs[0]
-        else:
-            z = outputs[1]
-        pooled_output = None
-        if len(outputs) >= 3:
-            if not self.return_projected_pooled and len(outputs) >= 4 and outputs[3] is not None:
-                pooled_output = outputs[3].float()
-            elif outputs[2] is not None:
-                pooled_output = outputs[2].float()
-        return z.float(), pooled_output
-
-
-class SDXLClipG(SDClipModel):
-    """Wraps the CLIP-G model into the SD-CLIP-Model interface"""
-
-    def __init__(self, config, device="cpu", layer="penultimate", layer_idx=None, dtype=None):
-        if layer == "penultimate":
-            layer = "hidden"
-            layer_idx = -2
-        super().__init__(
-            device=device,
-            layer=layer,
-            layer_idx=layer_idx,
-            textmodel_json_config=config,
-            dtype=dtype,
-            special_tokens={"start": 49406, "end": 49407, "pad": 0},
-            layer_norm_hidden_state=False,
-        )
-
-
-class T5XXLModel(SDClipModel):
-    """Wraps the T5-XXL model into the SD-CLIP-Model interface for convenience"""
-
-    def __init__(self, config, device="cpu", layer="last", layer_idx=None, dtype=None):
-        super().__init__(
-            device=device,
-            layer=layer,
-            layer_idx=layer_idx,
-            textmodel_json_config=config,
-            dtype=dtype,
-            special_tokens={"end": 1, "pad": 0},
-            model_class=T5,
-        )
-
-
-#################################################################################################
-### T5 implementation, for the T5-XXL text encoder portion, largely pulled from upstream impl
-#################################################################################################
-
-
-class T5XXLTokenizer(SDTokenizer):
-    """Wraps the T5 Tokenizer from HF into the SDTokenizer interface"""
-
-    def __init__(self):
-        super().__init__(
-            pad_with_end=False,
-            tokenizer=T5TokenizerFast.from_pretrained("google/t5-v1_1-xxl"),
-            has_start_token=False,
-            pad_to_max_length=False,
-            max_length=99999999,
-            min_length=77,
-        )
-
-
-class T5LayerNorm(torch.nn.Module):
-    def __init__(self, hidden_size, eps=1e-6, dtype=None, device=None):
-        super().__init__()
-        self.weight = torch.nn.Parameter(torch.ones(hidden_size, dtype=dtype, device=device))
-        self.variance_epsilon = eps
-
-    def forward(self, x):
-        variance = x.pow(2).mean(-1, keepdim=True)
-        x = x * torch.rsqrt(variance + self.variance_epsilon)
-        return self.weight.to(device=x.device, dtype=x.dtype) * x
-
-
-class T5DenseGatedActDense(torch.nn.Module):
-    def __init__(self, model_dim, ff_dim, dtype, device):
-        super().__init__()
-        self.wi_0 = torch.nn.Linear(model_dim, ff_dim, bias=False, dtype=dtype, device=device)
-        self.wi_1 = torch.nn.Linear(model_dim, ff_dim, bias=False, dtype=dtype, device=device)
-        self.wo = torch.nn.Linear(ff_dim, model_dim, bias=False, dtype=dtype, device=device)
-
-    def forward(self, x):
-        hidden_gelu = torch.nn.functional.gelu(self.wi_0(x), approximate="tanh")
-        hidden_linear = self.wi_1(x)
-        x = hidden_gelu * hidden_linear
-        x = self.wo(x)
-        return x
-
-
-class T5LayerFF(torch.nn.Module):
-    def __init__(self, model_dim, ff_dim, dtype, device):
-        super().__init__()
-        self.DenseReluDense = T5DenseGatedActDense(model_dim, ff_dim, dtype, device)
-        self.layer_norm = T5LayerNorm(model_dim, dtype=dtype, device=device)
-
-    def forward(self, x):
-        forwarded_states = self.layer_norm(x)
-        forwarded_states = self.DenseReluDense(forwarded_states)
-        x += forwarded_states
-        return x
-
-
-class T5Attention(torch.nn.Module):
-    def __init__(self, model_dim, inner_dim, num_heads, relative_attention_bias, dtype, device):
-        super().__init__()
-        # Mesh TensorFlow initialization to avoid scaling before softmax
-        self.q = torch.nn.Linear(model_dim, inner_dim, bias=False, dtype=dtype, device=device)
-        self.k = torch.nn.Linear(model_dim, inner_dim, bias=False, dtype=dtype, device=device)
-        self.v = torch.nn.Linear(model_dim, inner_dim, bias=False, dtype=dtype, device=device)
-        self.o = torch.nn.Linear(inner_dim, model_dim, bias=False, dtype=dtype, device=device)
-        self.num_heads = num_heads
-        self.relative_attention_bias = None
-        if relative_attention_bias:
-            self.relative_attention_num_buckets = 32
-            self.relative_attention_max_distance = 128
-            self.relative_attention_bias = torch.nn.Embedding(
-                self.relative_attention_num_buckets, self.num_heads, device=device
-            )
-
-    @staticmethod
-    def _relative_position_bucket(relative_position, bidirectional=True, num_buckets=32, max_distance=128):
-        """
-        Adapted from Mesh Tensorflow:
-        https://github.com/tensorflow/mesh/blob/0cb87fe07da627bf0b7e60475d59f95ed6b5be3d/mesh_tensorflow/transformer/transformer_layers.py#L593
-
-        Translate relative position to a bucket number for relative attention. The relative position is defined as
-        memory_position - query_position, i.e. the distance in tokens from the attending position to the attended-to
-        position. If bidirectional=False, then positive relative positions are invalid. We use smaller buckets for
-        small absolute relative_position and larger buckets for larger absolute relative_positions. All relative
-        positions >=max_distance map to the same bucket. All relative positions <=-max_distance map to the same bucket.
-        This should allow for more graceful generalization to longer sequences than the model has been trained on
-
-        Args:
-            relative_position: an int32 Tensor
-            bidirectional: a boolean - whether the attention is bidirectional
-            num_buckets: an integer
-            max_distance: an integer
-
-        Returns:
-            a Tensor with the same shape as relative_position, containing int32 values in the range [0, num_buckets)
-        """
-        relative_buckets = 0
-        if bidirectional:
-            num_buckets //= 2
-            relative_buckets += (relative_position > 0).to(torch.long) * num_buckets
-            relative_position = torch.abs(relative_position)
-        else:
-            relative_position = -torch.min(relative_position, torch.zeros_like(relative_position))
-        # now relative_position is in the range [0, inf)
-        # half of the buckets are for exact increments in positions
-        max_exact = num_buckets // 2
-        is_small = relative_position < max_exact
-        # The other half of the buckets are for logarithmically bigger bins in positions up to max_distance
-        relative_position_if_large = max_exact + (
-            torch.log(relative_position.float() / max_exact)
-            / math.log(max_distance / max_exact)
-            * (num_buckets - max_exact)
-        ).to(torch.long)
-        relative_position_if_large = torch.min(
-            relative_position_if_large,
-            torch.full_like(relative_position_if_large, num_buckets - 1),
-        )
-        relative_buckets += torch.where(is_small, relative_position, relative_position_if_large)
-        return relative_buckets
-
-    def compute_bias(self, query_length, key_length, device):
-        """Compute binned relative position bias"""
-        context_position = torch.arange(query_length, dtype=torch.long, device=device)[:, None]
-        memory_position = torch.arange(key_length, dtype=torch.long, device=device)[None, :]
-        relative_position = memory_position - context_position  # shape (query_length, key_length)
-        relative_position_bucket = self._relative_position_bucket(
-            relative_position,  # shape (query_length, key_length)
-            bidirectional=True,
-            num_buckets=self.relative_attention_num_buckets,
-            max_distance=self.relative_attention_max_distance,
-        )
-        values = self.relative_attention_bias(relative_position_bucket)  # shape (query_length, key_length, num_heads)
-        values = values.permute([2, 0, 1]).unsqueeze(0)  # shape (1, num_heads, query_length, key_length)
-        return values
-
-    def forward(self, x, past_bias=None):
-        q = self.q(x)
-        k = self.k(x)
-        v = self.v(x)
-        if self.relative_attention_bias is not None:
-            past_bias = self.compute_bias(x.shape[1], x.shape[1], x.device)
-        if past_bias is not None:
-            mask = past_bias
-        out = attention(q, k * ((k.shape[-1] / self.num_heads) ** 0.5), v, self.num_heads, mask)
-        return self.o(out), past_bias
-
-
-class T5LayerSelfAttention(torch.nn.Module):
-    def __init__(
-        self,
-        model_dim,
-        inner_dim,
-        ff_dim,
-        num_heads,
-        relative_attention_bias,
-        dtype,
-        device,
-    ):
-        super().__init__()
-        self.SelfAttention = T5Attention(model_dim, inner_dim, num_heads, relative_attention_bias, dtype, device)
-        self.layer_norm = T5LayerNorm(model_dim, dtype=dtype, device=device)
-
-    def forward(self, x, past_bias=None):
-        output, past_bias = self.SelfAttention(self.layer_norm(x), past_bias=past_bias)
-        x += output
-        return x, past_bias
-
-
-class T5Block(torch.nn.Module):
-    def __init__(
-        self,
-        model_dim,
-        inner_dim,
-        ff_dim,
-        num_heads,
-        relative_attention_bias,
-        dtype,
-        device,
-    ):
-        super().__init__()
-        self.layer = torch.nn.ModuleList()
-        self.layer.append(
-            T5LayerSelfAttention(
-                model_dim,
-                inner_dim,
-                ff_dim,
-                num_heads,
-                relative_attention_bias,
-                dtype,
-                device,
-            )
-        )
-        self.layer.append(T5LayerFF(model_dim, ff_dim, dtype, device))
-
-    def forward(self, x, past_bias=None):
-        x, past_bias = self.layer[0](x, past_bias)
-        x = self.layer[-1](x)
-        return x, past_bias
-
-
-class T5Stack(torch.nn.Module):
-    def __init__(
-        self,
-        num_layers,
-        model_dim,
-        inner_dim,
-        ff_dim,
-        num_heads,
-        vocab_size,
-        dtype,
-        device,
-    ):
-        super().__init__()
-        self.embed_tokens = torch.nn.Embedding(vocab_size, model_dim, device=device)
-        self.block = torch.nn.ModuleList(
-            [
-                T5Block(
-                    model_dim,
-                    inner_dim,
-                    ff_dim,
-                    num_heads,
-                    relative_attention_bias=(i == 0),
-                    dtype=dtype,
-                    device=device,
-                )
-                for i in range(num_layers)
-            ]
-        )
-        self.final_layer_norm = T5LayerNorm(model_dim, dtype=dtype, device=device)
-
-    def forward(self, input_ids, intermediate_output=None, final_layer_norm_intermediate=True):
-        intermediate = None
-        x = self.embed_tokens(input_ids)
-        past_bias = None
-        for i, l in enumerate(self.block):
-            x, past_bias = l(x, past_bias)
-            if i == intermediate_output:
-                intermediate = x.clone()
-        x = self.final_layer_norm(x)
-        if intermediate is not None and final_layer_norm_intermediate:
-            intermediate = self.final_layer_norm(intermediate)
-        return x, intermediate
-
-
-class T5(torch.nn.Module):
-    def __init__(self, config_dict, dtype, device):
-        super().__init__()
-        self.num_layers = config_dict["num_layers"]
-        self.encoder = T5Stack(
-            self.num_layers,
-            config_dict["d_model"],
-            config_dict["d_model"],
-            config_dict["d_ff"],
-            config_dict["num_heads"],
-            config_dict["vocab_size"],
-            dtype,
-            device,
-        )
-        self.dtype = dtype
-
-    def get_input_embeddings(self):
-        return self.encoder.embed_tokens
-
-    def set_input_embeddings(self, embeddings):
-        self.encoder.embed_tokens = embeddings
-
-    def forward(self, *args, **kwargs):
-        return self.encoder(*args, **kwargs)
--- a/invokeai/backend/sd3/sd3_impls.py
+++ b/invokeai/backend/sd3/sd3_impls.py
@@ -1,609 +0,0 @@
-# This file was originally copied from:
-# https://github.com/Stability-AI/sd3.5/blob/19bf11c4e1e37324c5aa5a61f010d4127848a09c/sd3_impls.py
-
-
-### Impls of the SD3 core diffusion model and VAE
-
-import math
-import re
-
-import einops
-import torch
-from PIL import Image
-from tqdm import tqdm
-
-from invokeai.backend.sd3.mmditx import MMDiTX
-
-#################################################################################################
-### MMDiT Model Wrapping
-#################################################################################################
-
-
-class ModelSamplingDiscreteFlow(torch.nn.Module):
-    """Helper for sampler scheduling (ie timestep/sigma calculations) for Discrete Flow models"""
-
-    def __init__(self, shift: float = 1.0):
-        super().__init__()
-        self.shift = shift
-        timesteps = 1000
-        ts = self.sigma(torch.arange(1, timesteps + 1, 1))
-        self.register_buffer("sigmas", ts)
-
-    @property
-    def sigma_min(self):
-        return self.sigmas[0]
-
-    @property
-    def sigma_max(self):
-        return self.sigmas[-1]
-
-    def timestep(self, sigma: torch.Tensor) -> torch.Tensor:
-        return sigma * 1000
-
-    def sigma(self, timestep: torch.Tensor):
-        timestep = timestep / 1000.0
-        if self.shift == 1.0:
-            return timestep
-        return self.shift * timestep / (1 + (self.shift - 1) * timestep)
-
-    def calculate_denoised(
-        self, sigma: torch.Tensor, model_output: torch.Tensor, model_input: torch.Tensor
-    ) -> torch.Tensor:
-        sigma = sigma.view(sigma.shape[:1] + (1,) * (model_output.ndim - 1))
-        return model_input - model_output * sigma
-
-    def noise_scaling(self, sigma, noise, latent_image, max_denoise=False):
-        return sigma * noise + (1.0 - sigma) * latent_image
-
-
-class BaseModel(torch.nn.Module):
-    """Wrapper around the core MM-DiT model"""
-
-    def __init__(
-        self,
-        shift=1.0,
-        device=None,
-        dtype=torch.float32,
-        file=None,
-        prefix="",
-        verbose=False,
-    ):
-        super().__init__()
-        # Important configuration values can be quickly determined by checking shapes in the source file
-        # Some of these will vary between models (eg 2B vs 8B primarily differ in their depth, but also other details change)
-        patch_size = file.get_tensor(f"{prefix}x_embedder.proj.weight").shape[2]
-        depth = file.get_tensor(f"{prefix}x_embedder.proj.weight").shape[0] // 64
-        num_patches = file.get_tensor(f"{prefix}pos_embed").shape[1]
-        pos_embed_max_size = round(math.sqrt(num_patches))
-        adm_in_channels = file.get_tensor(f"{prefix}y_embedder.mlp.0.weight").shape[1]
-        context_shape = file.get_tensor(f"{prefix}context_embedder.weight").shape
-        qk_norm = "rms" if f"{prefix}joint_blocks.0.context_block.attn.ln_k.weight" in file.keys() else None
-        x_block_self_attn_layers = sorted(
-            [
-                int(key.split(".x_block.attn2.ln_k.weight")[0].split(".")[-1])
-                for key in list(filter(re.compile(".*.x_block.attn2.ln_k.weight").match, file.keys()))
-            ]
-        )
-
-        context_embedder_config = {
-            "target": "torch.nn.Linear",
-            "params": {
-                "in_features": context_shape[1],
-                "out_features": context_shape[0],
-            },
-        }
-        self.diffusion_model = MMDiTX(
-            input_size=None,
-            pos_embed_scaling_factor=None,
-            pos_embed_offset=None,
-            pos_embed_max_size=pos_embed_max_size,
-            patch_size=patch_size,
-            in_channels=16,
-            depth=depth,
-            num_patches=num_patches,
-            adm_in_channels=adm_in_channels,
-            context_embedder_config=context_embedder_config,
-            qk_norm=qk_norm,
-            x_block_self_attn_layers=x_block_self_attn_layers,
-            device=device,
-            dtype=dtype,
-            verbose=verbose,
-        )
-        self.model_sampling = ModelSamplingDiscreteFlow(shift=shift)
-
-    def apply_model(
-        self, x: torch.Tensor, sigma: float, c_crossattn: torch.Tensor | None = None, y: torch.Tensor | None = None
-    ):
-        dtype = self.get_dtype()
-        timestep = self.model_sampling.timestep(sigma).float()
-        model_output = self.diffusion_model(x.to(dtype), timestep, context=c_crossattn.to(dtype), y=y.to(dtype)).float()
-        return self.model_sampling.calculate_denoised(sigma, model_output, x)
-
-    def forward(self, *args, **kwargs):
-        return self.apply_model(*args, **kwargs)
-
-    def get_dtype(self):
-        return self.diffusion_model.dtype
-
-
-class CFGDenoiser(torch.nn.Module):
-    """Helper for applying CFG Scaling to diffusion outputs"""
-
-    def __init__(self, model):
-        super().__init__()
-        self.model = model
-
-    def forward(self, x, timestep, cond, uncond, cond_scale):
-        # Run cond and uncond in a batch together
-        batched = self.model.apply_model(
-            torch.cat([x, x]),
-            torch.cat([timestep, timestep]),
-            c_crossattn=torch.cat([cond["c_crossattn"], uncond["c_crossattn"]]),
-            y=torch.cat([cond["y"], uncond["y"]]),
-        )
-        # Then split and apply CFG Scaling
-        pos_out, neg_out = batched.chunk(2)
-        scaled = neg_out + (pos_out - neg_out) * cond_scale
-        return scaled
-
-
-class SD3LatentFormat:
-    """Latents are slightly shifted from center - this class must be called after VAE Decode to correct for the shift"""
-
-    def __init__(self):
-        self.scale_factor = 1.5305
-        self.shift_factor = 0.0609
-
-    def process_in(self, latent):
-        return (latent - self.shift_factor) * self.scale_factor
-
-    def process_out(self, latent):
-        return (latent / self.scale_factor) + self.shift_factor
-
-    def decode_latent_to_preview(self, x0):
-        """Quick RGB approximate preview of sd3 latents"""
-        factors = torch.tensor(
-            [
-                [-0.0645, 0.0177, 0.1052],
-                [0.0028, 0.0312, 0.0650],
-                [0.1848, 0.0762, 0.0360],
-                [0.0944, 0.0360, 0.0889],
-                [0.0897, 0.0506, -0.0364],
-                [-0.0020, 0.1203, 0.0284],
-                [0.0855, 0.0118, 0.0283],
-                [-0.0539, 0.0658, 0.1047],
-                [-0.0057, 0.0116, 0.0700],
-                [-0.0412, 0.0281, -0.0039],
-                [0.1106, 0.1171, 0.1220],
-                [-0.0248, 0.0682, -0.0481],
-                [0.0815, 0.0846, 0.1207],
-                [-0.0120, -0.0055, -0.0867],
-                [-0.0749, -0.0634, -0.0456],
-                [-0.1418, -0.1457, -0.1259],
-            ],
-            device="cpu",
-        )
-        latent_image = x0[0].permute(1, 2, 0).cpu() @ factors
-
-        latents_ubyte = (
-            ((latent_image + 1) / 2)
-            .clamp(0, 1)  # change scale from -1..1 to 0..1
-            .mul(0xFF)  # to 0..255
-            .byte()
-        ).cpu()
-
-        return Image.fromarray(latents_ubyte.numpy())
-
-
-#################################################################################################
-### Samplers
-#################################################################################################
-
-
-def append_dims(x, target_dims):
-    """Appends dimensions to the end of a tensor until it has target_dims dimensions."""
-    dims_to_append = target_dims - x.ndim
-    return x[(...,) + (None,) * dims_to_append]
-
-
-def to_d(x, sigma, denoised):
-    """Converts a denoiser output to a Karras ODE derivative."""
-    return (x - denoised) / append_dims(sigma, x.ndim)
-
-
-@torch.no_grad()
-@torch.autocast("cuda", dtype=torch.float16)
-def sample_euler(model, x, sigmas, extra_args=None):
-    """Implements Algorithm 2 (Euler steps) from Karras et al. (2022)."""
-    extra_args = {} if extra_args is None else extra_args
-    s_in = x.new_ones([x.shape[0]])
-    for i in tqdm(range(len(sigmas) - 1)):
-        sigma_hat = sigmas[i]
-        denoised = model(x, sigma_hat * s_in, **extra_args)
-        d = to_d(x, sigma_hat, denoised)
-        dt = sigmas[i + 1] - sigma_hat
-        # Euler method
-        x = x + d * dt
-    return x
-
-
-@torch.no_grad()
-@torch.autocast("cuda", dtype=torch.float16)
-def sample_dpmpp_2m(model, x, sigmas, extra_args=None):
-    """DPM-Solver++(2M)."""
-    extra_args = {} if extra_args is None else extra_args
-    s_in = x.new_ones([x.shape[0]])
-    sigma_fn = lambda t: t.neg().exp()
-    t_fn = lambda sigma: sigma.log().neg()
-    old_denoised = None
-    for i in tqdm(range(len(sigmas) - 1)):
-        denoised = model(x, sigmas[i] * s_in, **extra_args)
-        t, t_next = t_fn(sigmas[i]), t_fn(sigmas[i + 1])
-        h = t_next - t
-        if old_denoised is None or sigmas[i + 1] == 0:
-            x = (sigma_fn(t_next) / sigma_fn(t)) * x - (-h).expm1() * denoised
-        else:
-            h_last = t - t_fn(sigmas[i - 1])
-            r = h_last / h
-            denoised_d = (1 + 1 / (2 * r)) * denoised - (1 / (2 * r)) * old_denoised
-            x = (sigma_fn(t_next) / sigma_fn(t)) * x - (-h).expm1() * denoised_d
-        old_denoised = denoised
-    return x
-
-
-#################################################################################################
-### VAE
-#################################################################################################
-
-
-def Normalize(in_channels, num_groups=32, dtype=torch.float32, device=None):
-    return torch.nn.GroupNorm(
-        num_groups=num_groups,
-        num_channels=in_channels,
-        eps=1e-6,
-        affine=True,
-        dtype=dtype,
-        device=device,
-    )
-
-
-class ResnetBlock(torch.nn.Module):
-    def __init__(self, *, in_channels, out_channels=None, dtype=torch.float32, device=None):
-        super().__init__()
-        self.in_channels = in_channels
-        out_channels = in_channels if out_channels is None else out_channels
-        self.out_channels = out_channels
-
-        self.norm1 = Normalize(in_channels, dtype=dtype, device=device)
-        self.conv1 = torch.nn.Conv2d(
-            in_channels,
-            out_channels,
-            kernel_size=3,
-            stride=1,
-            padding=1,
-            dtype=dtype,
-            device=device,
-        )
-        self.norm2 = Normalize(out_channels, dtype=dtype, device=device)
-        self.conv2 = torch.nn.Conv2d(
-            out_channels,
-            out_channels,
-            kernel_size=3,
-            stride=1,
-            padding=1,
-            dtype=dtype,
-            device=device,
-        )
-        if self.in_channels != self.out_channels:
-            self.nin_shortcut = torch.nn.Conv2d(
-                in_channels,
-                out_channels,
-                kernel_size=1,
-                stride=1,
-                padding=0,
-                dtype=dtype,
-                device=device,
-            )
-        else:
-            self.nin_shortcut = None
-        self.swish = torch.nn.SiLU(inplace=True)
-
-    def forward(self, x):
-        hidden = x
-        hidden = self.norm1(hidden)
-        hidden = self.swish(hidden)
-        hidden = self.conv1(hidden)
-        hidden = self.norm2(hidden)
-        hidden = self.swish(hidden)
-        hidden = self.conv2(hidden)
-        if self.in_channels != self.out_channels:
-            x = self.nin_shortcut(x)
-        return x + hidden
-
-
-class AttnBlock(torch.nn.Module):
-    def __init__(self, in_channels, dtype=torch.float32, device=None):
-        super().__init__()
-        self.norm = Normalize(in_channels, dtype=dtype, device=device)
-        self.q = torch.nn.Conv2d(
-            in_channels,
-            in_channels,
-            kernel_size=1,
-            stride=1,
-            padding=0,
-            dtype=dtype,
-            device=device,
-        )
-        self.k = torch.nn.Conv2d(
-            in_channels,
-            in_channels,
-            kernel_size=1,
-            stride=1,
-            padding=0,
-            dtype=dtype,
-            device=device,
-        )
-        self.v = torch.nn.Conv2d(
-            in_channels,
-            in_channels,
-            kernel_size=1,
-            stride=1,
-            padding=0,
-            dtype=dtype,
-            device=device,
-        )
-        self.proj_out = torch.nn.Conv2d(
-            in_channels,
-            in_channels,
-            kernel_size=1,
-            stride=1,
-            padding=0,
-            dtype=dtype,
-            device=device,
-        )
-
-    def forward(self, x):
-        hidden = self.norm(x)
-        q = self.q(hidden)
-        k = self.k(hidden)
-        v = self.v(hidden)
-        b, c, h, w = q.shape
-        q, k, v = map(
-            lambda x: einops.rearrange(x, "b c h w -> b 1 (h w) c").contiguous(),
-            (q, k, v),
-        )
-        hidden = torch.nn.functional.scaled_dot_product_attention(q, k, v)  # scale is dim ** -0.5 per default
-        hidden = einops.rearrange(hidden, "b 1 (h w) c -> b c h w", h=h, w=w, c=c, b=b)
-        hidden = self.proj_out(hidden)
-        return x + hidden
-
-
-class Downsample(torch.nn.Module):
-    def __init__(self, in_channels, dtype=torch.float32, device=None):
-        super().__init__()
-        self.conv = torch.nn.Conv2d(
-            in_channels,
-            in_channels,
-            kernel_size=3,
-            stride=2,
-            padding=0,
-            dtype=dtype,
-            device=device,
-        )
-
-    def forward(self, x):
-        pad = (0, 1, 0, 1)
-        x = torch.nn.functional.pad(x, pad, mode="constant", value=0)
-        x = self.conv(x)
-        return x
-
-
-class Upsample(torch.nn.Module):
-    def __init__(self, in_channels, dtype=torch.float32, device=None):
-        super().__init__()
-        self.conv = torch.nn.Conv2d(
-            in_channels,
-            in_channels,
-            kernel_size=3,
-            stride=1,
-            padding=1,
-            dtype=dtype,
-            device=device,
-        )
-
-    def forward(self, x):
-        x = torch.nn.functional.interpolate(x, scale_factor=2.0, mode="nearest")
-        x = self.conv(x)
-        return x
-
-
-class VAEEncoder(torch.nn.Module):
-    def __init__(
-        self,
-        ch=128,
-        ch_mult=(1, 2, 4, 4),
-        num_res_blocks=2,
-        in_channels=3,
-        z_channels=16,
-        dtype=torch.float32,
-        device=None,
-    ):
-        super().__init__()
-        self.num_resolutions = len(ch_mult)
-        self.num_res_blocks = num_res_blocks
-        # downsampling
-        self.conv_in = torch.nn.Conv2d(
-            in_channels,
-            ch,
-            kernel_size=3,
-            stride=1,
-            padding=1,
-            dtype=dtype,
-            device=device,
-        )
-        in_ch_mult = (1,) + tuple(ch_mult)
-        self.in_ch_mult = in_ch_mult
-        self.down = torch.nn.ModuleList()
-        for i_level in range(self.num_resolutions):
-            block = torch.nn.ModuleList()
-            attn = torch.nn.ModuleList()
-            block_in = ch * in_ch_mult[i_level]
-            block_out = ch * ch_mult[i_level]
-            for i_block in range(num_res_blocks):
-                block.append(
-                    ResnetBlock(
-                        in_channels=block_in,
-                        out_channels=block_out,
-                        dtype=dtype,
-                        device=device,
-                    )
-                )
-                block_in = block_out
-            down = torch.nn.Module()
-            down.block = block
-            down.attn = attn
-            if i_level != self.num_resolutions - 1:
-                down.downsample = Downsample(block_in, dtype=dtype, device=device)
-            self.down.append(down)
-        # middle
-        self.mid = torch.nn.Module()
-        self.mid.block_1 = ResnetBlock(in_channels=block_in, out_channels=block_in, dtype=dtype, device=device)
-        self.mid.attn_1 = AttnBlock(block_in, dtype=dtype, device=device)
-        self.mid.block_2 = ResnetBlock(in_channels=block_in, out_channels=block_in, dtype=dtype, device=device)
-        # end
-        self.norm_out = Normalize(block_in, dtype=dtype, device=device)
-        self.conv_out = torch.nn.Conv2d(
-            block_in,
-            2 * z_channels,
-            kernel_size=3,
-            stride=1,
-            padding=1,
-            dtype=dtype,
-            device=device,
-        )
-        self.swish = torch.nn.SiLU(inplace=True)
-
-    def forward(self, x):
-        # downsampling
-        hs = [self.conv_in(x)]
-        for i_level in range(self.num_resolutions):
-            for i_block in range(self.num_res_blocks):
-                h = self.down[i_level].block[i_block](hs[-1])
-                hs.append(h)
-            if i_level != self.num_resolutions - 1:
-                hs.append(self.down[i_level].downsample(hs[-1]))
-        # middle
-        h = hs[-1]
-        h = self.mid.block_1(h)
-        h = self.mid.attn_1(h)
-        h = self.mid.block_2(h)
-        # end
-        h = self.norm_out(h)
-        h = self.swish(h)
-        h = self.conv_out(h)
-        return h
-
-
-class VAEDecoder(torch.nn.Module):
-    def __init__(
-        self,
-        ch=128,
-        out_ch=3,
-        ch_mult=(1, 2, 4, 4),
-        num_res_blocks=2,
-        resolution=256,
-        z_channels=16,
-        dtype=torch.float32,
-        device=None,
-    ):
-        super().__init__()
-        self.num_resolutions = len(ch_mult)
-        self.num_res_blocks = num_res_blocks
-        block_in = ch * ch_mult[self.num_resolutions - 1]
-        curr_res = resolution // 2 ** (self.num_resolutions - 1)
-        # z to block_in
-        self.conv_in = torch.nn.Conv2d(
-            z_channels,
-            block_in,
-            kernel_size=3,
-            stride=1,
-            padding=1,
-            dtype=dtype,
-            device=device,
-        )
-        # middle
-        self.mid = torch.nn.Module()
-        self.mid.block_1 = ResnetBlock(in_channels=block_in, out_channels=block_in, dtype=dtype, device=device)
-        self.mid.attn_1 = AttnBlock(block_in, dtype=dtype, device=device)
-        self.mid.block_2 = ResnetBlock(in_channels=block_in, out_channels=block_in, dtype=dtype, device=device)
-        # upsampling
-        self.up = torch.nn.ModuleList()
-        for i_level in reversed(range(self.num_resolutions)):
-            block = torch.nn.ModuleList()
-            block_out = ch * ch_mult[i_level]
-            for i_block in range(self.num_res_blocks + 1):
-                block.append(
-                    ResnetBlock(
-                        in_channels=block_in,
-                        out_channels=block_out,
-                        dtype=dtype,
-                        device=device,
-                    )
-                )
-                block_in = block_out
-            up = torch.nn.Module()
-            up.block = block
-            if i_level != 0:
-                up.upsample = Upsample(block_in, dtype=dtype, device=device)
-                curr_res = curr_res * 2
-            self.up.insert(0, up)  # prepend to get consistent order
-        # end
-        self.norm_out = Normalize(block_in, dtype=dtype, device=device)
-        self.conv_out = torch.nn.Conv2d(
-            block_in,
-            out_ch,
-            kernel_size=3,
-            stride=1,
-            padding=1,
-            dtype=dtype,
-            device=device,
-        )
-        self.swish = torch.nn.SiLU(inplace=True)
-
-    def forward(self, z):
-        # z to block_in
-        hidden = self.conv_in(z)
-        # middle
-        hidden = self.mid.block_1(hidden)
-        hidden = self.mid.attn_1(hidden)
-        hidden = self.mid.block_2(hidden)
-        # upsampling
-        for i_level in reversed(range(self.num_resolutions)):
-            for i_block in range(self.num_res_blocks + 1):
-                hidden = self.up[i_level].block[i_block](hidden)
-            if i_level != 0:
-                hidden = self.up[i_level].upsample(hidden)
-        # end
-        hidden = self.norm_out(hidden)
-        hidden = self.swish(hidden)
-        hidden = self.conv_out(hidden)
-        return hidden
-
-
-class SDVAE(torch.nn.Module):
-    def __init__(self, dtype=torch.float32, device=None):
-        super().__init__()
-        self.encoder = VAEEncoder(dtype=dtype, device=device)
-        self.decoder = VAEDecoder(dtype=dtype, device=device)
-
-    @torch.autocast("cuda", dtype=torch.float16)
-    def decode(self, latent):
-        return self.decoder(latent)
-
-    @torch.autocast("cuda", dtype=torch.float16)
-    def encode(self, image):
-        hidden = self.encoder(image)
-        mean, logvar = torch.chunk(hidden, 2, dim=1)
-        logvar = torch.clamp(logvar, -30.0, 20.0)
-        std = torch.exp(0.5 * logvar)
-        return mean + std * torch.randn_like(mean)
--- a/invokeai/backend/sd3/sd3_infer.py
+++ b/invokeai/backend/sd3/sd3_infer.py
@@ -1,426 +0,0 @@
-# This file was originally copied from:
-# https://github.com/Stability-AI/sd3.5/blob/19bf11c4e1e37324c5aa5a61f010d4127848a09c/sd3_infer.py
-
-# NOTE: Must have folder `models` with the following files:
-# - `clip_g.safetensors` (openclip bigG, same as SDXL)
-# - `clip_l.safetensors` (OpenAI CLIP-L, same as SDXL)
-# - `t5xxl.safetensors` (google T5-v1.1-XXL)
-# - `sd3_medium.safetensors` (or whichever main MMDiT model file)
-# Also can have
-# - `sd3_vae.safetensors` (holds the VAE separately if needed)
-
-import datetime
-import math
-import os
-
-import fire
-import numpy as np
-import sd3_impls
-import torch
-from other_impls import SD3Tokenizer, SDClipModel, SDXLClipG, T5XXLModel
-from PIL import Image
-from safetensors import safe_open
-from sd3_impls import SDVAE, BaseModel, CFGDenoiser, SD3LatentFormat
-from tqdm import tqdm
-
-#################################################################################################
-### Wrappers for model parts
-#################################################################################################
-
-
-def load_into(f, model, prefix, device, dtype=None):
-    """Just a debugging-friendly hack to apply the weights in a safetensors file to the pytorch module."""
-    for key in f.keys():
-        if key.startswith(prefix) and not key.startswith("loss."):
-            path = key[len(prefix) :].split(".")
-            obj = model
-            for p in path:
-                if obj is list:
-                    obj = obj[int(p)]
-                else:
-                    obj = getattr(obj, p, None)
-                    if obj is None:
-                        print(f"Skipping key '{key}' in safetensors file as '{p}' does not exist in python model")
-                        break
-            if obj is None:
-                continue
-            try:
-                tensor = f.get_tensor(key).to(device=device)
-                if dtype is not None:
-                    tensor = tensor.to(dtype=dtype)
-                obj.requires_grad_(False)
-                obj.set_(tensor)
-            except Exception as e:
-                print(f"Failed to load key '{key}' in safetensors file: {e}")
-                raise e
-
-
-CLIPG_CONFIG = {
-    "hidden_act": "gelu",
-    "hidden_size": 1280,
-    "intermediate_size": 5120,
-    "num_attention_heads": 20,
-    "num_hidden_layers": 32,
-}
-
-
-class ClipG:
-    def __init__(self):
-        with safe_open("models/clip_g.safetensors", framework="pt", device="cpu") as f:
-            self.model = SDXLClipG(CLIPG_CONFIG, device="cpu", dtype=torch.float32)
-            load_into(f, self.model.transformer, "", "cpu", torch.float32)
-
-
-CLIPL_CONFIG = {
-    "hidden_act": "quick_gelu",
-    "hidden_size": 768,
-    "intermediate_size": 3072,
-    "num_attention_heads": 12,
-    "num_hidden_layers": 12,
-}
-
-
-class ClipL:
-    def __init__(self):
-        with safe_open("models/clip_l.safetensors", framework="pt", device="cpu") as f:
-            self.model = SDClipModel(
-                layer="hidden",
-                layer_idx=-2,
-                device="cpu",
-                dtype=torch.float32,
-                layer_norm_hidden_state=False,
-                return_projected_pooled=False,
-                textmodel_json_config=CLIPL_CONFIG,
-            )
-            load_into(f, self.model.transformer, "", "cpu", torch.float32)
-
-
-T5_CONFIG = {
-    "d_ff": 10240,
-    "d_model": 4096,
-    "num_heads": 64,
-    "num_layers": 24,
-    "vocab_size": 32128,
-}
-
-
-class T5XXL:
-    def __init__(self):
-        with safe_open("models/t5xxl.safetensors", framework="pt", device="cpu") as f:
-            self.model = T5XXLModel(T5_CONFIG, device="cpu", dtype=torch.float32)
-            load_into(f, self.model.transformer, "", "cpu", torch.float32)
-
-
-class SD3:
-    def __init__(self, model, shift, verbose=False):
-        with safe_open(model, framework="pt", device="cpu") as f:
-            self.model = BaseModel(
-                shift=shift,
-                file=f,
-                prefix="model.diffusion_model.",
-                device="cpu",
-                dtype=torch.float16,
-                verbose=verbose,
-            ).eval()
-            load_into(f, self.model, "model.", "cpu", torch.float16)
-
-
-class VAE:
-    def __init__(self, model):
-        with safe_open(model, framework="pt", device="cpu") as f:
-            self.model = SDVAE(device="cpu", dtype=torch.float16).eval().cpu()
-            prefix = ""
-            if any(k.startswith("first_stage_model.") for k in f.keys()):
-                prefix = "first_stage_model."
-            load_into(f, self.model, prefix, "cpu", torch.float16)
-
-
-#################################################################################################
-### Main inference logic
-#################################################################################################
-
-
-# Note: Sigma shift value, publicly released models use 3.0
-SHIFT = 3.0
-# Naturally, adjust to the width/height of the model you have
-WIDTH = 1024
-HEIGHT = 1024
-# Pick your prompt
-PROMPT = "a photo of a cat"
-# Most models prefer the range of 4-5, but still work well around 7
-CFG_SCALE = 4.5
-# Different models want different step counts but most will be good at 50, albeit that's slow to run
-# sd3_medium is quite decent at 28 steps
-STEPS = 40
-# Seed
-SEED = 23
-# SEEDTYPE = "fixed"
-SEEDTYPE = "rand"
-# SEEDTYPE = "roll"
-# Actual model file path
-# MODEL = "models/sd3_medium.safetensors"
-# MODEL = "models/sd3.5_large_turbo.safetensors"
-MODEL = "models/sd3.5_large.safetensors"
-# VAE model file path, or set None to use the same model file
-VAEFile = None  # "models/sd3_vae.safetensors"
-# Optional init image file path
-INIT_IMAGE = None
-# If init_image is given, this is the percentage of denoising steps to run (1.0 = full denoise, 0.0 = no denoise at all)
-DENOISE = 0.6
-# Output file path
-OUTDIR = "outputs"
-# SAMPLER
-# SAMPLER = "euler"
-SAMPLER = "dpmpp_2m"
-
-
-class SD3Inferencer:
-    def print(self, txt):
-        if self.verbose:
-            print(txt)
-
-    def load(self, model=MODEL, vae=VAEFile, shift=SHIFT, verbose=False):
-        self.verbose = verbose
-        print("Loading tokenizers...")
-        # NOTE: if you need a reference impl for a high performance CLIP tokenizer instead of just using the HF transformers one,
-        # check https://github.com/Stability-AI/StableSwarmUI/blob/master/src/Utils/CliplikeTokenizer.cs
-        # (T5 tokenizer is different though)
-        self.tokenizer = SD3Tokenizer()
-        print("Loading OpenAI CLIP L...")
-        self.clip_l = ClipL()
-        print("Loading OpenCLIP bigG...")
-        self.clip_g = ClipG()
-        print("Loading Google T5-v1-XXL...")
-        self.t5xxl = T5XXL()
-        print(f"Loading SD3 model {os.path.basename(model)}...")
-        self.sd3 = SD3(model, shift, verbose)
-        print("Loading VAE model...")
-        self.vae = VAE(vae or model)
-        print("Models loaded.")
-
-    def get_empty_latent(self, width, height):
-        self.print("Prep an empty latent...")
-        return torch.ones(1, 16, height // 8, width // 8, device="cpu") * 0.0609
-
-    def get_sigmas(self, sampling, steps):
-        start = sampling.timestep(sampling.sigma_max)
-        end = sampling.timestep(sampling.sigma_min)
-        timesteps = torch.linspace(start, end, steps)
-        sigs = []
-        for x in range(len(timesteps)):
-            ts = timesteps[x]
-            sigs.append(sampling.sigma(ts))
-        sigs += [0.0]
-        return torch.FloatTensor(sigs)
-
-    def get_noise(self, seed, latent):
-        generator = torch.manual_seed(seed)
-        self.print(f"dtype = {latent.dtype}, layout = {latent.layout}, device = {latent.device}")
-        return torch.randn(
-            latent.size(),
-            dtype=torch.float32,
-            layout=latent.layout,
-            generator=generator,
-            device="cpu",
-        ).to(latent.dtype)
-
-    def get_cond(self, prompt):
-        self.print("Encode prompt...")
-        tokens = self.tokenizer.tokenize_with_weights(prompt)
-        l_out, l_pooled = self.clip_l.model.encode_token_weights(tokens["l"])
-        g_out, g_pooled = self.clip_g.model.encode_token_weights(tokens["g"])
-        t5_out, t5_pooled = self.t5xxl.model.encode_token_weights(tokens["t5xxl"])
-        lg_out = torch.cat([l_out, g_out], dim=-1)
-        lg_out = torch.nn.functional.pad(lg_out, (0, 4096 - lg_out.shape[-1]))
-        return torch.cat([lg_out, t5_out], dim=-2), torch.cat((l_pooled, g_pooled), dim=-1)
-
-    def max_denoise(self, sigmas):
-        max_sigma = float(self.sd3.model.model_sampling.sigma_max)
-        sigma = float(sigmas[0])
-        return math.isclose(max_sigma, sigma, rel_tol=1e-05) or sigma > max_sigma
-
-    def fix_cond(self, cond):
-        cond, pooled = (cond[0].half().cuda(), cond[1].half().cuda())
-        return {"c_crossattn": cond, "y": pooled}
-
-    def do_sampling(
-        self,
-        latent,
-        seed,
-        conditioning,
-        neg_cond,
-        steps,
-        cfg_scale,
-        sampler="dpmpp_2m",
-        denoise=1.0,
-    ) -> torch.Tensor:
-        self.print("Sampling...")
-        latent = latent.half().cuda()
-        self.sd3.model = self.sd3.model.cuda()
-        noise = self.get_noise(seed, latent).cuda()
-        sigmas = self.get_sigmas(self.sd3.model.model_sampling, steps).cuda()
-        sigmas = sigmas[int(steps * (1 - denoise)) :]
-        conditioning = self.fix_cond(conditioning)
-        neg_cond = self.fix_cond(neg_cond)
-        extra_args = {"cond": conditioning, "uncond": neg_cond, "cond_scale": cfg_scale}
-        noise_scaled = self.sd3.model.model_sampling.noise_scaling(sigmas[0], noise, latent, self.max_denoise(sigmas))
-        sample_fn = getattr(sd3_impls, f"sample_{sampler}")
-        latent = sample_fn(CFGDenoiser(self.sd3.model), noise_scaled, sigmas, extra_args=extra_args)
-        latent = SD3LatentFormat().process_out(latent)
-        self.sd3.model = self.sd3.model.cpu()
-        self.print("Sampling done")
-        return latent
-
-    def vae_encode(self, image) -> torch.Tensor:
-        self.print("Encoding image to latent...")
-        image = image.convert("RGB")
-        image_np = np.array(image).astype(np.float32) / 255.0
-        image_np = np.moveaxis(image_np, 2, 0)
-        batch_images = np.expand_dims(image_np, axis=0).repeat(1, axis=0)
-        image_torch = torch.from_numpy(batch_images)
-        image_torch = 2.0 * image_torch - 1.0
-        image_torch = image_torch.cuda()
-        self.vae.model = self.vae.model.cuda()
-        latent = self.vae.model.encode(image_torch).cpu()
-        self.vae.model = self.vae.model.cpu()
-        self.print("Encoded")
-        return latent
-
-    def vae_decode(self, latent) -> Image.Image:
-        self.print("Decoding latent to image...")
-        latent = latent.cuda()
-        self.vae.model = self.vae.model.cuda()
-        image = self.vae.model.decode(latent)
-        image = image.float()
-        self.vae.model = self.vae.model.cpu()
-        image = torch.clamp((image + 1.0) / 2.0, min=0.0, max=1.0)[0]
-        decoded_np = 255.0 * np.moveaxis(image.cpu().numpy(), 0, 2)
-        decoded_np = decoded_np.astype(np.uint8)
-        out_image = Image.fromarray(decoded_np)
-        self.print("Decoded")
-        return out_image
-
-    def gen_image(
-        self,
-        prompts=[PROMPT],
-        width=WIDTH,
-        height=HEIGHT,
-        steps=STEPS,
-        cfg_scale=CFG_SCALE,
-        sampler=SAMPLER,
-        seed=SEED,
-        seed_type=SEEDTYPE,
-        out_dir=OUTDIR,
-        init_image=INIT_IMAGE,
-        denoise=DENOISE,
-    ):
-        latent = self.get_empty_latent(width, height)
-        if init_image:
-            image_data = Image.open(init_image)
-            image_data = image_data.resize((width, height), Image.LANCZOS)
-            latent = self.vae_encode(image_data)
-            latent = SD3LatentFormat().process_in(latent)
-        neg_cond = self.get_cond("")
-        seed_num = None
-        pbar = tqdm(enumerate(prompts), total=len(prompts), position=0, leave=True)
-        for i, prompt in pbar:
-            if seed_type == "roll":
-                seed_num = seed if seed_num is None else seed_num + 1
-            elif seed_type == "rand":
-                seed_num = torch.randint(0, 100000, (1,)).item()
-            else:  # fixed
-                seed_num = seed
-            conditioning = self.get_cond(prompt)
-            sampled_latent = self.do_sampling(
-                latent,
-                seed_num,
-                conditioning,
-                neg_cond,
-                steps,
-                cfg_scale,
-                sampler,
-                denoise if init_image else 1.0,
-            )
-            image = self.vae_decode(sampled_latent)
-            save_path = os.path.join(out_dir, f"{i:06d}.png")
-            self.print(f"Will save to {save_path}")
-            image.save(save_path)
-            self.print("Done")
-
-
-CONFIGS = {
-    "sd3_medium": {
-        "shift": 1.0,
-        "cfg": 5.0,
-        "steps": 50,
-        "sampler": "dpmpp_2m",
-    },
-    "sd3.5_large": {
-        "shift": 3.0,
-        "cfg": 4.5,
-        "steps": 40,
-        "sampler": "dpmpp_2m",
-    },
-    "sd3.5_large_turbo": {"shift": 3.0, "cfg": 1.0, "steps": 4, "sampler": "euler"},
-}
-
-
-@torch.no_grad()
-def main(
-    prompt=PROMPT,
-    model=MODEL,
-    out_dir=OUTDIR,
-    postfix=None,
-    seed=SEED,
-    seed_type=SEEDTYPE,
-    sampler=None,
-    steps=None,
-    cfg=None,
-    shift=None,
-    width=WIDTH,
-    height=HEIGHT,
-    vae=VAEFile,
-    init_image=INIT_IMAGE,
-    denoise=DENOISE,
-    verbose=False,
-):
-    steps = steps or CONFIGS[os.path.splitext(os.path.basename(model))[0]]["steps"]
-    cfg = cfg or CONFIGS[os.path.splitext(os.path.basename(model))[0]]["cfg"]
-    shift = shift or CONFIGS[os.path.splitext(os.path.basename(model))[0]]["shift"]
-    sampler = sampler or CONFIGS[os.path.splitext(os.path.basename(model))[0]]["sampler"]
-
-    inferencer = SD3Inferencer()
-    inferencer.load(model, vae, shift, verbose)
-
-    if isinstance(prompt, str):
-        if os.path.splitext(prompt)[-1] == ".txt":
-            with open(prompt, "r") as f:
-                prompts = [l.strip() for l in f.readlines()]
-        else:
-            prompts = [prompt]
-
-    out_dir = os.path.join(
-        out_dir,
-        os.path.splitext(os.path.basename(model))[0],
-        os.path.splitext(os.path.basename(prompt))[0][:50]
-        + (postfix or datetime.datetime.now().strftime("_%Y-%m-%dT%H-%M-%S")),
-    )
-    print(f"Saving images to {out_dir}")
-    os.makedirs(out_dir, exist_ok=False)
-
-    inferencer.gen_image(
-        prompts,
-        width,
-        height,
-        steps,
-        cfg,
-        sampler,
-        seed,
-        seed_type,
-        out_dir,
-        init_image,
-        denoise,
-    )
-
-
-fire.Fire(main)
--- a/invokeai/backend/sd3/sd3_mmditx.py
+++ b/invokeai/backend/sd3/sd3_mmditx.py
@@ -1,72 +0,0 @@
-from dataclasses import dataclass
-from typing import Literal, TypedDict
-
-import torch
-
-from invokeai.backend.sd3.mmditx import MMDiTX
-from invokeai.backend.sd3.sd3_impls import ModelSamplingDiscreteFlow
-
-
-class ContextEmbedderConfig(TypedDict):
-    target: Literal["torch.nn.Linear"]
-    params: dict[str, int]
-
-
-@dataclass
-class Sd3MMDiTXParams:
-    patch_size: int
-    depth: int
-    num_patches: int
-    pos_embed_max_size: int
-    adm_in_channels: int
-    context_shape: tuple[int, int]
-    qk_norm: Literal["rms", None]
-    x_block_self_attn_layers: list[int]
-    context_embedder_config: ContextEmbedderConfig
-
-
-class Sd3MMDiTX(torch.nn.Module):
-    """This class is based closely on
-    https://github.com/Stability-AI/sd3.5/blob/19bf11c4e1e37324c5aa5a61f010d4127848a09c/sd3_impls.py#L53
-    but has more standard model loading semantics.
-    """
-
-    def __init__(
-        self,
-        params: Sd3MMDiTXParams,
-        shift: float = 1.0,
-        device: torch.device | None = None,
-        dtype: torch.dtype | None = None,
-        verbose: bool = False,
-    ):
-        super().__init__()
-        self.diffusion_model = MMDiTX(
-            input_size=None,
-            pos_embed_scaling_factor=None,
-            pos_embed_offset=None,
-            pos_embed_max_size=params.pos_embed_max_size,
-            patch_size=params.patch_size,
-            in_channels=16,
-            depth=params.depth,
-            num_patches=params.num_patches,
-            adm_in_channels=params.adm_in_channels,
-            context_embedder_config=params.context_embedder_config,
-            qk_norm=params.qk_norm,
-            x_block_self_attn_layers=params.x_block_self_attn_layers,
-            device=device,
-            dtype=dtype,
-            verbose=verbose,
-        )
-        self.model_sampling = ModelSamplingDiscreteFlow(shift=shift)
-
-    def apply_model(self, x: torch.Tensor, sigma: torch.Tensor, c_crossattn: torch.Tensor, y: torch.Tensor):
-        dtype = self.get_dtype()
-        timestep = self.model_sampling.timestep(sigma).float()
-        model_output = self.diffusion_model(x.to(dtype), timestep, context=c_crossattn.to(dtype), y=y.to(dtype)).float()
-        return self.model_sampling.calculate_denoised(sigma, model_output, x)
-
-    def forward(self, x: torch.Tensor, sigma: float, c_crossattn: torch.Tensor, y: torch.Tensor):
-        return self.apply_model(x=x, sigma=sigma, c_crossattn=c_crossattn, y=y)
-
-    def get_dtype(self):
-        return self.diffusion_model.dtype
--- a/invokeai/backend/sd3/sd3_state_dict_utils.py
+++ b/invokeai/backend/sd3/sd3_state_dict_utils.py
@@ -1,70 +0,0 @@
-import math
-import re
-from typing import Any, Dict
-
-from invokeai.backend.sd3.sd3_mmditx import ContextEmbedderConfig, Sd3MMDiTXParams
-
-
-def is_sd3_checkpoint(sd: Dict[str, Any]) -> bool:
-    """Is the state dict for an SD3 checkpoint like this one?:
-    https://huggingface.co/stabilityai/stable-diffusion-3.5-large/blob/main/sd3.5_large.safetensors
-
-    Note that this checkpoint format contains both the VAE and the MMDiTX model.
-
-    This is intended to be a reasonably high-precision detector, but it is not guaranteed to have perfect precision.
-    """
-    # If all of the expected keys are present, then this is very likely a SD3 checkpoint.
-    expected_keys = {
-        # VAE decoder and encoder keys.
-        "first_stage_model.decoder.conv_in.bias",
-        "first_stage_model.decoder.conv_in.weight",
-        "first_stage_model.encoder.conv_in.bias",
-        "first_stage_model.encoder.conv_in.weight",
-        # MMDiTX keys.
-        "model.diffusion_model.final_layer.linear.bias",
-        "model.diffusion_model.final_layer.linear.weight",
-        "model.diffusion_model.joint_blocks.0.context_block.attn.ln_k.weight",
-        "model.diffusion_model.joint_blocks.0.context_block.attn.ln_q.weight",
-    }
-
-    return expected_keys.issubset(sd.keys())
-
-
-def infer_sd3_mmditx_params(sd: Dict[str, Any], prefix: str = "model.diffusion_model.") -> Sd3MMDiTXParams:
-    """Infer the MMDiTX model parameters from the state dict.
-
-    This logic is based on:
-    https://github.com/Stability-AI/sd3.5/blob/19bf11c4e1e37324c5aa5a61f010d4127848a09c/sd3_impls.py#L68-L88
-    """
-    patch_size = sd[f"{prefix}x_embedder.proj.weight"].shape[2]
-    depth = sd[f"{prefix}x_embedder.proj.weight"].shape[0] // 64
-    num_patches = sd[f"{prefix}pos_embed"].shape[1]
-    pos_embed_max_size = round(math.sqrt(num_patches))
-    adm_in_channels = sd[f"{prefix}y_embedder.mlp.0.weight"].shape[1]
-    context_shape = sd[f"{prefix}context_embedder.weight"].shape
-    qk_norm = "rms" if f"{prefix}joint_blocks.0.context_block.attn.ln_k.weight" in sd else None
-    x_block_self_attn_layers = sorted(
-        [
-            int(key.split(".x_block.attn2.ln_k.weight")[0].split(".")[-1])
-            for key in list(filter(re.compile(".*.x_block.attn2.ln_k.weight").match, sd.keys()))
-        ]
-    )
-
-    context_embedder_config: ContextEmbedderConfig = {
-        "target": "torch.nn.Linear",
-        "params": {
-            "in_features": context_shape[1],
-            "out_features": context_shape[0],
-        },
-    }
-    return Sd3MMDiTXParams(
-        patch_size=patch_size,
-        depth=depth,
-        num_patches=num_patches,
-        pos_embed_max_size=pos_embed_max_size,
-        adm_in_channels=adm_in_channels,
-        context_shape=context_shape,
-        qk_norm=qk_norm,
-        x_block_self_attn_layers=x_block_self_attn_layers,
-        context_embedder_config=context_embedder_config,
-    )
--- a/invokeai/backend/stable_diffusion/extensions/preview.py
+++ b/invokeai/backend/stable_diffusion/extensions/preview.py
@@ -33,7 +33,7 @@ class PreviewExt(ExtensionBase):
    def initial_preview(self, ctx: DenoiseContext):
        self.callback(
            PipelineIntermediateState(
-                step=-1,
+                step=0,
                order=ctx.scheduler.order,
                total_steps=len(ctx.inputs.timesteps),
                timestep=int(ctx.scheduler.config.num_train_timesteps),  # TODO: is there any code which uses it?
--- a/invokeai/backend/util/hotfixes.py
+++ b/invokeai/backend/util/hotfixes.py
@@ -3,7 +3,7 @@ from typing import Any, Dict, List, Optional, Tuple, Union
 import diffusers
 import torch
 from diffusers.configuration_utils import ConfigMixin, register_to_config
-from diffusers.loaders import FromOriginalControlNetMixin
+from diffusers.loaders.single_file_model import FromOriginalModelMixin
 from diffusers.models.attention_processor import AttentionProcessor, AttnProcessor
 from diffusers.models.controlnet import ControlNetConditioningEmbedding, ControlNetOutput, zero_module
 from diffusers.models.embeddings import (
@@ -32,7 +32,9 @@ from invokeai.backend.util.logging import InvokeAILogger
 logger = InvokeAILogger.get_logger(__name__)


-class ControlNetModel(ModelMixin, ConfigMixin, FromOriginalControlNetMixin):
+# NOTE(ryand): I'm not the origina author of this code, but for future reference, it appears that this class was copied
+# from diffusers in order to add support for the encoder_attention_mask argument.
+class ControlNetModel(ModelMixin, ConfigMixin, FromOriginalModelMixin):
    """
    A ControlNet model.

--- a/invokeai/frontend/web/package.json
+++ b/invokeai/frontend/web/package.json
@@ -58,7 +58,7 @@
    "@dnd-kit/sortable": "^8.0.0",
    "@dnd-kit/utilities": "^3.2.2",
    "@fontsource-variable/inter": "^5.1.0",
-    "@invoke-ai/ui-library": "^0.0.42",
+    "@invoke-ai/ui-library": "^0.0.43",
    "@nanostores/react": "^0.7.3",
    "@reduxjs/toolkit": "2.2.3",
    "@roarr/browser-log-writer": "^1.3.0",
--- a/invokeai/frontend/web/pnpm-lock.yaml
+++ b/invokeai/frontend/web/pnpm-lock.yaml
@@ -24,8 +24,8 @@ dependencies:
    specifier: ^5.1.0
    version: 5.1.0
  '@invoke-ai/ui-library':
-    specifier: ^0.0.42
-    version: 0.0.42(@chakra-ui/form-control@2.2.0)(@chakra-ui/icon@3.2.0)(@chakra-ui/media-query@3.3.0)(@chakra-ui/menu@2.2.1)(@chakra-ui/spinner@2.1.0)(@chakra-ui/system@2.6.2)(@fontsource-variable/inter@5.1.0)(@types/react@18.3.11)(i18next@23.15.1)(react-dom@18.3.1)(react@18.3.1)
+    specifier: ^0.0.43
+    version: 0.0.43(@chakra-ui/form-control@2.2.0)(@chakra-ui/icon@3.2.0)(@chakra-ui/media-query@3.3.0)(@chakra-ui/menu@2.2.1)(@chakra-ui/spinner@2.1.0)(@chakra-ui/system@2.6.2)(@fontsource-variable/inter@5.1.0)(@types/react@18.3.11)(i18next@23.15.1)(react-dom@18.3.1)(react@18.3.1)
  '@nanostores/react':
    specifier: ^0.7.3
    version: 0.7.3(nanostores@0.11.3)(react@18.3.1)
@@ -1696,20 +1696,20 @@ packages:
      prettier: 3.3.3
    dev: true

-  /@invoke-ai/ui-library@0.0.42(@chakra-ui/form-control@2.2.0)(@chakra-ui/icon@3.2.0)(@chakra-ui/media-query@3.3.0)(@chakra-ui/menu@2.2.1)(@chakra-ui/spinner@2.1.0)(@chakra-ui/system@2.6.2)(@fontsource-variable/inter@5.1.0)(@types/react@18.3.11)(i18next@23.15.1)(react-dom@18.3.1)(react@18.3.1):
-    resolution: {integrity: sha512-OuDXRipBO5mu+Nv4qN8cd8MiwiGBdq6h4PirVgPI9/ltbdcIzePgUJ0dJns26lflHSTRWW38I16wl4YTw3mNWA==}
+  /@invoke-ai/ui-library@0.0.43(@chakra-ui/form-control@2.2.0)(@chakra-ui/icon@3.2.0)(@chakra-ui/media-query@3.3.0)(@chakra-ui/menu@2.2.1)(@chakra-ui/spinner@2.1.0)(@chakra-ui/system@2.6.2)(@fontsource-variable/inter@5.1.0)(@types/react@18.3.11)(i18next@23.15.1)(react-dom@18.3.1)(react@18.3.1):
+    resolution: {integrity: sha512-t3fPYyks07ue3dEBPJuTHbeDLnDckDCOrtvc07mMDbLOnlPEZ0StaeiNGH+oO8qLzAuMAlSTdswgHfzTc2MmPw==}
    peerDependencies:
      '@fontsource-variable/inter': ^5.0.16
      react: ^18.2.0
      react-dom: ^18.2.0
    dependencies:
-      '@chakra-ui/anatomy': 2.2.2
+      '@chakra-ui/anatomy': 2.3.4
      '@chakra-ui/icons': 2.2.4(@chakra-ui/react@2.10.2)(react@18.3.1)
      '@chakra-ui/layout': 2.3.1(@chakra-ui/system@2.6.2)(react@18.3.1)
      '@chakra-ui/portal': 2.1.0(react-dom@18.3.1)(react@18.3.1)
      '@chakra-ui/react': 2.10.2(@emotion/react@11.13.3)(@emotion/styled@11.13.0)(@types/react@18.3.11)(framer-motion@11.10.0)(react-dom@18.3.1)(react@18.3.1)
-      '@chakra-ui/styled-system': 2.9.2
-      '@chakra-ui/theme-tools': 2.1.2(@chakra-ui/styled-system@2.9.2)
+      '@chakra-ui/styled-system': 2.11.2(react@18.3.1)
+      '@chakra-ui/theme-tools': 2.2.6(@chakra-ui/styled-system@2.11.2)(react@18.3.1)
      '@emotion/react': 11.13.3(@types/react@18.3.11)(react@18.3.1)
      '@emotion/styled': 11.13.0(@emotion/react@11.13.3)(@types/react@18.3.11)(react@18.3.1)
      '@fontsource-variable/inter': 5.1.0
--- a/invokeai/frontend/web/public/locales/en.json
+++ b/invokeai/frontend/web/public/locales/en.json
@@ -94,6 +94,7 @@
        "close": "Close",
        "copy": "Copy",
        "copyError": "$t(gallery.copy) Error",
+        "clipboard": "Clipboard",
        "on": "On",
        "off": "Off",
        "or": "or",
@@ -681,7 +682,8 @@
        "recallParameters": "Recall Parameters",
        "recallParameter": "Recall {{label}}",
        "scheduler": "Scheduler",
-        "seamless": "Seamless",
+        "seamlessXAxis": "Seamless X Axis",
+        "seamlessYAxis": "Seamless Y Axis",
        "seed": "Seed",
        "steps": "Steps",
        "strength": "Image to image strength",
@@ -712,8 +714,12 @@
        "convertToDiffusersHelpText4": "This is a one time process only. It might take around 30s-60s depending on the specifications of your computer.",
        "convertToDiffusersHelpText5": "Please make sure you have enough disk space. Models generally vary between 2GB-7GB in size.",
        "convertToDiffusersHelpText6": "Do you wish to convert this model?",
+        "noDefaultSettings": "No default settings configured for this model. Visit the Model Manager to add default settings.",
        "defaultSettings": "Default Settings",
        "defaultSettingsSaved": "Default Settings Saved",
+        "defaultSettingsOutOfSync": "Some settings do not match the model's defaults:",
+        "restoreDefaultSettings": "Click to use the model's default settings.",
+        "usingDefaultSettings": "Using model's default settings",
        "delete": "Delete",
        "deleteConfig": "Delete Config",
        "deleteModel": "Delete Model",
@@ -798,7 +804,6 @@
        "uploadImage": "Upload Image",
        "urlOrLocalPath": "URL or Local Path",
        "urlOrLocalPathHelper": "URLs should point to a single file. Local paths can point to a single file or folder for a single diffusers model.",
-        "useDefaultSettings": "Use Default Settings",
        "vae": "VAE",
        "vaePrecision": "VAE Precision",
        "variant": "Variant",
@@ -1108,6 +1113,9 @@
        "enableInformationalPopovers": "Enable Informational Popovers",
        "informationalPopoversDisabled": "Informational Popovers Disabled",
        "informationalPopoversDisabledDesc": "Informational popovers have been disabled. Enable them in Settings.",
+        "enableModelDescriptions": "Enable Model Descriptions in Dropdowns",
+        "modelDescriptionsDisabled": "Model Descriptions in Dropdowns Disabled",
+        "modelDescriptionsDisabledDesc": "Model descriptions in dropdowns have been disabled. Enable them in Settings.",
        "enableInvisibleWatermark": "Enable Invisible Watermark",
        "enableNSFWChecker": "Enable NSFW Checker",
        "general": "General",
@@ -1251,6 +1259,33 @@
            "heading": "Mask Adjustments",
            "paragraphs": ["Adjust the mask."]
        },
+        "inpainting": {
+            "heading": "Inpainting",
+            "paragraphs": ["Controls which area is modified, guided by Denoising Strength."]
+        },
+        "rasterLayer": {
+            "heading": "Raster Layer",
+            "paragraphs": ["Pixel-based content of your canvas, used during image generation."]
+        },
+        "regionalGuidance": {
+            "heading": "Regional Guidance",
+            "paragraphs": ["Brush to guide where elements from global prompts should appear."]
+        },
+        "regionalGuidanceAndReferenceImage": {
+            "heading": "Regional Guidance and Regional Reference Image",
+            "paragraphs": [
+                "For Regional Guidance, brush to guide where elements from global prompts should appear.",
+                "For Regional Reference Image, brush to apply a reference image to specific areas."
+            ]
+        },
+        "globalReferenceImage": {
+            "heading": "Global Reference Image",
+            "paragraphs": ["Applies a reference image to influence the entire generation."]
+        },
+        "regionalReferenceImage": {
+            "heading": "Regional Reference Image",
+            "paragraphs": ["Brush to apply a reference image to specific areas."]
+        },
        "controlNet": {
            "heading": "ControlNet",
            "paragraphs": [
@@ -1648,6 +1683,8 @@
        "controlLayer": "Control Layer",
        "inpaintMask": "Inpaint Mask",
        "regionalGuidance": "Regional Guidance",
+        "canvasAsRasterLayer": "$t(controlLayers.canvas) as $t(controlLayers.rasterLayer)",
+        "canvasAsControlLayer": "$t(controlLayers.canvas) as $t(controlLayers.controlLayer)",
        "referenceImage": "Reference Image",
        "regionalReferenceImage": "Regional Reference Image",
        "globalReferenceImage": "Global Reference Image",
@@ -1688,8 +1725,18 @@
        "layer_other": "Layers",
        "layer_withCount_one": "Layer ({{count}})",
        "layer_withCount_other": "Layers ({{count}})",
-        "convertToControlLayer": "Convert to Control Layer",
-        "convertToRasterLayer": "Convert to Raster Layer",
+        "convertRasterLayerTo": "Convert $t(controlLayers.rasterLayer) To",
+        "convertControlLayerTo": "Convert $t(controlLayers.controlLayer) To",
+        "convertInpaintMaskTo": "Convert $t(controlLayers.inpaintMask) To",
+        "convertRegionalGuidanceTo": "Convert $t(controlLayers.regionalGuidance) To",
+        "copyRasterLayerTo": "Copy $t(controlLayers.rasterLayer) To",
+        "copyControlLayerTo": "Copy $t(controlLayers.controlLayer) To",
+        "copyInpaintMaskTo": "Copy $t(controlLayers.inpaintMask) To",
+        "copyRegionalGuidanceTo": "Copy $t(controlLayers.regionalGuidance) To",
+        "newRasterLayer": "New $t(controlLayers.rasterLayer)",
+        "newControlLayer": "New $t(controlLayers.controlLayer)",
+        "newInpaintMask": "New $t(controlLayers.inpaintMask)",
+        "newRegionalGuidance": "New $t(controlLayers.regionalGuidance)",
        "transparency": "Transparency",
        "enableTransparencyEffect": "Enable Transparency Effect",
        "disableTransparencyEffect": "Disable Transparency Effect",
@@ -1713,6 +1760,7 @@
        "newGallerySessionDesc": "This will clear the canvas and all settings except for your model selection. Generations will be sent to the gallery.",
        "newCanvasSession": "New Canvas Session",
        "newCanvasSessionDesc": "This will clear the canvas and all settings except for your model selection. Generations will be staged on the canvas.",
+        "replaceCurrent": "Replace Current",
        "controlMode": {
            "controlMode": "Control Mode",
            "balanced": "Balanced",
@@ -1842,16 +1890,24 @@
            "apply": "Apply",
            "cancel": "Cancel"
        },
-        "segment": {
-            "autoMask": "Auto Mask",
+        "selectObject": {
+            "selectObject": "Select Object",
            "pointType": "Point Type",
-            "foreground": "Foreground",
-            "background": "Background",
+            "invertSelection": "Invert Selection",
+            "include": "Include",
+            "exclude": "Exclude",
            "neutral": "Neutral",
-            "reset": "Reset",
            "apply": "Apply",
+            "reset": "Reset",
+            "saveAs": "Save As",
            "cancel": "Cancel",
-            "process": "Process"
+            "process": "Process",
+            "help1": "Select a single target object. Add <Bold>Include</Bold> and <Bold>Exclude</Bold> points to indicate which parts of the layer are part of the target object.",
+            "help2": "Start with one <Bold>Include</Bold> point within the target object. Add more points to refine the selection. Fewer points typically produce better results.",
+            "help3": "Invert the selection to select everything except the target object.",
+            "clickToAdd": "Click on the layer to add a point",
+            "dragToMove": "Drag a point to move it",
+            "clickToRemove": "Click on a point to remove it"
        },
        "settings": {
            "snapToGrid": {
@@ -1892,6 +1948,8 @@
            "newRegionalReferenceImage": "New Regional Reference Image",
            "newControlLayer": "New Control Layer",
            "newRasterLayer": "New Raster Layer",
+            "newInpaintMask": "New Inpaint Mask",
+            "newRegionalGuidance": "New Regional Guidance",
            "cropCanvasToBbox": "Crop Canvas to Bbox"
        },
        "stagingArea": {
@@ -2024,13 +2082,11 @@
    },
    "whatsNew": {
        "whatsNewInInvoke": "What's New in Invoke",
-        "canvasV2Announcement": {
-            "newCanvas": "A powerful new control canvas",
-            "newLayerTypes": "New layer types for even more control",
-            "fluxSupport": "Support for the Flux family of models",
-            "readReleaseNotes": "Read Release Notes",
-            "watchReleaseVideo": "Watch Release Video",
-            "watchUiUpdatesOverview": "Watch UI Updates Overview"
-        }
+        "line1": "<ItalicComponent>Select Object</ItalicComponent> tool for precise object selection and editing",
+        "line2": "Expanded Flux support, now with Global Reference Images",
+        "line3": "Improved tooltips and context menus",
+        "readReleaseNotes": "Read Release Notes",
+        "watchRecentReleaseVideos": "Watch Recent Release Videos",
+        "watchUiUpdatesOverview": "Watch UI Updates Overview"
    }
 }
--- a/invokeai/frontend/web/src/app/store/middleware/listenerMiddleware/listeners/imageDropped.ts
+++ b/invokeai/frontend/web/src/app/store/middleware/listenerMiddleware/listeners/imageDropped.ts
@@ -8,6 +8,7 @@ import {
  controlLayerAdded,
  entityRasterized,
  entitySelected,
+  inpaintMaskAdded,
  rasterLayerAdded,
  referenceImageAdded,
  referenceImageIPAdapterImageChanged,
@@ -17,6 +18,7 @@ import {
 import { selectCanvasSlice } from 'features/controlLayers/store/selectors';
 import type {
  CanvasControlLayerState,
+  CanvasInpaintMaskState,
  CanvasRasterLayerState,
  CanvasReferenceImageState,
  CanvasRegionalGuidanceState,
@@ -110,6 +112,46 @@ export const addImageDroppedListener = (startAppListening: AppStartListening) =>
        return;
      }

+      /**
+
+      /**
+       * Image dropped on Inpaint Mask
+       */
+      if (
+        overData.actionType === 'ADD_INPAINT_MASK_FROM_IMAGE' &&
+        activeData.payloadType === 'IMAGE_DTO' &&
+        activeData.payload.imageDTO
+      ) {
+        const imageObject = imageDTOToImageObject(activeData.payload.imageDTO);
+        const { x, y } = selectCanvasSlice(getState()).bbox.rect;
+        const overrides: Partial<CanvasInpaintMaskState> = {
+          objects: [imageObject],
+          position: { x, y },
+        };
+        dispatch(inpaintMaskAdded({ overrides, isSelected: true }));
+        return;
+      }
+
+      /**
+
+      /**
+       * Image dropped on Regional Guidance
+       */
+      if (
+        overData.actionType === 'ADD_REGIONAL_GUIDANCE_FROM_IMAGE' &&
+        activeData.payloadType === 'IMAGE_DTO' &&
+        activeData.payload.imageDTO
+      ) {
+        const imageObject = imageDTOToImageObject(activeData.payload.imageDTO);
+        const { x, y } = selectCanvasSlice(getState()).bbox.rect;
+        const overrides: Partial<CanvasRegionalGuidanceState> = {
+          objects: [imageObject],
+          position: { x, y },
+        };
+        dispatch(rgAdded({ overrides, isSelected: true }));
+        return;
+      }
+
      /**
       * Image dropped on Raster layer
       */
--- a/invokeai/frontend/web/src/common/components/IconMenuItem.tsx
+++ b/invokeai/frontend/web/src/common/components/IconMenuItem.tsx
@@ -26,5 +26,9 @@ export const IconMenuItem = ({ tooltip, icon, ...props }: Props) => {
 };

 export const IconMenuItemGroup = ({ children }: { children: ReactNode }) => {
-  return <Flex gap={2}>{children}</Flex>;
+  return (
+    <Flex gap={2} justifyContent="space-between">
+      {children}
+    </Flex>
+  );
 };
--- a/invokeai/frontend/web/src/common/components/InformationalPopover/constants.ts
+++ b/invokeai/frontend/web/src/common/components/InformationalPopover/constants.ts
@@ -23,8 +23,10 @@ export type Feature =
  | 'dynamicPrompts'
  | 'dynamicPromptsMaxPrompts'
  | 'dynamicPromptsSeedBehaviour'
+  | 'globalReferenceImage'
  | 'imageFit'
  | 'infillMethod'
+  | 'inpainting'
  | 'ipAdapterMethod'
  | 'lora'
  | 'loraWeight'
@@ -46,6 +48,7 @@ export type Feature =
  | 'paramVAEPrecision'
  | 'paramWidth'
  | 'patchmatchDownScaleSize'
+  | 'rasterLayer'
  | 'refinerModel'
  | 'refinerNegativeAestheticScore'
  | 'refinerPositiveAestheticScore'
@@ -53,6 +56,9 @@ export type Feature =
  | 'refinerStart'
  | 'refinerSteps'
  | 'refinerCfgScale'
+  | 'regionalGuidance'
+  | 'regionalGuidanceAndReferenceImage'
+  | 'regionalReferenceImage'
  | 'scaleBeforeProcessing'
  | 'seamlessTilingXAxis'
  | 'seamlessTilingYAxis'
@@ -76,6 +82,24 @@ export const POPOVER_DATA: { [key in Feature]?: PopoverData } = {
  clipSkip: {
    href: 'https://support.invoke.ai/support/solutions/articles/151000178161-advanced-settings',
  },
+  inpainting: {
+    href: 'https://support.invoke.ai/support/solutions/articles/151000096702-inpainting-outpainting-and-bounding-box',
+  },
+  rasterLayer: {
+    href: 'https://support.invoke.ai/support/solutions/articles/151000094998-raster-layers-and-initial-images',
+  },
+  regionalGuidance: {
+    href: 'https://support.invoke.ai/support/solutions/articles/151000165024-regional-guidance-layers',
+  },
+  regionalGuidanceAndReferenceImage: {
+    href: 'https://support.invoke.ai/support/solutions/articles/151000165024-regional-guidance-layers',
+  },
+  globalReferenceImage: {
+    href: 'https://support.invoke.ai/support/solutions/articles/151000159340-global-and-regional-reference-images-ip-adapters-',
+  },
+  regionalReferenceImage: {
+    href: 'https://support.invoke.ai/support/solutions/articles/151000159340-global-and-regional-reference-images-ip-adapters-',
+  },
  controlNet: {
    href: 'https://support.invoke.ai/support/solutions/articles/151000105880',
  },
--- a/invokeai/frontend/web/src/common/hooks/useBoolean.ts
+++ b/invokeai/frontend/web/src/common/hooks/useBoolean.ts
@@ -127,8 +127,6 @@ export const buildUseDisclosure = (defaultIsOpen: boolean): [() => UseDisclosure
 *
 * Hook to manage a boolean state. Use this for a local boolean state.
 * @param defaultIsOpen Initial state of the disclosure
- *
- * @knipignore
 */
 export const useDisclosure = (defaultIsOpen: boolean): UseDisclosure => {
  const [isOpen, set] = useState(defaultIsOpen);
--- a/invokeai/frontend/web/src/common/hooks/useGroupedModelCombobox.ts
+++ b/invokeai/frontend/web/src/common/hooks/useGroupedModelCombobox.ts
@@ -4,6 +4,7 @@ import { useAppSelector } from 'app/store/storeHooks';
 import type { GroupBase } from 'chakra-react-select';
 import { selectParamsSlice } from 'features/controlLayers/store/paramsSlice';
 import type { ModelIdentifierField } from 'features/nodes/types/common';
+import { selectSystemShouldEnableModelDescriptions } from 'features/system/store/systemSlice';
 import { groupBy, reduce } from 'lodash-es';
 import { useCallback, useMemo } from 'react';
 import { useTranslation } from 'react-i18next';
@@ -37,6 +38,7 @@ export const useGroupedModelCombobox = <T extends AnyModelConfig>(
 ): UseGroupedModelComboboxReturn => {
  const { t } = useTranslation();
  const base = useAppSelector(selectBaseWithSDXLFallback);
+  const shouldShowModelDescriptions = useAppSelector(selectSystemShouldEnableModelDescriptions);
  const { modelConfigs, selectedModel, getIsDisabled, onChange, isLoading, groupByType = false } = arg;
  const options = useMemo<GroupBase<ComboboxOption>[]>(() => {
    if (!modelConfigs) {
@@ -51,6 +53,7 @@ export const useGroupedModelCombobox = <T extends AnyModelConfig>(
          options: val.map((model) => ({
            label: model.name,
            value: model.key,
+            description: (shouldShowModelDescriptions && model.description) || undefined,
            isDisabled: getIsDisabled ? getIsDisabled(model) : false,
          })),
        });
@@ -60,7 +63,7 @@ export const useGroupedModelCombobox = <T extends AnyModelConfig>(
    );
    _options.sort((a) => (a.label?.split('/')[0]?.toLowerCase().includes(base) ? -1 : 1));
    return _options;
-  }, [modelConfigs, groupByType, getIsDisabled, base]);
+  }, [modelConfigs, groupByType, getIsDisabled, base, shouldShowModelDescriptions]);

  const value = useMemo(
    () =>
--- a/invokeai/frontend/web/src/common/hooks/useModelCombobox.ts
+++ b/invokeai/frontend/web/src/common/hooks/useModelCombobox.ts
@@ -1,5 +1,7 @@
 import type { ComboboxOnChange, ComboboxOption } from '@invoke-ai/ui-library';
+import { useAppSelector } from 'app/store/storeHooks';
 import type { ModelIdentifierField } from 'features/nodes/types/common';
+import { selectSystemShouldEnableModelDescriptions } from 'features/system/store/systemSlice';
 import { useCallback, useMemo } from 'react';
 import { useTranslation } from 'react-i18next';
 import type { AnyModelConfig } from 'services/api/types';
@@ -24,13 +26,16 @@ type UseModelComboboxReturn = {
 export const useModelCombobox = <T extends AnyModelConfig>(arg: UseModelComboboxArg<T>): UseModelComboboxReturn => {
  const { t } = useTranslation();
  const { modelConfigs, selectedModel, getIsDisabled, onChange, isLoading, optionsFilter = () => true } = arg;
+  const shouldShowModelDescriptions = useAppSelector(selectSystemShouldEnableModelDescriptions);
+
  const options = useMemo<ComboboxOption[]>(() => {
    return modelConfigs.filter(optionsFilter).map((model) => ({
      label: model.name,
      value: model.key,
+      description: (shouldShowModelDescriptions && model.description) || undefined,
      isDisabled: getIsDisabled ? getIsDisabled(model) : false,
    }));
-  }, [optionsFilter, getIsDisabled, modelConfigs]);
+  }, [optionsFilter, getIsDisabled, modelConfigs, shouldShowModelDescriptions]);

  const value = useMemo(
    () => options.find((m) => (selectedModel ? m.value === selectedModel.key : false)),
--- a/invokeai/frontend/web/src/common/hooks/useSubMenu.tsx
+++ b/invokeai/frontend/web/src/common/hooks/useSubMenu.tsx
@@ -0,0 +1,161 @@
+import type { MenuButtonProps, MenuItemProps, MenuListProps, MenuProps } from '@invoke-ai/ui-library';
+import { Box, Flex, Icon, Text } from '@invoke-ai/ui-library';
+import { useDisclosure } from 'common/hooks/useBoolean';
+import type { FocusEventHandler, PointerEvent, RefObject } from 'react';
+import { useCallback, useEffect, useRef } from 'react';
+import { PiCaretRightBold } from 'react-icons/pi';
+import { useDebouncedCallback } from 'use-debounce';
+
+const offset: [number, number] = [0, 8];
+
+type UseSubMenuReturn = {
+  parentMenuItemProps: Partial<MenuItemProps>;
+  menuProps: Partial<MenuProps>;
+  menuButtonProps: Partial<MenuButtonProps>;
+  menuListProps: Partial<MenuListProps> & { ref: RefObject<HTMLDivElement> };
+};
+
+/**
+ * A hook that provides the necessary props to create a sub-menu within a menu.
+ *
+ * The sub-menu should be wrapped inside a parent `MenuItem` component.
+ *
+ * Use SubMenuButtonContent to render a button with a label and a right caret icon.
+ *
+ * TODO(psyche): Add keyboard handling for sub-menu.
+ *
+ * @example
+ * ```tsx
+ * const SubMenuExample = () => {
+ *   const subMenu = useSubMenu();
+ *   return (
+ *     <Menu>
+ *       <MenuButton>Open Parent Menu</MenuButton>
+ *       <MenuList>
+ *         <MenuItem>Parent Item 1</MenuItem>
+ *         <MenuItem>Parent Item 2</MenuItem>
+ *         <MenuItem>Parent Item 3</MenuItem>
+ *         <MenuItem {...subMenu.parentMenuItemProps} icon={<PiImageBold />}>
+ *           <Menu {...subMenu.menuProps}>
+ *             <MenuButton {...subMenu.menuButtonProps}>
+ *               <SubMenuButtonContent label="Open Sub Menu" />
+ *             </MenuButton>
+ *             <MenuList {...subMenu.menuListProps}>
+ *               <MenuItem>Sub Item 1</MenuItem>
+ *               <MenuItem>Sub Item 2</MenuItem>
+ *               <MenuItem>Sub Item 3</MenuItem>
+ *             </MenuList>
+ *           </Menu>
+ *         </MenuItem>
+ *       </MenuList>
+ *     </Menu>
+ *   );
+ * };
+ * ```
+ */
+export const useSubMenu = (): UseSubMenuReturn => {
+  const subMenu = useDisclosure(false);
+  const menuListRef = useRef<HTMLDivElement>(null);
+  const closeDebounced = useDebouncedCallback(subMenu.close, 300);
+  const openAndCancelPendingClose = useCallback(() => {
+    closeDebounced.cancel();
+    subMenu.open();
+  }, [closeDebounced, subMenu]);
+  const toggleAndCancelPendingClose = useCallback(() => {
+    if (subMenu.isOpen) {
+      subMenu.close();
+      return;
+    } else {
+      closeDebounced.cancel();
+      subMenu.toggle();
+    }
+  }, [closeDebounced, subMenu]);
+  const onBlurMenuList = useCallback<FocusEventHandler<HTMLDivElement>>(
+    (e) => {
+      // Don't trigger blur if focus is moving to a child element - e.g. from a sub-menu item to another sub-menu item
+      if (e.currentTarget.contains(e.relatedTarget)) {
+        closeDebounced.cancel();
+        return;
+      }
+      subMenu.close();
+    },
+    [closeDebounced, subMenu]
+  );
+
+  const onParentMenuItemPointerLeave = useCallback(
+    (e: PointerEvent<HTMLButtonElement>) => {
+      /**
+       * The pointerleave event is triggered when the pen or touch device is lifted, which would close the sub-menu.
+       * However, we want to keep the sub-menu open until the pen or touch device pressed some other element. This
+       * will be handled in the useEffect below - just ignore the pointerleave event for pen and touch devices.
+       */
+      if (e.pointerType === 'pen' || e.pointerType === 'touch') {
+        return;
+      }
+      subMenu.close();
+    },
+    [subMenu]
+  );
+
+  /**
+   * When using a mouse, the pointerleave events close the menu. But when using a pen or touch device, we need to close
+   * the sub-menu when the user taps outside of the menu list. So we need to listen for clicks outside of the menu list
+   * and close the menu accordingly.
+   */
+  useEffect(() => {
+    const el = menuListRef.current;
+    if (!el) {
+      return;
+    }
+    const controller = new AbortController();
+    window.addEventListener(
+      'click',
+      (e) => {
+        if (menuListRef.current?.contains(e.target as Node)) {
+          return;
+        }
+        subMenu.close();
+      },
+      { signal: controller.signal }
+    );
+    return () => {
+      controller.abort();
+    };
+  }, [subMenu]);
+
+  return {
+    parentMenuItemProps: {
+      onClick: toggleAndCancelPendingClose,
+      onPointerEnter: openAndCancelPendingClose,
+      onPointerLeave: onParentMenuItemPointerLeave,
+      closeOnSelect: false,
+    },
+    menuProps: {
+      isOpen: subMenu.isOpen,
+      onClose: subMenu.close,
+      placement: 'right',
+      offset: offset,
+      closeOnBlur: false,
+    },
+    menuButtonProps: {
+      as: Box,
+      width: 'full',
+      height: 'full',
+    },
+    menuListProps: {
+      ref: menuListRef,
+      onPointerEnter: openAndCancelPendingClose,
+      onPointerLeave: closeDebounced,
+      onBlur: onBlurMenuList,
+    },
+  };
+};
+
+export const SubMenuButtonContent = ({ label }: { label: string }) => {
+  return (
+    <Flex w="full" h="full" flexDir="row" justifyContent="space-between" alignItems="center">
+      <Text>{label}</Text>
+      <Icon as={PiCaretRightBold} />
+    </Flex>
+  );
+};
--- a/invokeai/frontend/web/src/features/controlLayers/components/CanvasAddEntityButtons.tsx
+++ b/invokeai/frontend/web/src/features/controlLayers/components/CanvasAddEntityButtons.tsx
@@ -1,5 +1,6 @@
 import { Button, Flex, Heading } from '@invoke-ai/ui-library';
 import { useAppSelector } from 'app/store/storeHooks';
+import { InformationalPopover } from 'common/components/InformationalPopover/InformationalPopover';
 import {
  useAddControlLayer,
  useAddGlobalReferenceImage,
@@ -28,69 +29,80 @@ export const CanvasAddEntityButtons = memo(() => {
      <Flex position="relative" flexDir="column" gap={4} top="20%">
        <Flex flexDir="column" justifyContent="flex-start" gap={2}>
          <Heading size="xs">{t('controlLayers.global')}</Heading>
-          <Button
-            size="sm"
-            variant="ghost"
-            justifyContent="flex-start"
-            leftIcon={<PiPlusBold />}
-            onClick={addGlobalReferenceImage}
-          >
-            {t('controlLayers.globalReferenceImage')}
-          </Button>
+          <InformationalPopover feature="globalReferenceImage">
+            <Button
+              size="sm"
+              variant="ghost"
+              justifyContent="flex-start"
+              leftIcon={<PiPlusBold />}
+              onClick={addGlobalReferenceImage}
+            >
+              {t('controlLayers.globalReferenceImage')}
+            </Button>
+          </InformationalPopover>
        </Flex>
        <Flex flexDir="column" gap={2}>
          <Heading size="xs">{t('controlLayers.regional')}</Heading>
-          <Button
-            size="sm"
-            variant="ghost"
-            justifyContent="flex-start"
-            leftIcon={<PiPlusBold />}
-            onClick={addInpaintMask}
-          >
-            {t('controlLayers.inpaintMask')}
-          </Button>
-          <Button
-            size="sm"
-            variant="ghost"
-            justifyContent="flex-start"
-            leftIcon={<PiPlusBold />}
-            onClick={addRegionalGuidance}
-            isDisabled={isFLUX}
-          >
-            {t('controlLayers.regionalGuidance')}
-          </Button>
-          <Button
-            size="sm"
-            variant="ghost"
-            justifyContent="flex-start"
-            leftIcon={<PiPlusBold />}
-            onClick={addRegionalReferenceImage}
-            isDisabled={isFLUX}
-          >
-            {t('controlLayers.regionalReferenceImage')}
-          </Button>
+          <InformationalPopover feature="inpainting">
+            <Button
+              size="sm"
+              variant="ghost"
+              justifyContent="flex-start"
+              leftIcon={<PiPlusBold />}
+              onClick={addInpaintMask}
+            >
+              {t('controlLayers.inpaintMask')}
+            </Button>
+          </InformationalPopover>
+          <InformationalPopover feature="regionalGuidance">
+            <Button
+              size="sm"
+              variant="ghost"
+              justifyContent="flex-start"
+              leftIcon={<PiPlusBold />}
+              onClick={addRegionalGuidance}
+              isDisabled={isFLUX}
+            >
+              {t('controlLayers.regionalGuidance')}
+            </Button>
+          </InformationalPopover>
+          <InformationalPopover feature="regionalReferenceImage">
+            <Button
+              size="sm"
+              variant="ghost"
+              justifyContent="flex-start"
+              leftIcon={<PiPlusBold />}
+              onClick={addRegionalReferenceImage}
+              isDisabled={isFLUX}
+            >
+              {t('controlLayers.regionalReferenceImage')}
+            </Button>
+          </InformationalPopover>
        </Flex>
        <Flex flexDir="column" justifyContent="flex-start" gap={2}>
          <Heading size="xs">{t('controlLayers.layer_other')}</Heading>
-
-          <Button
-            size="sm"
-            variant="ghost"
-            justifyContent="flex-start"
-            leftIcon={<PiPlusBold />}
-            onClick={addControlLayer}
-          >
-            {t('controlLayers.controlLayer')}
-          </Button>
-          <Button
-            size="sm"
-            variant="ghost"
-            justifyContent="flex-start"
-            leftIcon={<PiPlusBold />}
-            onClick={addRasterLayer}
-          >
-            {t('controlLayers.rasterLayer')}
-          </Button>
+          <InformationalPopover feature="controlNet">
+            <Button
+              size="sm"
+              variant="ghost"
+              justifyContent="flex-start"
+              leftIcon={<PiPlusBold />}
+              onClick={addControlLayer}
+            >
+              {t('controlLayers.controlLayer')}
+            </Button>
+          </InformationalPopover>
+          <InformationalPopover feature="rasterLayer">
+            <Button
+              size="sm"
+              variant="ghost"
+              justifyContent="flex-start"
+              leftIcon={<PiPlusBold />}
+              onClick={addRasterLayer}
+            >
+              {t('controlLayers.rasterLayer')}
+            </Button>
+          </InformationalPopover>
        </Flex>
      </Flex>
    </Flex>
--- a/invokeai/frontend/web/src/features/controlLayers/components/CanvasAlerts/CanvasAlertsPreserveMask.tsx
+++ b/invokeai/frontend/web/src/features/controlLayers/components/CanvasAlerts/CanvasAlertsPreserveMask.tsx
@@ -13,7 +13,7 @@ export const CanvasAlertsPreserveMask = memo(() => {
  }

  return (
-    <Alert status="warning" borderRadius="base" fontSize="sm" shadow="md" w="fit-content" alignSelf="flex-end">
+    <Alert status="warning" borderRadius="base" fontSize="sm" shadow="md" w="fit-content">
      <AlertIcon />
      <AlertTitle>{t('controlLayers.settings.preserveMask.alert')}</AlertTitle>
    </Alert>
--- a/invokeai/frontend/web/src/features/controlLayers/components/CanvasAlerts/CanvasAlertsSelectedEntityStatus.tsx
+++ b/invokeai/frontend/web/src/features/controlLayers/components/CanvasAlerts/CanvasAlertsSelectedEntityStatus.tsx
@@ -98,7 +98,7 @@ const CanvasAlertsSelectedEntityStatusContent = memo(({ entityIdentifier, adapte
  }

  return (
-    <Alert status={alert.status} borderRadius="base" fontSize="sm" shadow="md" w="fit-content" alignSelf="flex-end">
+    <Alert status={alert.status} borderRadius="base" fontSize="sm" shadow="md" w="fit-content">
      <AlertIcon />
      <AlertTitle>{alert.title}</AlertTitle>
    </Alert>
--- a/invokeai/frontend/web/src/features/controlLayers/components/CanvasAlerts/CanvasAlertsSendingTo.tsx
+++ b/invokeai/frontend/web/src/features/controlLayers/components/CanvasAlerts/CanvasAlertsSendingTo.tsx
@@ -132,7 +132,6 @@ const AlertWrapper = ({
            fontSize="sm"
            shadow="md"
            w="fit-content"
-            alignSelf="flex-end"
          >
            <Flex w="full" alignItems="center">
              <AlertIcon />
--- a/invokeai/frontend/web/src/features/controlLayers/components/CanvasContextMenu/CanvasContextMenuGlobalMenuItems.tsx
+++ b/invokeai/frontend/web/src/features/controlLayers/components/CanvasContextMenu/CanvasContextMenuGlobalMenuItems.tsx
@@ -1,4 +1,5 @@
-import { MenuGroup, MenuItem } from '@invoke-ai/ui-library';
+import { Menu, MenuButton, MenuGroup, MenuItem, MenuList } from '@invoke-ai/ui-library';
+import { SubMenuButtonContent, useSubMenu } from 'common/hooks/useSubMenu';
 import { CanvasContextMenuItemsCropCanvasToBbox } from 'features/controlLayers/components/CanvasContextMenu/CanvasContextMenuItemsCropCanvasToBbox';
 import { NewLayerIcon } from 'features/controlLayers/components/common/icons';
 import {
@@ -16,6 +17,8 @@ import { PiFloppyDiskBold } from 'react-icons/pi';

 export const CanvasContextMenuGlobalMenuItems = memo(() => {
  const { t } = useTranslation();
+  const saveSubMenu = useSubMenu();
+  const newSubMenu = useSubMenu();
  const isBusy = useCanvasIsBusy();
  const saveCanvasToGallery = useSaveCanvasToGallery();
  const saveBboxToGallery = useSaveBboxToGallery();
@@ -28,27 +31,41 @@ export const CanvasContextMenuGlobalMenuItems = memo(() => {
    <>
      <MenuGroup title={t('controlLayers.canvasContextMenu.canvasGroup')}>
        <CanvasContextMenuItemsCropCanvasToBbox />
-      </MenuGroup>
-      <MenuGroup title={t('controlLayers.canvasContextMenu.saveToGalleryGroup')}>
-        <MenuItem icon={<PiFloppyDiskBold />} isDisabled={isBusy} onClick={saveCanvasToGallery}>
-          {t('controlLayers.canvasContextMenu.saveCanvasToGallery')}
+        <MenuItem {...saveSubMenu.parentMenuItemProps} icon={<PiFloppyDiskBold />}>
+          <Menu {...saveSubMenu.menuProps}>
+            <MenuButton {...saveSubMenu.menuButtonProps}>
+              <SubMenuButtonContent label={t('controlLayers.canvasContextMenu.saveToGalleryGroup')} />
+            </MenuButton>
+            <MenuList {...saveSubMenu.menuListProps}>
+              <MenuItem icon={<PiFloppyDiskBold />} isDisabled={isBusy} onClick={saveCanvasToGallery}>
+                {t('controlLayers.canvasContextMenu.saveCanvasToGallery')}
+              </MenuItem>
+              <MenuItem icon={<PiFloppyDiskBold />} isDisabled={isBusy} onClick={saveBboxToGallery}>
+                {t('controlLayers.canvasContextMenu.saveBboxToGallery')}
+              </MenuItem>
+            </MenuList>
+          </Menu>
        </MenuItem>
-        <MenuItem icon={<PiFloppyDiskBold />} isDisabled={isBusy} onClick={saveBboxToGallery}>
-          {t('controlLayers.canvasContextMenu.saveBboxToGallery')}
-        </MenuItem>
-      </MenuGroup>
-      <MenuGroup title={t('controlLayers.canvasContextMenu.bboxGroup')}>
-        <MenuItem icon={<NewLayerIcon />} isDisabled={isBusy} onClick={newGlobalReferenceImageFromBbox}>
-          {t('controlLayers.canvasContextMenu.newGlobalReferenceImage')}
-        </MenuItem>
-        <MenuItem icon={<NewLayerIcon />} isDisabled={isBusy} onClick={newRegionalReferenceImageFromBbox}>
-          {t('controlLayers.canvasContextMenu.newRegionalReferenceImage')}
-        </MenuItem>
-        <MenuItem icon={<NewLayerIcon />} isDisabled={isBusy} onClick={newControlLayerFromBbox}>
-          {t('controlLayers.canvasContextMenu.newControlLayer')}
-        </MenuItem>
-        <MenuItem icon={<NewLayerIcon />} isDisabled={isBusy} onClick={newRasterLayerFromBbox}>
-          {t('controlLayers.canvasContextMenu.newRasterLayer')}
+        <MenuItem {...newSubMenu.parentMenuItemProps} icon={<NewLayerIcon />}>
+          <Menu {...newSubMenu.menuProps}>
+            <MenuButton {...newSubMenu.menuButtonProps}>
+              <SubMenuButtonContent label={t('controlLayers.canvasContextMenu.bboxGroup')} />
+            </MenuButton>
+            <MenuList {...newSubMenu.menuListProps}>
+              <MenuItem icon={<NewLayerIcon />} isDisabled={isBusy} onClick={newGlobalReferenceImageFromBbox}>
+                {t('controlLayers.canvasContextMenu.newGlobalReferenceImage')}
+              </MenuItem>
+              <MenuItem icon={<NewLayerIcon />} isDisabled={isBusy} onClick={newRegionalReferenceImageFromBbox}>
+                {t('controlLayers.canvasContextMenu.newRegionalReferenceImage')}
+              </MenuItem>
+              <MenuItem icon={<NewLayerIcon />} isDisabled={isBusy} onClick={newControlLayerFromBbox}>
+                {t('controlLayers.canvasContextMenu.newControlLayer')}
+              </MenuItem>
+              <MenuItem icon={<NewLayerIcon />} isDisabled={isBusy} onClick={newRasterLayerFromBbox}>
+                {t('controlLayers.canvasContextMenu.newRasterLayer')}
+              </MenuItem>
+            </MenuList>
+          </Menu>
        </MenuItem>
      </MenuGroup>
    </>
--- a/invokeai/frontend/web/src/features/controlLayers/components/CanvasContextMenu/CanvasContextMenuSelectedEntityMenuItems.tsx
+++ b/invokeai/frontend/web/src/features/controlLayers/components/CanvasContextMenu/CanvasContextMenuSelectedEntityMenuItems.tsx
@@ -1,42 +1,43 @@
 import { MenuGroup } from '@invoke-ai/ui-library';
 import { useAppSelector } from 'app/store/storeHooks';
-import { CanvasEntityMenuItemsCopyToClipboard } from 'features/controlLayers/components/common/CanvasEntityMenuItemsCopyToClipboard';
-import { CanvasEntityMenuItemsCropToBbox } from 'features/controlLayers/components/common/CanvasEntityMenuItemsCropToBbox';
-import { CanvasEntityMenuItemsDelete } from 'features/controlLayers/components/common/CanvasEntityMenuItemsDelete';
-import { CanvasEntityMenuItemsFilter } from 'features/controlLayers/components/common/CanvasEntityMenuItemsFilter';
-import { CanvasEntityMenuItemsSave } from 'features/controlLayers/components/common/CanvasEntityMenuItemsSave';
-import { CanvasEntityMenuItemsSegment } from 'features/controlLayers/components/common/CanvasEntityMenuItemsSegment';
-import { CanvasEntityMenuItemsTransform } from 'features/controlLayers/components/common/CanvasEntityMenuItemsTransform';
+import { ControlLayerMenuItems } from 'features/controlLayers/components/ControlLayer/ControlLayerMenuItems';
+import { InpaintMaskMenuItems } from 'features/controlLayers/components/InpaintMask/InpaintMaskMenuItems';
+import { IPAdapterMenuItems } from 'features/controlLayers/components/IPAdapter/IPAdapterMenuItems';
+import { RasterLayerMenuItems } from 'features/controlLayers/components/RasterLayer/RasterLayerMenuItems';
+import { RegionalGuidanceMenuItems } from 'features/controlLayers/components/RegionalGuidance/RegionalGuidanceMenuItems';
 import {
  EntityIdentifierContext,
  useEntityIdentifierContext,
 } from 'features/controlLayers/contexts/EntityIdentifierContext';
-import { useEntityTitle } from 'features/controlLayers/hooks/useEntityTitle';
+import { useEntityTypeString } from 'features/controlLayers/hooks/useEntityTypeString';
 import { selectSelectedEntityIdentifier } from 'features/controlLayers/store/selectors';
-import {
-  isFilterableEntityIdentifier,
-  isSaveableEntityIdentifier,
-  isSegmentableEntityIdentifier,
-  isTransformableEntityIdentifier,
-} from 'features/controlLayers/store/types';
+import type { PropsWithChildren } from 'react';
 import { memo } from 'react';
+import type { Equals } from 'tsafe';
+import { assert } from 'tsafe';

 const CanvasContextMenuSelectedEntityMenuItemsContent = memo(() => {
  const entityIdentifier = useEntityIdentifierContext();
-  const title = useEntityTitle(entityIdentifier);

-  return (
-    <MenuGroup title={title}>
-      {isFilterableEntityIdentifier(entityIdentifier) && <CanvasEntityMenuItemsFilter />}
-      {isTransformableEntityIdentifier(entityIdentifier) && <CanvasEntityMenuItemsTransform />}
-      {isSegmentableEntityIdentifier(entityIdentifier) && <CanvasEntityMenuItemsSegment />}
-      {isSaveableEntityIdentifier(entityIdentifier) && <CanvasEntityMenuItemsCopyToClipboard />}
-      {isSaveableEntityIdentifier(entityIdentifier) && <CanvasEntityMenuItemsSave />}
-      {isTransformableEntityIdentifier(entityIdentifier) && <CanvasEntityMenuItemsCropToBbox />}
-      <CanvasEntityMenuItemsDelete />
-    </MenuGroup>
-  );
+  if (entityIdentifier.type === 'raster_layer') {
+    return <RasterLayerMenuItems />;
+  }
+  if (entityIdentifier.type === 'control_layer') {
+    return <ControlLayerMenuItems />;
+  }
+  if (entityIdentifier.type === 'inpaint_mask') {
+    return <InpaintMaskMenuItems />;
+  }
+  if (entityIdentifier.type === 'regional_guidance') {
+    return <RegionalGuidanceMenuItems />;
+  }
+  if (entityIdentifier.type === 'reference_image') {
+    return <IPAdapterMenuItems />;
+  }
+
+  assert<Equals<typeof entityIdentifier.type, never>>(false);
 });
+
 CanvasContextMenuSelectedEntityMenuItemsContent.displayName = 'CanvasContextMenuSelectedEntityMenuItemsContent';

 export const CanvasContextMenuSelectedEntityMenuItems = memo(() => {
@@ -48,9 +49,20 @@ export const CanvasContextMenuSelectedEntityMenuItems = memo(() => {

  return (
    <EntityIdentifierContext.Provider value={selectedEntityIdentifier}>
-      <CanvasContextMenuSelectedEntityMenuItemsContent />
+      <CanvasContextMenuSelectedEntityMenuGroup>
+        <CanvasContextMenuSelectedEntityMenuItemsContent />
+      </CanvasContextMenuSelectedEntityMenuGroup>
    </EntityIdentifierContext.Provider>
  );
 });

 CanvasContextMenuSelectedEntityMenuItems.displayName = 'CanvasContextMenuSelectedEntityMenuItems';
+
+const CanvasContextMenuSelectedEntityMenuGroup = memo((props: PropsWithChildren) => {
+  const entityIdentifier = useEntityIdentifierContext();
+  const title = useEntityTypeString(entityIdentifier.type);
+
+  return <MenuGroup title={title}>{props.children}</MenuGroup>;
+});
+
+CanvasContextMenuSelectedEntityMenuGroup.displayName = 'CanvasContextMenuSelectedEntityMenuGroup';
--- a/invokeai/frontend/web/src/features/controlLayers/components/CanvasDropArea.tsx
+++ b/invokeai/frontend/web/src/features/controlLayers/components/CanvasDropArea.tsx
@@ -62,6 +62,7 @@ export const CanvasDropArea = memo(() => {
            data={addControlLayerFromImageDropData}
          />
        </GridItem>
+
        <GridItem position="relative">
          <IAIDroppable
            dropLabel={t('controlLayers.canvasContextMenu.newRegionalReferenceImage')}
--- a/invokeai/frontend/web/src/features/controlLayers/components/CanvasEntityList/EntityListGlobalActionBarAddLayerMenu.tsx
+++ b/invokeai/frontend/web/src/features/controlLayers/components/CanvasEntityList/EntityListGlobalActionBarAddLayerMenu.tsx
@@ -29,7 +29,7 @@ export const EntityListGlobalActionBarAddLayerMenu = memo(() => {
    <Menu>
      <MenuButton
        as={IconButton}
-        size="sm"
+        minW={8}
        variant="link"
        alignSelf="stretch"
        tooltip={t('controlLayers.addLayer')}
--- a/invokeai/frontend/web/src/features/controlLayers/components/CanvasEntityList/EntityListSelectedEntityActionBar.tsx
+++ b/invokeai/frontend/web/src/features/controlLayers/components/CanvasEntityList/EntityListSelectedEntityActionBar.tsx
@@ -4,6 +4,7 @@ import { EntityListSelectedEntityActionBarDuplicateButton } from 'features/contr
 import { EntityListSelectedEntityActionBarFill } from 'features/controlLayers/components/CanvasEntityList/EntityListSelectedEntityActionBarFill';
 import { EntityListSelectedEntityActionBarFilterButton } from 'features/controlLayers/components/CanvasEntityList/EntityListSelectedEntityActionBarFilterButton';
 import { EntityListSelectedEntityActionBarOpacity } from 'features/controlLayers/components/CanvasEntityList/EntityListSelectedEntityActionBarOpacity';
+import { EntityListSelectedEntityActionBarSelectObjectButton } from 'features/controlLayers/components/CanvasEntityList/EntityListSelectedEntityActionBarSelectObjectButton';
 import { EntityListSelectedEntityActionBarTransformButton } from 'features/controlLayers/components/CanvasEntityList/EntityListSelectedEntityActionBarTransformButton';
 import { memo } from 'react';

@@ -16,6 +17,7 @@ export const EntityListSelectedEntityActionBar = memo(() => {
      <Spacer />
      <EntityListSelectedEntityActionBarFill />
      <Flex h="full">
+        <EntityListSelectedEntityActionBarSelectObjectButton />
        <EntityListSelectedEntityActionBarFilterButton />
        <EntityListSelectedEntityActionBarTransformButton />
        <EntityListSelectedEntityActionBarSaveToAssetsButton />
--- a/invokeai/frontend/web/src/features/controlLayers/components/CanvasEntityList/EntityListSelectedEntityActionBarDuplicateButton.tsx
+++ b/invokeai/frontend/web/src/features/controlLayers/components/CanvasEntityList/EntityListSelectedEntityActionBarDuplicateButton.tsx
@@ -23,7 +23,7 @@ export const EntityListSelectedEntityActionBarDuplicateButton = memo(() => {
    <IconButton
      onClick={onClick}
      isDisabled={!selectedEntityIdentifier || isBusy}
-      size="sm"
+      minW={8}
      variant="link"
      alignSelf="stretch"
      aria-label={t('controlLayers.duplicate')}
--- a/invokeai/frontend/web/src/features/controlLayers/components/CanvasEntityList/EntityListSelectedEntityActionBarFilterButton.tsx
+++ b/invokeai/frontend/web/src/features/controlLayers/components/CanvasEntityList/EntityListSelectedEntityActionBarFilterButton.tsx
@@ -5,7 +5,7 @@ import { selectSelectedEntityIdentifier } from 'features/controlLayers/store/sel
 import { isFilterableEntityIdentifier } from 'features/controlLayers/store/types';
 import { memo } from 'react';
 import { useTranslation } from 'react-i18next';
-import { PiShootingStarBold } from 'react-icons/pi';
+import { PiShootingStarFill } from 'react-icons/pi';

 export const EntityListSelectedEntityActionBarFilterButton = memo(() => {
  const { t } = useTranslation();
@@ -24,12 +24,12 @@ export const EntityListSelectedEntityActionBarFilterButton = memo(() => {
    <IconButton
      onClick={filter.start}
      isDisabled={filter.isDisabled}
-      size="sm"
+      minW={8}
      variant="link"
      alignSelf="stretch"
      aria-label={t('controlLayers.filter.filter')}
      tooltip={t('controlLayers.filter.filter')}
-      icon={<PiShootingStarBold />}
+      icon={<PiShootingStarFill />}
    />
  );
 });
--- a/invokeai/frontend/web/src/features/controlLayers/components/CanvasEntityList/EntityListSelectedEntityActionBarSaveToAssetsButton.tsx
+++ b/invokeai/frontend/web/src/features/controlLayers/components/CanvasEntityList/EntityListSelectedEntityActionBarSaveToAssetsButton.tsx
@@ -31,7 +31,7 @@ export const EntityListSelectedEntityActionBarSaveToAssetsButton = memo(() => {
    <IconButton
      onClick={onClick}
      isDisabled={!selectedEntityIdentifier || isBusy}
-      size="sm"
+      minW={8}
      variant="link"
      alignSelf="stretch"
      aria-label={t('controlLayers.saveLayerToAssets')}
--- a/invokeai/frontend/web/src/features/controlLayers/components/CanvasEntityList/EntityListSelectedEntityActionBarSelectObjectButton.tsx
+++ b/invokeai/frontend/web/src/features/controlLayers/components/CanvasEntityList/EntityListSelectedEntityActionBarSelectObjectButton.tsx
@@ -0,0 +1,37 @@
+import { IconButton } from '@invoke-ai/ui-library';
+import { useAppSelector } from 'app/store/storeHooks';
+import { useEntitySegmentAnything } from 'features/controlLayers/hooks/useEntitySegmentAnything';
+import { selectSelectedEntityIdentifier } from 'features/controlLayers/store/selectors';
+import { isSegmentableEntityIdentifier } from 'features/controlLayers/store/types';
+import { memo } from 'react';
+import { useTranslation } from 'react-i18next';
+import { PiShapesFill } from 'react-icons/pi';
+
+export const EntityListSelectedEntityActionBarSelectObjectButton = memo(() => {
+  const { t } = useTranslation();
+  const selectedEntityIdentifier = useAppSelector(selectSelectedEntityIdentifier);
+  const segment = useEntitySegmentAnything(selectedEntityIdentifier);
+
+  if (!selectedEntityIdentifier) {
+    return null;
+  }
+
+  if (!isSegmentableEntityIdentifier(selectedEntityIdentifier)) {
+    return null;
+  }
+
+  return (
+    <IconButton
+      onClick={segment.start}
+      isDisabled={segment.isDisabled}
+      minW={8}
+      variant="link"
+      alignSelf="stretch"
+      aria-label={t('controlLayers.selectObject.selectObject')}
+      tooltip={t('controlLayers.selectObject.selectObject')}
+      icon={<PiShapesFill />}
+    />
+  );
+});
+
+EntityListSelectedEntityActionBarSelectObjectButton.displayName = 'EntityListSelectedEntityActionBarSelectObjectButton';
--- a/invokeai/frontend/web/src/features/controlLayers/components/CanvasEntityList/EntityListSelectedEntityActionBarTransformButton.tsx
+++ b/invokeai/frontend/web/src/features/controlLayers/components/CanvasEntityList/EntityListSelectedEntityActionBarTransformButton.tsx
@@ -24,7 +24,7 @@ export const EntityListSelectedEntityActionBarTransformButton = memo(() => {
    <IconButton
      onClick={transform.start}
      isDisabled={transform.isDisabled}
-      size="sm"
+      minW={8}
      variant="link"
      alignSelf="stretch"
      aria-label={t('controlLayers.transform.transform')}
--- a/invokeai/frontend/web/src/features/controlLayers/components/CanvasMainPanelContent.tsx
+++ b/invokeai/frontend/web/src/features/controlLayers/components/CanvasMainPanelContent.tsx
@@ -10,7 +10,7 @@ import { CanvasDropArea } from 'features/controlLayers/components/CanvasDropArea
 import { Filter } from 'features/controlLayers/components/Filters/Filter';
 import { CanvasHUD } from 'features/controlLayers/components/HUD/CanvasHUD';
 import { InvokeCanvasComponent } from 'features/controlLayers/components/InvokeCanvasComponent';
-import { SegmentAnything } from 'features/controlLayers/components/SegmentAnything/SegmentAnything';
+import { SelectObject } from 'features/controlLayers/components/SelectObject/SelectObject';
 import { StagingAreaIsStagingGate } from 'features/controlLayers/components/StagingArea/StagingAreaIsStagingGate';
 import { StagingAreaToolbar } from 'features/controlLayers/components/StagingArea/StagingAreaToolbar';
 import { CanvasToolbar } from 'features/controlLayers/components/Toolbar/CanvasToolbar';
@@ -25,8 +25,8 @@ const MenuContent = () => {
  return (
    <CanvasManagerProviderGate>
      <MenuList>
-        <CanvasContextMenuGlobalMenuItems />
        <CanvasContextMenuSelectedEntityMenuItems />
+        <CanvasContextMenuGlobalMenuItems />
      </MenuList>
    </CanvasManagerProviderGate>
  );
@@ -71,12 +71,16 @@ export const CanvasMainPanelContent = memo(() => {
          >
            <InvokeCanvasComponent />
            <CanvasManagerProviderGate>
-              {showHUD && (
-                <Flex position="absolute" top={1} insetInlineStart={1} pointerEvents="none">
-                  <CanvasHUD />
-                </Flex>
-              )}
-              <Flex flexDir="column" position="absolute" top={1} insetInlineEnd={1} pointerEvents="none" gap={2}>
+              <Flex
+                position="absolute"
+                flexDir="column"
+                top={1}
+                insetInlineStart={1}
+                pointerEvents="none"
+                gap={2}
+                alignItems="flex-start"
+              >
+                {showHUD && <CanvasHUD />}
                <CanvasAlertsSelectedEntityStatus />
                <CanvasAlertsPreserveMask />
                <CanvasAlertsSendingToGallery />
@@ -102,7 +106,7 @@ export const CanvasMainPanelContent = memo(() => {
        <CanvasManagerProviderGate>
          <Filter />
          <Transform />
-          <SegmentAnything />
+          <SelectObject />
        </CanvasManagerProviderGate>
      </Flex>
      <CanvasDropArea />
--- a/invokeai/frontend/web/src/features/controlLayers/components/ControlLayer/ControlLayerControlAdapter.tsx
+++ b/invokeai/frontend/web/src/features/controlLayers/components/ControlLayer/ControlLayerControlAdapter.tsx
@@ -21,7 +21,7 @@ import { selectCanvasSlice, selectEntityOrThrow } from 'features/controlLayers/s
 import type { CanvasEntityIdentifier, ControlModeV2 } from 'features/controlLayers/store/types';
 import { memo, useCallback, useMemo } from 'react';
 import { useTranslation } from 'react-i18next';
-import { PiBoundingBoxBold, PiShootingStarBold, PiUploadBold } from 'react-icons/pi';
+import { PiBoundingBoxBold, PiShootingStarFill, PiUploadBold } from 'react-icons/pi';
 import type { ControlNetModelConfig, PostUploadAction, T2IAdapterModelConfig } from 'services/api/types';

 const useControlLayerControlAdapter = (entityIdentifier: CanvasEntityIdentifier<'control_layer'>) => {
@@ -93,7 +93,7 @@ export const ControlLayerControlAdapter = memo(() => {
          variant="link"
          aria-label={t('controlLayers.filter.filter')}
          tooltip={t('controlLayers.filter.filter')}
-          icon={<PiShootingStarBold />}
+          icon={<PiShootingStarFill />}
        />
        <IconButton
          onClick={pullBboxIntoLayer}
--- a/invokeai/frontend/web/src/features/controlLayers/components/ControlLayer/ControlLayerMenuItems.tsx
+++ b/invokeai/frontend/web/src/features/controlLayers/components/ControlLayer/ControlLayerMenuItems.tsx
@@ -1,15 +1,15 @@
 import { MenuDivider } from '@invoke-ai/ui-library';
 import { IconMenuItemGroup } from 'common/components/IconMenuItem';
 import { CanvasEntityMenuItemsArrange } from 'features/controlLayers/components/common/CanvasEntityMenuItemsArrange';
-import { CanvasEntityMenuItemsCopyToClipboard } from 'features/controlLayers/components/common/CanvasEntityMenuItemsCopyToClipboard';
 import { CanvasEntityMenuItemsCropToBbox } from 'features/controlLayers/components/common/CanvasEntityMenuItemsCropToBbox';
 import { CanvasEntityMenuItemsDelete } from 'features/controlLayers/components/common/CanvasEntityMenuItemsDelete';
 import { CanvasEntityMenuItemsDuplicate } from 'features/controlLayers/components/common/CanvasEntityMenuItemsDuplicate';
 import { CanvasEntityMenuItemsFilter } from 'features/controlLayers/components/common/CanvasEntityMenuItemsFilter';
 import { CanvasEntityMenuItemsSave } from 'features/controlLayers/components/common/CanvasEntityMenuItemsSave';
-import { CanvasEntityMenuItemsSegment } from 'features/controlLayers/components/common/CanvasEntityMenuItemsSegment';
+import { CanvasEntityMenuItemsSelectObject } from 'features/controlLayers/components/common/CanvasEntityMenuItemsSelectObject';
 import { CanvasEntityMenuItemsTransform } from 'features/controlLayers/components/common/CanvasEntityMenuItemsTransform';
-import { ControlLayerMenuItemsConvertControlToRaster } from 'features/controlLayers/components/ControlLayer/ControlLayerMenuItemsConvertControlToRaster';
+import { ControlLayerMenuItemsConvertToSubMenu } from 'features/controlLayers/components/ControlLayer/ControlLayerMenuItemsConvertToSubMenu';
+import { ControlLayerMenuItemsCopyToSubMenu } from 'features/controlLayers/components/ControlLayer/ControlLayerMenuItemsCopyToSubMenu';
 import { ControlLayerMenuItemsTransparencyEffect } from 'features/controlLayers/components/ControlLayer/ControlLayerMenuItemsTransparencyEffect';
 import { memo } from 'react';

@@ -24,12 +24,12 @@ export const ControlLayerMenuItems = memo(() => {
      <MenuDivider />
      <CanvasEntityMenuItemsTransform />
      <CanvasEntityMenuItemsFilter />
-      <CanvasEntityMenuItemsSegment />
-      <ControlLayerMenuItemsConvertControlToRaster />
+      <CanvasEntityMenuItemsSelectObject />
      <ControlLayerMenuItemsTransparencyEffect />
      <MenuDivider />
+      <ControlLayerMenuItemsCopyToSubMenu />
+      <ControlLayerMenuItemsConvertToSubMenu />
      <CanvasEntityMenuItemsCropToBbox />
-      <CanvasEntityMenuItemsCopyToClipboard />
      <CanvasEntityMenuItemsSave />
    </>
  );
--- a/invokeai/frontend/web/src/features/controlLayers/components/ControlLayer/ControlLayerMenuItemsConvertControlToRaster.tsx
+++ b/invokeai/frontend/web/src/features/controlLayers/components/ControlLayer/ControlLayerMenuItemsConvertControlToRaster.tsx
@@ -1,27 +0,0 @@
-import { MenuItem } from '@invoke-ai/ui-library';
-import { useAppDispatch } from 'app/store/storeHooks';
-import { useEntityIdentifierContext } from 'features/controlLayers/contexts/EntityIdentifierContext';
-import { useIsEntityInteractable } from 'features/controlLayers/hooks/useEntityIsInteractable';
-import { controlLayerConvertedToRasterLayer } from 'features/controlLayers/store/canvasSlice';
-import { memo, useCallback } from 'react';
-import { useTranslation } from 'react-i18next';
-import { PiLightningBold } from 'react-icons/pi';
-
-export const ControlLayerMenuItemsConvertControlToRaster = memo(() => {
-  const { t } = useTranslation();
-  const dispatch = useAppDispatch();
-  const entityIdentifier = useEntityIdentifierContext('control_layer');
-  const isInteractable = useIsEntityInteractable(entityIdentifier);
-
-  const convertControlLayerToRasterLayer = useCallback(() => {
-    dispatch(controlLayerConvertedToRasterLayer({ entityIdentifier }));
-  }, [dispatch, entityIdentifier]);
-
-  return (
-    <MenuItem onClick={convertControlLayerToRasterLayer} icon={<PiLightningBold />} isDisabled={!isInteractable}>
-      {t('controlLayers.convertToRasterLayer')}
-    </MenuItem>
-  );
-});
-
-ControlLayerMenuItemsConvertControlToRaster.displayName = 'ControlLayerMenuItemsConvertControlToRaster';
--- a/invokeai/frontend/web/src/features/controlLayers/components/ControlLayer/ControlLayerMenuItemsConvertToSubMenu.tsx
+++ b/invokeai/frontend/web/src/features/controlLayers/components/ControlLayer/ControlLayerMenuItemsConvertToSubMenu.tsx
@@ -0,0 +1,56 @@
+import { Menu, MenuButton, MenuItem, MenuList } from '@invoke-ai/ui-library';
+import { useAppDispatch } from 'app/store/storeHooks';
+import { SubMenuButtonContent, useSubMenu } from 'common/hooks/useSubMenu';
+import { useEntityIdentifierContext } from 'features/controlLayers/contexts/EntityIdentifierContext';
+import { useIsEntityInteractable } from 'features/controlLayers/hooks/useEntityIsInteractable';
+import {
+  controlLayerConvertedToInpaintMask,
+  controlLayerConvertedToRasterLayer,
+  controlLayerConvertedToRegionalGuidance,
+} from 'features/controlLayers/store/canvasSlice';
+import { memo, useCallback } from 'react';
+import { useTranslation } from 'react-i18next';
+import { PiSwapBold } from 'react-icons/pi';
+
+export const ControlLayerMenuItemsConvertToSubMenu = memo(() => {
+  const { t } = useTranslation();
+  const subMenu = useSubMenu();
+  const dispatch = useAppDispatch();
+  const entityIdentifier = useEntityIdentifierContext('control_layer');
+  const isInteractable = useIsEntityInteractable(entityIdentifier);
+
+  const convertToInpaintMask = useCallback(() => {
+    dispatch(controlLayerConvertedToInpaintMask({ entityIdentifier, replace: true }));
+  }, [dispatch, entityIdentifier]);
+
+  const convertToRegionalGuidance = useCallback(() => {
+    dispatch(controlLayerConvertedToRegionalGuidance({ entityIdentifier, replace: true }));
+  }, [dispatch, entityIdentifier]);
+
+  const convertToRasterLayer = useCallback(() => {
+    dispatch(controlLayerConvertedToRasterLayer({ entityIdentifier, replace: true }));
+  }, [dispatch, entityIdentifier]);
+
+  return (
+    <MenuItem {...subMenu.parentMenuItemProps} icon={<PiSwapBold />}>
+      <Menu {...subMenu.menuProps}>
+        <MenuButton {...subMenu.menuButtonProps}>
+          <SubMenuButtonContent label={t('controlLayers.convertControlLayerTo')} />
+        </MenuButton>
+        <MenuList {...subMenu.menuListProps}>
+          <MenuItem onClick={convertToInpaintMask} icon={<PiSwapBold />} isDisabled={!isInteractable}>
+            {t('controlLayers.inpaintMask')}
+          </MenuItem>
+          <MenuItem onClick={convertToRegionalGuidance} icon={<PiSwapBold />} isDisabled={!isInteractable}>
+            {t('controlLayers.regionalGuidance')}
+          </MenuItem>
+          <MenuItem onClick={convertToRasterLayer} icon={<PiSwapBold />} isDisabled={!isInteractable}>
+            {t('controlLayers.rasterLayer')}
+          </MenuItem>
+        </MenuList>
+      </Menu>
+    </MenuItem>
+  );
+});
+
+ControlLayerMenuItemsConvertToSubMenu.displayName = 'ControlLayerMenuItemsConvertToSubMenu';
--- a/invokeai/frontend/web/src/features/controlLayers/components/ControlLayer/ControlLayerMenuItemsCopyToSubMenu.tsx
+++ b/invokeai/frontend/web/src/features/controlLayers/components/ControlLayer/ControlLayerMenuItemsCopyToSubMenu.tsx
@@ -0,0 +1,58 @@
+import { Menu, MenuButton, MenuItem, MenuList } from '@invoke-ai/ui-library';
+import { useAppDispatch } from 'app/store/storeHooks';
+import { SubMenuButtonContent, useSubMenu } from 'common/hooks/useSubMenu';
+import { CanvasEntityMenuItemsCopyToClipboard } from 'features/controlLayers/components/common/CanvasEntityMenuItemsCopyToClipboard';
+import { useEntityIdentifierContext } from 'features/controlLayers/contexts/EntityIdentifierContext';
+import { useIsEntityInteractable } from 'features/controlLayers/hooks/useEntityIsInteractable';
+import {
+  controlLayerConvertedToInpaintMask,
+  controlLayerConvertedToRasterLayer,
+  controlLayerConvertedToRegionalGuidance,
+} from 'features/controlLayers/store/canvasSlice';
+import { memo, useCallback } from 'react';
+import { useTranslation } from 'react-i18next';
+import { PiCopyBold } from 'react-icons/pi';
+
+export const ControlLayerMenuItemsCopyToSubMenu = memo(() => {
+  const { t } = useTranslation();
+  const subMenu = useSubMenu();
+  const dispatch = useAppDispatch();
+  const entityIdentifier = useEntityIdentifierContext('control_layer');
+  const isInteractable = useIsEntityInteractable(entityIdentifier);
+
+  const copyToInpaintMask = useCallback(() => {
+    dispatch(controlLayerConvertedToInpaintMask({ entityIdentifier }));
+  }, [dispatch, entityIdentifier]);
+
+  const copyToRegionalGuidance = useCallback(() => {
+    dispatch(controlLayerConvertedToRegionalGuidance({ entityIdentifier }));
+  }, [dispatch, entityIdentifier]);
+
+  const copyToRasterLayer = useCallback(() => {
+    dispatch(controlLayerConvertedToRasterLayer({ entityIdentifier }));
+  }, [dispatch, entityIdentifier]);
+
+  return (
+    <MenuItem {...subMenu.parentMenuItemProps} icon={<PiCopyBold />}>
+      <Menu {...subMenu.menuProps}>
+        <MenuButton {...subMenu.menuButtonProps}>
+          <SubMenuButtonContent label={t('controlLayers.copyControlLayerTo')} />
+        </MenuButton>
+        <MenuList {...subMenu.menuListProps}>
+          <CanvasEntityMenuItemsCopyToClipboard />
+          <MenuItem onClick={copyToInpaintMask} icon={<PiCopyBold />} isDisabled={!isInteractable}>
+            {t('controlLayers.newInpaintMask')}
+          </MenuItem>
+          <MenuItem onClick={copyToRegionalGuidance} icon={<PiCopyBold />} isDisabled={!isInteractable}>
+            {t('controlLayers.newRegionalGuidance')}
+          </MenuItem>
+          <MenuItem onClick={copyToRasterLayer} icon={<PiCopyBold />} isDisabled={!isInteractable}>
+            {t('controlLayers.newRasterLayer')}
+          </MenuItem>
+        </MenuList>
+      </Menu>
+    </MenuItem>
+  );
+});
+
+ControlLayerMenuItemsCopyToSubMenu.displayName = 'ControlLayerMenuItemsCopyToSubMenu';
--- a/invokeai/frontend/web/src/features/controlLayers/components/Filters/Filter.tsx
+++ b/invokeai/frontend/web/src/features/controlLayers/components/Filters/Filter.tsx
@@ -1,4 +1,15 @@
-import { Button, ButtonGroup, Flex, Heading, Spacer } from '@invoke-ai/ui-library';
+import {
+  Button,
+  ButtonGroup,
+  Flex,
+  Heading,
+  Menu,
+  MenuButton,
+  MenuItem,
+  MenuList,
+  Spacer,
+  Spinner,
+} from '@invoke-ai/ui-library';
 import { useStore } from '@nanostores/react';
 import { useAppSelector } from 'app/store/storeHooks';
 import { useFocusRegion, useIsRegionFocused } from 'common/hooks/focus';
@@ -15,7 +26,7 @@ import { IMAGE_FILTERS } from 'features/controlLayers/store/filters';
 import { useRegisteredHotkeys } from 'features/system/components/HotkeysModal/useHotkeyData';
 import { memo, useCallback, useMemo, useRef } from 'react';
 import { useTranslation } from 'react-i18next';
-import { PiArrowsCounterClockwiseBold, PiCheckBold, PiShootingStarBold, PiXBold } from 'react-icons/pi';
+import { PiCaretDownBold } from 'react-icons/pi';

 const FilterContent = memo(
  ({ adapter }: { adapter: CanvasEntityAdapterRasterLayer | CanvasEntityAdapterControlLayer }) => {
@@ -25,7 +36,7 @@ const FilterContent = memo(
    const config = useStore(adapter.filterer.$filterConfig);
    const isCanvasFocused = useIsRegionFocused('canvas');
    const isProcessing = useStore(adapter.filterer.$isProcessing);
-    const hasProcessed = useStore(adapter.filterer.$hasProcessed);
+    const hasImageState = useStore(adapter.filterer.$hasImageState);
    const autoProcess = useAppSelector(selectAutoProcess);

    const onChangeFilterConfig = useCallback(
@@ -46,6 +57,22 @@ const FilterContent = memo(
      return IMAGE_FILTERS[config.type].validateConfig?.(config as never) ?? true;
    }, [config]);

+    const saveAsInpaintMask = useCallback(() => {
+      adapter.filterer.saveAs('inpaint_mask');
+    }, [adapter.filterer]);
+
+    const saveAsRegionalGuidance = useCallback(() => {
+      adapter.filterer.saveAs('regional_guidance');
+    }, [adapter.filterer]);
+
+    const saveAsRasterLayer = useCallback(() => {
+      adapter.filterer.saveAs('raster_layer');
+    }, [adapter.filterer]);
+
+    const saveAsControlLayer = useCallback(() => {
+      adapter.filterer.saveAs('control_layer');
+    }, [adapter.filterer]);
+
    useRegisteredHotkeys({
      id: 'applyFilter',
      category: 'canvas',
@@ -89,40 +116,56 @@ const FilterContent = memo(
        <ButtonGroup isAttached={false} size="sm" w="full">
          <Button
            variant="ghost"
-            leftIcon={<PiShootingStarBold />}
            onClick={adapter.filterer.processImmediate}
-            isLoading={isProcessing}
            loadingText={t('controlLayers.filter.process')}
-            isDisabled={!isValid || autoProcess}
+            isDisabled={isProcessing || !isValid || (autoProcess && hasImageState)}
          >
            {t('controlLayers.filter.process')}
+            {isProcessing && <Spinner ms={3} boxSize={5} color="base.600" />}
          </Button>
          <Spacer />
          <Button
-            leftIcon={<PiArrowsCounterClockwiseBold />}
            onClick={adapter.filterer.reset}
-            isLoading={isProcessing}
+            isDisabled={isProcessing}
            loadingText={t('controlLayers.filter.reset')}
            variant="ghost"
          >
            {t('controlLayers.filter.reset')}
          </Button>
          <Button
-            variant="ghost"
-            leftIcon={<PiCheckBold />}
            onClick={adapter.filterer.apply}
-            isLoading={isProcessing}
            loadingText={t('controlLayers.filter.apply')}
-            isDisabled={!isValid || !hasProcessed}
+            variant="ghost"
+            isDisabled={isProcessing || !isValid || !hasImageState}
          >
            {t('controlLayers.filter.apply')}
          </Button>
-          <Button
-            variant="ghost"
-            leftIcon={<PiXBold />}
-            onClick={adapter.filterer.cancel}
-            loadingText={t('controlLayers.filter.cancel')}
-          >
+          <Menu>
+            <MenuButton
+              as={Button}
+              loadingText={t('controlLayers.selectObject.saveAs')}
+              variant="ghost"
+              isDisabled={isProcessing || !isValid || !hasImageState}
+              rightIcon={<PiCaretDownBold />}
+            >
+              {t('controlLayers.selectObject.saveAs')}
+            </MenuButton>
+            <MenuList>
+              <MenuItem isDisabled={isProcessing || !isValid || !hasImageState} onClick={saveAsInpaintMask}>
+                {t('controlLayers.newInpaintMask')}
+              </MenuItem>
+              <MenuItem isDisabled={isProcessing || !isValid || !hasImageState} onClick={saveAsRegionalGuidance}>
+                {t('controlLayers.newRegionalGuidance')}
+              </MenuItem>
+              <MenuItem isDisabled={isProcessing || !isValid || !hasImageState} onClick={saveAsControlLayer}>
+                {t('controlLayers.newControlLayer')}
+              </MenuItem>
+              <MenuItem isDisabled={isProcessing || !isValid || !hasImageState} onClick={saveAsRasterLayer}>
+                {t('controlLayers.newRasterLayer')}
+              </MenuItem>
+            </MenuList>
+          </Menu>
+          <Button variant="ghost" onClick={adapter.filterer.cancel} loadingText={t('controlLayers.filter.cancel')}>
            {t('controlLayers.filter.cancel')}
          </Button>
        </ButtonGroup>
--- a/invokeai/frontend/web/src/features/controlLayers/components/IPAdapter/IPAdapterMenuItemPullBbox.tsx
+++ b/invokeai/frontend/web/src/features/controlLayers/components/IPAdapter/IPAdapterMenuItemPullBbox.tsx
@@ -0,0 +1,22 @@
+import { MenuItem } from '@invoke-ai/ui-library';
+import { useEntityIdentifierContext } from 'features/controlLayers/contexts/EntityIdentifierContext';
+import { usePullBboxIntoGlobalReferenceImage } from 'features/controlLayers/hooks/saveCanvasHooks';
+import { useCanvasIsBusy } from 'features/controlLayers/hooks/useCanvasIsBusy';
+import { memo } from 'react';
+import { useTranslation } from 'react-i18next';
+import { PiBoundingBoxBold } from 'react-icons/pi';
+
+export const IPAdapterMenuItemPullBbox = memo(() => {
+  const { t } = useTranslation();
+  const entityIdentifier = useEntityIdentifierContext('reference_image');
+  const pullBboxIntoIPAdapter = usePullBboxIntoGlobalReferenceImage(entityIdentifier);
+  const isBusy = useCanvasIsBusy();
+
+  return (
+    <MenuItem onClick={pullBboxIntoIPAdapter} icon={<PiBoundingBoxBold />} isDisabled={isBusy}>
+      {t('controlLayers.pullBboxIntoReferenceImage')}
+    </MenuItem>
+  );
+});
+
+IPAdapterMenuItemPullBbox.displayName = 'IPAdapterMenuItemPullBbox';
--- a/invokeai/frontend/web/src/features/controlLayers/components/IPAdapter/IPAdapterMenuItems.tsx
+++ b/invokeai/frontend/web/src/features/controlLayers/components/IPAdapter/IPAdapterMenuItems.tsx
@@ -1,16 +1,22 @@
+import { MenuDivider } from '@invoke-ai/ui-library';
 import { IconMenuItemGroup } from 'common/components/IconMenuItem';
 import { CanvasEntityMenuItemsArrange } from 'features/controlLayers/components/common/CanvasEntityMenuItemsArrange';
 import { CanvasEntityMenuItemsDelete } from 'features/controlLayers/components/common/CanvasEntityMenuItemsDelete';
 import { CanvasEntityMenuItemsDuplicate } from 'features/controlLayers/components/common/CanvasEntityMenuItemsDuplicate';
+import { IPAdapterMenuItemPullBbox } from 'features/controlLayers/components/IPAdapter/IPAdapterMenuItemPullBbox';
 import { memo } from 'react';

 export const IPAdapterMenuItems = memo(() => {
  return (
-    <IconMenuItemGroup>
-      <CanvasEntityMenuItemsArrange />
-      <CanvasEntityMenuItemsDuplicate />
-      <CanvasEntityMenuItemsDelete asIcon />
-    </IconMenuItemGroup>
+    <>
+      <IconMenuItemGroup>
+        <CanvasEntityMenuItemsArrange />
+        <CanvasEntityMenuItemsDuplicate />
+        <CanvasEntityMenuItemsDelete asIcon />
+      </IconMenuItemGroup>
+      <MenuDivider />
+      <IPAdapterMenuItemPullBbox />
+    </>
  );
 });

--- a/invokeai/frontend/web/src/features/controlLayers/components/InpaintMask/InpaintMask.tsx
+++ b/invokeai/frontend/web/src/features/controlLayers/components/InpaintMask/InpaintMask.tsx
@@ -14,7 +14,7 @@ type Props = {
 };

 export const InpaintMask = memo(({ id }: Props) => {
-  const entityIdentifier = useMemo<CanvasEntityIdentifier>(() => ({ id, type: 'inpaint_mask' }), [id]);
+  const entityIdentifier = useMemo<CanvasEntityIdentifier<'inpaint_mask'>>(() => ({ id, type: 'inpaint_mask' }), [id]);

  return (
    <EntityIdentifierContext.Provider value={entityIdentifier}>
--- a/invokeai/frontend/web/src/features/controlLayers/components/InpaintMask/InpaintMaskMenuItems.tsx
+++ b/invokeai/frontend/web/src/features/controlLayers/components/InpaintMask/InpaintMaskMenuItems.tsx
@@ -5,6 +5,8 @@ import { CanvasEntityMenuItemsCropToBbox } from 'features/controlLayers/componen
 import { CanvasEntityMenuItemsDelete } from 'features/controlLayers/components/common/CanvasEntityMenuItemsDelete';
 import { CanvasEntityMenuItemsDuplicate } from 'features/controlLayers/components/common/CanvasEntityMenuItemsDuplicate';
 import { CanvasEntityMenuItemsTransform } from 'features/controlLayers/components/common/CanvasEntityMenuItemsTransform';
+import { InpaintMaskMenuItemsConvertToSubMenu } from 'features/controlLayers/components/InpaintMask/InpaintMaskMenuItemsConvertToSubMenu';
+import { InpaintMaskMenuItemsCopyToSubMenu } from 'features/controlLayers/components/InpaintMask/InpaintMaskMenuItemsCopyToSubMenu';
 import { memo } from 'react';

 export const InpaintMaskMenuItems = memo(() => {
@@ -18,6 +20,8 @@ export const InpaintMaskMenuItems = memo(() => {
      <MenuDivider />
      <CanvasEntityMenuItemsTransform />
      <MenuDivider />
+      <InpaintMaskMenuItemsCopyToSubMenu />
+      <InpaintMaskMenuItemsConvertToSubMenu />
      <CanvasEntityMenuItemsCropToBbox />
    </>
  );
--- a/invokeai/frontend/web/src/features/controlLayers/components/InpaintMask/InpaintMaskMenuItemsConvertToSubMenu.tsx
+++ b/invokeai/frontend/web/src/features/controlLayers/components/InpaintMask/InpaintMaskMenuItemsConvertToSubMenu.tsx
@@ -0,0 +1,38 @@
+import { Menu, MenuButton, MenuItem, MenuList } from '@invoke-ai/ui-library';
+import { useAppDispatch } from 'app/store/storeHooks';
+import { SubMenuButtonContent, useSubMenu } from 'common/hooks/useSubMenu';
+import { useEntityIdentifierContext } from 'features/controlLayers/contexts/EntityIdentifierContext';
+import { useIsEntityInteractable } from 'features/controlLayers/hooks/useEntityIsInteractable';
+import { inpaintMaskConvertedToRegionalGuidance } from 'features/controlLayers/store/canvasSlice';
+import { memo, useCallback } from 'react';
+import { useTranslation } from 'react-i18next';
+import { PiSwapBold } from 'react-icons/pi';
+
+export const InpaintMaskMenuItemsConvertToSubMenu = memo(() => {
+  const { t } = useTranslation();
+  const subMenu = useSubMenu();
+  const dispatch = useAppDispatch();
+  const entityIdentifier = useEntityIdentifierContext('inpaint_mask');
+  const isInteractable = useIsEntityInteractable(entityIdentifier);
+
+  const convertToRegionalGuidance = useCallback(() => {
+    dispatch(inpaintMaskConvertedToRegionalGuidance({ entityIdentifier, replace: true }));
+  }, [dispatch, entityIdentifier]);
+
+  return (
+    <MenuItem {...subMenu.parentMenuItemProps} icon={<PiSwapBold />}>
+      <Menu {...subMenu.menuProps}>
+        <MenuButton {...subMenu.menuButtonProps}>
+          <SubMenuButtonContent label={t('controlLayers.convertInpaintMaskTo')} />
+        </MenuButton>
+        <MenuList {...subMenu.menuListProps}>
+          <MenuItem onClick={convertToRegionalGuidance} icon={<PiSwapBold />} isDisabled={!isInteractable}>
+            {t('controlLayers.regionalGuidance')}
+          </MenuItem>
+        </MenuList>
+      </Menu>
+    </MenuItem>
+  );
+});
+
+InpaintMaskMenuItemsConvertToSubMenu.displayName = 'InpaintMaskMenuItemsConvertToSubMenu';
--- a/invokeai/frontend/web/src/features/controlLayers/components/InpaintMask/InpaintMaskMenuItemsCopyToSubMenu.tsx
+++ b/invokeai/frontend/web/src/features/controlLayers/components/InpaintMask/InpaintMaskMenuItemsCopyToSubMenu.tsx
@@ -0,0 +1,40 @@
+import { Menu, MenuButton, MenuItem, MenuList } from '@invoke-ai/ui-library';
+import { useAppDispatch } from 'app/store/storeHooks';
+import { SubMenuButtonContent, useSubMenu } from 'common/hooks/useSubMenu';
+import { CanvasEntityMenuItemsCopyToClipboard } from 'features/controlLayers/components/common/CanvasEntityMenuItemsCopyToClipboard';
+import { useEntityIdentifierContext } from 'features/controlLayers/contexts/EntityIdentifierContext';
+import { useIsEntityInteractable } from 'features/controlLayers/hooks/useEntityIsInteractable';
+import { inpaintMaskConvertedToRegionalGuidance } from 'features/controlLayers/store/canvasSlice';
+import { memo, useCallback } from 'react';
+import { useTranslation } from 'react-i18next';
+import { PiCopyBold } from 'react-icons/pi';
+
+export const InpaintMaskMenuItemsCopyToSubMenu = memo(() => {
+  const { t } = useTranslation();
+  const subMenu = useSubMenu();
+  const dispatch = useAppDispatch();
+  const entityIdentifier = useEntityIdentifierContext('inpaint_mask');
+  const isInteractable = useIsEntityInteractable(entityIdentifier);
+
+  const copyToRegionalGuidance = useCallback(() => {
+    dispatch(inpaintMaskConvertedToRegionalGuidance({ entityIdentifier }));
+  }, [dispatch, entityIdentifier]);
+
+  return (
+    <MenuItem {...subMenu.parentMenuItemProps} icon={<PiCopyBold />}>
+      <Menu {...subMenu.menuProps}>
+        <MenuButton {...subMenu.menuButtonProps}>
+          <SubMenuButtonContent label={t('controlLayers.copyInpaintMaskTo')} />
+        </MenuButton>
+        <MenuList {...subMenu.menuListProps}>
+          <CanvasEntityMenuItemsCopyToClipboard />
+          <MenuItem onClick={copyToRegionalGuidance} icon={<PiCopyBold />} isDisabled={!isInteractable}>
+            {t('controlLayers.newRegionalGuidance')}
+          </MenuItem>
+        </MenuList>
+      </Menu>
+    </MenuItem>
+  );
+});
+
+InpaintMaskMenuItemsCopyToSubMenu.displayName = 'InpaintMaskMenuItemsCopyToSubMenu';
--- a/invokeai/frontend/web/src/features/controlLayers/components/RasterLayer/RasterLayerMenuItems.tsx
+++ b/invokeai/frontend/web/src/features/controlLayers/components/RasterLayer/RasterLayerMenuItems.tsx
@@ -1,15 +1,15 @@
 import { MenuDivider } from '@invoke-ai/ui-library';
 import { IconMenuItemGroup } from 'common/components/IconMenuItem';
 import { CanvasEntityMenuItemsArrange } from 'features/controlLayers/components/common/CanvasEntityMenuItemsArrange';
-import { CanvasEntityMenuItemsCopyToClipboard } from 'features/controlLayers/components/common/CanvasEntityMenuItemsCopyToClipboard';
 import { CanvasEntityMenuItemsCropToBbox } from 'features/controlLayers/components/common/CanvasEntityMenuItemsCropToBbox';
 import { CanvasEntityMenuItemsDelete } from 'features/controlLayers/components/common/CanvasEntityMenuItemsDelete';
 import { CanvasEntityMenuItemsDuplicate } from 'features/controlLayers/components/common/CanvasEntityMenuItemsDuplicate';
 import { CanvasEntityMenuItemsFilter } from 'features/controlLayers/components/common/CanvasEntityMenuItemsFilter';
 import { CanvasEntityMenuItemsSave } from 'features/controlLayers/components/common/CanvasEntityMenuItemsSave';
-import { CanvasEntityMenuItemsSegment } from 'features/controlLayers/components/common/CanvasEntityMenuItemsSegment';
+import { CanvasEntityMenuItemsSelectObject } from 'features/controlLayers/components/common/CanvasEntityMenuItemsSelectObject';
 import { CanvasEntityMenuItemsTransform } from 'features/controlLayers/components/common/CanvasEntityMenuItemsTransform';
-import { RasterLayerMenuItemsConvertRasterToControl } from 'features/controlLayers/components/RasterLayer/RasterLayerMenuItemsConvertRasterToControl';
+import { RasterLayerMenuItemsConvertToSubMenu } from 'features/controlLayers/components/RasterLayer/RasterLayerMenuItemsConvertToSubMenu';
+import { RasterLayerMenuItemsCopyToSubMenu } from 'features/controlLayers/components/RasterLayer/RasterLayerMenuItemsCopyToSubMenu';
 import { memo } from 'react';

 export const RasterLayerMenuItems = memo(() => {
@@ -23,11 +23,11 @@ export const RasterLayerMenuItems = memo(() => {
      <MenuDivider />
      <CanvasEntityMenuItemsTransform />
      <CanvasEntityMenuItemsFilter />
-      <CanvasEntityMenuItemsSegment />
-      <RasterLayerMenuItemsConvertRasterToControl />
+      <CanvasEntityMenuItemsSelectObject />
      <MenuDivider />
+      <RasterLayerMenuItemsCopyToSubMenu />
+      <RasterLayerMenuItemsConvertToSubMenu />
      <CanvasEntityMenuItemsCropToBbox />
-      <CanvasEntityMenuItemsCopyToClipboard />
      <CanvasEntityMenuItemsSave />
    </>
  );
--- a/invokeai/frontend/web/src/features/controlLayers/components/RasterLayer/RasterLayerMenuItemsConvertRasterToControl.tsx
+++ b/invokeai/frontend/web/src/features/controlLayers/components/RasterLayer/RasterLayerMenuItemsConvertRasterToControl.tsx
@@ -1,36 +0,0 @@
-import { MenuItem } from '@invoke-ai/ui-library';
-import { useAppDispatch, useAppSelector } from 'app/store/storeHooks';
-import { useEntityIdentifierContext } from 'features/controlLayers/contexts/EntityIdentifierContext';
-import { selectDefaultControlAdapter } from 'features/controlLayers/hooks/addLayerHooks';
-import { useIsEntityInteractable } from 'features/controlLayers/hooks/useEntityIsInteractable';
-import { rasterLayerConvertedToControlLayer } from 'features/controlLayers/store/canvasSlice';
-import { memo, useCallback } from 'react';
-import { useTranslation } from 'react-i18next';
-import { PiLightningBold } from 'react-icons/pi';
-
-export const RasterLayerMenuItemsConvertRasterToControl = memo(() => {
-  const { t } = useTranslation();
-  const dispatch = useAppDispatch();
-  const entityIdentifier = useEntityIdentifierContext('raster_layer');
-  const defaultControlAdapter = useAppSelector(selectDefaultControlAdapter);
-  const isInteractable = useIsEntityInteractable(entityIdentifier);
-
-  const onClick = useCallback(() => {
-    dispatch(
-      rasterLayerConvertedToControlLayer({
-        entityIdentifier,
-        overrides: {
-          controlAdapter: defaultControlAdapter,
-        },
-      })
-    );
-  }, [defaultControlAdapter, dispatch, entityIdentifier]);
-
-  return (
-    <MenuItem onClick={onClick} icon={<PiLightningBold />} isDisabled={!isInteractable}>
-      {t('controlLayers.convertToControlLayer')}
-    </MenuItem>
-  );
-});
-
-RasterLayerMenuItemsConvertRasterToControl.displayName = 'RasterLayerMenuItemsConvertRasterToControl';
--- a/invokeai/frontend/web/src/features/controlLayers/components/RasterLayer/RasterLayerMenuItemsConvertToSubMenu.tsx
+++ b/invokeai/frontend/web/src/features/controlLayers/components/RasterLayer/RasterLayerMenuItemsConvertToSubMenu.tsx
@@ -0,0 +1,65 @@
+import { Menu, MenuButton, MenuItem, MenuList } from '@invoke-ai/ui-library';
+import { useAppDispatch, useAppSelector } from 'app/store/storeHooks';
+import { SubMenuButtonContent, useSubMenu } from 'common/hooks/useSubMenu';
+import { useEntityIdentifierContext } from 'features/controlLayers/contexts/EntityIdentifierContext';
+import { selectDefaultControlAdapter } from 'features/controlLayers/hooks/addLayerHooks';
+import { useIsEntityInteractable } from 'features/controlLayers/hooks/useEntityIsInteractable';
+import {
+  rasterLayerConvertedToControlLayer,
+  rasterLayerConvertedToInpaintMask,
+  rasterLayerConvertedToRegionalGuidance,
+} from 'features/controlLayers/store/canvasSlice';
+import { memo, useCallback } from 'react';
+import { useTranslation } from 'react-i18next';
+import { PiSwapBold } from 'react-icons/pi';
+
+export const RasterLayerMenuItemsConvertToSubMenu = memo(() => {
+  const { t } = useTranslation();
+  const subMenu = useSubMenu();
+
+  const dispatch = useAppDispatch();
+  const entityIdentifier = useEntityIdentifierContext('raster_layer');
+  const defaultControlAdapter = useAppSelector(selectDefaultControlAdapter);
+  const isInteractable = useIsEntityInteractable(entityIdentifier);
+
+  const convertToInpaintMask = useCallback(() => {
+    dispatch(rasterLayerConvertedToInpaintMask({ entityIdentifier, replace: true }));
+  }, [dispatch, entityIdentifier]);
+
+  const convertToRegionalGuidance = useCallback(() => {
+    dispatch(rasterLayerConvertedToRegionalGuidance({ entityIdentifier, replace: true }));
+  }, [dispatch, entityIdentifier]);
+
+  const convertToControlLayer = useCallback(() => {
+    dispatch(
+      rasterLayerConvertedToControlLayer({
+        entityIdentifier,
+        replace: true,
+        overrides: { controlAdapter: defaultControlAdapter },
+      })
+    );
+  }, [defaultControlAdapter, dispatch, entityIdentifier]);
+
+  return (
+    <MenuItem {...subMenu.parentMenuItemProps} icon={<PiSwapBold />}>
+      <Menu {...subMenu.menuProps}>
+        <MenuButton {...subMenu.menuButtonProps}>
+          <SubMenuButtonContent label={t('controlLayers.convertRasterLayerTo')} />
+        </MenuButton>
+        <MenuList {...subMenu.menuListProps}>
+          <MenuItem onClick={convertToInpaintMask} icon={<PiSwapBold />} isDisabled={!isInteractable}>
+            {t('controlLayers.inpaintMask')}
+          </MenuItem>
+          <MenuItem onClick={convertToRegionalGuidance} icon={<PiSwapBold />} isDisabled={!isInteractable}>
+            {t('controlLayers.regionalGuidance')}
+          </MenuItem>
+          <MenuItem onClick={convertToControlLayer} icon={<PiSwapBold />} isDisabled={!isInteractable}>
+            {t('controlLayers.controlLayer')}
+          </MenuItem>
+        </MenuList>
+      </Menu>
+    </MenuItem>
+  );
+});
+
+RasterLayerMenuItemsConvertToSubMenu.displayName = 'RasterLayerMenuItemsConvertToSubMenu';
--- a/invokeai/frontend/web/src/features/controlLayers/components/RasterLayer/RasterLayerMenuItemsCopyToSubMenu.tsx
+++ b/invokeai/frontend/web/src/features/controlLayers/components/RasterLayer/RasterLayerMenuItemsCopyToSubMenu.tsx
@@ -0,0 +1,66 @@
+import { Menu, MenuButton, MenuItem, MenuList } from '@invoke-ai/ui-library';
+import { useAppDispatch, useAppSelector } from 'app/store/storeHooks';
+import { SubMenuButtonContent, useSubMenu } from 'common/hooks/useSubMenu';
+import { CanvasEntityMenuItemsCopyToClipboard } from 'features/controlLayers/components/common/CanvasEntityMenuItemsCopyToClipboard';
+import { useEntityIdentifierContext } from 'features/controlLayers/contexts/EntityIdentifierContext';
+import { selectDefaultControlAdapter } from 'features/controlLayers/hooks/addLayerHooks';
+import { useIsEntityInteractable } from 'features/controlLayers/hooks/useEntityIsInteractable';
+import {
+  rasterLayerConvertedToControlLayer,
+  rasterLayerConvertedToInpaintMask,
+  rasterLayerConvertedToRegionalGuidance,
+} from 'features/controlLayers/store/canvasSlice';
+import { memo, useCallback } from 'react';
+import { useTranslation } from 'react-i18next';
+import { PiCopyBold } from 'react-icons/pi';
+
+export const RasterLayerMenuItemsCopyToSubMenu = memo(() => {
+  const { t } = useTranslation();
+  const subMenu = useSubMenu();
+
+  const dispatch = useAppDispatch();
+  const entityIdentifier = useEntityIdentifierContext('raster_layer');
+  const defaultControlAdapter = useAppSelector(selectDefaultControlAdapter);
+  const isInteractable = useIsEntityInteractable(entityIdentifier);
+
+  const copyToInpaintMask = useCallback(() => {
+    dispatch(rasterLayerConvertedToInpaintMask({ entityIdentifier }));
+  }, [dispatch, entityIdentifier]);
+
+  const copyToRegionalGuidance = useCallback(() => {
+    dispatch(rasterLayerConvertedToRegionalGuidance({ entityIdentifier }));
+  }, [dispatch, entityIdentifier]);
+
+  const copyToControlLayer = useCallback(() => {
+    dispatch(
+      rasterLayerConvertedToControlLayer({
+        entityIdentifier,
+        overrides: { controlAdapter: defaultControlAdapter },
+      })
+    );
+  }, [defaultControlAdapter, dispatch, entityIdentifier]);
+
+  return (
+    <MenuItem {...subMenu.parentMenuItemProps} icon={<PiCopyBold />}>
+      <Menu {...subMenu.menuProps}>
+        <MenuButton {...subMenu.menuButtonProps}>
+          <SubMenuButtonContent label={t('controlLayers.copyRasterLayerTo')} />
+        </MenuButton>
+        <MenuList {...subMenu.menuListProps}>
+          <CanvasEntityMenuItemsCopyToClipboard />
+          <MenuItem onClick={copyToInpaintMask} icon={<PiCopyBold />} isDisabled={!isInteractable}>
+            {t('controlLayers.newInpaintMask')}
+          </MenuItem>
+          <MenuItem onClick={copyToRegionalGuidance} icon={<PiCopyBold />} isDisabled={!isInteractable}>
+            {t('controlLayers.newRegionalGuidance')}
+          </MenuItem>
+          <MenuItem onClick={copyToControlLayer} icon={<PiCopyBold />} isDisabled={!isInteractable}>
+            {t('controlLayers.newControlLayer')}
+          </MenuItem>
+        </MenuList>
+      </Menu>
+    </MenuItem>
+  );
+});
+
+RasterLayerMenuItemsCopyToSubMenu.displayName = 'RasterLayerMenuItemsCopyToSubMenu';
--- a/invokeai/frontend/web/src/features/controlLayers/components/RegionalGuidance/RegionalGuidance.tsx
+++ b/invokeai/frontend/web/src/features/controlLayers/components/RegionalGuidance/RegionalGuidance.tsx
@@ -16,7 +16,10 @@ type Props = {
 };

 export const RegionalGuidance = memo(({ id }: Props) => {
-  const entityIdentifier = useMemo<CanvasEntityIdentifier>(() => ({ id, type: 'regional_guidance' }), [id]);
+  const entityIdentifier = useMemo<CanvasEntityIdentifier<'regional_guidance'>>(
+    () => ({ id, type: 'regional_guidance' }),
+    [id]
+  );

  return (
    <EntityIdentifierContext.Provider value={entityIdentifier}>
--- a/invokeai/frontend/web/src/features/controlLayers/components/RegionalGuidance/RegionalGuidanceMenuItems.tsx
+++ b/invokeai/frontend/web/src/features/controlLayers/components/RegionalGuidance/RegionalGuidanceMenuItems.tsx
@@ -1,4 +1,5 @@
-import { Flex, MenuDivider } from '@invoke-ai/ui-library';
+import { MenuDivider } from '@invoke-ai/ui-library';
+import { IconMenuItemGroup } from 'common/components/IconMenuItem';
 import { CanvasEntityMenuItemsArrange } from 'features/controlLayers/components/common/CanvasEntityMenuItemsArrange';
 import { CanvasEntityMenuItemsCropToBbox } from 'features/controlLayers/components/common/CanvasEntityMenuItemsCropToBbox';
 import { CanvasEntityMenuItemsDelete } from 'features/controlLayers/components/common/CanvasEntityMenuItemsDelete';
@@ -6,22 +7,26 @@ import { CanvasEntityMenuItemsDuplicate } from 'features/controlLayers/component
 import { CanvasEntityMenuItemsTransform } from 'features/controlLayers/components/common/CanvasEntityMenuItemsTransform';
 import { RegionalGuidanceMenuItemsAddPromptsAndIPAdapter } from 'features/controlLayers/components/RegionalGuidance/RegionalGuidanceMenuItemsAddPromptsAndIPAdapter';
 import { RegionalGuidanceMenuItemsAutoNegative } from 'features/controlLayers/components/RegionalGuidance/RegionalGuidanceMenuItemsAutoNegative';
+import { RegionalGuidanceMenuItemsConvertToSubMenu } from 'features/controlLayers/components/RegionalGuidance/RegionalGuidanceMenuItemsConvertToSubMenu';
+import { RegionalGuidanceMenuItemsCopyToSubMenu } from 'features/controlLayers/components/RegionalGuidance/RegionalGuidanceMenuItemsCopyToSubMenu';
 import { memo } from 'react';

 export const RegionalGuidanceMenuItems = memo(() => {
  return (
    <>
-      <Flex gap={2}>
+      <IconMenuItemGroup>
        <CanvasEntityMenuItemsArrange />
        <CanvasEntityMenuItemsDuplicate />
        <CanvasEntityMenuItemsDelete asIcon />
-      </Flex>
+      </IconMenuItemGroup>
      <MenuDivider />
      <RegionalGuidanceMenuItemsAddPromptsAndIPAdapter />
      <MenuDivider />
      <CanvasEntityMenuItemsTransform />
      <RegionalGuidanceMenuItemsAutoNegative />
      <MenuDivider />
+      <RegionalGuidanceMenuItemsCopyToSubMenu />
+      <RegionalGuidanceMenuItemsConvertToSubMenu />
      <CanvasEntityMenuItemsCropToBbox />
    </>
  );
--- a/invokeai/frontend/web/src/features/controlLayers/components/RegionalGuidance/RegionalGuidanceMenuItemsConvertToSubMenu.tsx
+++ b/invokeai/frontend/web/src/features/controlLayers/components/RegionalGuidance/RegionalGuidanceMenuItemsConvertToSubMenu.tsx
@@ -0,0 +1,38 @@
+import { Menu, MenuButton, MenuItem, MenuList } from '@invoke-ai/ui-library';
+import { useAppDispatch } from 'app/store/storeHooks';
+import { SubMenuButtonContent, useSubMenu } from 'common/hooks/useSubMenu';
+import { useEntityIdentifierContext } from 'features/controlLayers/contexts/EntityIdentifierContext';
+import { useIsEntityInteractable } from 'features/controlLayers/hooks/useEntityIsInteractable';
+import { rgConvertedToInpaintMask } from 'features/controlLayers/store/canvasSlice';
+import { memo, useCallback } from 'react';
+import { useTranslation } from 'react-i18next';
+import { PiSwapBold } from 'react-icons/pi';
+
+export const RegionalGuidanceMenuItemsConvertToSubMenu = memo(() => {
+  const { t } = useTranslation();
+  const subMenu = useSubMenu();
+  const dispatch = useAppDispatch();
+  const entityIdentifier = useEntityIdentifierContext('regional_guidance');
+  const isInteractable = useIsEntityInteractable(entityIdentifier);
+
+  const convertToInpaintMask = useCallback(() => {
+    dispatch(rgConvertedToInpaintMask({ entityIdentifier, replace: true }));
+  }, [dispatch, entityIdentifier]);
+
+  return (
+    <MenuItem {...subMenu.parentMenuItemProps} icon={<PiSwapBold />}>
+      <Menu {...subMenu.menuProps}>
+        <MenuButton {...subMenu.menuButtonProps}>
+          <SubMenuButtonContent label={t('controlLayers.convertRegionalGuidanceTo')} />
+        </MenuButton>
+        <MenuList {...subMenu.menuListProps}>
+          <MenuItem onClick={convertToInpaintMask} icon={<PiSwapBold />} isDisabled={!isInteractable}>
+            {t('controlLayers.inpaintMask')}
+          </MenuItem>
+        </MenuList>
+      </Menu>
+    </MenuItem>
+  );
+});
+
+RegionalGuidanceMenuItemsConvertToSubMenu.displayName = 'RegionalGuidanceMenuItemsConvertToSubMenu';
--- a/invokeai/frontend/web/src/features/controlLayers/components/RegionalGuidance/RegionalGuidanceMenuItemsCopyToSubMenu.tsx
+++ b/invokeai/frontend/web/src/features/controlLayers/components/RegionalGuidance/RegionalGuidanceMenuItemsCopyToSubMenu.tsx
@@ -0,0 +1,40 @@
+import { Menu, MenuButton, MenuItem, MenuList } from '@invoke-ai/ui-library';
+import { useAppDispatch } from 'app/store/storeHooks';
+import { SubMenuButtonContent, useSubMenu } from 'common/hooks/useSubMenu';
+import { CanvasEntityMenuItemsCopyToClipboard } from 'features/controlLayers/components/common/CanvasEntityMenuItemsCopyToClipboard';
+import { useEntityIdentifierContext } from 'features/controlLayers/contexts/EntityIdentifierContext';
+import { useIsEntityInteractable } from 'features/controlLayers/hooks/useEntityIsInteractable';
+import { rgConvertedToInpaintMask } from 'features/controlLayers/store/canvasSlice';
+import { memo, useCallback } from 'react';
+import { useTranslation } from 'react-i18next';
+import { PiCopyBold } from 'react-icons/pi';
+
+export const RegionalGuidanceMenuItemsCopyToSubMenu = memo(() => {
+  const { t } = useTranslation();
+  const subMenu = useSubMenu();
+  const dispatch = useAppDispatch();
+  const entityIdentifier = useEntityIdentifierContext('regional_guidance');
+  const isInteractable = useIsEntityInteractable(entityIdentifier);
+
+  const copyToInpaintMask = useCallback(() => {
+    dispatch(rgConvertedToInpaintMask({ entityIdentifier }));
+  }, [dispatch, entityIdentifier]);
+
+  return (
+    <MenuItem {...subMenu.parentMenuItemProps} icon={<PiCopyBold />}>
+      <Menu {...subMenu.menuProps}>
+        <MenuButton {...subMenu.menuButtonProps}>
+          <SubMenuButtonContent label={t('controlLayers.copyRegionalGuidanceTo')} />
+        </MenuButton>
+        <MenuList {...subMenu.menuListProps}>
+          <CanvasEntityMenuItemsCopyToClipboard />
+          <MenuItem onClick={copyToInpaintMask} icon={<PiCopyBold />} isDisabled={!isInteractable}>
+            {t('controlLayers.newInpaintMask')}
+          </MenuItem>
+        </MenuList>
+      </Menu>
+    </MenuItem>
+  );
+});
+
+RegionalGuidanceMenuItemsCopyToSubMenu.displayName = 'RegionalGuidanceMenuItemsCopyToSubMenu';
--- a/invokeai/frontend/web/src/features/controlLayers/components/SegmentAnything/SegmentAnything.tsx
+++ b/invokeai/frontend/web/src/features/controlLayers/components/SegmentAnything/SegmentAnything.tsx
@@ -1,124 +0,0 @@
-import { Button, ButtonGroup, Flex, Heading, Spacer } from '@invoke-ai/ui-library';
-import { useStore } from '@nanostores/react';
-import { useAppSelector } from 'app/store/storeHooks';
-import { useFocusRegion, useIsRegionFocused } from 'common/hooks/focus';
-import { CanvasAutoProcessSwitch } from 'features/controlLayers/components/CanvasAutoProcessSwitch';
-import { CanvasOperationIsolatedLayerPreviewSwitch } from 'features/controlLayers/components/CanvasOperationIsolatedLayerPreviewSwitch';
-import { SegmentAnythingPointType } from 'features/controlLayers/components/SegmentAnything/SegmentAnythingPointType';
-import { useCanvasManager } from 'features/controlLayers/contexts/CanvasManagerProviderGate';
-import type { CanvasEntityAdapterControlLayer } from 'features/controlLayers/konva/CanvasEntity/CanvasEntityAdapterControlLayer';
-import type { CanvasEntityAdapterRasterLayer } from 'features/controlLayers/konva/CanvasEntity/CanvasEntityAdapterRasterLayer';
-import { selectAutoProcess } from 'features/controlLayers/store/canvasSettingsSlice';
-import { useRegisteredHotkeys } from 'features/system/components/HotkeysModal/useHotkeyData';
-import { memo, useRef } from 'react';
-import { useTranslation } from 'react-i18next';
-import { PiArrowsCounterClockwiseBold, PiCheckBold, PiStarBold, PiXBold } from 'react-icons/pi';
-
-const SegmentAnythingContent = memo(
-  ({ adapter }: { adapter: CanvasEntityAdapterRasterLayer | CanvasEntityAdapterControlLayer }) => {
-    const { t } = useTranslation();
-    const ref = useRef<HTMLDivElement>(null);
-    useFocusRegion('canvas', ref, { focusOnMount: true });
-    const isCanvasFocused = useIsRegionFocused('canvas');
-    const isProcessing = useStore(adapter.segmentAnything.$isProcessing);
-    const hasPoints = useStore(adapter.segmentAnything.$hasPoints);
-    const autoProcess = useAppSelector(selectAutoProcess);
-
-    useRegisteredHotkeys({
-      id: 'applySegmentAnything',
-      category: 'canvas',
-      callback: adapter.segmentAnything.apply,
-      options: { enabled: !isProcessing && isCanvasFocused },
-      dependencies: [adapter.segmentAnything, isProcessing, isCanvasFocused],
-    });
-
-    useRegisteredHotkeys({
-      id: 'cancelSegmentAnything',
-      category: 'canvas',
-      callback: adapter.segmentAnything.cancel,
-      options: { enabled: !isProcessing && isCanvasFocused },
-      dependencies: [adapter.segmentAnything, isProcessing, isCanvasFocused],
-    });
-
-    return (
-      <Flex
-        ref={ref}
-        bg="base.800"
-        borderRadius="base"
-        p={4}
-        flexDir="column"
-        gap={4}
-        minW={420}
-        h="auto"
-        shadow="dark-lg"
-        transitionProperty="height"
-        transitionDuration="normal"
-      >
-        <Flex w="full" gap={4}>
-          <Heading size="md" color="base.300" userSelect="none">
-            {t('controlLayers.segment.autoMask')}
-          </Heading>
-          <Spacer />
-          <CanvasAutoProcessSwitch />
-          <CanvasOperationIsolatedLayerPreviewSwitch />
-        </Flex>
-
-        <SegmentAnythingPointType adapter={adapter} />
-
-        <ButtonGroup isAttached={false} size="sm" w="full">
-          <Button
-            leftIcon={<PiStarBold />}
-            onClick={adapter.segmentAnything.processImmediate}
-            isLoading={isProcessing}
-            loadingText={t('controlLayers.segment.process')}
-            variant="ghost"
-            isDisabled={!hasPoints || autoProcess}
-          >
-            {t('controlLayers.segment.process')}
-          </Button>
-          <Spacer />
-          <Button
-            leftIcon={<PiArrowsCounterClockwiseBold />}
-            onClick={adapter.segmentAnything.reset}
-            isLoading={isProcessing}
-            loadingText={t('controlLayers.segment.reset')}
-            variant="ghost"
-          >
-            {t('controlLayers.segment.reset')}
-          </Button>
-          <Button
-            leftIcon={<PiCheckBold />}
-            onClick={adapter.segmentAnything.apply}
-            isLoading={isProcessing}
-            loadingText={t('controlLayers.segment.apply')}
-            variant="ghost"
-          >
-            {t('controlLayers.segment.apply')}
-          </Button>
-          <Button
-            leftIcon={<PiXBold />}
-            onClick={adapter.segmentAnything.cancel}
-            isLoading={isProcessing}
-            loadingText={t('common.cancel')}
-            variant="ghost"
-          >
-            {t('controlLayers.segment.cancel')}
-          </Button>
-        </ButtonGroup>
-      </Flex>
-    );
-  }
-);
-
-SegmentAnythingContent.displayName = 'SegmentAnythingContent';
-
-export const SegmentAnything = () => {
-  const canvasManager = useCanvasManager();
-  const adapter = useStore(canvasManager.stateApi.$segmentingAdapter);
-
-  if (!adapter) {
-    return null;
-  }
-
-  return <SegmentAnythingContent adapter={adapter} />;
-};
--- a/invokeai/frontend/web/src/features/controlLayers/components/SelectObject/SelectObject.tsx
+++ b/invokeai/frontend/web/src/features/controlLayers/components/SelectObject/SelectObject.tsx
@@ -0,0 +1,223 @@
+import {
+  Button,
+  ButtonGroup,
+  Flex,
+  Heading,
+  Icon,
+  ListItem,
+  Menu,
+  MenuButton,
+  MenuItem,
+  MenuList,
+  Spacer,
+  Spinner,
+  Text,
+  Tooltip,
+  UnorderedList,
+} from '@invoke-ai/ui-library';
+import { useStore } from '@nanostores/react';
+import { useAppSelector } from 'app/store/storeHooks';
+import { useFocusRegion, useIsRegionFocused } from 'common/hooks/focus';
+import { CanvasAutoProcessSwitch } from 'features/controlLayers/components/CanvasAutoProcessSwitch';
+import { CanvasOperationIsolatedLayerPreviewSwitch } from 'features/controlLayers/components/CanvasOperationIsolatedLayerPreviewSwitch';
+import { SelectObjectInvert } from 'features/controlLayers/components/SelectObject/SelectObjectInvert';
+import { SelectObjectPointType } from 'features/controlLayers/components/SelectObject/SelectObjectPointType';
+import { useCanvasManager } from 'features/controlLayers/contexts/CanvasManagerProviderGate';
+import type { CanvasEntityAdapterControlLayer } from 'features/controlLayers/konva/CanvasEntity/CanvasEntityAdapterControlLayer';
+import type { CanvasEntityAdapterRasterLayer } from 'features/controlLayers/konva/CanvasEntity/CanvasEntityAdapterRasterLayer';
+import { selectAutoProcess } from 'features/controlLayers/store/canvasSettingsSlice';
+import { useRegisteredHotkeys } from 'features/system/components/HotkeysModal/useHotkeyData';
+import type { PropsWithChildren } from 'react';
+import { memo, useCallback, useRef } from 'react';
+import { Trans, useTranslation } from 'react-i18next';
+import { PiCaretDownBold, PiInfoBold } from 'react-icons/pi';
+
+const SelectObjectContent = memo(
+  ({ adapter }: { adapter: CanvasEntityAdapterRasterLayer | CanvasEntityAdapterControlLayer }) => {
+    const { t } = useTranslation();
+    const ref = useRef<HTMLDivElement>(null);
+    useFocusRegion('canvas', ref, { focusOnMount: true });
+    const isCanvasFocused = useIsRegionFocused('canvas');
+    const isProcessing = useStore(adapter.segmentAnything.$isProcessing);
+    const hasPoints = useStore(adapter.segmentAnything.$hasPoints);
+    const hasImageState = useStore(adapter.segmentAnything.$hasImageState);
+    const autoProcess = useAppSelector(selectAutoProcess);
+
+    const saveAsInpaintMask = useCallback(() => {
+      adapter.segmentAnything.saveAs('inpaint_mask');
+    }, [adapter.segmentAnything]);
+
+    const saveAsRegionalGuidance = useCallback(() => {
+      adapter.segmentAnything.saveAs('regional_guidance');
+    }, [adapter.segmentAnything]);
+
+    const saveAsRasterLayer = useCallback(() => {
+      adapter.segmentAnything.saveAs('raster_layer');
+    }, [adapter.segmentAnything]);
+
+    const saveAsControlLayer = useCallback(() => {
+      adapter.segmentAnything.saveAs('control_layer');
+    }, [adapter.segmentAnything]);
+
+    useRegisteredHotkeys({
+      id: 'applySegmentAnything',
+      category: 'canvas',
+      callback: adapter.segmentAnything.apply,
+      options: { enabled: !isProcessing && isCanvasFocused },
+      dependencies: [adapter.segmentAnything, isProcessing, isCanvasFocused],
+    });
+
+    useRegisteredHotkeys({
+      id: 'cancelSegmentAnything',
+      category: 'canvas',
+      callback: adapter.segmentAnything.cancel,
+      options: { enabled: !isProcessing && isCanvasFocused },
+      dependencies: [adapter.segmentAnything, isProcessing, isCanvasFocused],
+    });
+
+    return (
+      <Flex
+        ref={ref}
+        bg="base.800"
+        borderRadius="base"
+        p={4}
+        flexDir="column"
+        gap={4}
+        minW={420}
+        h="auto"
+        shadow="dark-lg"
+        transitionProperty="height"
+        transitionDuration="normal"
+      >
+        <Flex w="full" gap={4} alignItems="center">
+          <Flex gap={2}>
+            <Heading size="md" color="base.300" userSelect="none">
+              {t('controlLayers.selectObject.selectObject')}
+            </Heading>
+            <Tooltip label={<SelectObjectHelpTooltipContent />}>
+              <Flex alignItems="center">
+                <Icon as={PiInfoBold} color="base.500" />
+              </Flex>
+            </Tooltip>
+          </Flex>
+          <Spacer />
+          <CanvasAutoProcessSwitch />
+          <CanvasOperationIsolatedLayerPreviewSwitch />
+        </Flex>
+
+        <Flex w="full" justifyContent="space-between" py={2}>
+          <SelectObjectPointType adapter={adapter} />
+          <SelectObjectInvert adapter={adapter} />
+        </Flex>
+
+        <ButtonGroup isAttached={false} size="sm" w="full">
+          <Button
+            onClick={adapter.segmentAnything.processImmediate}
+            loadingText={t('controlLayers.selectObject.process')}
+            variant="ghost"
+            isDisabled={isProcessing || !hasPoints || (autoProcess && hasImageState)}
+          >
+            {t('controlLayers.selectObject.process')}
+            {isProcessing && <Spinner ms={3} boxSize={5} color="base.600" />}
+          </Button>
+          <Spacer />
+          <Button
+            onClick={adapter.segmentAnything.reset}
+            isDisabled={isProcessing || !hasPoints}
+            loadingText={t('controlLayers.selectObject.reset')}
+            variant="ghost"
+          >
+            {t('controlLayers.selectObject.reset')}
+          </Button>
+          <Button
+            onClick={adapter.segmentAnything.apply}
+            loadingText={t('controlLayers.selectObject.apply')}
+            variant="ghost"
+            isDisabled={isProcessing || !hasImageState}
+          >
+            {t('controlLayers.selectObject.apply')}
+          </Button>
+          <Menu>
+            <MenuButton
+              as={Button}
+              loadingText={t('controlLayers.selectObject.saveAs')}
+              variant="ghost"
+              isDisabled={isProcessing || !hasImageState}
+              rightIcon={<PiCaretDownBold />}
+            >
+              {t('controlLayers.selectObject.saveAs')}
+            </MenuButton>
+            <MenuList>
+              <MenuItem isDisabled={isProcessing || !hasImageState} onClick={saveAsInpaintMask}>
+                {t('controlLayers.newInpaintMask')}
+              </MenuItem>
+              <MenuItem isDisabled={isProcessing || !hasImageState} onClick={saveAsRegionalGuidance}>
+                {t('controlLayers.newRegionalGuidance')}
+              </MenuItem>
+              <MenuItem isDisabled={isProcessing || !hasImageState} onClick={saveAsControlLayer}>
+                {t('controlLayers.newControlLayer')}
+              </MenuItem>
+              <MenuItem isDisabled={isProcessing || !hasImageState} onClick={saveAsRasterLayer}>
+                {t('controlLayers.newRasterLayer')}
+              </MenuItem>
+            </MenuList>
+          </Menu>
+          <Button
+            onClick={adapter.segmentAnything.cancel}
+            isDisabled={isProcessing}
+            loadingText={t('common.cancel')}
+            variant="ghost"
+          >
+            {t('controlLayers.selectObject.cancel')}
+          </Button>
+        </ButtonGroup>
+      </Flex>
+    );
+  }
+);
+
+SelectObjectContent.displayName = 'SegmentAnythingContent';
+
+export const SelectObject = memo(() => {
+  const canvasManager = useCanvasManager();
+  const adapter = useStore(canvasManager.stateApi.$segmentingAdapter);
+
+  if (!adapter) {
+    return null;
+  }
+
+  return <SelectObjectContent adapter={adapter} />;
+});
+
+SelectObject.displayName = 'SelectObject';
+
+const Bold = (props: PropsWithChildren) => (
+  <Text as="span" fontWeight="semibold">
+    {props.children}
+  </Text>
+);
+
+const SelectObjectHelpTooltipContent = memo(() => {
+  const { t } = useTranslation();
+
+  return (
+    <Flex gap={3} flexDir="column">
+      <Text>
+        <Trans i18nKey="controlLayers.selectObject.help1" components={{ Bold: <Bold /> }} />
+      </Text>
+      <Text>
+        <Trans i18nKey="controlLayers.selectObject.help2" components={{ Bold: <Bold /> }} />
+      </Text>
+      <Text>
+        <Trans i18nKey="controlLayers.selectObject.help3" />
+      </Text>
+      <UnorderedList>
+        <ListItem>{t('controlLayers.selectObject.clickToAdd')}</ListItem>
+        <ListItem>{t('controlLayers.selectObject.dragToMove')}</ListItem>
+        <ListItem>{t('controlLayers.selectObject.clickToRemove')}</ListItem>
+      </UnorderedList>
+    </Flex>
+  );
+});
+
+SelectObjectHelpTooltipContent.displayName = 'SelectObjectHelpTooltipContent';
--- a/invokeai/frontend/web/src/features/controlLayers/components/SelectObject/SelectObjectInvert.tsx
+++ b/invokeai/frontend/web/src/features/controlLayers/components/SelectObject/SelectObjectInvert.tsx
@@ -0,0 +1,26 @@
+import { FormControl, FormLabel, Switch } from '@invoke-ai/ui-library';
+import { useStore } from '@nanostores/react';
+import type { CanvasEntityAdapterControlLayer } from 'features/controlLayers/konva/CanvasEntity/CanvasEntityAdapterControlLayer';
+import type { CanvasEntityAdapterRasterLayer } from 'features/controlLayers/konva/CanvasEntity/CanvasEntityAdapterRasterLayer';
+import { memo, useCallback } from 'react';
+import { useTranslation } from 'react-i18next';
+
+export const SelectObjectInvert = memo(
+  ({ adapter }: { adapter: CanvasEntityAdapterRasterLayer | CanvasEntityAdapterControlLayer }) => {
+    const { t } = useTranslation();
+    const invert = useStore(adapter.segmentAnything.$invert);
+
+    const onChange = useCallback(() => {
+      adapter.segmentAnything.$invert.set(!adapter.segmentAnything.$invert.get());
+    }, [adapter.segmentAnything.$invert]);
+
+    return (
+      <FormControl w="min-content">
+        <FormLabel m={0}>{t('controlLayers.selectObject.invertSelection')}</FormLabel>
+        <Switch size="sm" isChecked={invert} onChange={onChange} />
+      </FormControl>
+    );
+  }
+);
+
+SelectObjectInvert.displayName = 'SelectObjectInvert';
--- a/invokeai/frontend/web/src/features/controlLayers/components/SegmentAnything/SegmentAnythingPointType.tsx
+++ b/invokeai/frontend/web/src/features/controlLayers/components/SegmentAnything/SegmentAnythingPointType.tsx
@@ -6,7 +6,7 @@ import { SAM_POINT_LABEL_STRING_TO_NUMBER, zSAMPointLabelString } from 'features
 import { memo, useCallback } from 'react';
 import { useTranslation } from 'react-i18next';

-export const SegmentAnythingPointType = memo(
+export const SelectObjectPointType = memo(
  ({ adapter }: { adapter: CanvasEntityAdapterRasterLayer | CanvasEntityAdapterControlLayer }) => {
    const { t } = useTranslation();
    const pointType = useStore(adapter.segmentAnything.$pointTypeString);
@@ -21,18 +21,15 @@ export const SegmentAnythingPointType = memo(
    );

    return (
-      <FormControl w="full">
-        <FormLabel>{t('controlLayers.segment.pointType')}</FormLabel>
+      <FormControl w="min-content">
+        <FormLabel m={0}>{t('controlLayers.selectObject.pointType')}</FormLabel>
        <RadioGroup value={pointType} onChange={onChange} w="full" size="md">
          <Flex alignItems="center" w="full" gap={4} fontWeight="semibold" color="base.300">
            <Radio value="foreground">
-              <Text>{t('controlLayers.segment.foreground')}</Text>
+              <Text>{t('controlLayers.selectObject.include')}</Text>
            </Radio>
            <Radio value="background">
-              <Text>{t('controlLayers.segment.background')}</Text>
-            </Radio>
-            <Radio value="neutral">
-              <Text>{t('controlLayers.segment.neutral')}</Text>
+              <Text>{t('controlLayers.selectObject.exclude')}</Text>
            </Radio>
          </Flex>
        </RadioGroup>
@@ -41,4 +38,4 @@ export const SegmentAnythingPointType = memo(
  }
 );

-SegmentAnythingPointType.displayName = 'SegmentAnythingPointType';
+SelectObjectPointType.displayName = 'SelectObject';
--- a/invokeai/frontend/web/src/features/controlLayers/components/Transform/Transform.tsx
+++ b/invokeai/frontend/web/src/features/controlLayers/components/Transform/Transform.tsx
@@ -1,4 +1,4 @@
-import { Button, ButtonGroup, Flex, Heading, Spacer } from '@invoke-ai/ui-library';
+import { Button, ButtonGroup, Flex, Heading, Spacer, Spinner } from '@invoke-ai/ui-library';
 import { useStore } from '@nanostores/react';
 import { useFocusRegion, useIsRegionFocused } from 'common/hooks/focus';
 import { CanvasOperationIsolatedLayerPreviewSwitch } from 'features/controlLayers/components/CanvasOperationIsolatedLayerPreviewSwitch';
@@ -8,7 +8,6 @@ import type { CanvasEntityAdapter } from 'features/controlLayers/konva/CanvasEnt
 import { useRegisteredHotkeys } from 'features/system/components/HotkeysModal/useHotkeyData';
 import { memo, useRef } from 'react';
 import { useTranslation } from 'react-i18next';
-import { PiArrowsCounterClockwiseBold, PiCheckBold, PiXBold } from 'react-icons/pi';

 const TransformContent = memo(({ adapter }: { adapter: CanvasEntityAdapter }) => {
  const { t } = useTranslation();
@@ -62,30 +61,28 @@ const TransformContent = memo(({ adapter }: { adapter: CanvasEntityAdapter }) =>

      <TransformFitToBboxButtons adapter={adapter} />

-      <ButtonGroup isAttached={false} size="sm" w="full">
+      <ButtonGroup isAttached={false} size="sm" w="full" alignItems="center">
+        {isProcessing && <Spinner ms={3} boxSize={5} color="base.600" />}
        <Spacer />
        <Button
-          leftIcon={<PiArrowsCounterClockwiseBold />}
          onClick={adapter.transformer.resetTransform}
-          isLoading={isProcessing}
+          isDisabled={isProcessing}
          loadingText={t('controlLayers.transform.reset')}
          variant="ghost"
        >
          {t('controlLayers.transform.reset')}
        </Button>
        <Button
-          leftIcon={<PiCheckBold />}
          onClick={adapter.transformer.applyTransform}
-          isLoading={isProcessing}
+          isDisabled={isProcessing}
          loadingText={t('controlLayers.transform.apply')}
          variant="ghost"
        >
          {t('controlLayers.transform.apply')}
        </Button>
        <Button
-          leftIcon={<PiXBold />}
          onClick={adapter.transformer.stopTransform}
-          isLoading={isProcessing}
+          isDisabled={isProcessing}
          loadingText={t('common.cancel')}
          variant="ghost"
        >
--- a/invokeai/frontend/web/src/features/controlLayers/components/Transform/TransformFitToBboxButtons.tsx
+++ b/invokeai/frontend/web/src/features/controlLayers/components/Transform/TransformFitToBboxButtons.tsx
@@ -4,7 +4,6 @@ import { useStore } from '@nanostores/react';
 import type { CanvasEntityAdapter } from 'features/controlLayers/konva/CanvasEntity/types';
 import { memo, useCallback, useMemo, useState } from 'react';
 import { useTranslation } from 'react-i18next';
-import { PiArrowsOutBold } from 'react-icons/pi';
 import type { Equals } from 'tsafe';
 import { assert } from 'tsafe';
 import { z } from 'zod';
@@ -60,10 +59,9 @@ export const TransformFitToBboxButtons = memo(({ adapter }: { adapter: CanvasEnt
        <Combobox options={options} value={value} onChange={onChange} isSearchable={false} isClearable={false} />
      </FormControl>
      <Button
-        leftIcon={<PiArrowsOutBold />}
        size="sm"
        onClick={onClick}
-        isLoading={isProcessing}
+        isDisabled={isProcessing}
        loadingText={t('controlLayers.transform.fitToBbox')}
        variant="ghost"
      >
--- a/invokeai/frontend/web/src/features/controlLayers/components/common/CanvasEntityGroupList.tsx
+++ b/invokeai/frontend/web/src/features/controlLayers/components/common/CanvasEntityGroupList.tsx
@@ -1,9 +1,11 @@
 import type { SystemStyleObject } from '@invoke-ai/ui-library';
 import { Button, Collapse, Flex, Icon, Spacer, Text } from '@invoke-ai/ui-library';
+import { InformationalPopover } from 'common/components/InformationalPopover/InformationalPopover';
 import { useBoolean } from 'common/hooks/useBoolean';
 import { CanvasEntityAddOfTypeButton } from 'features/controlLayers/components/common/CanvasEntityAddOfTypeButton';
 import { CanvasEntityMergeVisibleButton } from 'features/controlLayers/components/common/CanvasEntityMergeVisibleButton';
 import { CanvasEntityTypeIsHiddenToggle } from 'features/controlLayers/components/common/CanvasEntityTypeIsHiddenToggle';
+import { useEntityTypeInformationalPopover } from 'features/controlLayers/hooks/useEntityTypeInformationalPopover';
 import { useEntityTypeTitle } from 'features/controlLayers/hooks/useEntityTypeTitle';
 import type { CanvasEntityIdentifier } from 'features/controlLayers/store/types';
 import type { PropsWithChildren } from 'react';
@@ -21,6 +23,7 @@ const _hover: SystemStyleObject = {

 export const CanvasEntityGroupList = memo(({ isSelected, type, children }: Props) => {
  const title = useEntityTypeTitle(type);
+  const informationalPopoverFeature = useEntityTypeInformationalPopover(type);
  const collapse = useBoolean(true);
  const canMergeVisible = useMemo(() => type === 'raster_layer' || type === 'inpaint_mask', [type]);
  const canHideAll = useMemo(() => type !== 'reference_image', [type]);
@@ -47,15 +50,30 @@ export const CanvasEntityGroupList = memo(({ isSelected, type, children }: Props
            transitionProperty="common"
            transitionDuration="fast"
          />
-          <Text
-            fontWeight="semibold"
-            color={isSelected ? 'base.200' : 'base.500'}
-            userSelect="none"
-            transitionProperty="common"
-            transitionDuration="fast"
-          >
-            {title}
-          </Text>
+          {informationalPopoverFeature ? (
+            <InformationalPopover feature={informationalPopoverFeature}>
+              <Text
+                fontWeight="semibold"
+                color={isSelected ? 'base.200' : 'base.500'}
+                userSelect="none"
+                transitionProperty="common"
+                transitionDuration="fast"
+              >
+                {title}
+              </Text>
+            </InformationalPopover>
+          ) : (
+            <Text
+              fontWeight="semibold"
+              color={isSelected ? 'base.200' : 'base.500'}
+              userSelect="none"
+              transitionProperty="common"
+              transitionDuration="fast"
+            >
+              {title}
+            </Text>
+          )}
+
          <Spacer />
        </Flex>
        {canMergeVisible && <CanvasEntityMergeVisibleButton type={type} />}
--- a/invokeai/frontend/web/src/features/controlLayers/components/common/CanvasEntityMenuItemsCopyToClipboard.tsx
+++ b/invokeai/frontend/web/src/features/controlLayers/components/common/CanvasEntityMenuItemsCopyToClipboard.tsx
@@ -2,6 +2,7 @@ import { MenuItem } from '@invoke-ai/ui-library';
 import { useEntityAdapterSafe } from 'features/controlLayers/contexts/EntityAdapterContext';
 import { useEntityIdentifierContext } from 'features/controlLayers/contexts/EntityIdentifierContext';
 import { useCopyLayerToClipboard } from 'features/controlLayers/hooks/useCopyLayerToClipboard';
+import { useEntityIsEmpty } from 'features/controlLayers/hooks/useEntityIsEmpty';
 import { useIsEntityInteractable } from 'features/controlLayers/hooks/useEntityIsInteractable';
 import { memo, useCallback } from 'react';
 import { useTranslation } from 'react-i18next';
@@ -12,6 +13,7 @@ export const CanvasEntityMenuItemsCopyToClipboard = memo(() => {
  const entityIdentifier = useEntityIdentifierContext();
  const adapter = useEntityAdapterSafe(entityIdentifier);
  const isInteractable = useIsEntityInteractable(entityIdentifier);
+  const isEmpty = useEntityIsEmpty(entityIdentifier);
  const copyLayerToClipboard = useCopyLayerToClipboard();

  const onClick = useCallback(() => {
@@ -19,8 +21,8 @@ export const CanvasEntityMenuItemsCopyToClipboard = memo(() => {
  }, [copyLayerToClipboard, adapter]);

  return (
-    <MenuItem onClick={onClick} icon={<PiCopyBold />} isDisabled={!isInteractable}>
-      {t('controlLayers.copyToClipboard')}
+    <MenuItem onClick={onClick} icon={<PiCopyBold />} isDisabled={!isInteractable || isEmpty}>
+      {t('common.clipboard')}
    </MenuItem>
  );
 });
--- a/invokeai/frontend/web/src/features/controlLayers/components/common/CanvasEntityMenuItemsFilter.tsx
+++ b/invokeai/frontend/web/src/features/controlLayers/components/common/CanvasEntityMenuItemsFilter.tsx
@@ -3,7 +3,7 @@ import { useEntityIdentifierContext } from 'features/controlLayers/contexts/Enti
 import { useEntityFilter } from 'features/controlLayers/hooks/useEntityFilter';
 import { memo } from 'react';
 import { useTranslation } from 'react-i18next';
-import { PiShootingStarBold } from 'react-icons/pi';
+import { PiShootingStarFill } from 'react-icons/pi';

 export const CanvasEntityMenuItemsFilter = memo(() => {
  const { t } = useTranslation();
@@ -11,7 +11,7 @@ export const CanvasEntityMenuItemsFilter = memo(() => {
  const filter = useEntityFilter(entityIdentifier);

  return (
-    <MenuItem onClick={filter.start} icon={<PiShootingStarBold />} isDisabled={filter.isDisabled}>
+    <MenuItem onClick={filter.start} icon={<PiShootingStarFill />} isDisabled={filter.isDisabled}>
      {t('controlLayers.filter.filter')}
    </MenuItem>
  );
--- a/invokeai/frontend/web/src/features/controlLayers/components/common/CanvasEntityMenuItemsSelectObject.tsx
+++ b/invokeai/frontend/web/src/features/controlLayers/components/common/CanvasEntityMenuItemsSelectObject.tsx
@@ -3,18 +3,18 @@ import { useEntityIdentifierContext } from 'features/controlLayers/contexts/Enti
 import { useEntitySegmentAnything } from 'features/controlLayers/hooks/useEntitySegmentAnything';
 import { memo } from 'react';
 import { useTranslation } from 'react-i18next';
-import { PiMaskHappyBold } from 'react-icons/pi';
+import { PiShapesFill } from 'react-icons/pi';

-export const CanvasEntityMenuItemsSegment = memo(() => {
+export const CanvasEntityMenuItemsSelectObject = memo(() => {
  const { t } = useTranslation();
  const entityIdentifier = useEntityIdentifierContext();
  const segmentAnything = useEntitySegmentAnything(entityIdentifier);

  return (
-    <MenuItem onClick={segmentAnything.start} icon={<PiMaskHappyBold />} isDisabled={segmentAnything.isDisabled}>
-      {t('controlLayers.segment.autoMask')}
+    <MenuItem onClick={segmentAnything.start} icon={<PiShapesFill />} isDisabled={segmentAnything.isDisabled}>
+      {t('controlLayers.selectObject.selectObject')}
    </MenuItem>
  );
 });

-CanvasEntityMenuItemsSegment.displayName = 'CanvasEntityMenuItemsSegment';
+CanvasEntityMenuItemsSelectObject.displayName = 'CanvasEntityMenuItemsSelectObject';
--- a/invokeai/frontend/web/src/features/controlLayers/hooks/addLayerHooks.ts
+++ b/invokeai/frontend/web/src/features/controlLayers/hooks/addLayerHooks.ts
@@ -24,7 +24,9 @@ import {
  selectEntityOrThrow,
 } from 'features/controlLayers/store/selectors';
 import type {
+  CanvasControlLayerState,
  CanvasEntityIdentifier,
+  CanvasInpaintMaskState,
  CanvasRasterLayerState,
  CanvasRegionalGuidanceState,
  ControlNetConfig,
@@ -44,6 +46,8 @@ import { useCallback } from 'react';
 import { modelConfigsAdapterSelectors, selectModelConfigsQuery } from 'services/api/endpoints/models';
 import type { ControlNetModelConfig, ImageDTO, IPAdapterModelConfig, T2IAdapterModelConfig } from 'services/api/types';
 import { isControlNetOrT2IAdapterModelConfig, isIPAdapterModelConfig } from 'services/api/types';
+import type { Equals } from 'tsafe';
+import { assert } from 'tsafe';

 export const selectDefaultControlAdapter = createSelector(
  selectModelConfigsQuery,
@@ -124,6 +128,60 @@ export const useNewRasterLayerFromImage = () => {
  return func;
 };

+export const useNewControlLayerFromImage = () => {
+  const dispatch = useAppDispatch();
+  const bboxRect = useAppSelector(selectBboxRect);
+  const func = useCallback(
+    (imageDTO: ImageDTO) => {
+      const imageObject = imageDTOToImageObject(imageDTO);
+      const overrides: Partial<CanvasControlLayerState> = {
+        position: { x: bboxRect.x, y: bboxRect.y },
+        objects: [imageObject],
+      };
+      dispatch(controlLayerAdded({ overrides, isSelected: true }));
+    },
+    [bboxRect.x, bboxRect.y, dispatch]
+  );
+
+  return func;
+};
+
+export const useNewInpaintMaskFromImage = () => {
+  const dispatch = useAppDispatch();
+  const bboxRect = useAppSelector(selectBboxRect);
+  const func = useCallback(
+    (imageDTO: ImageDTO) => {
+      const imageObject = imageDTOToImageObject(imageDTO);
+      const overrides: Partial<CanvasInpaintMaskState> = {
+        position: { x: bboxRect.x, y: bboxRect.y },
+        objects: [imageObject],
+      };
+      dispatch(inpaintMaskAdded({ overrides, isSelected: true }));
+    },
+    [bboxRect.x, bboxRect.y, dispatch]
+  );
+
+  return func;
+};
+
+export const useNewRegionalGuidanceFromImage = () => {
+  const dispatch = useAppDispatch();
+  const bboxRect = useAppSelector(selectBboxRect);
+  const func = useCallback(
+    (imageDTO: ImageDTO) => {
+      const imageObject = imageDTOToImageObject(imageDTO);
+      const overrides: Partial<CanvasRegionalGuidanceState> = {
+        position: { x: bboxRect.x, y: bboxRect.y },
+        objects: [imageObject],
+      };
+      dispatch(rgAdded({ overrides, isSelected: true }));
+    },
+    [bboxRect.x, bboxRect.y, dispatch]
+  );
+
+  return func;
+};
+
 /**
 * Returns a function that adds a new canvas with the given image as the initial image, replicating the img2img flow:
 * - Reset the canvas
@@ -138,18 +196,31 @@ export const useNewCanvasFromImage = () => {
  const bboxRect = useAppSelector(selectBboxRect);
  const base = useAppSelector(selectBboxModelBase);
  const func = useCallback(
-    (imageDTO: ImageDTO) => {
+    (imageDTO: ImageDTO, type: CanvasRasterLayerState['type'] | CanvasControlLayerState['type']) => {
      // Calculate the new bbox dimensions to fit the image's aspect ratio at the optimal size
      const ratio = imageDTO.width / imageDTO.height;
      const optimalDimension = getOptimalDimension(base);
      const { width, height } = calculateNewSize(ratio, optimalDimension ** 2, base);

      // The overrides need to include the layer's ID so we can transform the layer it is initialized
-      const overrides = {
-        id: getPrefixedId('raster_layer'),
-        position: { x: bboxRect.x, y: bboxRect.y },
-        objects: [imageDTOToImageObject(imageDTO)],
-      } satisfies Partial<CanvasRasterLayerState>;
+      let overrides: Partial<CanvasRasterLayerState> | Partial<CanvasControlLayerState>;
+
+      if (type === 'raster_layer') {
+        overrides = {
+          id: getPrefixedId('raster_layer'),
+          position: { x: bboxRect.x, y: bboxRect.y },
+          objects: [imageDTOToImageObject(imageDTO)],
+        } satisfies Partial<CanvasRasterLayerState>;
+      } else if (type === 'control_layer') {
+        overrides = {
+          id: getPrefixedId('control_layer'),
+          position: { x: bboxRect.x, y: bboxRect.y },
+          objects: [imageDTOToImageObject(imageDTO)],
+        } satisfies Partial<CanvasControlLayerState>;
+      } else {
+        // Catch unhandled types
+        assert<Equals<typeof type, never>>(false);
+      }

      CanvasEntityAdapterBase.registerInitCallback(async (adapter) => {
        // Skip the callback if the adapter is not the one we are creating
@@ -166,7 +237,16 @@ export const useNewCanvasFromImage = () => {
      dispatch(canvasReset());
      // The `bboxChangedFromCanvas` reducer does no validation! Careful!
      dispatch(bboxChangedFromCanvas({ x: 0, y: 0, width, height }));
-      dispatch(rasterLayerAdded({ overrides, isSelected: true }));
+
+      // The type casts are safe because the type is checked above
+      if (type === 'raster_layer') {
+        dispatch(rasterLayerAdded({ overrides: overrides as Partial<CanvasRasterLayerState>, isSelected: true }));
+      } else if (type === 'control_layer') {
+        dispatch(controlLayerAdded({ overrides: overrides as Partial<CanvasControlLayerState>, isSelected: true }));
+      } else {
+        // Catch unhandled types
+        assert<Equals<typeof type, never>>(false);
+      }
    },
    [base, bboxRect.x, bboxRect.y, dispatch]
  );
--- a/invokeai/frontend/web/src/features/controlLayers/hooks/useCopyLayerToClipboard.ts
+++ b/invokeai/frontend/web/src/features/controlLayers/hooks/useCopyLayerToClipboard.ts
@@ -1,3 +1,4 @@
+import { logger } from 'app/logging/logger';
 import type { CanvasEntityAdapterControlLayer } from 'features/controlLayers/konva/CanvasEntity/CanvasEntityAdapterControlLayer';
 import type { CanvasEntityAdapterInpaintMask } from 'features/controlLayers/konva/CanvasEntity/CanvasEntityAdapterInpaintMask';
 import type { CanvasEntityAdapterRasterLayer } from 'features/controlLayers/konva/CanvasEntity/CanvasEntityAdapterRasterLayer';
@@ -7,6 +8,9 @@ import { copyBlobToClipboard } from 'features/system/util/copyBlobToClipboard';
 import { toast } from 'features/toast/toast';
 import { useCallback } from 'react';
 import { useTranslation } from 'react-i18next';
+import { serializeError } from 'serialize-error';
+
+const log = logger('canvas');

 export const useCopyLayerToClipboard = () => {
  const { t } = useTranslation();
@@ -26,11 +30,13 @@ export const useCopyLayerToClipboard = () => {
        const canvas = adapter.getCanvas();
        const blob = await canvasToBlob(canvas);
        copyBlobToClipboard(blob);
+        log.trace('Layer copied to clipboard');
        toast({
          status: 'info',
          title: t('toast.layerCopiedToClipboard'),
        });
      } catch (error) {
+        log.error({ error: serializeError(error) }, 'Problem copying layer to clipboard');
        toast({
          status: 'error',
          title: t('toast.problemCopyingLayer'),
--- a/invokeai/frontend/web/src/features/controlLayers/hooks/useEntityFilter.ts
+++ b/invokeai/frontend/web/src/features/controlLayers/hooks/useEntityFilter.ts
@@ -5,11 +5,13 @@ import { useEntityAdapterSafe } from 'features/controlLayers/contexts/EntityAdap
 import { useCanvasIsBusy } from 'features/controlLayers/hooks/useCanvasIsBusy';
 import type { CanvasEntityIdentifier } from 'features/controlLayers/store/types';
 import { isFilterableEntityIdentifier } from 'features/controlLayers/store/types';
+import { useImageViewer } from 'features/gallery/components/ImageViewer/useImageViewer';
 import { useCallback, useMemo } from 'react';

 export const useEntityFilter = (entityIdentifier: CanvasEntityIdentifier | null) => {
  const canvasManager = useCanvasManager();
  const adapter = useEntityAdapterSafe(entityIdentifier);
+  const imageViewer = useImageViewer();
  const isBusy = useCanvasIsBusy();
  const isInteractable = useStore(adapter?.$isInteractable ?? $false);
  const isEmpty = useStore(adapter?.$isEmpty ?? $false);
@@ -50,8 +52,9 @@ export const useEntityFilter = (entityIdentifier: CanvasEntityIdentifier | null)
    if (!adapter) {
      return;
    }
+    imageViewer.close();
    adapter.filterer.start();
-  }, [isDisabled, entityIdentifier, canvasManager]);
+  }, [isDisabled, entityIdentifier, canvasManager, imageViewer]);

  return { isDisabled, start } as const;
 };
--- a/invokeai/frontend/web/src/features/controlLayers/hooks/useEntityIsEmpty.ts
+++ b/invokeai/frontend/web/src/features/controlLayers/hooks/useEntityIsEmpty.ts
@@ -0,0 +1,11 @@
+import { useAppSelector } from 'app/store/storeHooks';
+import { buildSelectHasObjects } from 'features/controlLayers/store/selectors';
+import type { CanvasEntityIdentifier } from 'features/controlLayers/store/types';
+import { useMemo } from 'react';
+
+export const useEntityIsEmpty = (entityIdentifier: CanvasEntityIdentifier) => {
+  const selectHasObjects = useMemo(() => buildSelectHasObjects(entityIdentifier), [entityIdentifier]);
+  const hasObjects = useAppSelector(selectHasObjects);
+
+  return !hasObjects;
+};
--- a/invokeai/frontend/web/src/features/controlLayers/hooks/useEntitySegmentAnything.ts
+++ b/invokeai/frontend/web/src/features/controlLayers/hooks/useEntitySegmentAnything.ts
@@ -5,11 +5,13 @@ import { useEntityAdapterSafe } from 'features/controlLayers/contexts/EntityAdap
 import { useCanvasIsBusy } from 'features/controlLayers/hooks/useCanvasIsBusy';
 import type { CanvasEntityIdentifier } from 'features/controlLayers/store/types';
 import { isSegmentableEntityIdentifier } from 'features/controlLayers/store/types';
+import { useImageViewer } from 'features/gallery/components/ImageViewer/useImageViewer';
 import { useCallback, useMemo } from 'react';

 export const useEntitySegmentAnything = (entityIdentifier: CanvasEntityIdentifier | null) => {
  const canvasManager = useCanvasManager();
  const adapter = useEntityAdapterSafe(entityIdentifier);
+  const imageViewer = useImageViewer();
  const isBusy = useCanvasIsBusy();
  const isInteractable = useStore(adapter?.$isInteractable ?? $false);
  const isEmpty = useStore(adapter?.$isEmpty ?? $false);
@@ -50,8 +52,9 @@ export const useEntitySegmentAnything = (entityIdentifier: CanvasEntityIdentifie
    if (!adapter) {
      return;
    }
+    imageViewer.close();
    adapter.segmentAnything.start();
-  }, [isDisabled, entityIdentifier, canvasManager]);
+  }, [isDisabled, entityIdentifier, canvasManager, imageViewer]);

  return { isDisabled, start } as const;
 };
--- a/invokeai/frontend/web/src/features/controlLayers/hooks/useEntityTransform.ts
+++ b/invokeai/frontend/web/src/features/controlLayers/hooks/useEntityTransform.ts
@@ -5,11 +5,13 @@ import { useEntityAdapterSafe } from 'features/controlLayers/contexts/EntityAdap
 import { useCanvasIsBusy } from 'features/controlLayers/hooks/useCanvasIsBusy';
 import type { CanvasEntityIdentifier } from 'features/controlLayers/store/types';
 import { isTransformableEntityIdentifier } from 'features/controlLayers/store/types';
+import { useImageViewer } from 'features/gallery/components/ImageViewer/useImageViewer';
 import { useCallback, useMemo } from 'react';

 export const useEntityTransform = (entityIdentifier: CanvasEntityIdentifier | null) => {
  const canvasManager = useCanvasManager();
  const adapter = useEntityAdapterSafe(entityIdentifier);
+  const imageViewer = useImageViewer();
  const isBusy = useCanvasIsBusy();
  const isInteractable = useStore(adapter?.$isInteractable ?? $false);
  const isEmpty = useStore(adapter?.$isEmpty ?? $false);
@@ -50,8 +52,9 @@ export const useEntityTransform = (entityIdentifier: CanvasEntityIdentifier | nu
    if (!adapter) {
      return;
    }
+    imageViewer.close();
    await adapter.transformer.startTransform();
-  }, [isDisabled, entityIdentifier, canvasManager]);
+  }, [isDisabled, entityIdentifier, canvasManager, imageViewer]);

  const fitToBbox = useCallback(async () => {
    if (isDisabled) {
@@ -67,10 +70,11 @@ export const useEntityTransform = (entityIdentifier: CanvasEntityIdentifier | nu
    if (!adapter) {
      return;
    }
+    imageViewer.close();
    await adapter.transformer.startTransform({ silent: true });
    adapter.transformer.fitToBboxContain();
    await adapter.transformer.applyTransform();
-  }, [canvasManager, entityIdentifier, isDisabled]);
+  }, [canvasManager, entityIdentifier, imageViewer, isDisabled]);

  return { isDisabled, start, fitToBbox } as const;
 };
--- a/invokeai/frontend/web/src/features/controlLayers/hooks/useEntityTypeInformationalPopover.ts
+++ b/invokeai/frontend/web/src/features/controlLayers/hooks/useEntityTypeInformationalPopover.ts
@@ -0,0 +1,25 @@
+import type { Feature } from 'common/components/InformationalPopover/constants';
+import type { CanvasEntityIdentifier } from 'features/controlLayers/store/types';
+import { useMemo } from 'react';
+
+export const useEntityTypeInformationalPopover = (type: CanvasEntityIdentifier['type']): Feature | undefined => {
+  const feature = useMemo(() => {
+    switch (type) {
+      case 'control_layer':
+        return 'controlNet';
+      case 'inpaint_mask':
+        return 'inpainting';
+      case 'raster_layer':
+        return 'rasterLayer';
+      case 'regional_guidance':
+        return 'regionalGuidanceAndReferenceImage';
+      case 'reference_image':
+        return 'globalReferenceImage';
+
+      default:
+        return undefined;
+    }
+  }, [type]);
+
+  return feature;
+};
--- a/invokeai/frontend/web/src/features/controlLayers/konva/CanvasEntity/CanvasEntityFilterer.ts
+++ b/invokeai/frontend/web/src/features/controlLayers/konva/CanvasEntity/CanvasEntityFilterer.ts
@@ -1,27 +1,36 @@
+import { deepClone } from 'common/util/deepClone';
 import { withResult, withResultAsync } from 'common/util/result';
 import type { CanvasEntityAdapterControlLayer } from 'features/controlLayers/konva/CanvasEntity/CanvasEntityAdapterControlLayer';
 import type { CanvasEntityAdapterRasterLayer } from 'features/controlLayers/konva/CanvasEntity/CanvasEntityAdapterRasterLayer';
 import type { CanvasManager } from 'features/controlLayers/konva/CanvasManager';
 import { CanvasModuleBase } from 'features/controlLayers/konva/CanvasModuleBase';
-import { getPrefixedId } from 'features/controlLayers/konva/util';
+import { CanvasObjectImage } from 'features/controlLayers/konva/CanvasObject/CanvasObjectImage';
+import { addCoords, getKonvaNodeDebugAttrs, getPrefixedId } from 'features/controlLayers/konva/util';
 import { selectAutoProcess } from 'features/controlLayers/store/canvasSettingsSlice';
 import type { FilterConfig } from 'features/controlLayers/store/filters';
 import { getFilterForModel, IMAGE_FILTERS } from 'features/controlLayers/store/filters';
-import type { CanvasImageState } from 'features/controlLayers/store/types';
+import type { CanvasEntityType, CanvasImageState } from 'features/controlLayers/store/types';
 import { imageDTOToImageObject } from 'features/controlLayers/store/util';
+import Konva from 'konva';
 import { debounce } from 'lodash-es';
-import { atom } from 'nanostores';
+import { atom, computed } from 'nanostores';
 import type { Logger } from 'roarr';
 import { serializeError } from 'serialize-error';
 import { buildSelectModelConfig } from 'services/api/hooks/modelsByType';
 import { isControlNetOrT2IAdapterModelConfig } from 'services/api/types';
+import stableHash from 'stable-hash';
+import type { Equals } from 'tsafe';
+import { assert } from 'tsafe';

 type CanvasEntityFiltererConfig = {
-  processDebounceMs: number;
+  /**
+   * The debounce time in milliseconds for processing the filter.
+   */
+  PROCESS_DEBOUNCE_MS: number;
 };

 const DEFAULT_CONFIG: CanvasEntityFiltererConfig = {
-  processDebounceMs: 1000,
+  PROCESS_DEBOUNCE_MS: 1000,
 };

 export class CanvasEntityFilterer extends CanvasModuleBase {
@@ -32,20 +41,65 @@ export class CanvasEntityFilterer extends CanvasModuleBase {
  readonly manager: CanvasManager;
  readonly log: Logger;

-  imageState: CanvasImageState | null = null;
-  subscriptions = new Set<() => void>();
  config: CanvasEntityFiltererConfig = DEFAULT_CONFIG;

+  subscriptions = new Set<() => void>();
+
  /**
   * The AbortController used to cancel the filter processing.
   */
  abortController: AbortController | null = null;

+  /**
+   * Whether the module is currently filtering an image.
+   */
  $isFiltering = atom<boolean>(false);
-  $hasProcessed = atom<boolean>(false);
+  /**
+   * The hash of the last processed config. This is used to prevent re-processing the same config.
+   */
+  $lastProcessedHash = atom<string>('');
+
+  /**
+   * Whether the module is currently processing the filter.
+   */
  $isProcessing = atom<boolean>(false);
+
+  /**
+   * The config for the filter.
+   */
  $filterConfig = atom<FilterConfig>(IMAGE_FILTERS.canny_edge_detection.buildDefaults());

+  /**
+   * The initial filter config, used to reset the filter config.
+   */
+  $initialFilterConfig = atom<FilterConfig | null>(null);
+
+  /**
+   * The ephemeral image state of the filtered image.
+   */
+  $imageState = atom<CanvasImageState | null>(null);
+
+  /**
+   * Whether the module has an image state. This is a computed value based on $imageState.
+   */
+  $hasImageState = computed(this.$imageState, (imageState) => imageState !== null);
+  /**
+   * The filtered image object module, if it exists.
+   */
+  imageModule: CanvasObjectImage | null = null;
+
+  /**
+   * The Konva nodes for the module.
+   */
+  konva: {
+    /**
+     * The main Konva group node for the module. This is added to the parent layer on start, and removed on teardown.
+     */
+    group: Konva.Group;
+  };
+
+  KONVA_GROUP_NAME = `${this.type}:group`;
+
  constructor(parent: CanvasEntityAdapterRasterLayer | CanvasEntityAdapterControlLayer) {
    super();
    this.id = getPrefixedId(this.type);
@@ -55,9 +109,17 @@ export class CanvasEntityFilterer extends CanvasModuleBase {
    this.log = this.manager.buildLogger(this);

    this.log.debug('Creating filter module');
+
+    this.konva = {
+      group: new Konva.Group({ name: this.KONVA_GROUP_NAME }),
+    };
  }

+  /**
+   * Adds event listeners needed while filtering the entity.
+   */
  subscribe = () => {
+    // As the filter config changes, process the filter
    this.subscriptions.add(
      this.$filterConfig.listen(() => {
        if (this.manager.stateApi.getSettings().autoProcess && this.$isFiltering.get()) {
@@ -65,6 +127,7 @@ export class CanvasEntityFilterer extends CanvasModuleBase {
        }
      })
    );
+    // When auto-process is enabled, process the filter
    this.subscriptions.add(
      this.manager.stateApi.createStoreSubscription(selectAutoProcess, (autoProcess) => {
        if (autoProcess && this.$isFiltering.get()) {
@@ -74,11 +137,18 @@ export class CanvasEntityFilterer extends CanvasModuleBase {
    );
  };

+  /**
+   * Removes event listeners used while filtering the entity.
+   */
  unsubscribe = () => {
    this.subscriptions.forEach((unsubscribe) => unsubscribe());
    this.subscriptions.clear();
  };

+  /**
+   * Starts the filter module.
+   * @param config The filter config to start with. If omitted, the default filter config is used.
+   */
  start = (config?: FilterConfig) => {
    const filteringAdapter = this.manager.stateApi.$filteringAdapter.get();
    if (filteringAdapter) {
@@ -88,30 +158,57 @@ export class CanvasEntityFilterer extends CanvasModuleBase {

    this.log.trace('Initializing filter');

-    this.subscribe();
+    // Reset any previous state
+    this.resetEphemeralState();
+    this.$isFiltering.set(true);
+
+    // Update the konva group's position to match the parent entity
+    const pixelRect = this.parent.transformer.$pixelRect.get();
+    const position = addCoords(this.parent.state.position, pixelRect);
+    this.konva.group.setAttrs(position);
+
+    // Add the group to the parent layer
+    this.parent.konva.layer.add(this.konva.group);

    if (config) {
+      // If a config is provided, use it
      this.$filterConfig.set(config);
-    } else if (this.parent.type === 'control_layer_adapter' && this.parent.state.controlAdapter.model) {
+      this.$initialFilterConfig.set(config);
+    } else {
+      this.$filterConfig.set(this.createInitialFilterConfig());
+    }
+
+    this.$initialFilterConfig.set(this.$filterConfig.get());
+
+    this.subscribe();
+
+    this.manager.stateApi.$filteringAdapter.set(this.parent);
+
+    if (this.manager.stateApi.getSettings().autoProcess) {
+      this.processImmediate();
+    }
+  };
+
+  createInitialFilterConfig = (): FilterConfig => {
+    if (this.parent.type === 'control_layer_adapter' && this.parent.state.controlAdapter.model) {
      // If the parent is a control layer adapter, we should check if the model has a default filter and set it if so
      const selectModelConfig = buildSelectModelConfig(
        this.parent.state.controlAdapter.model.key,
        isControlNetOrT2IAdapterModelConfig
      );
      const modelConfig = this.manager.stateApi.runSelector(selectModelConfig);
+      // This always returns a filter
      const filter = getFilterForModel(modelConfig);
-      this.$filterConfig.set(filter.buildDefaults());
+      return filter.buildDefaults();
    } else {
-      // Otherwise, set the default filter
-      this.$filterConfig.set(IMAGE_FILTERS.canny_edge_detection.buildDefaults());
-    }
-    this.$isFiltering.set(true);
-    this.manager.stateApi.$filteringAdapter.set(this.parent);
-    if (this.manager.stateApi.getSettings().autoProcess) {
-      this.processImmediate();
+      // Otherwise, used the default filter
+      return IMAGE_FILTERS.canny_edge_detection.buildDefaults();
    }
  };

+  /**
+   * Processes the filter, updating the module's state and rendering the filtered image.
+   */
  processImmediate = async () => {
    const config = this.$filterConfig.get();
    const filterData = IMAGE_FILTERS[config.type];
@@ -123,6 +220,12 @@ export class CanvasEntityFilterer extends CanvasModuleBase {
      return;
    }

+    const hash = stableHash({ config });
+    if (hash === this.$lastProcessedHash.get()) {
+      this.log.trace('Already processed config');
+      return;
+    }
+
    this.log.trace({ config }, 'Processing filter');
    const rect = this.parent.transformer.getRelativeRect();

@@ -156,91 +259,181 @@ export class CanvasEntityFilterer extends CanvasModuleBase {
      this.manager.stateApi.runGraphAndReturnImageOutput({
        graph,
        outputNodeId,
-        // The filter graph should always be prepended to the queue so it's processed ASAP.
        prepend: true,
-        /**
-         * The filter node may need to download a large model. Currently, the models required by the filter nodes are
-         * downloaded just-in-time, as required by the filter. If we use a timeout here, we might get into a catch-22
-         * where the filter node is waiting for the model to download, but the download gets canceled if the filter
-         * node times out.
-         *
-         * (I suspect the model download will actually _not_ be canceled if the graph is canceled, but let's not chance it!)
-         *
-         * TODO(psyche): Figure out a better way to handle this. Probably need to download the models ahead of time.
-         */
-        // timeout: 5000,
-        /**
-         * The filter node should be able to cancel the request if it's taking too long. This will cancel the graph's
-         * queue item and clear any event listeners on the request.
-         */
        signal: controller.signal,
      })
    );
+
+    // If there is an error, log it and bail out of this processing run
    if (filterResult.isErr()) {
-      this.log.error({ error: serializeError(filterResult.error) }, 'Error processing filter');
+      this.log.error({ error: serializeError(filterResult.error) }, 'Error filtering');
      this.$isProcessing.set(false);
+      // Clean up the abort controller as needed
+      if (!this.abortController.signal.aborted) {
+        this.abortController.abort();
+      }
      this.abortController = null;
      return;
    }

-    this.log.trace({ imageDTO: filterResult.value }, 'Filter processed');
-    this.imageState = imageDTOToImageObject(filterResult.value);
+    this.log.trace({ imageDTO: filterResult.value }, 'Filtered');

-    await this.parent.bufferRenderer.setBuffer(this.imageState, true);
+    // Prepare the ephemeral image state
+    const imageState = imageDTOToImageObject(filterResult.value);
+    this.$imageState.set(imageState);
+
+    // Destroy any existing masked image and create a new one
+    if (this.imageModule) {
+      this.imageModule.destroy();
+    }
+
+    this.imageModule = new CanvasObjectImage(imageState, this);
+
+    // Force update the masked image - after awaiting, the image will be rendered (in memory)
+    await this.imageModule.update(imageState, true);
+
+    this.konva.group.add(this.imageModule.konva.group);
+
+    // The porcessing is complete, set can set the last processed hash and isProcessing to false
+    this.$lastProcessedHash.set(hash);

    this.$isProcessing.set(false);
-    this.$hasProcessed.set(true);
+
+    // Clean up the abort controller as needed
+    if (!this.abortController.signal.aborted) {
+      this.abortController.abort();
+    }
+
    this.abortController = null;
  };

-  process = debounce(this.processImmediate, this.config.processDebounceMs);
+  /**
+   * Debounced version of processImmediate.
+   */
+  process = debounce(this.processImmediate, this.config.PROCESS_DEBOUNCE_MS);

+  /**
+   * Applies the filter image to the entity, replacing the entity's objects with the filtered image.
+   */
  apply = () => {
-    const imageState = this.imageState;
-    if (!imageState) {
+    const filteredImageObjectState = this.$imageState.get();
+    if (!filteredImageObjectState) {
      this.log.warn('No image state to apply filter to');
      return;
    }
-    this.log.trace('Applying filter');
-    this.parent.bufferRenderer.commitBuffer();
+    this.log.trace('Applying');
+
+    // Have the parent adopt the image module - this prevents a flash of the original layer content before the filtered
+    // image is rendered
+    if (this.imageModule) {
+      this.parent.renderer.adoptObjectRenderer(this.imageModule);
+    }
+
+    // Rasterize the entity, replacing the objects with the masked image
    const rect = this.parent.transformer.getRelativeRect();
    this.manager.stateApi.rasterizeEntity({
      entityIdentifier: this.parent.entityIdentifier,
-      imageObject: imageState,
+      imageObject: filteredImageObjectState,
      position: {
        x: Math.round(rect.x),
        y: Math.round(rect.y),
      },
      replaceObjects: true,
    });
-    this.imageState = null;
+
+    // Final cleanup and teardown, returning user to main canvas UI
+    this.resetEphemeralState();
+    this.teardown();
+  };
+
+  /**
+   * Saves the filtered image as a new entity of the given type.
+   * @param type The type of entity to save the filtered image as.
+   */
+  saveAs = (type: Exclude<CanvasEntityType, 'reference_image'>) => {
+    const imageState = this.$imageState.get();
+    if (!imageState) {
+      this.log.warn('No image state to apply filter to');
+      return;
+    }
+    this.log.trace(`Saving as ${type}`);
+
+    const rect = this.parent.transformer.getRelativeRect();
+    const arg = {
+      overrides: {
+        objects: [imageState],
+        position: {
+          x: Math.round(rect.x),
+          y: Math.round(rect.y),
+        },
+      },
+      isSelected: true,
+    };
+
+    switch (type) {
+      case 'raster_layer':
+        this.manager.stateApi.addRasterLayer(arg);
+        break;
+      case 'control_layer':
+        this.manager.stateApi.addControlLayer(arg);
+        break;
+      case 'inpaint_mask':
+        this.manager.stateApi.addInpaintMask(arg);
+        break;
+      case 'regional_guidance':
+        this.manager.stateApi.addRegionalGuidance(arg);
+        break;
+      default:
+        assert<Equals<typeof type, never>>(false);
+    }
+
+    // Final cleanup and teardown, returning user to main canvas UI
+    this.resetEphemeralState();
+    this.teardown();
+  };
+
+  resetEphemeralState = () => {
+    // First we need to bail out of any processing
+    if (this.abortController && !this.abortController.signal.aborted) {
+      this.abortController.abort();
+    }
+    this.abortController = null;
+
+    // If the image module exists, and is a child of the group, destroy it. It might not be a child of the group if
+    // the user has applied the filter and the image has been adopted by the parent entity.
+    if (this.imageModule && this.imageModule.konva.group.parent === this.konva.group) {
+      this.imageModule.destroy();
+      this.imageModule = null;
+    }
+    const initialFilterConfig = this.$initialFilterConfig.get() ?? this.createInitialFilterConfig();
+    this.$filterConfig.set(initialFilterConfig);
+    this.$imageState.set(null);
+    this.$lastProcessedHash.set('');
+    this.$isProcessing.set(false);
+  };
+
+  teardown = () => {
+    this.$initialFilterConfig.set(null);
+    this.konva.group.remove();
    this.unsubscribe();
    this.$isFiltering.set(false);
-    this.$hasProcessed.set(false);
    this.manager.stateApi.$filteringAdapter.set(null);
  };

+  /**
+   * Resets the module (e.g. remove all points and the mask image).
+   *
+   * Does not cancel or otherwise complete the segmenting process.
+   */
  reset = () => {
-    this.log.trace('Resetting filter');
-
-    this.abortController?.abort();
-    this.abortController = null;
-    this.parent.bufferRenderer.clearBuffer();
-    this.parent.transformer.updatePosition();
-    this.parent.renderer.syncKonvaCache(true);
-    this.imageState = null;
-    this.$hasProcessed.set(false);
+    this.log.trace('Resetting');
+    this.resetEphemeralState();
  };

  cancel = () => {
-    this.log.trace('Cancelling filter');
-
-    this.reset();
-    this.unsubscribe();
-    this.$isProcessing.set(false);
-    this.$isFiltering.set(false);
-    this.$hasProcessed.set(false);
-    this.manager.stateApi.$filteringAdapter.set(null);
+    this.log.trace('Canceling');
+    this.resetEphemeralState();
+    this.teardown();
  };

  repr = () => {
@@ -248,11 +441,14 @@ export class CanvasEntityFilterer extends CanvasModuleBase {
      id: this.id,
      type: this.type,
      path: this.path,
+      parent: this.parent.id,
      config: this.config,
+      imageState: deepClone(this.$imageState.get()),
      $isFiltering: this.$isFiltering.get(),
-      $hasProcessed: this.$hasProcessed.get(),
+      $lastProcessedHash: this.$lastProcessedHash.get(),
      $isProcessing: this.$isProcessing.get(),
      $filterConfig: this.$filterConfig.get(),
+      konva: { group: getKonvaNodeDebugAttrs(this.konva.group) },
    };
  };

@@ -263,5 +459,6 @@ export class CanvasEntityFilterer extends CanvasModuleBase {
    }
    this.abortController = null;
    this.unsubscribe();
+    this.konva.group.destroy();
  };
 }
--- a/invokeai/frontend/web/src/features/controlLayers/konva/CanvasObject/CanvasObjectImage.ts
+++ b/invokeai/frontend/web/src/features/controlLayers/konva/CanvasObject/CanvasObjectImage.ts
@@ -1,6 +1,7 @@
 import { Mutex } from 'async-mutex';
 import { deepClone } from 'common/util/deepClone';
 import type { CanvasEntityBufferObjectRenderer } from 'features/controlLayers/konva/CanvasEntity/CanvasEntityBufferObjectRenderer';
+import type { CanvasEntityFilterer } from 'features/controlLayers/konva/CanvasEntity/CanvasEntityFilterer';
 import type { CanvasEntityObjectRenderer } from 'features/controlLayers/konva/CanvasEntity/CanvasEntityObjectRenderer';
 import type { CanvasManager } from 'features/controlLayers/konva/CanvasManager';
 import { CanvasModuleBase } from 'features/controlLayers/konva/CanvasModuleBase';
@@ -21,7 +22,8 @@ export class CanvasObjectImage extends CanvasModuleBase {
    | CanvasEntityObjectRenderer
    | CanvasEntityBufferObjectRenderer
    | CanvasStagingAreaModule
-    | CanvasSegmentAnythingModule;
+    | CanvasSegmentAnythingModule
+    | CanvasEntityFilterer;
  readonly manager: CanvasManager;
  readonly log: Logger;

@@ -43,6 +45,7 @@ export class CanvasObjectImage extends CanvasModuleBase {
      | CanvasEntityBufferObjectRenderer
      | CanvasStagingAreaModule
      | CanvasSegmentAnythingModule
+      | CanvasEntityFilterer
  ) {
    super();
    this.id = state.id;
--- a/invokeai/frontend/web/src/features/controlLayers/konva/CanvasSegmentAnythingModule.ts
+++ b/invokeai/frontend/web/src/features/controlLayers/konva/CanvasSegmentAnythingModule.ts
@@ -6,15 +6,22 @@ import type { CanvasEntityAdapterRasterLayer } from 'features/controlLayers/konv
 import type { CanvasManager } from 'features/controlLayers/konva/CanvasManager';
 import { CanvasModuleBase } from 'features/controlLayers/konva/CanvasModuleBase';
 import { CanvasObjectImage } from 'features/controlLayers/konva/CanvasObject/CanvasObjectImage';
-import { addCoords, getKonvaNodeDebugAttrs, getPrefixedId, offsetCoord } from 'features/controlLayers/konva/util';
+import {
+  addCoords,
+  getKonvaNodeDebugAttrs,
+  getPrefixedId,
+  offsetCoord,
+  roundCoord,
+} from 'features/controlLayers/konva/util';
 import { selectAutoProcess } from 'features/controlLayers/store/canvasSettingsSlice';
 import type {
+  CanvasEntityType,
  CanvasImageState,
  Coordinate,
  RgbaColor,
-  SAMPoint,
  SAMPointLabel,
  SAMPointLabelString,
+  SAMPointWithId,
 } from 'features/controlLayers/store/types';
 import { SAM_POINT_LABEL_NUMBER_TO_STRING } from 'features/controlLayers/store/types';
 import { imageDTOToImageObject } from 'features/controlLayers/store/util';
@@ -27,6 +34,9 @@ import { atom, computed } from 'nanostores';
 import type { Logger } from 'roarr';
 import { serializeError } from 'serialize-error';
 import type { ImageDTO } from 'services/api/types';
+import stableHash from 'stable-hash';
+import type { Equals } from 'tsafe';
+import { assert } from 'tsafe';

 type CanvasSegmentAnythingModuleConfig = {
  /**
@@ -70,7 +80,7 @@ const DEFAULT_CONFIG: CanvasSegmentAnythingModuleConfig = {
  SAM_POINT_FOREGROUND_COLOR: { r: 50, g: 255, b: 0, a: 1 }, // light green
  SAM_POINT_BACKGROUND_COLOR: { r: 255, g: 0, b: 50, a: 1 }, // red-ish
  SAM_POINT_NEUTRAL_COLOR: { r: 0, g: 225, b: 255, a: 1 }, // cyan
-  MASK_COLOR: { r: 0, g: 200, b: 200, a: 0.5 }, // cyan with 50% opacity
+  MASK_COLOR: { r: 0, g: 225, b: 255, a: 1 }, // cyan
  PROCESS_DEBOUNCE_MS: 1000,
 };

@@ -85,6 +95,7 @@ const DEFAULT_CONFIG: CanvasSegmentAnythingModuleConfig = {
 type SAMPointState = {
  id: string;
  label: SAMPointLabel;
+  coord: Coordinate;
  konva: {
    circle: Konva.Circle;
  };
@@ -103,7 +114,7 @@ export class CanvasSegmentAnythingModule extends CanvasModuleBase {
  subscriptions = new Set<() => void>();

  /**
-   * The AbortController used to cancel the filter processing.
+   * The AbortController used to cancel the segment processing.
   */
  abortController: AbortController | null = null;

@@ -113,9 +124,9 @@ export class CanvasSegmentAnythingModule extends CanvasModuleBase {
  $isSegmenting = atom<boolean>(false);

  /**
-   * Whether the current set of points has been processed.
+   * The hash of the last processed points. This is used to prevent re-processing the same points.
   */
-  $hasProcessed = atom<boolean>(false);
+  $lastProcessedHash = atom<string>('');

  /**
   * Whether the module is currently processing the points.
@@ -144,10 +155,15 @@ export class CanvasSegmentAnythingModule extends CanvasModuleBase {
  /**
   * The ephemeral image state of the processed image. Only used while segmenting.
   */
-  imageState: CanvasImageState | null = null;
+  $imageState = atom<CanvasImageState | null>(null);

  /**
-   * The current input points.
+   * Whether the module has an image state. This is a computed value based on $imageState.
+   */
+  $hasImageState = computed(this.$imageState, (imageState) => imageState !== null);
+
+  /**
+   * The current input points. A listener is added to this atom to process the points when they change.
   */
  $points = atom<SAMPointState[]>([]);

@@ -157,16 +173,21 @@ export class CanvasSegmentAnythingModule extends CanvasModuleBase {
  $hasPoints = computed(this.$points, (points) => points.length > 0);

  /**
-   * The masked image object, if it exists.
+   * Whether the module should invert the mask image.
   */
-  maskedImage: CanvasObjectImage | null = null;
+  $invert = atom<boolean>(false);
+
+  /**
+   * The masked image object module, if it exists.
+   */
+  imageModule: CanvasObjectImage | null = null;

  /**
   * The Konva nodes for the module.
   */
  konva: {
    /**
-     * The main Konva group node for the module.
+     * The main Konva group node for the module. This is added to the parent layer on start, and removed on teardown.
     */
    group: Konva.Group;
    /**
@@ -187,6 +208,10 @@ export class CanvasSegmentAnythingModule extends CanvasModuleBase {
     * It's rendered with a globalCompositeOperation of 'source-atop' to preview the mask as a semi-transparent overlay.
     */
    compositingRect: Konva.Rect;
+    /**
+     * A tween for pulsing the mask group's opacity.
+     */
+    maskTween: Konva.Tween | null;
  };

  KONVA_CIRCLE_NAME = `${this.type}:circle`;
@@ -209,7 +234,7 @@ export class CanvasSegmentAnythingModule extends CanvasModuleBase {
    this.konva = {
      group: new Konva.Group({ name: this.KONVA_GROUP_NAME }),
      pointGroup: new Konva.Group({ name: this.KONVA_POINT_GROUP_NAME }),
-      maskGroup: new Konva.Group({ name: this.KONVA_MASK_GROUP_NAME }),
+      maskGroup: new Konva.Group({ name: this.KONVA_MASK_GROUP_NAME, opacity: 0.6 }),
      compositingRect: new Konva.Rect({
        name: this.KONVA_COMPOSITING_RECT_NAME,
        fill: rgbaColorToString(this.config.MASK_COLOR),
@@ -219,6 +244,7 @@ export class CanvasSegmentAnythingModule extends CanvasModuleBase {
        perfectDrawEnabled: false,
        visible: false,
      }),
+      maskTween: null,
    };

    // Points should always be rendered above the mask group
@@ -250,10 +276,12 @@ export class CanvasSegmentAnythingModule extends CanvasModuleBase {
  createPoint(coord: Coordinate, label: SAMPointLabel): SAMPointState {
    const id = getPrefixedId('sam_point');

+    const roundedCoord = roundCoord(coord);
+
    const circle = new Konva.Circle({
      name: this.KONVA_CIRCLE_NAME,
-      x: Math.round(coord.x),
-      y: Math.round(coord.y),
+      x: roundedCoord.x,
+      y: roundedCoord.y,
      radius: this.manager.stage.unscale(this.config.SAM_POINT_RADIUS), // We will scale this as the stage scale changes
      fill: rgbaColorToString(this.getSAMPointColor(label)),
      stroke: rgbaColorToString(this.config.SAM_POINT_BORDER_COLOR),
@@ -270,14 +298,18 @@ export class CanvasSegmentAnythingModule extends CanvasModuleBase {
      if (this.$isDraggingPoint.get()) {
        return;
      }
+      if (e.evt.button !== 0) {
+        return;
+      }
      // This event should not bubble up to the parent, stage or any other nodes
      e.cancelBubble = true;
      circle.destroy();
-      this.$points.set(this.$points.get().filter((point) => point.id !== id));
-      if (this.$points.get().length === 0) {
+
+      const newPoints = this.$points.get().filter((point) => point.id !== id);
+      if (newPoints.length === 0) {
        this.resetEphemeralState();
      } else {
-        this.$hasProcessed.set(false);
+        this.$points.set(newPoints);
      }
    });

@@ -286,25 +318,28 @@ export class CanvasSegmentAnythingModule extends CanvasModuleBase {
    });

    circle.on('dragend', () => {
+      const roundedCoord = roundCoord(circle.position());
+
+      this.log.trace({ ...roundedCoord, label: SAM_POINT_LABEL_NUMBER_TO_STRING[label] }, 'Moved SAM point');
      this.$isDraggingPoint.set(false);
-      // Point has changed!
-      this.$hasProcessed.set(false);
-      this.$points.notify();
-      this.log.trace(
-        { x: Math.round(circle.x()), y: Math.round(circle.y()), label: SAM_POINT_LABEL_NUMBER_TO_STRING[label] },
-        'Moved SAM point'
-      );
+
+      const newPoints = this.$points.get().map((point) => {
+        if (point.id === id) {
+          return { ...point, coord: roundedCoord };
+        }
+        return point;
+      });
+
+      this.$points.set(newPoints);
    });

    this.konva.pointGroup.add(circle);

-    this.log.trace(
-      { x: Math.round(circle.x()), y: Math.round(circle.y()), label: SAM_POINT_LABEL_NUMBER_TO_STRING[label] },
-      'Created SAM point'
-    );
+    this.log.trace({ ...roundedCoord, label: SAM_POINT_LABEL_NUMBER_TO_STRING[label] }, 'Created SAM point');

    return {
      id,
+      coord: roundedCoord,
      label,
      konva: { circle },
    };
@@ -327,14 +362,14 @@ export class CanvasSegmentAnythingModule extends CanvasModuleBase {
  /**
   * Gets the SAM points in the format expected by the segment-anything API. The x and y values are rounded to integers.
   */
-  getSAMPoints = (): SAMPoint[] => {
-    const points: SAMPoint[] = [];
+  getSAMPoints = (): SAMPointWithId[] => {
+    const points: SAMPointWithId[] = [];

-    for (const { konva, label } of this.$points.get()) {
+    for (const { id, coord, label } of this.$points.get()) {
      points.push({
-        // Pull out and round the x and y values from Konva
-        x: Math.round(konva.circle.x()),
-        y: Math.round(konva.circle.y()),
+        id,
+        x: coord.x,
+        y: coord.y,
        label,
      });
    }
@@ -381,10 +416,8 @@ export class CanvasSegmentAnythingModule extends CanvasModuleBase {

    // Create a SAM point at the normalized position
    const point = this.createPoint(normalizedPoint, this.$pointType.get());
-    this.$points.set([...this.$points.get(), point]);
-
-    // Mark the module as having _not_ processed the points now that they have changed
-    this.$hasProcessed.set(false);
+    const newPoints = [...this.$points.get(), point];
+    this.$points.set(newPoints);
  };

  /**
@@ -421,6 +454,20 @@ export class CanvasSegmentAnythingModule extends CanvasModuleBase {
        if (points.length === 0) {
          return;
        }
+
+        if (this.manager.stateApi.getSettings().autoProcess) {
+          this.process();
+        }
+      })
+    );
+
+    // When the invert flag changes, process if autoProcess is enabled
+    this.subscriptions.add(
+      this.$invert.listen(() => {
+        if (this.$points.get().length === 0) {
+          return;
+        }
+
        if (this.manager.stateApi.getSettings().autoProcess) {
          this.process();
        }
@@ -433,7 +480,7 @@ export class CanvasSegmentAnythingModule extends CanvasModuleBase {
        if (this.$points.get().length === 0) {
          return;
        }
-        if (autoProcess && !this.$hasProcessed.get()) {
+        if (autoProcess) {
          this.process();
        }
      })
@@ -441,7 +488,7 @@ export class CanvasSegmentAnythingModule extends CanvasModuleBase {
  };

  /**
-   * Adds event listeners needed while segmenting the entity.
+   * Removes event listeners used while segmenting the entity.
   */
  unsubscribe = () => {
    this.subscriptions.forEach((unsubscribe) => unsubscribe());
@@ -500,6 +547,14 @@ export class CanvasSegmentAnythingModule extends CanvasModuleBase {
      return;
    }

+    const invert = this.$invert.get();
+
+    const hash = stableHash({ points, invert });
+    if (hash === this.$lastProcessedHash.get()) {
+      this.log.trace('Already processed points');
+      return;
+    }
+
    this.$isProcessing.set(true);

    this.log.trace({ points }, 'Segmenting');
@@ -521,7 +576,7 @@ export class CanvasSegmentAnythingModule extends CanvasModuleBase {
    this.abortController = controller;

    // Build the graph for segmenting the image, using the rasterized image DTO
-    const { graph, outputNodeId } = this.buildGraph(rasterizeResult.value);
+    const { graph, outputNodeId } = CanvasSegmentAnythingModule.buildGraph(rasterizeResult.value, points, invert);

    // Run the graph and get the segmented image output
    const segmentResult = await withResultAsync(() =>
@@ -548,38 +603,56 @@ export class CanvasSegmentAnythingModule extends CanvasModuleBase {
    this.log.trace({ imageDTO: segmentResult.value }, 'Segmented');

    // Prepare the ephemeral image state
-    this.imageState = imageDTOToImageObject(segmentResult.value);
+    const imageState = imageDTOToImageObject(segmentResult.value);
+    this.$imageState.set(imageState);

    // Destroy any existing masked image and create a new one
-    if (this.maskedImage) {
-      this.maskedImage.destroy();
+    if (this.imageModule) {
+      this.imageModule.destroy();
    }
-    this.maskedImage = new CanvasObjectImage(this.imageState, this);
+    if (this.konva.maskTween) {
+      this.konva.maskTween.destroy();
+      this.konva.maskTween = null;
+    }
+
+    this.imageModule = new CanvasObjectImage(imageState, this);

    // Force update the masked image - after awaiting, the image will be rendered (in memory)
-    await this.maskedImage.update(this.imageState, true);
+    await this.imageModule.update(imageState, true);

    // Update the compositing rect to match the image size
    this.konva.compositingRect.setAttrs({
-      width: this.imageState.image.width,
-      height: this.imageState.image.height,
+      width: imageState.image.width,
+      height: imageState.image.height,
      visible: true,
    });

    // Now we can add the masked image to the mask group. It will be rendered above the compositing rect, but should be
    // under it, so we will move the compositing rect to the top
-    this.konva.maskGroup.add(this.maskedImage.konva.group);
+    this.konva.maskGroup.add(this.imageModule.konva.group);
    this.konva.compositingRect.moveToTop();

    // Cache the group to ensure the mask is rendered correctly w/ opacity
    this.konva.maskGroup.cache();

+    // Create a pulsing tween
+    this.konva.maskTween = new Konva.Tween({
+      node: this.konva.maskGroup,
+      duration: 1,
+      opacity: 0.4, // oscillate between this value and pre-tween opacity
+      yoyo: true,
+      repeat: Infinity,
+      easing: Konva.Easings.EaseOut,
+    });
+
+    // Start the pulsing effect
+    this.konva.maskTween.play();
+
+    this.$lastProcessedHash.set(hash);
+
    // We are done processing (still segmenting though!)
    this.$isProcessing.set(false);

-    // The current points have been processed
-    this.$hasProcessed.set(true);
-
    // Clean up the abort controller as needed
    if (!this.abortController.signal.aborted) {
      this.abortController.abort();
@@ -593,24 +666,17 @@ export class CanvasSegmentAnythingModule extends CanvasModuleBase {
  process = debounce(this.processImmediate, this.config.PROCESS_DEBOUNCE_MS);

  /**
-   * Applies the segmented image to the entity.
+   * Applies the segmented image to the entity, replacing the entity's objects with the masked image.
   */
  apply = () => {
-    if (!this.$hasProcessed.get()) {
-      this.log.error('Cannot apply unprocessed points');
-      return;
-    }
-    const imageState = this.imageState;
+    const imageState = this.$imageState.get();
    if (!imageState) {
      this.log.error('No image state to apply');
      return;
    }
    this.log.trace('Applying');

-    // Commit the buffer, which will move the buffer to from the layers' buffer renderer to its main renderer
-    this.parent.bufferRenderer.commitBuffer();
-
-    // Rasterize the entity, this time replacing the objects with the masked image
+    // Rasterize the entity, replacing the objects with the masked image
    const rect = this.parent.transformer.getRelativeRect();
    this.manager.stateApi.rasterizeEntity({
      entityIdentifier: this.parent.entityIdentifier,
@@ -627,6 +693,59 @@ export class CanvasSegmentAnythingModule extends CanvasModuleBase {
    this.teardown();
  };

+  /**
+   * Saves the segmented image as a new entity of the given type.
+   * @param type The type of entity to save the segmented image as.
+   */
+  saveAs = (type: Exclude<CanvasEntityType, 'reference_image'>) => {
+    const imageState = this.$imageState.get();
+    if (!imageState) {
+      this.log.error('No image state to save as');
+      return;
+    }
+    this.log.trace(`Saving as ${type}`);
+
+    // Have the parent adopt the image module - this prevents a flash of the original layer content before the
+    // segmented image is rendered
+    if (this.imageModule) {
+      this.parent.renderer.adoptObjectRenderer(this.imageModule);
+    }
+
+    // Create the new entity with the masked image as its only object
+    const rect = this.parent.transformer.getRelativeRect();
+    const arg = {
+      overrides: {
+        objects: [imageState],
+        position: {
+          x: Math.round(rect.x),
+          y: Math.round(rect.y),
+        },
+      },
+      isSelected: true,
+    };
+
+    switch (type) {
+      case 'raster_layer':
+        this.manager.stateApi.addRasterLayer(arg);
+        break;
+      case 'control_layer':
+        this.manager.stateApi.addControlLayer(arg);
+        break;
+      case 'inpaint_mask':
+        this.manager.stateApi.addInpaintMask(arg);
+        break;
+      case 'regional_guidance':
+        this.manager.stateApi.addRegionalGuidance(arg);
+        break;
+      default:
+        assert<Equals<typeof type, never>>(false);
+    }
+
+    // Final cleanup and teardown, returning user to main canvas UI
+    this.resetEphemeralState();
+    this.teardown();
+  };
+
  /**
   * Resets the module (e.g. remove all points and the mask image).
   *
@@ -683,30 +802,39 @@ export class CanvasSegmentAnythingModule extends CanvasModuleBase {
    for (const point of this.$points.get()) {
      point.konva.circle.destroy();
    }
-    if (this.maskedImage) {
-      this.maskedImage.destroy();
+
+    // If the image module exists, and is a child of the group, destroy it. It might not be a child of the group if
+    // the user has applied the segmented image and the image has been adopted by the parent entity.
+    if (this.imageModule && this.imageModule.konva.group.parent === this.konva.group) {
+      this.imageModule.destroy();
+      this.imageModule = null;
+    }
+    if (this.konva.maskTween) {
+      this.konva.maskTween.destroy();
+      this.konva.maskTween = null;
    }

    // Empty internal module state
    this.$points.set([]);
-    this.imageState = null;
+    this.$imageState.set(null);
    this.$pointType.set(1);
-    this.$hasProcessed.set(false);
+    this.$invert.set(false);
+    this.$lastProcessedHash.set('');
    this.$isProcessing.set(false);

    // Reset non-ephemeral konva nodes
    this.konva.compositingRect.visible(false);
    this.konva.maskGroup.clearCache();
-
-    // The parent module's buffer should be reset & forcibly sync the cache
-    this.parent.bufferRenderer.clearBuffer();
-    this.parent.renderer.syncKonvaCache(true);
  };

  /**
   * Builds a graph for segmenting an image with the given image DTO.
   */
-  buildGraph = ({ image_name }: ImageDTO): { graph: Graph; outputNodeId: string } => {
+  static buildGraph = (
+    { image_name }: ImageDTO,
+    points: SAMPointWithId[],
+    invert: boolean
+  ): { graph: Graph; outputNodeId: string } => {
    const graph = new Graph(getPrefixedId('canvas_segment_anything'));

    // TODO(psyche): When SAM2 is available in transformers, use it here
@@ -716,7 +844,7 @@ export class CanvasSegmentAnythingModule extends CanvasModuleBase {
      type: 'segment_anything',
      model: 'segment-anything-huge',
      image: { image_name },
-      point_lists: [{ points: this.getSAMPoints() }],
+      point_lists: [{ points: points.map(({ x, y, label }) => ({ x, y, label })) }],
      mask_filter: 'largest',
    });

@@ -725,6 +853,7 @@ export class CanvasSegmentAnythingModule extends CanvasModuleBase {
      id: getPrefixedId('apply_tensor_mask_to_image'),
      type: 'apply_tensor_mask_to_image',
      image: { image_name },
+      invert,
    });
    graph.addEdge(segmentAnything, 'mask', applyMask, 'mask');

@@ -759,11 +888,11 @@ export class CanvasSegmentAnythingModule extends CanvasModuleBase {
        label,
        circle: getKonvaNodeDebugAttrs(konva.circle),
      })),
-      imageState: deepClone(this.imageState),
-      maskedImage: this.maskedImage?.repr(),
+      imageState: deepClone(this.$imageState.get()),
+      imageModule: this.imageModule?.repr(),
      config: deepClone(this.config),
      $isSegmenting: this.$isSegmenting.get(),
-      $hasProcessed: this.$hasProcessed.get(),
+      $lastProcessedHash: this.$lastProcessedHash.get(),
      $isProcessing: this.$isProcessing.get(),
      $pointType: this.$pointType.get(),
      $pointTypeString: this.$pointTypeString.get(),
--- a/invokeai/frontend/web/src/features/controlLayers/konva/CanvasStagingAreaModule.ts
+++ b/invokeai/frontend/web/src/features/controlLayers/konva/CanvasStagingAreaModule.ts
@@ -51,10 +51,16 @@ export class CanvasStagingAreaModule extends CanvasModuleBase {
    /**
     * Sync the $isStaging flag with the redux state. $isStaging is used by the manager to determine the global busy
     * state of the canvas.
+     *
+     * We also set the $shouldShowStagedImage flag when we enter staging mode, so that the staged images are shown,
+     * even if the user disabled this in the last staging session.
     */
    this.subscriptions.add(
-      this.manager.stateApi.createStoreSubscription(selectIsStaging, (isStaging) => {
+      this.manager.stateApi.createStoreSubscription(selectIsStaging, (isStaging, oldIsStaging) => {
        this.$isStaging.set(isStaging);
+        if (isStaging && !oldIsStaging) {
+          this.$shouldShowStagedImage.set(true);
+        }
      })
    );
  }
--- a/invokeai/frontend/web/src/features/controlLayers/konva/CanvasStateApiModule.ts
+++ b/invokeai/frontend/web/src/features/controlLayers/konva/CanvasStateApiModule.ts
@@ -17,12 +17,16 @@ import {
 } from 'features/controlLayers/store/canvasSettingsSlice';
 import {
  bboxChangedFromCanvas,
+  controlLayerAdded,
  entityBrushLineAdded,
  entityEraserLineAdded,
  entityMoved,
  entityRasterized,
  entityRectAdded,
  entityReset,
+  inpaintMaskAdded,
+  rasterLayerAdded,
+  rgAdded,
 } from 'features/controlLayers/store/canvasSlice';
 import { selectCanvasStagingAreaSlice } from 'features/controlLayers/store/canvasStagingAreaSlice';
 import {
@@ -51,6 +55,7 @@ import { getImageDTO } from 'services/api/endpoints/images';
 import { queueApi } from 'services/api/endpoints/queue';
 import type { BatchConfig, ImageDTO, S } from 'services/api/types';
 import { QueueError } from 'services/events/errors';
+import type { Param0 } from 'tsafe';
 import { assert } from 'tsafe';

 import type { CanvasEntityAdapter } from './CanvasEntity/types';
@@ -160,6 +165,34 @@ export class CanvasStateApiModule extends CanvasModuleBase {
    this.store.dispatch(entityRectAdded(arg));
  };

+  /**
+   * Adds a raster layer to the canvas, pushing state to redux.
+   */
+  addRasterLayer = (arg: Param0<typeof rasterLayerAdded>) => {
+    this.store.dispatch(rasterLayerAdded(arg));
+  };
+
+  /**
+   * Adds a control layer to the canvas, pushing state to redux.
+   */
+  addControlLayer = (arg: Param0<typeof controlLayerAdded>) => {
+    this.store.dispatch(controlLayerAdded(arg));
+  };
+
+  /**
+   * Adds an inpaint mask to the canvas, pushing state to redux.
+   */
+  addInpaintMask = (arg: Param0<typeof inpaintMaskAdded>) => {
+    this.store.dispatch(inpaintMaskAdded(arg));
+  };
+
+  /**
+   * Adds regional guidance to the canvas, pushing state to redux.
+   */
+  addRegionalGuidance = (arg: Param0<typeof rgAdded>) => {
+    this.store.dispatch(rgAdded(arg));
+  };
+
  /**
   * Rasterizes an entity, pushing state to redux.
   */
@@ -260,6 +293,8 @@ export class CanvasStateApiModule extends CanvasModuleBase {
      },
    };

+    let didSuceed = false;
+
    /**
     * If a timeout is provided, we will cancel the graph if it takes too long - but we need a way to clear the timeout
     * if the graph completes or errors before the timeout.
@@ -311,6 +346,8 @@ export class CanvasStateApiModule extends CanvasModuleBase {
          return;
        }

+        didSuceed = true;
+
        // Ok!
        resolve(getImageDTOResult.value);
      };
@@ -401,6 +438,10 @@ export class CanvasStateApiModule extends CanvasModuleBase {

      if (timeout) {
        timeoutId = window.setTimeout(() => {
+          if (didSuceed) {
+            // If we already succeeded, we don't need to do anything
+            return;
+          }
          this.log.trace('Graph canceled by timeout');
          clearListeners();
          cancelGraph();
@@ -410,6 +451,10 @@ export class CanvasStateApiModule extends CanvasModuleBase {

      if (signal) {
        signal.addEventListener('abort', () => {
+          if (didSuceed) {
+            // If we already succeeded, we don't need to do anything
+            return;
+          }
          this.log.trace('Graph canceled by signal');
          _clearTimeout();
          clearListeners();
--- a/invokeai/frontend/web/src/features/controlLayers/konva/CanvasTool/CanvasEraserToolModule.ts
+++ b/invokeai/frontend/web/src/features/controlLayers/konva/CanvasTool/CanvasEraserToolModule.ts
@@ -216,12 +216,14 @@ export class CanvasEraserToolModule extends CanvasModuleBase {
   */
  onStagePointerDown = async (e: KonvaEventObject<PointerEvent>) => {
    const cursorPos = this.parent.$cursorPos.get();
+    const isPrimaryPointerDown = this.parent.$isPrimaryPointerDown.get();
    const selectedEntity = this.manager.stateApi.getSelectedEntityAdapter();

-    if (!cursorPos || !selectedEntity) {
+    if (!cursorPos || !selectedEntity || !isPrimaryPointerDown) {
      /**
       * Can't do anything without:
       * - A cursor position: the cursor is not on the stage
+       * - The mouse is down: the user is not drawing
       * - A selected entity: there is no entity to draw on
       */
      return;
--- a/invokeai/frontend/web/src/features/controlLayers/konva/CanvasTool/CanvasToolModule.ts
+++ b/invokeai/frontend/web/src/features/controlLayers/konva/CanvasTool/CanvasToolModule.ts
@@ -160,11 +160,16 @@ export class CanvasToolModule extends CanvasModuleBase {
    const stage = this.manager.stage;
    const tool = this.$tool.get();
    const segmentingAdapter = this.manager.stateApi.$segmentingAdapter.get();
+    const transformingAdapter = this.manager.stateApi.$transformingAdapter.get();

-    if ((this.manager.stage.getIsDragging() || tool === 'view') && !segmentingAdapter) {
+    if (this.manager.stage.getIsDragging()) {
+      this.tools.view.syncCursorStyle();
+    } else if (tool === 'view') {
      this.tools.view.syncCursorStyle();
    } else if (segmentingAdapter) {
      segmentingAdapter.segmentAnything.syncCursorStyle();
+    } else if (transformingAdapter) {
+      // The transformer handles cursor style via events
    } else if (this.manager.stateApi.$isFiltering.get()) {
      stage.setCursor('not-allowed');
    } else if (this.manager.stagingArea.$isStaging.get()) {
--- a/invokeai/frontend/web/src/features/controlLayers/konva/util.ts
+++ b/invokeai/frontend/web/src/features/controlLayers/konva/util.ts
@@ -126,6 +126,13 @@ export const floorCoord = (coord: Coordinate): Coordinate => {
  };
 };

+export const roundCoord = (coord: Coordinate): Coordinate => {
+  return {
+    x: Math.round(coord.x),
+    y: Math.round(coord.y),
+  };
+};
+
 /**
 * Snaps a position to the edge of the given rect if within a threshold of the edge
 * @param pos The position to snap
--- a/invokeai/frontend/web/src/features/controlLayers/store/canvasSlice.ts
+++ b/invokeai/frontend/web/src/features/controlLayers/store/canvasSlice.ts
@@ -29,7 +29,7 @@ import { isMainModelBase, zModelIdentifierField } from 'features/nodes/types/com
 import { ASPECT_RATIO_MAP } from 'features/parameters/components/Bbox/constants';
 import { getGridSize, getIsSizeOptimal, getOptimalDimension } from 'features/parameters/util/optimalDimension';
 import type { IRect } from 'konva/lib/types';
-import { merge, omit } from 'lodash-es';
+import { merge } from 'lodash-es';
 import type { UndoableOptions } from 'redux-undo';
 import type { ControlNetModelConfig, ImageDTO, IPAdapterModelConfig, T2IAdapterModelConfig } from 'services/api/types';
 import { assert } from 'tsafe';
@@ -57,13 +57,13 @@ import type {
 } from './types';
 import { getEntityIdentifier, isRenderableEntity } from './types';
 import {
+  converters,
  getControlLayerState,
  getInpaintMaskState,
  getRasterLayerState,
  getReferenceImageState,
  getRegionalGuidanceState,
  imageDTOToImageWithDims,
-  initialControlNet,
  initialIPAdapter,
 } from './util';

@@ -157,28 +157,25 @@ export const canvasSlice = createSlice({
      reducer: (
        state,
        action: PayloadAction<
-          EntityIdentifierPayload<{ newId: string; overrides?: Partial<CanvasControlLayerState> }, 'raster_layer'>
+          EntityIdentifierPayload<
+            { newId: string; overrides?: Partial<CanvasControlLayerState>; replace?: boolean },
+            'raster_layer'
+          >
        >
      ) => {
-        const { entityIdentifier, newId, overrides } = action.payload;
+        const { entityIdentifier, newId, overrides, replace } = action.payload;
        const layer = selectEntity(state, entityIdentifier);
        if (!layer) {
          return;
        }

        // Convert the raster layer to control layer
-        const controlLayerState: CanvasControlLayerState = {
-          ...deepClone(layer),
-          id: newId,
-          type: 'control_layer',
-          controlAdapter: deepClone(initialControlNet),
-          withTransparencyEffect: true,
-        };
+        const controlLayerState = converters.rasterLayer.toControlLayer(newId, layer, overrides);

-        merge(controlLayerState, overrides);
-
-        // Remove the raster layer
-        state.rasterLayers.entities = state.rasterLayers.entities.filter((layer) => layer.id !== entityIdentifier.id);
+        if (replace) {
+          // Remove the raster layer
+          state.rasterLayers.entities = state.rasterLayers.entities.filter((layer) => layer.id !== entityIdentifier.id);
+        }

        // Add the converted control layer
        state.controlLayers.entities.push(controlLayerState);
@@ -186,11 +183,90 @@ export const canvasSlice = createSlice({
        state.selectedEntityIdentifier = { type: controlLayerState.type, id: controlLayerState.id };
      },
      prepare: (
-        payload: EntityIdentifierPayload<{ overrides?: Partial<CanvasControlLayerState> } | undefined, 'raster_layer'>
+        payload: EntityIdentifierPayload<
+          { overrides?: Partial<CanvasControlLayerState>; replace?: boolean } | undefined,
+          'raster_layer'
+        >
      ) => ({
        payload: { ...payload, newId: getPrefixedId('control_layer') },
      }),
    },
+    rasterLayerConvertedToInpaintMask: {
+      reducer: (
+        state,
+        action: PayloadAction<
+          EntityIdentifierPayload<
+            { newId: string; overrides?: Partial<CanvasInpaintMaskState>; replace?: boolean },
+            'raster_layer'
+          >
+        >
+      ) => {
+        const { entityIdentifier, newId, overrides, replace } = action.payload;
+        const layer = selectEntity(state, entityIdentifier);
+        if (!layer) {
+          return;
+        }
+
+        // Convert the raster layer to inpaint mask
+        const inpaintMaskState = converters.rasterLayer.toInpaintMask(newId, layer, overrides);
+
+        if (replace) {
+          // Remove the raster layer
+          state.rasterLayers.entities = state.rasterLayers.entities.filter((layer) => layer.id !== entityIdentifier.id);
+        }
+
+        // Add the converted inpaint mask
+        state.inpaintMasks.entities.push(inpaintMaskState);
+
+        state.selectedEntityIdentifier = { type: inpaintMaskState.type, id: inpaintMaskState.id };
+      },
+      prepare: (
+        payload: EntityIdentifierPayload<
+          { overrides?: Partial<CanvasInpaintMaskState>; replace?: boolean } | undefined,
+          'raster_layer'
+        >
+      ) => ({
+        payload: { ...payload, newId: getPrefixedId('inpaint_mask') },
+      }),
+    },
+    rasterLayerConvertedToRegionalGuidance: {
+      reducer: (
+        state,
+        action: PayloadAction<
+          EntityIdentifierPayload<
+            { newId: string; overrides?: Partial<CanvasRegionalGuidanceState>; replace?: boolean },
+            'raster_layer'
+          >
+        >
+      ) => {
+        const { entityIdentifier, newId, overrides, replace } = action.payload;
+        const layer = selectEntity(state, entityIdentifier);
+        if (!layer) {
+          return;
+        }
+
+        // Convert the raster layer to inpaint mask
+        const regionalGuidanceState = converters.rasterLayer.toRegionalGuidance(newId, layer, overrides);
+
+        if (replace) {
+          // Remove the raster layer
+          state.rasterLayers.entities = state.rasterLayers.entities.filter((layer) => layer.id !== entityIdentifier.id);
+        }
+
+        // Add the converted inpaint mask
+        state.regionalGuidance.entities.push(regionalGuidanceState);
+
+        state.selectedEntityIdentifier = { type: regionalGuidanceState.type, id: regionalGuidanceState.id };
+      },
+      prepare: (
+        payload: EntityIdentifierPayload<
+          { overrides?: Partial<CanvasRegionalGuidanceState>; replace?: boolean } | undefined,
+          'raster_layer'
+        >
+      ) => ({
+        payload: { ...payload, newId: getPrefixedId('regional_guidance') },
+      }),
+    },
    //#region Control layers
    controlLayerAdded: {
      reducer: (
@@ -217,32 +293,125 @@ export const canvasSlice = createSlice({
      state.selectedEntityIdentifier = { type: 'control_layer', id: data.id };
    },
    controlLayerConvertedToRasterLayer: {
-      reducer: (state, action: PayloadAction<EntityIdentifierPayload<{ newId: string }, 'control_layer'>>) => {
-        const { entityIdentifier, newId } = action.payload;
+      reducer: (
+        state,
+        action: PayloadAction<
+          EntityIdentifierPayload<
+            { newId: string; overrides?: Partial<CanvasRasterLayerState>; replace?: boolean },
+            'control_layer'
+          >
+        >
+      ) => {
+        const { entityIdentifier, newId, overrides, replace } = action.payload;
        const layer = selectEntity(state, entityIdentifier);
        if (!layer) {
          return;
        }

        // Convert the raster layer to control layer
-        const rasterLayerState: CanvasRasterLayerState = {
-          ...omit(deepClone(layer), ['type', 'controlAdapter', 'withTransparencyEffect']),
-          id: newId,
-          type: 'raster_layer',
-        };
+        const rasterLayerState = converters.controlLayer.toRasterLayer(newId, layer, overrides);

-        // Remove the control layer
-        state.controlLayers.entities = state.controlLayers.entities.filter((layer) => layer.id !== entityIdentifier.id);
+        if (replace) {
+          // Remove the control layer
+          state.controlLayers.entities = state.controlLayers.entities.filter(
+            (layer) => layer.id !== entityIdentifier.id
+          );
+        }

        // Add the new raster layer
        state.rasterLayers.entities.push(rasterLayerState);

        state.selectedEntityIdentifier = { type: rasterLayerState.type, id: rasterLayerState.id };
      },
-      prepare: (payload: EntityIdentifierPayload<void, 'control_layer'>) => ({
+      prepare: (
+        payload: EntityIdentifierPayload<
+          { overrides?: Partial<CanvasRasterLayerState>; replace?: boolean } | undefined,
+          'control_layer'
+        >
+      ) => ({
        payload: { ...payload, newId: getPrefixedId('raster_layer') },
      }),
    },
+    controlLayerConvertedToInpaintMask: {
+      reducer: (
+        state,
+        action: PayloadAction<
+          EntityIdentifierPayload<
+            { newId: string; overrides?: Partial<CanvasInpaintMaskState>; replace?: boolean },
+            'control_layer'
+          >
+        >
+      ) => {
+        const { entityIdentifier, newId, overrides, replace } = action.payload;
+        const layer = selectEntity(state, entityIdentifier);
+        if (!layer) {
+          return;
+        }
+
+        // Convert the control layer to inpaint mask
+        const inpaintMaskState = converters.controlLayer.toInpaintMask(newId, layer, overrides);
+
+        if (replace) {
+          // Remove the control layer
+          state.controlLayers.entities = state.controlLayers.entities.filter(
+            (layer) => layer.id !== entityIdentifier.id
+          );
+        }
+
+        // Add the new inpaint mask
+        state.inpaintMasks.entities.push(inpaintMaskState);
+
+        state.selectedEntityIdentifier = { type: inpaintMaskState.type, id: inpaintMaskState.id };
+      },
+      prepare: (
+        payload: EntityIdentifierPayload<
+          { overrides?: Partial<CanvasInpaintMaskState>; replace?: boolean } | undefined,
+          'control_layer'
+        >
+      ) => ({
+        payload: { ...payload, newId: getPrefixedId('inpaint_mask') },
+      }),
+    },
+    controlLayerConvertedToRegionalGuidance: {
+      reducer: (
+        state,
+        action: PayloadAction<
+          EntityIdentifierPayload<
+            { newId: string; overrides?: Partial<CanvasRegionalGuidanceState>; replace?: boolean },
+            'control_layer'
+          >
+        >
+      ) => {
+        const { entityIdentifier, newId, overrides, replace } = action.payload;
+        const layer = selectEntity(state, entityIdentifier);
+        if (!layer) {
+          return;
+        }
+
+        // Convert the control layer to regional guidance
+        const regionalGuidanceState = converters.controlLayer.toRegionalGuidance(newId, layer, overrides);
+
+        if (replace) {
+          // Remove the control layer
+          state.controlLayers.entities = state.controlLayers.entities.filter(
+            (layer) => layer.id !== entityIdentifier.id
+          );
+        }
+
+        // Add the new regional guidance
+        state.regionalGuidance.entities.push(regionalGuidanceState);
+
+        state.selectedEntityIdentifier = { type: regionalGuidanceState.type, id: regionalGuidanceState.id };
+      },
+      prepare: (
+        payload: EntityIdentifierPayload<
+          { overrides?: Partial<CanvasRegionalGuidanceState>; replace?: boolean } | undefined,
+          'control_layer'
+        >
+      ) => ({
+        payload: { ...payload, newId: getPrefixedId('regional_guidance') },
+      }),
+    },
    controlLayerModelChanged: (
      state,
      action: PayloadAction<
@@ -447,6 +616,46 @@ export const canvasSlice = createSlice({
      state.regionalGuidance.entities.push(data);
      state.selectedEntityIdentifier = { type: 'regional_guidance', id: data.id };
    },
+    rgConvertedToInpaintMask: {
+      reducer: (
+        state,
+        action: PayloadAction<
+          EntityIdentifierPayload<
+            { newId: string; overrides?: Partial<CanvasInpaintMaskState>; replace?: boolean },
+            'regional_guidance'
+          >
+        >
+      ) => {
+        const { entityIdentifier, newId, overrides, replace } = action.payload;
+        const layer = selectEntity(state, entityIdentifier);
+        if (!layer) {
+          return;
+        }
+
+        // Convert the regional guidance to inpaint mask
+        const inpaintMaskState = converters.regionalGuidance.toInpaintMask(newId, layer, overrides);
+
+        if (replace) {
+          // Remove the regional guidance
+          state.regionalGuidance.entities = state.regionalGuidance.entities.filter(
+            (layer) => layer.id !== entityIdentifier.id
+          );
+        }
+
+        // Add the new inpaint mask
+        state.inpaintMasks.entities.push(inpaintMaskState);
+
+        state.selectedEntityIdentifier = { type: inpaintMaskState.type, id: inpaintMaskState.id };
+      },
+      prepare: (
+        payload: EntityIdentifierPayload<
+          { overrides?: Partial<CanvasInpaintMaskState>; replace?: boolean } | undefined,
+          'regional_guidance'
+        >
+      ) => ({
+        payload: { ...payload, newId: getPrefixedId('inpaint_mask') },
+      }),
+    },
    rgPositivePromptChanged: (
      state,
      action: PayloadAction<EntityIdentifierPayload<{ prompt: string | null }, 'regional_guidance'>>
@@ -644,6 +853,44 @@ export const canvasSlice = createSlice({
      state.inpaintMasks.entities = [data];
      state.selectedEntityIdentifier = { type: 'inpaint_mask', id: data.id };
    },
+    inpaintMaskConvertedToRegionalGuidance: {
+      reducer: (
+        state,
+        action: PayloadAction<
+          EntityIdentifierPayload<
+            { newId: string; overrides?: Partial<CanvasRegionalGuidanceState>; replace?: boolean },
+            'inpaint_mask'
+          >
+        >
+      ) => {
+        const { entityIdentifier, newId, overrides, replace } = action.payload;
+        const layer = selectEntity(state, entityIdentifier);
+        if (!layer) {
+          return;
+        }
+
+        // Convert the inpaint mask to regional guidance
+        const regionalGuidanceState = converters.inpaintMask.toRegionalGuidance(newId, layer, overrides);
+
+        if (replace) {
+          // Remove the inpaint mask
+          state.inpaintMasks.entities = state.inpaintMasks.entities.filter((layer) => layer.id !== entityIdentifier.id);
+        }
+
+        // Add the new regional guidance
+        state.regionalGuidance.entities.push(regionalGuidanceState);
+
+        state.selectedEntityIdentifier = { type: regionalGuidanceState.type, id: regionalGuidanceState.id };
+      },
+      prepare: (
+        payload: EntityIdentifierPayload<
+          { overrides?: Partial<CanvasRegionalGuidanceState>; replace?: boolean } | undefined,
+          'inpaint_mask'
+        >
+      ) => ({
+        payload: { ...payload, newId: getPrefixedId('regional_guidance') },
+      }),
+    },
    //#region BBox
    bboxScaledWidthChanged: (state, action: PayloadAction<number>) => {
      const gridSize = getGridSize(state.bbox.modelBase);
@@ -1210,10 +1457,14 @@ export const {
  rasterLayerAdded,
  // rasterLayerRecalled,
  rasterLayerConvertedToControlLayer,
+  rasterLayerConvertedToInpaintMask,
+  rasterLayerConvertedToRegionalGuidance,
  // Control layers
  controlLayerAdded,
  // controlLayerRecalled,
  controlLayerConvertedToRasterLayer,
+  controlLayerConvertedToInpaintMask,
+  controlLayerConvertedToRegionalGuidance,
  controlLayerModelChanged,
  controlLayerControlModeChanged,
  controlLayerWeightChanged,
@@ -1231,6 +1482,7 @@ export const {
  // Regions
  rgAdded,
  // rgRecalled,
+  rgConvertedToInpaintMask,
  rgPositivePromptChanged,
  rgNegativePromptChanged,
  rgAutoNegativeToggled,
@@ -1244,6 +1496,7 @@ export const {
  rgIPAdapterCLIPVisionModelChanged,
  // Inpaint mask
  inpaintMaskAdded,
+  inpaintMaskConvertedToRegionalGuidance,
  // inpaintMaskRecalled,
 } = canvasSlice.actions;

--- a/invokeai/frontend/web/src/features/controlLayers/store/selectors.ts
+++ b/invokeai/frontend/web/src/features/controlLayers/store/selectors.ts
@@ -349,6 +349,27 @@ export const buildSelectIsSelected = (entityIdentifier: CanvasEntityIdentifier)
  );
 };

+/**
+ * Builds a selector that selects if the entity is empty.
+ *
+ * Reference images are considered empty if the IP adapter is empty.
+ *
+ * Other entities are considered empty if they have no objects.
+ */
+export const buildSelectHasObjects = (entityIdentifier: CanvasEntityIdentifier) => {
+  return createSelector(selectCanvasSlice, (canvas) => {
+    const entity = selectEntity(canvas, entityIdentifier);
+
+    if (!entity) {
+      return false;
+    }
+    if (entity.type === 'reference_image') {
+      return entity.ipAdapter.image !== null;
+    }
+    return entity.objects.length > 0;
+  });
+};
+
 export const selectWidth = createSelector(selectCanvasSlice, (canvas) => canvas.bbox.rect.width);
 export const selectHeight = createSelector(selectCanvasSlice, (canvas) => canvas.bbox.rect.height);
 export const selectAspectRatioID = createSelector(selectCanvasSlice, (canvas) => canvas.bbox.aspectRatio.id);
--- a/invokeai/frontend/web/src/features/controlLayers/store/types.ts
+++ b/invokeai/frontend/web/src/features/controlLayers/store/types.ts
@@ -131,7 +131,8 @@ const zSAMPoint = z.object({
  y: z.number().int().gte(0),
  label: zSAMPointLabel,
 });
-export type SAMPoint = z.infer<typeof zSAMPoint>;
+type SAMPoint = z.infer<typeof zSAMPoint>;
+export type SAMPointWithId = SAMPoint & { id: string };

 const zRect = z.object({
  x: z.number(),
--- a/invokeai/frontend/web/src/features/controlLayers/store/util.ts
+++ b/invokeai/frontend/web/src/features/controlLayers/store/util.ts
@@ -184,3 +184,153 @@ export const getInpaintMaskState = (
  merge(entityState, overrides);
  return entityState;
 };
+
+const convertRasterLayerToControlLayer = (
+  newId: string,
+  rasterLayerState: CanvasRasterLayerState,
+  overrides?: Partial<CanvasControlLayerState>
+): CanvasControlLayerState => {
+  const { name, objects, position } = rasterLayerState;
+  const controlLayerState = getControlLayerState(newId, {
+    name,
+    objects,
+    position,
+  });
+  merge(controlLayerState, overrides);
+  return controlLayerState;
+};
+
+const convertRasterLayerToInpaintMask = (
+  newId: string,
+  rasterLayerState: CanvasRasterLayerState,
+  overrides?: Partial<CanvasInpaintMaskState>
+): CanvasInpaintMaskState => {
+  const { name, objects, position } = rasterLayerState;
+  const inpaintMaskState = getInpaintMaskState(newId, {
+    name,
+    objects,
+    position,
+  });
+  merge(inpaintMaskState, overrides);
+  return inpaintMaskState;
+};
+
+const convertRasterLayerToRegionalGuidance = (
+  newId: string,
+  rasterLayerState: CanvasRasterLayerState,
+  overrides?: Partial<CanvasRegionalGuidanceState>
+): CanvasRegionalGuidanceState => {
+  const { name, objects, position } = rasterLayerState;
+  const regionalGuidanceState = getRegionalGuidanceState(newId, {
+    name,
+    objects,
+    position,
+  });
+  merge(regionalGuidanceState, overrides);
+  return regionalGuidanceState;
+};
+
+const convertControlLayerToRasterLayer = (
+  newId: string,
+  controlLayerState: CanvasControlLayerState,
+  overrides?: Partial<CanvasRasterLayerState>
+): CanvasRasterLayerState => {
+  const { name, objects, position } = controlLayerState;
+  const rasterLayerState = getRasterLayerState(newId, {
+    name,
+    objects,
+    position,
+  });
+  merge(rasterLayerState, overrides);
+  return rasterLayerState;
+};
+
+const convertControlLayerToInpaintMask = (
+  newId: string,
+  rasterLayerState: CanvasControlLayerState,
+  overrides?: Partial<CanvasInpaintMaskState>
+): CanvasInpaintMaskState => {
+  const { name, objects, position } = rasterLayerState;
+  const inpaintMaskState = getInpaintMaskState(newId, {
+    name,
+    objects,
+    position,
+  });
+  merge(inpaintMaskState, overrides);
+  return inpaintMaskState;
+};
+
+const convertControlLayerToRegionalGuidance = (
+  newId: string,
+  rasterLayerState: CanvasControlLayerState,
+  overrides?: Partial<CanvasRegionalGuidanceState>
+): CanvasRegionalGuidanceState => {
+  const { name, objects, position } = rasterLayerState;
+  const regionalGuidanceState = getRegionalGuidanceState(newId, {
+    name,
+    objects,
+    position,
+  });
+  merge(regionalGuidanceState, overrides);
+  return regionalGuidanceState;
+};
+
+const convertInpaintMaskToRegionalGuidance = (
+  newId: string,
+  inpaintMaskState: CanvasInpaintMaskState,
+  overrides?: Partial<CanvasRegionalGuidanceState>
+): CanvasRegionalGuidanceState => {
+  const { name, objects, position } = inpaintMaskState;
+  const regionalGuidanceState = getRegionalGuidanceState(newId, {
+    name,
+    objects,
+    position,
+  });
+  merge(regionalGuidanceState, overrides);
+  return regionalGuidanceState;
+};
+
+const convertRegionalGuidanceToInpaintMask = (
+  newId: string,
+  regionalGuidanceState: CanvasRegionalGuidanceState,
+  overrides?: Partial<CanvasInpaintMaskState>
+): CanvasInpaintMaskState => {
+  const { name, objects, position } = regionalGuidanceState;
+  const inpaintMaskState = getInpaintMaskState(newId, {
+    name,
+    objects,
+    position,
+  });
+  merge(inpaintMaskState, overrides);
+  return inpaintMaskState;
+};
+
+/**
+ * Supported conversions:
+ * - Raster Layer -> Control Layer
+ * - Raster Layer -> Inpaint Mask
+ * - Raster Layer -> Regional Guidance
+ * - Control Layer -> Control Layer
+ * - Control Layer -> Inpaint Mask
+ * - Control Layer -> Regional Guidance
+ * - Inpaint Mask -> Regional Guidance
+ * - Regional Guidance -> Inpaint Mask
+ */
+export const converters = {
+  rasterLayer: {
+    toControlLayer: convertRasterLayerToControlLayer,
+    toInpaintMask: convertRasterLayerToInpaintMask,
+    toRegionalGuidance: convertRasterLayerToRegionalGuidance,
+  },
+  controlLayer: {
+    toRasterLayer: convertControlLayerToRasterLayer,
+    toInpaintMask: convertControlLayerToInpaintMask,
+    toRegionalGuidance: convertControlLayerToRegionalGuidance,
+  },
+  inpaintMask: {
+    toRegionalGuidance: convertInpaintMaskToRegionalGuidance,
+  },
+  regionalGuidance: {
+    toInpaintMask: convertRegionalGuidanceToInpaintMask,
+  },
+};
--- a/invokeai/frontend/web/src/features/dnd/types/index.ts
+++ b/invokeai/frontend/web/src/features/dnd/types/index.ts
@@ -42,6 +42,14 @@ export type AddControlLayerFromImageDropData = BaseDropData & {
  actionType: 'ADD_CONTROL_LAYER_FROM_IMAGE';
 };

+type AddInpaintMaskFromImageDropData = BaseDropData & {
+  actionType: 'ADD_INPAINT_MASK_FROM_IMAGE';
+};
+
+type AddRegionalGuidanceFromImageDropData = BaseDropData & {
+  actionType: 'ADD_REGIONAL_GUIDANCE_FROM_IMAGE';
+};
+
 export type AddRegionalReferenceImageFromImageDropData = BaseDropData & {
  actionType: 'ADD_REGIONAL_REFERENCE_IMAGE_FROM_IMAGE';
 };
@@ -53,7 +61,7 @@ export type AddGlobalReferenceImageFromImageDropData = BaseDropData & {
 export type ReplaceLayerImageDropData = BaseDropData & {
  actionType: 'REPLACE_LAYER_WITH_IMAGE';
  context: {
-    entityIdentifier: CanvasEntityIdentifier<'control_layer' | 'raster_layer'>;
+    entityIdentifier: CanvasEntityIdentifier<'control_layer' | 'raster_layer' | 'inpaint_mask' | 'regional_guidance'>;
  };
 };

@@ -98,7 +106,9 @@ export type TypesafeDroppableData =
  | AddControlLayerFromImageDropData
  | ReplaceLayerImageDropData
  | AddRegionalReferenceImageFromImageDropData
-  | AddGlobalReferenceImageFromImageDropData;
+  | AddGlobalReferenceImageFromImageDropData
+  | AddInpaintMaskFromImageDropData
+  | AddRegionalGuidanceFromImageDropData;

 type BaseDragData = {
  id: string;
--- a/Show More
+++ b/Show More
Author	SHA1	Message	Date
psychedelicious	825f163492	chore: bump version to v5.4.0a1	2024-10-30 11:06:01 +11:00
psychedelicious	bc42205593	fix(ui): remember to disable isFiltering when finishing filtering	2024-10-30 09:19:30 +11:00
psychedelicious	2e3cba6416	fix(ui): flash of original layer when applying filter/segment Let the parent module adopt the filtered/segemented image instead of destroying it and making the parent re-create it, which results in a brief flash of the parent layer's original objects before the new image is rendered.	2024-10-30 09:19:30 +11:00
psychedelicious	7852aacd11	fix(uI): track whether graph succeeded in runGraphAndReturnImageOutput This prevents extraneous graph cancel requests when cleaning up the abort signal after a successful run of a graph.	2024-10-30 09:19:30 +11:00
psychedelicious	6cccd67ecd	feat(ui): update SAM module to w/ minor improvements from filter module	2024-10-30 09:19:30 +11:00
psychedelicious	a7a89c9de1	feat(ui): use more resilient logic in canvas filter module, same as in SAM module	2024-10-30 09:19:30 +11:00
psychedelicious	5ca8eed89e	tidy(ui): remove all buffer renderer interactions in SAM module We don't use the buffer rendere in this module; there's no reason to clear it.	2024-10-30 09:19:30 +11:00
psychedelicious	c885c3c9a6	fix(ui): filter layer data pushed to parent rendered when saving as	2024-10-30 09:19:30 +11:00
Mary Hipp	d81c38c350	update announcements	2024-10-29 09:53:13 -04:00
Riku	92d5b73215	fix(ui): seamless zod parameter cleanup	2024-10-29 20:43:44 +11:00
Riku	097e92db6a	fix(ui): always write seamless metadata Ensure images without seamless enabled correctly reset the setting when all parameters are recalled	2024-10-29 20:43:44 +11:00
Riku	84c6209a45	feat(ui): display seamless values in metadata viewer	2024-10-29 20:43:44 +11:00
Riku	107e48808a	fix(ui): recall seamless settings	2024-10-29 20:43:44 +11:00
dunkeroni	47168b5505	chore: make ruff	2024-10-29 14:07:20 +11:00
dunkeroni	58152ec981	fix preview progress bar pre-denoise	2024-10-29 14:07:20 +11:00
dunkeroni	c74afbf332	convert to bgr on sdxl t2i	2024-10-29 14:07:20 +11:00
psychedelicious	7cdda00a54	feat(ui): rearrange canvas paste back nodes to save an image step We were scaling the unscaled image and mask down before doing the paste-back, but this adds an extraneous step & image output. We can do the paste-back first, then scale to output size after. So instead of 2 resizes before the paste-back, we have 1 resize after. The end result is the same.	2024-10-29 11:13:31 +11:00
psychedelicious	a74282bce6	feat(ui): graph builders use objects for arg instead of many args	2024-10-29 11:13:31 +11:00
psychedelicious	107f048c7a	feat(ui): extract canvas output node prefix to constant	2024-10-29 11:13:31 +11:00
Ryan Dick	a2486a5f06	Remove unused prediction_type and upcast_attention from from_single_file(...) calls.	2024-10-28 13:05:17 -04:00
Ryan Dick	07ab116efb	Remove `load_safety_checker=False` from calls to from_single_file(...). This param has been deprecated, and by including it (even when set to False) the safety checker automatically gets downloaded.	2024-10-28 13:05:17 -04:00
Ryan Dick	1a13af3c7a	Fix huggingface_hub.errors imports after version bump.	2024-10-28 13:05:17 -04:00
Ryan Dick	f2966a2594	Fix changed import for FromOriginalControlNetMixin after diffusers bump.	2024-10-28 13:05:17 -04:00
Ryan Dick	58bb97e3c6	Bump diffusers, accelerate, and huggingface-hub.	2024-10-28 13:05:17 -04:00
psychedelicious	a84aa5c049	fix(ui): canvas alerts blocking metadata panel	2024-10-27 09:46:01 +11:00
psychedelicious	aebcec28e0	chore: bump version to v5.3.0	2024-10-25 22:37:59 -04:00
psychedelicious	db1c5a94f7	feat(ui): image ctx -> New from Image -> Canvas as Raster/Control Layer	2024-10-25 22:27:00 -04:00
psychedelicious	56222a8493	feat(ui): organize layer context menu items	2024-10-25 22:27:00 -04:00
psychedelicious	b7510ce709	feat(ui): filter, select object and transform UI buttons - Restore dedicated `Apply` buttons - Remove icons from the buttons, too much noise when the words are short and clear - Update loading state to show a spinner next to the `Process` button instead of on _every_ button	2024-10-25 22:27:00 -04:00
psychedelicious	5739799e2e	fix(ui): close viewer when transforming	2024-10-25 22:27:00 -04:00
psychedelicious	813cf87920	feat(ui): move canvas alerts to top-left corner	2024-10-25 22:27:00 -04:00
psychedelicious	c95b151daf	feat(ui): add layer title heading for canvas ctx menu	2024-10-25 22:27:00 -04:00
psychedelicious	a0f823a3cf	feat(ui): reset `shouldShowStagedImage` flag when starting staging	2024-10-25 22:27:00 -04:00
Hippalectryon	64e0f6d688	Improve dev install docs Fix numbering	2024-10-25 08:27:26 -04:00
psychedelicious	ddd5b1087c	fix(nodes): return copies of objects in invocation ctx Closes #6820	2024-10-25 08:26:09 -04:00
psychedelicious	008be9b846	feat(ui): add all save as options to filter	2024-10-25 08:12:14 -04:00
psychedelicious	8e7cabdc04	feat(ui): add `Replace Current` open to `Select Object -> Save As`	2024-10-25 08:12:14 -04:00
psychedelicious	a4c4237f99	feat(ui): use `PiPlayFill` for process buttons for filter & select object	2024-10-25 08:12:14 -04:00
psychedelicious	bda3740dcd	feat(ui): use `fill` style icons for Filter	2024-10-25 08:12:14 -04:00
psychedelicious	5b4633baa9	feat(ui): use `PiShapesFill` icon for `Select Object`	2024-10-25 08:12:14 -04:00
psychedelicious	96351181cb	feat(ui): make canvas layer toolbar icons a bit larger	2024-10-25 08:12:14 -04:00
psychedelicious	957d591d99	feat(ui): "Auto-Mask" -> "Select Object"	2024-10-25 08:12:14 -04:00
psychedelicious	75f605ba1a	feat(ui): support inverted selection in auto-mask	2024-10-25 08:12:14 -04:00
psychedelicious	ab898a7180	chore(ui): typegen	2024-10-25 08:12:14 -04:00
psychedelicious	c9a4516ab1	feat(nodes): add `invert` to `apply_tensor_mask_to_image`	2024-10-25 08:12:14 -04:00
psychedelicious	fe97c0d5eb	tweak(ui): default settings verbiage	2024-10-25 16:09:59 +11:00
psychedelicious	6056764840	feat(ui): disable default settings button when synced A blue button is begging to be clicked, but clicking it will do nothing. Instead, we should communicate that no action is needed by disabling the button when the default settings are already in use.	2024-10-25 16:09:59 +11:00
psychedelicious	8747c0dbb0	fix(ui): handle no model selection in default settings tooltip	2024-10-25 16:09:59 +11:00
psychedelicious	c5cdd5f9c6	fix(ui): use const EMPTY_OBJECT to prevent rerenders	2024-10-25 16:09:59 +11:00
psychedelicious	abc5d53159	fix(ui): use explicit null check when comparing default settings Using `&&` will result in false negatives for settings where a falsy value might be valid. For example, any setting for which 0 is a valid number. To be on the safe side, just use an explicit null check on all values.	2024-10-25 16:09:59 +11:00
psychedelicious	2f76019a89	tweak(ui): defaults sync tooltip styling	2024-10-25 16:09:59 +11:00
Mary Hipp	3f45beb1ed	feat(ui): add out of sync details to model default settings button	2024-10-25 16:09:59 +11:00
Mary Hipp	bc1126a85b	(ui): add setting for showing model descriptions in dropdown defaulted to true	2024-10-25 14:52:33 +11:00
psychedelicious	380017041e	fix(app): mutating an image also changes the in-memory cached image We use an in-memory cache for PIL images to reduce I/O. If a node mutates the image in any way, the cached image object is also updated (but the on-disk image file is not). We've lucked out that this hasn't caused major issues in the past (well, maybe it has but we didn't understand them?) mainly because of a happy accident. When you call `context.images.get_pil` in a node, if you provide an image mode (e.g. `mode="RGB"`), we call `convert` on the image. This returns a copy. The node can do whatever it wants to that copy and nothing breaks. However, when mode is not specified, we return the image directly. This is where we get in trouble - nodes that load the image like this, and then mutate the image, update the cache. Other nodes that reference that same image will now get the mutated version of it. The fix is super simple - we make sure to return only copies from `get_pil`.	2024-10-25 10:22:22 +11:00
psychedelicious	ab7cdbb7e0	fix(ui): do not delete point on right-mouse click	2024-10-25 10:22:22 +11:00
psychedelicious	e5b78d0221	fix(ui): canvas drop area grid layout	2024-10-25 10:22:22 +11:00
psychedelicious	1acaa6c486	chore: bump version to v5.3.0rc2	2024-10-25 07:50:58 +11:00
psychedelicious	b0381076b7	revert(ui): drop targets for inpaint mask and rg	2024-10-25 07:42:46 +11:00
psychedelicious	ffff2d6dbb	feat(ui): add `New from Image` submenu for image ctx menu	2024-10-25 07:42:46 +11:00
psychedelicious	afa9f07649	fix(ui): missing cursor when transforming	2024-10-25 07:42:46 +11:00
psychedelicious	addb5c49ea	feat(ui): support dnd images onto inpaint mask/rg entities	2024-10-25 07:42:46 +11:00
psychedelicious	a112d2d55b	feat(ui): add logging to useCopyLayerToClipboard	2024-10-25 07:42:46 +11:00
psychedelicious	619a271c8a	feat(ui): disable copy to clipboard when layer is empty	2024-10-25 07:42:46 +11:00
psychedelicious	909f2ee36d	feat(ui): add help tooltip to automask	2024-10-25 07:42:46 +11:00
psychedelicious	b4cf3d9d03	fix(ui): canvas context menu w/ eraser tool erases	2024-10-25 07:42:46 +11:00
psychedelicious	e6ab6e0293	chore(ui): lint	2024-10-24 08:39:29 -04:00
psychedelicious	66d9c7c631	fix(ui): icon for automask save as	2024-10-24 08:39:29 -04:00
psychedelicious	fec45f3eb6	feat(ui): animate automask preview overlay	2024-10-24 08:39:29 -04:00
psychedelicious	7211d1a6fc	feat(ui): add context menu options for layer type convert/copy	2024-10-24 08:39:29 -04:00
psychedelicious	f3069754a9	feat(ui): add logic to convert/copy between all layer types	2024-10-24 08:39:29 -04:00
psychedelicious	4f43152aeb	fix(ui): handle pen/touch events on submenu	2024-10-24 08:39:29 -04:00
psychedelicious	7125055d02	fix(ui): icon menu item group spacing	2024-10-24 08:39:29 -04:00
psychedelicious	c91a9ce390	feat(ui): add pull bbox to global ref image ctx menu	2024-10-24 08:39:29 -04:00
psychedelicious	3e7b73da2c	feat(ui): add entity context menu as canvas context menu sub-menu	2024-10-24 08:39:29 -04:00
psychedelicious	61ac50c00d	feat(ui): use sub-menu for image metadata recall	2024-10-24 08:39:29 -04:00
psychedelicious	c1201f0bce	feat(ui): add `useSubMenu` hook to abstract logic for sub-menus	2024-10-24 08:39:29 -04:00
psychedelicious	acdffac5ad	feat(ui): close viewer when filtering/transforming/automasking	2024-10-24 08:39:29 -04:00
psychedelicious	e420300fa4	feat(ui): replace automask apply w/ save as menu	2024-10-24 08:39:29 -04:00
psychedelicious	260a5a4f9a	feat(ui): add automask button to toolbar	2024-10-24 08:39:29 -04:00
psychedelicious	ed0c2006fe	feat(ui): rename "foreground"/"background" -> "include"/"exclude"	2024-10-24 08:39:29 -04:00
psychedelicious	9ffd888c86	feat(ui): remove neutral points	2024-10-24 08:39:29 -04:00
psychedelicious	175a9dc28d	feat(ui): more resilient auto-masking processing - Use a hash of the last processed points instead of a `hasProcessed` flag to determine whether or not we should re-process a given set of points. - Store point coords in state instead of pulling them out of the konva node positions. This makes moving a point a more explicit action in code. - Add a `roundCoord` util to round the x and y values of a coordinate. - Ensure we always re-process when $points changes.	2024-10-24 08:39:29 -04:00
psychedelicious	5764e4f7f2	chore(ui): lint	2024-10-24 23:34:06 +11:00
psychedelicious	4275a494b9	tweak(ui): bundle info icon	2024-10-24 23:34:06 +11:00
psychedelicious	a3deb8d30d	tweak(ui): bundle tooltip styling	2024-10-24 23:34:06 +11:00
Mary Hipp	aafdb0a37b	update popover copy	2024-10-24 23:34:06 +11:00
Mary Hipp	56a815719a	update schema	2024-10-24 23:34:06 +11:00
Mary Hipp	4db26bfa3a	(ui): add information popovers for other layer types	2024-10-24 23:34:06 +11:00
Mary Hipp	8d84ccb12b	bump UI dep for combobox descriptions	2024-10-24 23:34:06 +11:00
Mary Hipp	3321d14997	undo show descriptions for now	2024-10-24 23:34:06 +11:00
maryhipp	43cc4684e1	(api) make sure all controlnet starter models will still have pre-processors correctly assigned when probed based on name	2024-10-24 23:34:06 +11:00
Mary Hipp	afa5a4b17c	(ui): add informational popover for controlnet layers	2024-10-24 23:34:06 +11:00
Mary Hipp	33c433fe59	(ui): show models in starter bundles on hover, use previous_names for isInstalled logic, allow grouped model combobox to optionally show descriptions	2024-10-24 23:34:06 +11:00
maryhipp	9cd47fa857	(api): update names of starter models, add ability to track previous_names so it does not mess up logic that prevents dupe starter model installs	2024-10-24 23:34:06 +11:00
psychedelicious	32d9abe802	tweak(ui): prevent show/hide boards button cutoff The use of hard 25% widths caused issues for some translations. Adjusted styling to not rely on any hard numbers. Tested with a project name and URL.	2024-10-24 08:21:16 -04:00
psychedelicious	3947d4a165	fix(ui): normalize infill alpha to 0-255 when building infill nodes The browser/UI uses float 0-1 for alpha, while backend uses 0-255. We need to normalize the value when building the infill nodes for outpaint.	2024-10-24 19:22:36 +11:00