Create CachedModelOnlyFullLoad class.

Move CachedModelWithPartialLoad into the main model_cache/ directory.
Get rid of ModelLocker. It was an unnecessary layer of indirection.
2026-01-22 17:58:02 -05:00 · 2024-12-05 18:43:50 +00:00 · 2024-12-05 18:21:26 +00:00 · 2024-12-05 16:59:40 +00:00 · 2024-12-05 16:11:40 +00:00 · 2024-12-04 22:53:57 +00:00
113 changed files with 822 additions and 592228 deletions
--- a/docs/contributing/ARCHITECTURE.md
+++ b/docs/contributing/ARCHITECTURE.md
@@ -50,7 +50,7 @@ Applications are built on top of the invoke framework. They should construct `in

 ### Web UI

-The Web UI is built on top of an HTTP API built with [FastAPI](https://fastapi.tiangolo.com/) and [Socket.IO](https://socket.io/). The frontend code is found in `/invokeai/frontend` and the backend code is found in `/invokeai/app/api_app.py` and `/invokeai/app/api/`. The code is further organized as such:
+The Web UI is built on top of an HTTP API built with [FastAPI](https://fastapi.tiangolo.com/) and [Socket.IO](https://socket.io/). The frontend code is found in `/frontend` and the backend code is found in `/ldm/invoke/app/api_app.py` and `/ldm/invoke/app/api/`. The code is further organized as such:

 | Component | Description |
 | --- | --- |
@@ -62,7 +62,7 @@ The Web UI is built on top of an HTTP API built with [FastAPI](https://fastapi.t

 ### CLI

-The CLI is built automatically from invocation metadata, and also supports invocation piping and auto-linking. Code is available in `/invokeai/frontend/cli`.
+The CLI is built automatically from invocation metadata, and also supports invocation piping and auto-linking. Code is available in `/ldm/invoke/app/cli_app.py`.

 ## Invoke

@@ -70,7 +70,7 @@ The Invoke framework provides the interface to the underlying AI systems and is

 ### Invoker

-The invoker (`/invokeai/app/services/invoker.py`) is the primary interface through which applications interact with the framework. Its primary purpose is to create, manage, and invoke sessions. It also maintains two sets of services:
+The invoker (`/ldm/invoke/app/services/invoker.py`) is the primary interface through which applications interact with the framework. Its primary purpose is to create, manage, and invoke sessions. It also maintains two sets of services:
 - **invocation services**, which are used by invocations to interact with core functionality.
 - **invoker services**, which are used by the invoker to manage sessions and manage the invocation queue.

@@ -82,12 +82,12 @@ The session graph does not support looping. This is left as an application probl

 ### Invocations

-Invocations represent individual units of execution, with inputs and outputs. All invocations are located in `/invokeai/app/invocations`, and are all automatically discovered and made available in the applications. These are the primary way to expose new functionality in Invoke.AI, and the [implementation guide](INVOCATIONS.md) explains how to add new invocations.
+Invocations represent individual units of execution, with inputs and outputs. All invocations are located in `/ldm/invoke/app/invocations`, and are all automatically discovered and made available in the applications. These are the primary way to expose new functionality in Invoke.AI, and the [implementation guide](INVOCATIONS.md) explains how to add new invocations.

 ### Services

-Services provide invocations access AI Core functionality and other necessary functionality (e.g. image storage). These are available in `/invokeai/app/services`. As a general rule, new services should provide an interface as an abstract base class, and may provide a lightweight local implementation by default in their module. The goal for all services should be to enable the usage of different implementations (e.g. using cloud storage for image storage), but should not load any module dependencies unless that implementation has been used (i.e. don't import anything that won't be used, especially if it's expensive to import).
+Services provide invocations access AI Core functionality and other necessary functionality (e.g. image storage). These are available in `/ldm/invoke/app/services`. As a general rule, new services should provide an interface as an abstract base class, and may provide a lightweight local implementation by default in their module. The goal for all services should be to enable the usage of different implementations (e.g. using cloud storage for image storage), but should not load any module dependencies unless that implementation has been used (i.e. don't import anything that won't be used, especially if it's expensive to import).

 ## AI Core

-The AI Core is represented by the rest of the code base (i.e. the code outside of `/invokeai/app/`).
+The AI Core is represented by the rest of the code base (i.e. the code outside of `/ldm/invoke/app/`).
--- a/docs/contributing/INVOCATIONS.md
+++ b/docs/contributing/INVOCATIONS.md
@@ -287,8 +287,8 @@ new Invocation ready to be used.

 Once you've created a Node, the next step is to share it with the community! The
 best way to do this is to submit a Pull Request to add the Node to the
-[Community Nodes](../nodes/communityNodes.md) list. If you're not sure how to do that,
-take a look a at our [contributing nodes overview](../nodes/contributingNodes.md).
+[Community Nodes](nodes/communityNodes) list. If you're not sure how to do that,
+take a look a at our [contributing nodes overview](contributingNodes).

 ## Advanced

--- a/docs/contributing/MODEL_MANAGER.md
+++ b/docs/contributing/MODEL_MANAGER.md
@@ -9,20 +9,20 @@ model. These are the:
  configuration information. Among other things, the record service
  tracks the type of the model, its provenance, and where it can be
  found on disk.
-
+  
 * _ModelInstallServiceBase_ A service for installing models to
  disk. It uses `DownloadQueueServiceBase` to download models and
  their metadata, and `ModelRecordServiceBase` to store that
  information. It is also responsible for managing the InvokeAI
  `models` directory and its contents.
-
+  
 * _DownloadQueueServiceBase_
  A multithreaded downloader responsible
  for downloading models from a remote source to disk. The download
  queue has special methods for downloading repo_id folders from
  Hugging Face, as well as discriminating among model versions in
  Civitai, but can be used for arbitrary content.
-
+  
  * _ModelLoadServiceBase_
  Responsible for loading a model from disk
  into RAM and VRAM and getting it ready for inference.
@@ -207,9 +207,9 @@ for use in the InvokeAI web server. Its signature is:

 ```
 def open(
-       cls,
-    config: InvokeAIAppConfig,
-    conn: Optional[sqlite3.Connection] = None,
+       cls, 
+    config: InvokeAIAppConfig, 
+    conn: Optional[sqlite3.Connection] = None, 
    lock: Optional[threading.Lock] = None
    ) -> Union[ModelRecordServiceSQL, ModelRecordServiceFile]:
 ```
@@ -363,7 +363,7 @@ functionality:

 * Registering a model config record for a model already located on the
  local filesystem, without moving it or changing its path.
-
+  
 * Installing a model alreadiy located on the local filesystem, by
  moving it into the InvokeAI root directory under the
  `models` folder (or wherever config parameter `models_dir`
@@ -371,21 +371,21 @@ functionality:

 * Probing of models to determine their type, base type and other key
  information.
-
+  
 * Interface with the InvokeAI event bus to provide status updates on
  the download, installation and registration process.
-
+  
 * Downloading a model from an arbitrary URL and installing it in
  `models_dir`.

 * Special handling for HuggingFace repo_ids to recursively download
  the contents of the repository, paying attention to alternative
  variants such as fp16.
-
+  
 * Saving tags and other metadata about the model into the invokeai database
  when fetching from a repo that provides that type of information,
  (currently only HuggingFace).
-
+  
 ### Initializing the installer

 A default installer is created at InvokeAI api startup time and stored
@@ -461,7 +461,7 @@ revision.
 `config` is an optional dict of values that will override the
 autoprobed values for model type, base, scheduler prediction type, and
 so forth. See [Model configuration and
-probing](#model-configuration-and-probing) for details.
+probing](#Model-configuration-and-probing) for details.

 `access_token` is an optional access token for accessing resources
 that need authentication.
@@ -494,7 +494,7 @@ source8 = URLModelSource(url='https://civitai.com/api/download/models/63006', ac

 for source in [source1, source2, source3, source4, source5, source6, source7]:
   install_job = installer.install_model(source)
-
+   
 source2job = installer.wait_for_installs(timeout=120)
 for source in sources:
    job = source2job[source]
@@ -504,7 +504,7 @@ for source in sources:
  print(f"{source} installed as {model_key}")
 elif job.errored:
     print(f"{source}: {job.error_type}.\nStack trace:\n{job.error}")
-
+ 
 ```

 As shown here, the `import_model()` method accepts a variety of
@@ -1364,7 +1364,6 @@ the in-memory loaded model:
 |----------------|-----------------|------------------|
 | `config`       | AnyModelConfig         | A copy of the model's configuration record for retrieving base type, etc. |
 | `model`        | AnyModel               | The instantiated model (details below) |
-| `locker`       | ModelLockerBase        | A context manager that mediates the movement of the model into VRAM |

 ### get_model_by_key(key, [submodel]) -> LoadedModel

--- a/docs/contributing/TESTS.md
+++ b/docs/contributing/TESTS.md
@@ -1,6 +1,6 @@
 # InvokeAI Backend Tests

-We use `pytest` to run the backend python tests. (See [pyproject.toml](https://github.com/invoke-ai/InvokeAI/blob/main/pyproject.toml) for the default `pytest` options.)
+We use `pytest` to run the backend python tests. (See [pyproject.toml](/pyproject.toml) for the default `pytest` options.)

 ## Fast vs. Slow
 All tests are categorized as either 'fast' (no test annotation) or 'slow' (annotated with the `@pytest.mark.slow` decorator).
@@ -33,7 +33,7 @@ pytest tests -m ""

 ## Test Organization

-All backend tests are in the [`tests/`](https://github.com/invoke-ai/InvokeAI/tree/main/tests) directory. This directory mirrors the organization of the `invokeai/` directory. For example, tests for `invokeai/model_management/model_manager.py` would be found in `tests/model_management/test_model_manager.py`.
+All backend tests are in the [`tests/`](/tests/) directory. This directory mirrors the organization of the `invokeai/` directory. For example, tests for `invokeai/model_management/model_manager.py` would be found in `tests/model_management/test_model_manager.py`.

 TODO: The above statement is aspirational. A re-organization of legacy tests is required to make it true.

--- a/docs/contributing/contribution_guides/development.md
+++ b/docs/contributing/contribution_guides/development.md
@@ -2,7 +2,7 @@

 ## **What do I need to know to help?**

-If you are looking to help with a code contribution, InvokeAI uses several different technologies under the hood: Python (Pydantic, FastAPI, diffusers) and Typescript (React, Redux Toolkit, ChakraUI, Mantine, Konva). Familiarity with StableDiffusion and image generation concepts is helpful, but not essential.
+If you are looking to help with a code contribution, InvokeAI uses several different technologies under the hood: Python (Pydantic, FastAPI, diffusers) and Typescript (React, Redux Toolkit, ChakraUI, Mantine, Konva). Familiarity with StableDiffusion and image generation concepts is helpful, but not essential. 


 ## **Get Started**
@@ -12,7 +12,7 @@ To get started, take a look at our [new contributors checklist](newContributorCh
 Once you're setup, for more information, you can review the documentation specific to your area of interest:

 * #### [InvokeAI Architecure](../ARCHITECTURE.md)
-* #### [Frontend Documentation](../frontend/index.md)
+* #### [Frontend Documentation](https://github.com/invoke-ai/InvokeAI/tree/main/invokeai/frontend/web)
 * #### [Node Documentation](../INVOCATIONS.md)
 * #### [Local Development](../LOCAL_DEVELOPMENT.md)

@@ -20,15 +20,15 @@ Once you're setup, for more information, you can review the documentation specif

 If you don't feel ready to make a code contribution yet, no problem! You can also help out in other ways, such as [documentation](documentation.md), [translation](translation.md) or helping support other users and triage issues as they're reported in GitHub.

-There are two paths to making a development contribution:
+There are two paths to making a development contribution: 

 1. Choosing an open issue to address. Open issues can be found in the [Issues](https://github.com/invoke-ai/InvokeAI/issues?q=is%3Aissue+is%3Aopen) section of the InvokeAI repository. These are tagged by the issue type (bug, enhancement, etc.) along with the “good first issues” tag denoting if they are suitable for first time contributors.
-    1. Additional items can be found on our [roadmap](https://github.com/orgs/invoke-ai/projects/7). The roadmap is organized in terms of priority, and contains features of varying size and complexity. If there is an inflight item you’d like to help with, reach out to the contributor assigned to the item to see how you can help.
+    1. Additional items can be found on our [roadmap](https://github.com/orgs/invoke-ai/projects/7). The roadmap is organized in terms of priority, and contains features of varying size and complexity. If there is an inflight item you’d like to help with, reach out to the contributor assigned to the item to see how you can help. 
 2. Opening a new issue or feature to add. **Please make sure you have searched through existing issues before creating new ones.**

 *Regardless of what you choose, please post in the  [#dev-chat](https://discord.com/channels/1020123559063990373/1049495067846524939) channel of the Discord before you start development in order to confirm that the issue or feature is aligned with the current direction of the project. We value our contributors time and effort and want to ensure that no one’s time is being misspent.*

-## Best Practices:
+## Best Practices: 
 * Keep your pull requests small. Smaller pull requests are more likely to be accepted and merged
 * Comments! Commenting your code helps reviewers easily understand your contribution
 * Use Python and Typescript’s typing systems, and consider using an editor with [LSP](https://microsoft.github.io/language-server-protocol/) support to streamline development
@@ -38,7 +38,7 @@ There are two paths to making a development contribution:

 If you need help, you can ask questions in the [#dev-chat](https://discord.com/channels/1020123559063990373/1049495067846524939) channel of the Discord.

-For frontend related work, **@psychedelicious** is the best person to reach out to.
+For frontend related work, **@psychedelicious** is the best person to reach out to. 

 For backend related work, please reach out to **@blessedcoolant**, **@lstein**, **@StAlKeR7779** or **@psychedelicious**.

--- a/docs/contributing/contribution_guides/newContributorChecklist.md
+++ b/docs/contributing/contribution_guides/newContributorChecklist.md
@@ -22,15 +22,15 @@ Before starting these steps, ensure you have your local environment [configured
 2. Fork the [InvokeAI](https://github.com/invoke-ai/InvokeAI) repository to your GitHub profile. This means that you will have a copy of the repository under **your-GitHub-username/InvokeAI**.
 3. Clone the repository to your local machine using:

-    ```bash
-    git clone https://github.com/your-GitHub-username/InvokeAI.git
-    ```
+   ```bash
+   git clone https://github.com/your-GitHub-username/InvokeAI.git
+   ```

 If you're unfamiliar with using Git through the commandline, [GitHub Desktop](https://desktop.github.com) is a easy-to-use alternative with a UI. You can do all the same steps listed here, but through the interface. 4. Create a new branch for your fix using:

-  ```bash
-  git checkout -b branch-name-here
-  ```
+    ```bash
+    git checkout -b branch-name-here
+    ```

 5. Make the appropriate changes for the issue you are trying to address or the feature that you want to add.
 6. Add the file contents of the changed files to the "snapshot" git uses to manage the state of the project, also known as the index:
--- a/docs/contributing/dev-environment.md
+++ b/docs/contributing/dev-environment.md
@@ -27,9 +27,9 @@ If you just want to use Invoke, you should use the [installer][installer link].

 5. Activate the venv (you'll need to do this every time you want to run the app):

-      ```sh
-      source .venv/bin/activate
-      ```
+        ```sh
+        source .venv/bin/activate
+        ```

 6. Install the repo as an [editable install][editable install link]:

@@ -37,7 +37,7 @@ If you just want to use Invoke, you should use the [installer][installer link].
      pip install -e ".[dev,test,xformers]" --use-pep517 --extra-index-url https://download.pytorch.org/whl/cu121
      ```

-      Refer to the [manual installation][manual install link] instructions for more determining the correct install options. `xformers` is optional, but `dev` and `test` are not.
+      Refer to the [manual installation][manual install link]] instructions for more determining the correct install options. `xformers` is optional, but `dev` and `test` are not.

 7. Install the frontend dev toolchain:

--- a/docs/contributing/index.md
+++ b/docs/contributing/index.md
@@ -34,11 +34,11 @@ Please reach out to @hipsterusername on [Discord](https://discord.gg/ZmtBAhwWhy)

 ## Contributors

-This project is a combined effort of dedicated people from across the world. [Check out the list of all these amazing people](contributors.md). We thank them for their time, hard work and effort.
+This project is a combined effort of dedicated people from across the world. [Check out the list of all these amazing people](https://invoke-ai.github.io/InvokeAI/other/CONTRIBUTORS/). We thank them for their time, hard work and effort.

 ## Code of Conduct

-The InvokeAI community is a welcoming place, and we want your help in maintaining that. Please review our [Code of Conduct](../CODE_OF_CONDUCT.md) to learn more - it's essential to maintaining a respectful and inclusive environment.
+The InvokeAI community is a welcoming place, and we want your help in maintaining that. Please review our [Code of Conduct](https://github.com/invoke-ai/InvokeAI/blob/main/docs/CODE_OF_CONDUCT.md) to learn more - it's essential to maintaining a respectful and inclusive environment.

 By making a contribution to this project, you certify that:

--- a/invokeai/app/api/routers/model_manager.py
+++ b/invokeai/app/api/routers/model_manager.py
@@ -37,7 +37,7 @@ from invokeai.backend.model_manager.config import (
    ModelFormat,
    ModelType,
 )
-from invokeai.backend.model_manager.load.model_cache.model_cache_base import CacheStats
+from invokeai.backend.model_manager.load.model_cache.cache_stats import CacheStats
 from invokeai.backend.model_manager.metadata.fetch.huggingface import HuggingFaceMetadataFetch
 from invokeai.backend.model_manager.metadata.metadata_base import ModelMetadataWithFiles, UnknownMetadataException
 from invokeai.backend.model_manager.search import ModelSearch
--- a/invokeai/app/services/invocation_stats/invocation_stats_default.py
+++ b/invokeai/app/services/invocation_stats/invocation_stats_default.py
@@ -20,7 +20,7 @@ from invokeai.app.services.invocation_stats.invocation_stats_common import (
    NodeExecutionStatsSummary,
 )
 from invokeai.app.services.invoker import Invoker
-from invokeai.backend.model_manager.load.model_cache import CacheStats
+from invokeai.backend.model_manager.load.model_cache.cache_stats import CacheStats

 # Size of 1GB in bytes.
 GB = 2**30
--- a/invokeai/app/services/model_load/model_load_base.py
+++ b/invokeai/app/services/model_load/model_load_base.py
@@ -7,7 +7,7 @@ from typing import Callable, Optional

 from invokeai.backend.model_manager import AnyModel, AnyModelConfig, SubModelType
 from invokeai.backend.model_manager.load import LoadedModel, LoadedModelWithoutConfig
-from invokeai.backend.model_manager.load.model_cache.model_cache_base import ModelCacheBase
+from invokeai.backend.model_manager.load.model_cache.model_cache import ModelCache


 class ModelLoadServiceBase(ABC):
@@ -24,7 +24,7 @@ class ModelLoadServiceBase(ABC):

    @property
    @abstractmethod
-    def ram_cache(self) -> ModelCacheBase[AnyModel]:
+    def ram_cache(self) -> ModelCache:
        """Return the RAM cache used by this loader."""

    @abstractmethod
--- a/invokeai/app/services/model_load/model_load_default.py
+++ b/invokeai/app/services/model_load/model_load_default.py
@@ -18,7 +18,7 @@ from invokeai.backend.model_manager.load import (
    ModelLoaderRegistry,
    ModelLoaderRegistryBase,
 )
-from invokeai.backend.model_manager.load.model_cache.model_cache_base import ModelCacheBase
+from invokeai.backend.model_manager.load.model_cache.model_cache import ModelCache
 from invokeai.backend.model_manager.load.model_loaders.generic_diffusers import GenericDiffusersLoader
 from invokeai.backend.util.devices import TorchDevice
 from invokeai.backend.util.logging import InvokeAILogger
@@ -30,7 +30,7 @@ class ModelLoadService(ModelLoadServiceBase):
    def __init__(
        self,
        app_config: InvokeAIAppConfig,
-        ram_cache: ModelCacheBase[AnyModel],
+        ram_cache: ModelCache,
        registry: Optional[Type[ModelLoaderRegistryBase]] = ModelLoaderRegistry,
    ):
        """Initialize the model load service."""
@@ -45,7 +45,7 @@ class ModelLoadService(ModelLoadServiceBase):
        self._invoker = invoker

    @property
-    def ram_cache(self) -> ModelCacheBase[AnyModel]:
+    def ram_cache(self) -> ModelCache:
        """Return the RAM cache used by this loader."""
        return self._ram_cache

@@ -78,9 +78,8 @@ class ModelLoadService(ModelLoadServiceBase):
        self, model_path: Path, loader: Optional[Callable[[Path], AnyModel]] = None
    ) -> LoadedModelWithoutConfig:
        cache_key = str(model_path)
-        ram_cache = self.ram_cache
        try:
-            return LoadedModelWithoutConfig(_locker=ram_cache.get(key=cache_key))
+            return LoadedModelWithoutConfig(cache_record=self._ram_cache.get(key=cache_key), cache=self._ram_cache)
        except IndexError:
            pass

@@ -109,5 +108,5 @@ class ModelLoadService(ModelLoadServiceBase):
        )
        assert loader is not None
        raw_model = loader(model_path)
-        ram_cache.put(key=cache_key, model=raw_model)
-        return LoadedModelWithoutConfig(_locker=ram_cache.get(key=cache_key))
+        self._ram_cache.put(key=cache_key, model=raw_model)
+        return LoadedModelWithoutConfig(cache_record=self._ram_cache.get(key=cache_key), cache=self._ram_cache)
--- a/invokeai/app/services/model_manager/model_manager_default.py
+++ b/invokeai/app/services/model_manager/model_manager_default.py
@@ -16,7 +16,8 @@ from invokeai.app.services.model_load.model_load_base import ModelLoadServiceBas
 from invokeai.app.services.model_load.model_load_default import ModelLoadService
 from invokeai.app.services.model_manager.model_manager_base import ModelManagerServiceBase
 from invokeai.app.services.model_records.model_records_base import ModelRecordServiceBase
-from invokeai.backend.model_manager.load import ModelCache, ModelLoaderRegistry
+from invokeai.backend.model_manager.load.model_cache.model_cache import ModelCache
+from invokeai.backend.model_manager.load.model_loader_registry import ModelLoaderRegistry
 from invokeai.backend.util.devices import TorchDevice
 from invokeai.backend.util.logging import InvokeAILogger

--- a/invokeai/app/services/session_processor/session_processor_default.py
+++ b/invokeai/app/services/session_processor/session_processor_default.py
@@ -378,9 +378,6 @@ class DefaultSessionProcessor(SessionProcessorBase):
        self._poll_now()

    async def _on_queue_item_status_changed(self, event: FastAPIEvent[QueueItemStatusChangedEvent]) -> None:
-        # Make sure the cancel event is for the currently processing queue item
-        if self._queue_item and self._queue_item.item_id is not event[1].item_id:
-            return
        if self._queue_item and event[1].status in ["completed", "failed", "canceled"]:
            # When the queue item is canceled via HTTP, the queue item status is set to `"canceled"` and this event is
            # emitted. We need to respond to this event and stop graph execution. This is done by setting the cancel
--- a/invokeai/backend/assets/model_base_conf_files/controlnet_sd15/config.json
+++ b/invokeai/backend/assets/model_base_conf_files/controlnet_sd15/config.json
@@ -1,42 +0,0 @@
-{
-  "_class_name": "ControlNetModel",
-  "_diffusers_version": "0.16.0.dev0",
-  "_name_or_path": "controlnet_v1_1/control_v11p_sd15_canny",
-  "act_fn": "silu",
-  "attention_head_dim": 8,
-  "block_out_channels": [
-    320,
-    640,
-    1280,
-    1280
-  ],
-  "class_embed_type": null,
-  "conditioning_embedding_out_channels": [
-    16,
-    32,
-    96,
-    256
-  ],
-  "controlnet_conditioning_channel_order": "rgb",
-  "cross_attention_dim": 768,
-  "down_block_types": [
-    "CrossAttnDownBlock2D",
-    "CrossAttnDownBlock2D",
-    "CrossAttnDownBlock2D",
-    "DownBlock2D"
-  ],
-  "downsample_padding": 1,
-  "flip_sin_to_cos": true,
-  "freq_shift": 0,
-  "in_channels": 4,
-  "layers_per_block": 2,
-  "mid_block_scale_factor": 1,
-  "norm_eps": 1e-05,
-  "norm_num_groups": 32,
-  "num_class_embeds": null,
-  "only_cross_attention": false,
-  "projection_class_embeddings_input_dim": null,
-  "resnet_time_scale_shift": "default",
-  "upcast_attention": false,
-  "use_linear_projection": false
-}
--- a/invokeai/backend/assets/model_base_conf_files/controlnet_sdxl/config.json
+++ b/invokeai/backend/assets/model_base_conf_files/controlnet_sdxl/config.json
@@ -1,56 +0,0 @@
-{
-  "_class_name": "ControlNetModel",
-  "_diffusers_version": "0.19.3",
-  "act_fn": "silu",
-  "addition_embed_type": "text_time",
-  "addition_embed_type_num_heads": 64,
-  "addition_time_embed_dim": 256,
-  "attention_head_dim": [
-    5,
-    10,
-    20
-  ],
-  "block_out_channels": [
-    320,
-    640,
-    1280
-  ],
-  "class_embed_type": null,
-  "conditioning_channels": 3,
-  "conditioning_embedding_out_channels": [
-    16,
-    32,
-    96,
-    256
-  ],
-  "controlnet_conditioning_channel_order": "rgb",
-  "cross_attention_dim": 2048,
-  "down_block_types": [
-    "DownBlock2D",
-    "CrossAttnDownBlock2D",
-    "CrossAttnDownBlock2D"
-  ],
-  "downsample_padding": 1,
-  "encoder_hid_dim": null,
-  "encoder_hid_dim_type": null,
-  "flip_sin_to_cos": true,
-  "freq_shift": 0,
-  "global_pool_conditions": false,
-  "in_channels": 4,
-  "layers_per_block": 2,
-  "mid_block_scale_factor": 1,
-  "norm_eps": 1e-05,
-  "norm_num_groups": 32,
-  "num_attention_heads": null,
-  "num_class_embeds": null,
-  "only_cross_attention": false,
-  "projection_class_embeddings_input_dim": 2816,
-  "resnet_time_scale_shift": "default",
-  "transformer_layers_per_block": [
-    1,
-    2,
-    10
-  ],
-  "upcast_attention": null,
-  "use_linear_projection": true
-}
--- a/invokeai/backend/assets/model_base_conf_files/stable-diffusion-1.5-epsilon/feature_extractor/preprocessor_config.json
+++ b/invokeai/backend/assets/model_base_conf_files/stable-diffusion-1.5-epsilon/feature_extractor/preprocessor_config.json
@@ -1,20 +0,0 @@
-{
-  "crop_size": 224,
-  "do_center_crop": true,
-  "do_convert_rgb": true,
-  "do_normalize": true,
-  "do_resize": true,
-  "feature_extractor_type": "CLIPFeatureExtractor",
-  "image_mean": [
-    0.48145466,
-    0.4578275,
-    0.40821073
-  ],
-  "image_std": [
-    0.26862954,
-    0.26130258,
-    0.27577711
-  ],
-  "resample": 3,
-  "size": 224
-}
--- a/invokeai/backend/assets/model_base_conf_files/stable-diffusion-1.5-epsilon/model_index.json
+++ b/invokeai/backend/assets/model_base_conf_files/stable-diffusion-1.5-epsilon/model_index.json
@@ -1,32 +0,0 @@
-{
-  "_class_name": "StableDiffusionPipeline",
-  "_diffusers_version": "0.6.0",
-  "feature_extractor": [
-    "transformers",
-    "CLIPImageProcessor"
-  ],
-  "safety_checker": [
-    "stable_diffusion",
-    "StableDiffusionSafetyChecker"
-  ],
-  "scheduler": [
-    "diffusers",
-    "PNDMScheduler"
-  ],
-  "text_encoder": [
-    "transformers",
-    "CLIPTextModel"
-  ],
-  "tokenizer": [
-    "transformers",
-    "CLIPTokenizer"
-  ],
-  "unet": [
-    "diffusers",
-    "UNet2DConditionModel"
-  ],
-  "vae": [
-    "diffusers",
-    "AutoencoderKL"
-  ]
-}
--- a/invokeai/backend/assets/model_base_conf_files/stable-diffusion-1.5-epsilon/safety_checker/config.json
+++ b/invokeai/backend/assets/model_base_conf_files/stable-diffusion-1.5-epsilon/safety_checker/config.json
@@ -1,175 +0,0 @@
-{
-  "_commit_hash": "4bb648a606ef040e7685bde262611766a5fdd67b",
-  "_name_or_path": "CompVis/stable-diffusion-safety-checker",
-  "architectures": [
-    "StableDiffusionSafetyChecker"
-  ],
-  "initializer_factor": 1.0,
-  "logit_scale_init_value": 2.6592,
-  "model_type": "clip",
-  "projection_dim": 768,
-  "text_config": {
-    "_name_or_path": "",
-    "add_cross_attention": false,
-    "architectures": null,
-    "attention_dropout": 0.0,
-    "bad_words_ids": null,
-    "bos_token_id": 0,
-    "chunk_size_feed_forward": 0,
-    "cross_attention_hidden_size": null,
-    "decoder_start_token_id": null,
-    "diversity_penalty": 0.0,
-    "do_sample": false,
-    "dropout": 0.0,
-    "early_stopping": false,
-    "encoder_no_repeat_ngram_size": 0,
-    "eos_token_id": 2,
-    "exponential_decay_length_penalty": null,
-    "finetuning_task": null,
-    "forced_bos_token_id": null,
-    "forced_eos_token_id": null,
-    "hidden_act": "quick_gelu",
-    "hidden_size": 768,
-    "id2label": {
-      "0": "LABEL_0",
-      "1": "LABEL_1"
-    },
-    "initializer_factor": 1.0,
-    "initializer_range": 0.02,
-    "intermediate_size": 3072,
-    "is_decoder": false,
-    "is_encoder_decoder": false,
-    "label2id": {
-      "LABEL_0": 0,
-      "LABEL_1": 1
-    },
-    "layer_norm_eps": 1e-05,
-    "length_penalty": 1.0,
-    "max_length": 20,
-    "max_position_embeddings": 77,
-    "min_length": 0,
-    "model_type": "clip_text_model",
-    "no_repeat_ngram_size": 0,
-    "num_attention_heads": 12,
-    "num_beam_groups": 1,
-    "num_beams": 1,
-    "num_hidden_layers": 12,
-    "num_return_sequences": 1,
-    "output_attentions": false,
-    "output_hidden_states": false,
-    "output_scores": false,
-    "pad_token_id": 1,
-    "prefix": null,
-    "problem_type": null,
-    "pruned_heads": {},
-    "remove_invalid_values": false,
-    "repetition_penalty": 1.0,
-    "return_dict": true,
-    "return_dict_in_generate": false,
-    "sep_token_id": null,
-    "task_specific_params": null,
-    "temperature": 1.0,
-    "tf_legacy_loss": false,
-    "tie_encoder_decoder": false,
-    "tie_word_embeddings": true,
-    "tokenizer_class": null,
-    "top_k": 50,
-    "top_p": 1.0,
-    "torch_dtype": null,
-    "torchscript": false,
-    "transformers_version": "4.22.0.dev0",
-    "typical_p": 1.0,
-    "use_bfloat16": false,
-    "vocab_size": 49408
-  },
-  "text_config_dict": {
-    "hidden_size": 768,
-    "intermediate_size": 3072,
-    "num_attention_heads": 12,
-    "num_hidden_layers": 12
-  },
-  "torch_dtype": "float32",
-  "transformers_version": null,
-  "vision_config": {
-    "_name_or_path": "",
-    "add_cross_attention": false,
-    "architectures": null,
-    "attention_dropout": 0.0,
-    "bad_words_ids": null,
-    "bos_token_id": null,
-    "chunk_size_feed_forward": 0,
-    "cross_attention_hidden_size": null,
-    "decoder_start_token_id": null,
-    "diversity_penalty": 0.0,
-    "do_sample": false,
-    "dropout": 0.0,
-    "early_stopping": false,
-    "encoder_no_repeat_ngram_size": 0,
-    "eos_token_id": null,
-    "exponential_decay_length_penalty": null,
-    "finetuning_task": null,
-    "forced_bos_token_id": null,
-    "forced_eos_token_id": null,
-    "hidden_act": "quick_gelu",
-    "hidden_size": 1024,
-    "id2label": {
-      "0": "LABEL_0",
-      "1": "LABEL_1"
-    },
-    "image_size": 224,
-    "initializer_factor": 1.0,
-    "initializer_range": 0.02,
-    "intermediate_size": 4096,
-    "is_decoder": false,
-    "is_encoder_decoder": false,
-    "label2id": {
-      "LABEL_0": 0,
-      "LABEL_1": 1
-    },
-    "layer_norm_eps": 1e-05,
-    "length_penalty": 1.0,
-    "max_length": 20,
-    "min_length": 0,
-    "model_type": "clip_vision_model",
-    "no_repeat_ngram_size": 0,
-    "num_attention_heads": 16,
-    "num_beam_groups": 1,
-    "num_beams": 1,
-    "num_channels": 3,
-    "num_hidden_layers": 24,
-    "num_return_sequences": 1,
-    "output_attentions": false,
-    "output_hidden_states": false,
-    "output_scores": false,
-    "pad_token_id": null,
-    "patch_size": 14,
-    "prefix": null,
-    "problem_type": null,
-    "pruned_heads": {},
-    "remove_invalid_values": false,
-    "repetition_penalty": 1.0,
-    "return_dict": true,
-    "return_dict_in_generate": false,
-    "sep_token_id": null,
-    "task_specific_params": null,
-    "temperature": 1.0,
-    "tf_legacy_loss": false,
-    "tie_encoder_decoder": false,
-    "tie_word_embeddings": true,
-    "tokenizer_class": null,
-    "top_k": 50,
-    "top_p": 1.0,
-    "torch_dtype": null,
-    "torchscript": false,
-    "transformers_version": "4.22.0.dev0",
-    "typical_p": 1.0,
-    "use_bfloat16": false
-  },
-  "vision_config_dict": {
-    "hidden_size": 1024,
-    "intermediate_size": 4096,
-    "num_attention_heads": 16,
-    "num_hidden_layers": 24,
-    "patch_size": 14
-  }
-}
--- a/invokeai/backend/assets/model_base_conf_files/stable-diffusion-1.5-epsilon/scheduler/scheduler_config.json
+++ b/invokeai/backend/assets/model_base_conf_files/stable-diffusion-1.5-epsilon/scheduler/scheduler_config.json
@@ -1,13 +0,0 @@
-{
-  "_class_name": "PNDMScheduler",
-  "_diffusers_version": "0.6.0",
-  "beta_end": 0.012,
-  "beta_schedule": "scaled_linear",
-  "beta_start": 0.00085,
-  "num_train_timesteps": 1000,
-  "set_alpha_to_one": false,
-  "skip_prk_steps": true,
-  "steps_offset": 1,
-  "trained_betas": null,
-  "clip_sample": false
-}
--- a/invokeai/backend/assets/model_base_conf_files/stable-diffusion-1.5-epsilon/text_encoder/config.json
+++ b/invokeai/backend/assets/model_base_conf_files/stable-diffusion-1.5-epsilon/text_encoder/config.json
@@ -1,25 +0,0 @@
-{
-  "_name_or_path": "openai/clip-vit-large-patch14",
-  "architectures": [
-    "CLIPTextModel"
-  ],
-  "attention_dropout": 0.0,
-  "bos_token_id": 0,
-  "dropout": 0.0,
-  "eos_token_id": 2,
-  "hidden_act": "quick_gelu",
-  "hidden_size": 768,
-  "initializer_factor": 1.0,
-  "initializer_range": 0.02,
-  "intermediate_size": 3072,
-  "layer_norm_eps": 1e-05,
-  "max_position_embeddings": 77,
-  "model_type": "clip_text_model",
-  "num_attention_heads": 12,
-  "num_hidden_layers": 12,
-  "pad_token_id": 1,
-  "projection_dim": 768,
-  "torch_dtype": "float32",
-  "transformers_version": "4.22.0.dev0",
-  "vocab_size": 49408
-}
--- a/invokeai/backend/assets/model_base_conf_files/stable-diffusion-1.5-epsilon/tokenizer/merges.txt
+++ b/invokeai/backend/assets/model_base_conf_files/stable-diffusion-1.5-epsilon/tokenizer/merges.txt
--- a/invokeai/backend/assets/model_base_conf_files/stable-diffusion-1.5-epsilon/tokenizer/special_tokens_map.json
+++ b/invokeai/backend/assets/model_base_conf_files/stable-diffusion-1.5-epsilon/tokenizer/special_tokens_map.json
@@ -1,24 +0,0 @@
-{
-  "bos_token": {
-    "content": "<|startoftext|>",
-    "lstrip": false,
-    "normalized": true,
-    "rstrip": false,
-    "single_word": false
-  },
-  "eos_token": {
-    "content": "<|endoftext|>",
-    "lstrip": false,
-    "normalized": true,
-    "rstrip": false,
-    "single_word": false
-  },
-  "pad_token": "<|endoftext|>",
-  "unk_token": {
-    "content": "<|endoftext|>",
-    "lstrip": false,
-    "normalized": true,
-    "rstrip": false,
-    "single_word": false
-  }
-}
--- a/invokeai/backend/assets/model_base_conf_files/stable-diffusion-1.5-epsilon/tokenizer/tokenizer_config.json
+++ b/invokeai/backend/assets/model_base_conf_files/stable-diffusion-1.5-epsilon/tokenizer/tokenizer_config.json
@@ -1,34 +0,0 @@
-{
-  "add_prefix_space": false,
-  "bos_token": {
-    "__type": "AddedToken",
-    "content": "<|startoftext|>",
-    "lstrip": false,
-    "normalized": true,
-    "rstrip": false,
-    "single_word": false
-  },
-  "do_lower_case": true,
-  "eos_token": {
-    "__type": "AddedToken",
-    "content": "<|endoftext|>",
-    "lstrip": false,
-    "normalized": true,
-    "rstrip": false,
-    "single_word": false
-  },
-  "errors": "replace",
-  "model_max_length": 77,
-  "name_or_path": "openai/clip-vit-large-patch14",
-  "pad_token": "<|endoftext|>",
-  "special_tokens_map_file": "./special_tokens_map.json",
-  "tokenizer_class": "CLIPTokenizer",
-  "unk_token": {
-    "__type": "AddedToken",
-    "content": "<|endoftext|>",
-    "lstrip": false,
-    "normalized": true,
-    "rstrip": false,
-    "single_word": false
-  }
-}
--- a/invokeai/backend/assets/model_base_conf_files/stable-diffusion-1.5-epsilon/tokenizer/vocab.json
+++ b/invokeai/backend/assets/model_base_conf_files/stable-diffusion-1.5-epsilon/tokenizer/vocab.json
--- a/invokeai/backend/assets/model_base_conf_files/stable-diffusion-1.5-epsilon/unet/config.json
+++ b/invokeai/backend/assets/model_base_conf_files/stable-diffusion-1.5-epsilon/unet/config.json
@@ -1,36 +0,0 @@
-{
-  "_class_name": "UNet2DConditionModel",
-  "_diffusers_version": "0.6.0",
-  "act_fn": "silu",
-  "attention_head_dim": 8,
-  "block_out_channels": [
-    320,
-    640,
-    1280,
-    1280
-  ],
-  "center_input_sample": false,
-  "cross_attention_dim": 768,
-  "down_block_types": [
-    "CrossAttnDownBlock2D",
-    "CrossAttnDownBlock2D",
-    "CrossAttnDownBlock2D",
-    "DownBlock2D"
-  ],
-  "downsample_padding": 1,
-  "flip_sin_to_cos": true,
-  "freq_shift": 0,
-  "in_channels": 4,
-  "layers_per_block": 2,
-  "mid_block_scale_factor": 1,
-  "norm_eps": 1e-05,
-  "norm_num_groups": 32,
-  "out_channels": 4,
-  "sample_size": 64,
-  "up_block_types": [
-    "UpBlock2D",
-    "CrossAttnUpBlock2D",
-    "CrossAttnUpBlock2D",
-    "CrossAttnUpBlock2D"
-  ]
-}
--- a/invokeai/backend/assets/model_base_conf_files/stable-diffusion-1.5-epsilon/vae/config.json
+++ b/invokeai/backend/assets/model_base_conf_files/stable-diffusion-1.5-epsilon/vae/config.json
@@ -1,29 +0,0 @@
-{
-  "_class_name": "AutoencoderKL",
-  "_diffusers_version": "0.6.0",
-  "act_fn": "silu",
-  "block_out_channels": [
-    128,
-    256,
-    512,
-    512
-  ],
-  "down_block_types": [
-    "DownEncoderBlock2D",
-    "DownEncoderBlock2D",
-    "DownEncoderBlock2D",
-    "DownEncoderBlock2D"
-  ],
-  "in_channels": 3,
-  "latent_channels": 4,
-  "layers_per_block": 2,
-  "norm_num_groups": 32,
-  "out_channels": 3,
-  "sample_size": 512,
-  "up_block_types": [
-    "UpDecoderBlock2D",
-    "UpDecoderBlock2D",
-    "UpDecoderBlock2D",
-    "UpDecoderBlock2D"
-  ]
-}
--- a/invokeai/backend/assets/model_base_conf_files/stable-diffusion-1.5-v_prediction/feature_extractor/preprocessor_config.json
+++ b/invokeai/backend/assets/model_base_conf_files/stable-diffusion-1.5-v_prediction/feature_extractor/preprocessor_config.json
@@ -1,28 +0,0 @@
-{
-  "crop_size": {
-    "height": 224,
-    "width": 224
-  },
-  "do_center_crop": true,
-  "do_convert_rgb": true,
-  "do_normalize": true,
-  "do_rescale": true,
-  "do_resize": true,
-  "feature_extractor_type": "CLIPFeatureExtractor",
-  "image_mean": [
-    0.48145466,
-    0.4578275,
-    0.40821073
-  ],
-  "image_processor_type": "CLIPFeatureExtractor",
-  "image_std": [
-    0.26862954,
-    0.26130258,
-    0.27577711
-  ],
-  "resample": 3,
-  "rescale_factor": 0.00392156862745098,
-  "size": {
-    "shortest_edge": 224
-  }
-}
--- a/invokeai/backend/assets/model_base_conf_files/stable-diffusion-1.5-v_prediction/model_index.json
+++ b/invokeai/backend/assets/model_base_conf_files/stable-diffusion-1.5-v_prediction/model_index.json
@@ -1,33 +0,0 @@
-{
-  "_class_name": "StableDiffusionPipeline",
-  "_diffusers_version": "0.18.0.dev0",
-  "feature_extractor": [
-    "transformers",
-    "CLIPFeatureExtractor"
-  ],
-  "requires_safety_checker": true,
-  "safety_checker": [
-    "stable_diffusion",
-    "StableDiffusionSafetyChecker"
-  ],
-  "scheduler": [
-    "diffusers",
-    "DPMSolverMultistepScheduler"
-  ],
-  "text_encoder": [
-    "transformers",
-    "CLIPTextModel"
-  ],
-  "tokenizer": [
-    "transformers",
-    "CLIPTokenizer"
-  ],
-  "unet": [
-    "diffusers",
-    "UNet2DConditionModel"
-  ],
-  "vae": [
-    "diffusers",
-    "AutoencoderKL"
-  ]
-}
--- a/invokeai/backend/assets/model_base_conf_files/stable-diffusion-1.5-v_prediction/safety_checker/config.json
+++ b/invokeai/backend/assets/model_base_conf_files/stable-diffusion-1.5-v_prediction/safety_checker/config.json
@@ -1,168 +0,0 @@
-{
-  "_commit_hash": "cb41f3a270d63d454d385fc2e4f571c487c253c5",
-  "_name_or_path": "CompVis/stable-diffusion-safety-checker",
-  "architectures": [
-    "StableDiffusionSafetyChecker"
-  ],
-  "initializer_factor": 1.0,
-  "logit_scale_init_value": 2.6592,
-  "model_type": "clip",
-  "projection_dim": 768,
-  "text_config": {
-    "_name_or_path": "",
-    "add_cross_attention": false,
-    "architectures": null,
-    "attention_dropout": 0.0,
-    "bad_words_ids": null,
-    "begin_suppress_tokens": null,
-    "bos_token_id": 0,
-    "chunk_size_feed_forward": 0,
-    "cross_attention_hidden_size": null,
-    "decoder_start_token_id": null,
-    "diversity_penalty": 0.0,
-    "do_sample": false,
-    "dropout": 0.0,
-    "early_stopping": false,
-    "encoder_no_repeat_ngram_size": 0,
-    "eos_token_id": 2,
-    "exponential_decay_length_penalty": null,
-    "finetuning_task": null,
-    "forced_bos_token_id": null,
-    "forced_eos_token_id": null,
-    "hidden_act": "quick_gelu",
-    "hidden_size": 768,
-    "id2label": {
-      "0": "LABEL_0",
-      "1": "LABEL_1"
-    },
-    "initializer_factor": 1.0,
-    "initializer_range": 0.02,
-    "intermediate_size": 3072,
-    "is_decoder": false,
-    "is_encoder_decoder": false,
-    "label2id": {
-      "LABEL_0": 0,
-      "LABEL_1": 1
-    },
-    "layer_norm_eps": 1e-05,
-    "length_penalty": 1.0,
-    "max_length": 20,
-    "max_position_embeddings": 77,
-    "min_length": 0,
-    "model_type": "clip_text_model",
-    "no_repeat_ngram_size": 0,
-    "num_attention_heads": 12,
-    "num_beam_groups": 1,
-    "num_beams": 1,
-    "num_hidden_layers": 12,
-    "num_return_sequences": 1,
-    "output_attentions": false,
-    "output_hidden_states": false,
-    "output_scores": false,
-    "pad_token_id": 1,
-    "prefix": null,
-    "problem_type": null,
-    "projection_dim": 512,
-    "pruned_heads": {},
-    "remove_invalid_values": false,
-    "repetition_penalty": 1.0,
-    "return_dict": true,
-    "return_dict_in_generate": false,
-    "sep_token_id": null,
-    "suppress_tokens": null,
-    "task_specific_params": null,
-    "temperature": 1.0,
-    "tf_legacy_loss": false,
-    "tie_encoder_decoder": false,
-    "tie_word_embeddings": true,
-    "tokenizer_class": null,
-    "top_k": 50,
-    "top_p": 1.0,
-    "torch_dtype": null,
-    "torchscript": false,
-    "transformers_version": "4.30.2",
-    "typical_p": 1.0,
-    "use_bfloat16": false,
-    "vocab_size": 49408
-  },
-  "torch_dtype": "float16",
-  "transformers_version": null,
-  "vision_config": {
-    "_name_or_path": "",
-    "add_cross_attention": false,
-    "architectures": null,
-    "attention_dropout": 0.0,
-    "bad_words_ids": null,
-    "begin_suppress_tokens": null,
-    "bos_token_id": null,
-    "chunk_size_feed_forward": 0,
-    "cross_attention_hidden_size": null,
-    "decoder_start_token_id": null,
-    "diversity_penalty": 0.0,
-    "do_sample": false,
-    "dropout": 0.0,
-    "early_stopping": false,
-    "encoder_no_repeat_ngram_size": 0,
-    "eos_token_id": null,
-    "exponential_decay_length_penalty": null,
-    "finetuning_task": null,
-    "forced_bos_token_id": null,
-    "forced_eos_token_id": null,
-    "hidden_act": "quick_gelu",
-    "hidden_size": 1024,
-    "id2label": {
-      "0": "LABEL_0",
-      "1": "LABEL_1"
-    },
-    "image_size": 224,
-    "initializer_factor": 1.0,
-    "initializer_range": 0.02,
-    "intermediate_size": 4096,
-    "is_decoder": false,
-    "is_encoder_decoder": false,
-    "label2id": {
-      "LABEL_0": 0,
-      "LABEL_1": 1
-    },
-    "layer_norm_eps": 1e-05,
-    "length_penalty": 1.0,
-    "max_length": 20,
-    "min_length": 0,
-    "model_type": "clip_vision_model",
-    "no_repeat_ngram_size": 0,
-    "num_attention_heads": 16,
-    "num_beam_groups": 1,
-    "num_beams": 1,
-    "num_channels": 3,
-    "num_hidden_layers": 24,
-    "num_return_sequences": 1,
-    "output_attentions": false,
-    "output_hidden_states": false,
-    "output_scores": false,
-    "pad_token_id": null,
-    "patch_size": 14,
-    "prefix": null,
-    "problem_type": null,
-    "projection_dim": 512,
-    "pruned_heads": {},
-    "remove_invalid_values": false,
-    "repetition_penalty": 1.0,
-    "return_dict": true,
-    "return_dict_in_generate": false,
-    "sep_token_id": null,
-    "suppress_tokens": null,
-    "task_specific_params": null,
-    "temperature": 1.0,
-    "tf_legacy_loss": false,
-    "tie_encoder_decoder": false,
-    "tie_word_embeddings": true,
-    "tokenizer_class": null,
-    "top_k": 50,
-    "top_p": 1.0,
-    "torch_dtype": null,
-    "torchscript": false,
-    "transformers_version": "4.30.2",
-    "typical_p": 1.0,
-    "use_bfloat16": false
-  }
-}
--- a/invokeai/backend/assets/model_base_conf_files/stable-diffusion-1.5-v_prediction/scheduler/scheduler_config.json
+++ b/invokeai/backend/assets/model_base_conf_files/stable-diffusion-1.5-v_prediction/scheduler/scheduler_config.json
@@ -1,26 +0,0 @@
-{
-  "_class_name": "DPMSolverMultistepScheduler",
-  "_diffusers_version": "0.18.0.dev0",
-  "algorithm_type": "dpmsolver++",
-  "beta_end": 0.012,
-  "beta_schedule": "scaled_linear",
-  "beta_start": 0.00085,
-  "clip_sample": false,
-  "clip_sample_range": 1.0,
-  "dynamic_thresholding_ratio": 0.995,
-  "lambda_min_clipped": -Infinity,
-  "lower_order_final": true,
-  "num_train_timesteps": 1000,
-  "prediction_type": "v_prediction",
-  "rescale_betas_zero_snr": false,
-  "sample_max_value": 1.0,
-  "set_alpha_to_one": false,
-  "solver_order": 2,
-  "solver_type": "midpoint",
-  "steps_offset": 1,
-  "thresholding": false,
-  "timestep_spacing": "leading",
-  "trained_betas": null,
-  "use_karras_sigmas": false,
-  "variance_type": null
-}
--- a/invokeai/backend/assets/model_base_conf_files/stable-diffusion-1.5-v_prediction/text_encoder/config.json
+++ b/invokeai/backend/assets/model_base_conf_files/stable-diffusion-1.5-v_prediction/text_encoder/config.json
@@ -1,25 +0,0 @@
-{
-  "_name_or_path": "openai/clip-vit-large-patch14",
-  "architectures": [
-    "CLIPTextModel"
-  ],
-  "attention_dropout": 0.0,
-  "bos_token_id": 0,
-  "dropout": 0.0,
-  "eos_token_id": 2,
-  "hidden_act": "quick_gelu",
-  "hidden_size": 768,
-  "initializer_factor": 1.0,
-  "initializer_range": 0.02,
-  "intermediate_size": 3072,
-  "layer_norm_eps": 1e-05,
-  "max_position_embeddings": 77,
-  "model_type": "clip_text_model",
-  "num_attention_heads": 12,
-  "num_hidden_layers": 12,
-  "pad_token_id": 1,
-  "projection_dim": 768,
-  "torch_dtype": "float16",
-  "transformers_version": "4.30.2",
-  "vocab_size": 49408
-}
--- a/invokeai/backend/assets/model_base_conf_files/stable-diffusion-1.5-v_prediction/tokenizer/merges.txt
+++ b/invokeai/backend/assets/model_base_conf_files/stable-diffusion-1.5-v_prediction/tokenizer/merges.txt
--- a/invokeai/backend/assets/model_base_conf_files/stable-diffusion-1.5-v_prediction/tokenizer/special_tokens_map.json
+++ b/invokeai/backend/assets/model_base_conf_files/stable-diffusion-1.5-v_prediction/tokenizer/special_tokens_map.json
@@ -1,24 +0,0 @@
-{
-  "bos_token": {
-    "content": "<|startoftext|>",
-    "lstrip": false,
-    "normalized": true,
-    "rstrip": false,
-    "single_word": false
-  },
-  "eos_token": {
-    "content": "<|endoftext|>",
-    "lstrip": false,
-    "normalized": true,
-    "rstrip": false,
-    "single_word": false
-  },
-  "pad_token": "<|endoftext|>",
-  "unk_token": {
-    "content": "<|endoftext|>",
-    "lstrip": false,
-    "normalized": true,
-    "rstrip": false,
-    "single_word": false
-  }
-}
--- a/invokeai/backend/assets/model_base_conf_files/stable-diffusion-1.5-v_prediction/tokenizer/tokenizer_config.json
+++ b/invokeai/backend/assets/model_base_conf_files/stable-diffusion-1.5-v_prediction/tokenizer/tokenizer_config.json
@@ -1,33 +0,0 @@
-{
-  "add_prefix_space": false,
-  "bos_token": {
-    "__type": "AddedToken",
-    "content": "<|startoftext|>",
-    "lstrip": false,
-    "normalized": true,
-    "rstrip": false,
-    "single_word": false
-  },
-  "clean_up_tokenization_spaces": true,
-  "do_lower_case": true,
-  "eos_token": {
-    "__type": "AddedToken",
-    "content": "<|endoftext|>",
-    "lstrip": false,
-    "normalized": true,
-    "rstrip": false,
-    "single_word": false
-  },
-  "errors": "replace",
-  "model_max_length": 77,
-  "pad_token": "<|endoftext|>",
-  "tokenizer_class": "CLIPTokenizer",
-  "unk_token": {
-    "__type": "AddedToken",
-    "content": "<|endoftext|>",
-    "lstrip": false,
-    "normalized": true,
-    "rstrip": false,
-    "single_word": false
-  }
-}
--- a/invokeai/backend/assets/model_base_conf_files/stable-diffusion-1.5-v_prediction/tokenizer/vocab.json
+++ b/invokeai/backend/assets/model_base_conf_files/stable-diffusion-1.5-v_prediction/tokenizer/vocab.json
--- a/invokeai/backend/assets/model_base_conf_files/stable-diffusion-1.5-v_prediction/unet/config.json
+++ b/invokeai/backend/assets/model_base_conf_files/stable-diffusion-1.5-v_prediction/unet/config.json
@@ -1,62 +0,0 @@
-{
-  "_class_name": "UNet2DConditionModel",
-  "_diffusers_version": "0.18.0.dev0",
-  "act_fn": "silu",
-  "addition_embed_type": null,
-  "addition_embed_type_num_heads": 64,
-  "attention_head_dim": 8,
-  "block_out_channels": [
-    320,
-    640,
-    1280,
-    1280
-  ],
-  "center_input_sample": false,
-  "class_embed_type": null,
-  "class_embeddings_concat": false,
-  "conv_in_kernel": 3,
-  "conv_out_kernel": 3,
-  "cross_attention_dim": 768,
-  "cross_attention_norm": null,
-  "down_block_types": [
-    "CrossAttnDownBlock2D",
-    "CrossAttnDownBlock2D",
-    "CrossAttnDownBlock2D",
-    "DownBlock2D"
-  ],
-  "downsample_padding": 1,
-  "dual_cross_attention": false,
-  "encoder_hid_dim": null,
-  "encoder_hid_dim_type": null,
-  "flip_sin_to_cos": true,
-  "freq_shift": 0,
-  "in_channels": 4,
-  "layers_per_block": 2,
-  "mid_block_only_cross_attention": null,
-  "mid_block_scale_factor": 1,
-  "mid_block_type": "UNetMidBlock2DCrossAttn",
-  "norm_eps": 1e-05,
-  "norm_num_groups": 32,
-  "num_attention_heads": null,
-  "num_class_embeds": null,
-  "only_cross_attention": false,
-  "out_channels": 4,
-  "projection_class_embeddings_input_dim": null,
-  "resnet_out_scale_factor": 1.0,
-  "resnet_skip_time_act": false,
-  "resnet_time_scale_shift": "default",
-  "sample_size": 96,
-  "time_cond_proj_dim": null,
-  "time_embedding_act_fn": null,
-  "time_embedding_dim": null,
-  "time_embedding_type": "positional",
-  "timestep_post_act": null,
-  "up_block_types": [
-    "UpBlock2D",
-    "CrossAttnUpBlock2D",
-    "CrossAttnUpBlock2D",
-    "CrossAttnUpBlock2D"
-  ],
-  "upcast_attention": null,
-  "use_linear_projection": false
-}
--- a/invokeai/backend/assets/model_base_conf_files/stable-diffusion-1.5-v_prediction/vae/config.json
+++ b/invokeai/backend/assets/model_base_conf_files/stable-diffusion-1.5-v_prediction/vae/config.json
@@ -1,30 +0,0 @@
-{
-  "_class_name": "AutoencoderKL",
-  "_diffusers_version": "0.18.0.dev0",
-  "act_fn": "silu",
-  "block_out_channels": [
-    128,
-    256,
-    512,
-    512
-  ],
-  "down_block_types": [
-    "DownEncoderBlock2D",
-    "DownEncoderBlock2D",
-    "DownEncoderBlock2D",
-    "DownEncoderBlock2D"
-  ],
-  "in_channels": 3,
-  "latent_channels": 4,
-  "layers_per_block": 2,
-  "norm_num_groups": 32,
-  "out_channels": 3,
-  "sample_size": 768,
-  "scaling_factor": 0.18215,
-  "up_block_types": [
-    "UpDecoderBlock2D",
-    "UpDecoderBlock2D",
-    "UpDecoderBlock2D",
-    "UpDecoderBlock2D"
-  ]
-}
--- a/invokeai/backend/assets/model_base_conf_files/stable-diffusion-2.0-v_prediction/feature_extractor/preprocessor_config.json
+++ b/invokeai/backend/assets/model_base_conf_files/stable-diffusion-2.0-v_prediction/feature_extractor/preprocessor_config.json
@@ -1,20 +0,0 @@
-{
-  "crop_size": 224,
-  "do_center_crop": true,
-  "do_convert_rgb": true,
-  "do_normalize": true,
-  "do_resize": true,
-  "feature_extractor_type": "CLIPFeatureExtractor",
-  "image_mean": [
-    0.48145466,
-    0.4578275,
-    0.40821073
-  ],
-  "image_std": [
-    0.26862954,
-    0.26130258,
-    0.27577711
-  ],
-  "resample": 3,
-  "size": 224
-}
--- a/invokeai/backend/assets/model_base_conf_files/stable-diffusion-2.0-v_prediction/model_index.json
+++ b/invokeai/backend/assets/model_base_conf_files/stable-diffusion-2.0-v_prediction/model_index.json
@@ -1,33 +0,0 @@
-{
-  "_class_name": "StableDiffusionPipeline",
-  "_diffusers_version": "0.8.0",
-  "feature_extractor": [
-    "transformers",
-    "CLIPImageProcessor"
-  ],
-  "requires_safety_checker": false,
-  "safety_checker": [
-    null,
-    null
-  ],
-  "scheduler": [
-    "diffusers",
-    "DDIMScheduler"
-  ],
-  "text_encoder": [
-    "transformers",
-    "CLIPTextModel"
-  ],
-  "tokenizer": [
-    "transformers",
-    "CLIPTokenizer"
-  ],
-  "unet": [
-    "diffusers",
-    "UNet2DConditionModel"
-  ],
-  "vae": [
-    "diffusers",
-    "AutoencoderKL"
-  ]
-}
--- a/invokeai/backend/assets/model_base_conf_files/stable-diffusion-2.0-v_prediction/scheduler/scheduler_config.json
+++ b/invokeai/backend/assets/model_base_conf_files/stable-diffusion-2.0-v_prediction/scheduler/scheduler_config.json
@@ -1,14 +0,0 @@
-{
-  "_class_name": "DDIMScheduler",
-  "_diffusers_version": "0.8.0",
-  "beta_end": 0.012,
-  "beta_schedule": "scaled_linear",
-  "beta_start": 0.00085,
-  "clip_sample": false,
-  "num_train_timesteps": 1000,
-  "prediction_type": "v_prediction",
-  "set_alpha_to_one": false,
-  "skip_prk_steps": true,
-  "steps_offset": 1,
-  "trained_betas": null
-}
--- a/invokeai/backend/assets/model_base_conf_files/stable-diffusion-2.0-v_prediction/text_encoder/config.json
+++ b/invokeai/backend/assets/model_base_conf_files/stable-diffusion-2.0-v_prediction/text_encoder/config.json
@@ -1,25 +0,0 @@
-{
-  "_name_or_path": "hf-models/stable-diffusion-v2-768x768/text_encoder",
-  "architectures": [
-    "CLIPTextModel"
-  ],
-  "attention_dropout": 0.0,
-  "bos_token_id": 0,
-  "dropout": 0.0,
-  "eos_token_id": 2,
-  "hidden_act": "gelu",
-  "hidden_size": 1024,
-  "initializer_factor": 1.0,
-  "initializer_range": 0.02,
-  "intermediate_size": 4096,
-  "layer_norm_eps": 1e-05,
-  "max_position_embeddings": 77,
-  "model_type": "clip_text_model",
-  "num_attention_heads": 16,
-  "num_hidden_layers": 23,
-  "pad_token_id": 1,
-  "projection_dim": 512,
-  "torch_dtype": "float32",
-  "transformers_version": "4.25.0.dev0",
-  "vocab_size": 49408
-}
--- a/invokeai/backend/assets/model_base_conf_files/stable-diffusion-2.0-v_prediction/tokenizer/merges.txt
+++ b/invokeai/backend/assets/model_base_conf_files/stable-diffusion-2.0-v_prediction/tokenizer/merges.txt
--- a/invokeai/backend/assets/model_base_conf_files/stable-diffusion-2.0-v_prediction/tokenizer/special_tokens_map.json
+++ b/invokeai/backend/assets/model_base_conf_files/stable-diffusion-2.0-v_prediction/tokenizer/special_tokens_map.json
@@ -1,24 +0,0 @@
-{
-  "bos_token": {
-    "content": "<|startoftext|>",
-    "lstrip": false,
-    "normalized": true,
-    "rstrip": false,
-    "single_word": false
-  },
-  "eos_token": {
-    "content": "<|endoftext|>",
-    "lstrip": false,
-    "normalized": true,
-    "rstrip": false,
-    "single_word": false
-  },
-  "pad_token": "!",
-  "unk_token": {
-    "content": "<|endoftext|>",
-    "lstrip": false,
-    "normalized": true,
-    "rstrip": false,
-    "single_word": false
-  }
-}
--- a/invokeai/backend/assets/model_base_conf_files/stable-diffusion-2.0-v_prediction/tokenizer/tokenizer_config.json
+++ b/invokeai/backend/assets/model_base_conf_files/stable-diffusion-2.0-v_prediction/tokenizer/tokenizer_config.json
@@ -1,34 +0,0 @@
-{
-  "add_prefix_space": false,
-  "bos_token": {
-    "__type": "AddedToken",
-    "content": "<|startoftext|>",
-    "lstrip": false,
-    "normalized": true,
-    "rstrip": false,
-    "single_word": false
-  },
-  "do_lower_case": true,
-  "eos_token": {
-    "__type": "AddedToken",
-    "content": "<|endoftext|>",
-    "lstrip": false,
-    "normalized": true,
-    "rstrip": false,
-    "single_word": false
-  },
-  "errors": "replace",
-  "model_max_length": 77,
-  "name_or_path": "hf-models/stable-diffusion-v2-768x768/tokenizer",
-  "pad_token": "<|endoftext|>",
-  "special_tokens_map_file": "./special_tokens_map.json",
-  "tokenizer_class": "CLIPTokenizer",
-  "unk_token": {
-    "__type": "AddedToken",
-    "content": "<|endoftext|>",
-    "lstrip": false,
-    "normalized": true,
-    "rstrip": false,
-    "single_word": false
-  }
-}
--- a/invokeai/backend/assets/model_base_conf_files/stable-diffusion-2.0-v_prediction/tokenizer/vocab.json
+++ b/invokeai/backend/assets/model_base_conf_files/stable-diffusion-2.0-v_prediction/tokenizer/vocab.json
--- a/invokeai/backend/assets/model_base_conf_files/stable-diffusion-2.0-v_prediction/unet/config.json
+++ b/invokeai/backend/assets/model_base_conf_files/stable-diffusion-2.0-v_prediction/unet/config.json
@@ -1,46 +0,0 @@
-{
-  "_class_name": "UNet2DConditionModel",
-  "_diffusers_version": "0.10.0.dev0",
-  "act_fn": "silu",
-  "attention_head_dim": [
-    5,
-    10,
-    20,
-    20
-  ],
-  "block_out_channels": [
-    320,
-    640,
-    1280,
-    1280
-  ],
-  "center_input_sample": false,
-  "cross_attention_dim": 1024,
-  "down_block_types": [
-    "CrossAttnDownBlock2D",
-    "CrossAttnDownBlock2D",
-    "CrossAttnDownBlock2D",
-    "DownBlock2D"
-  ],
-  "downsample_padding": 1,
-  "dual_cross_attention": false,
-  "flip_sin_to_cos": true,
-  "freq_shift": 0,
-  "in_channels": 4,
-  "layers_per_block": 2,
-  "mid_block_scale_factor": 1,
-  "norm_eps": 1e-05,
-  "norm_num_groups": 32,
-  "num_class_embeds": null,
-  "only_cross_attention": false,
-  "out_channels": 4,
-  "sample_size": 96,
-  "up_block_types": [
-    "UpBlock2D",
-    "CrossAttnUpBlock2D",
-    "CrossAttnUpBlock2D",
-    "CrossAttnUpBlock2D"
-  ],
-  "use_linear_projection": true,
-  "upcast_attention": true
-}
--- a/invokeai/backend/assets/model_base_conf_files/stable-diffusion-2.0-v_prediction/vae/config.json
+++ b/invokeai/backend/assets/model_base_conf_files/stable-diffusion-2.0-v_prediction/vae/config.json
@@ -1,30 +0,0 @@
-{
-  "_class_name": "AutoencoderKL",
-  "_diffusers_version": "0.8.0",
-  "_name_or_path": "hf-models/stable-diffusion-v2-768x768/vae",
-  "act_fn": "silu",
-  "block_out_channels": [
-    128,
-    256,
-    512,
-    512
-  ],
-  "down_block_types": [
-    "DownEncoderBlock2D",
-    "DownEncoderBlock2D",
-    "DownEncoderBlock2D",
-    "DownEncoderBlock2D"
-  ],
-  "in_channels": 3,
-  "latent_channels": 4,
-  "layers_per_block": 2,
-  "norm_num_groups": 32,
-  "out_channels": 3,
-  "sample_size": 768,
-  "up_block_types": [
-    "UpDecoderBlock2D",
-    "UpDecoderBlock2D",
-    "UpDecoderBlock2D",
-    "UpDecoderBlock2D"
-  ]
-}
--- a/invokeai/backend/assets/model_base_conf_files/stable-diffusion-xl-base-1.0/model_index.json
+++ b/invokeai/backend/assets/model_base_conf_files/stable-diffusion-xl-base-1.0/model_index.json
@@ -1,34 +0,0 @@
-{
-  "_class_name": "StableDiffusionXLPipeline",
-  "_diffusers_version": "0.19.0.dev0",
-  "force_zeros_for_empty_prompt": true,
-  "add_watermarker": null,
-  "scheduler": [
-    "diffusers",
-    "EulerDiscreteScheduler"
-  ],
-  "text_encoder": [
-    "transformers",
-    "CLIPTextModel"
-  ],
-  "text_encoder_2": [
-    "transformers",
-    "CLIPTextModelWithProjection"
-  ],
-  "tokenizer": [
-    "transformers",
-    "CLIPTokenizer"
-  ],
-  "tokenizer_2": [
-    "transformers",
-    "CLIPTokenizer"
-  ],
-  "unet": [
-    "diffusers",
-    "UNet2DConditionModel"
-  ],
-  "vae": [
-    "diffusers",
-    "AutoencoderKL"
-  ]
-}
--- a/invokeai/backend/assets/model_base_conf_files/stable-diffusion-xl-base-1.0/scheduler/scheduler_config.json
+++ b/invokeai/backend/assets/model_base_conf_files/stable-diffusion-xl-base-1.0/scheduler/scheduler_config.json
@@ -1,18 +0,0 @@
-{
-  "_class_name": "EulerDiscreteScheduler",
-  "_diffusers_version": "0.19.0.dev0",
-  "beta_end": 0.012,
-  "beta_schedule": "scaled_linear",
-  "beta_start": 0.00085,
-  "clip_sample": false,
-  "interpolation_type": "linear",
-  "num_train_timesteps": 1000,
-  "prediction_type": "epsilon",
-  "sample_max_value": 1.0,
-  "set_alpha_to_one": false,
-  "skip_prk_steps": true,
-  "steps_offset": 1,
-  "timestep_spacing": "leading",
-  "trained_betas": null,
-  "use_karras_sigmas": false
-}
--- a/invokeai/backend/assets/model_base_conf_files/stable-diffusion-xl-base-1.0/text_encoder/config.json
+++ b/invokeai/backend/assets/model_base_conf_files/stable-diffusion-xl-base-1.0/text_encoder/config.json
@@ -1,24 +0,0 @@
-{
-  "architectures": [
-    "CLIPTextModel"
-  ],
-  "attention_dropout": 0.0,
-  "bos_token_id": 0,
-  "dropout": 0.0,
-  "eos_token_id": 2,
-  "hidden_act": "quick_gelu",
-  "hidden_size": 768,
-  "initializer_factor": 1.0,
-  "initializer_range": 0.02,
-  "intermediate_size": 3072,
-  "layer_norm_eps": 1e-05,
-  "max_position_embeddings": 77,
-  "model_type": "clip_text_model",
-  "num_attention_heads": 12,
-  "num_hidden_layers": 12,
-  "pad_token_id": 1,
-  "projection_dim": 768,
-  "torch_dtype": "float16",
-  "transformers_version": "4.32.0.dev0",
-  "vocab_size": 49408
-}
--- a/invokeai/backend/assets/model_base_conf_files/stable-diffusion-xl-base-1.0/text_encoder_2/config.json
+++ b/invokeai/backend/assets/model_base_conf_files/stable-diffusion-xl-base-1.0/text_encoder_2/config.json
@@ -1,24 +0,0 @@
-{
-  "architectures": [
-    "CLIPTextModelWithProjection"
-  ],
-  "attention_dropout": 0.0,
-  "bos_token_id": 0,
-  "dropout": 0.0,
-  "eos_token_id": 2,
-  "hidden_act": "gelu",
-  "hidden_size": 1280,
-  "initializer_factor": 1.0,
-  "initializer_range": 0.02,
-  "intermediate_size": 5120,
-  "layer_norm_eps": 1e-05,
-  "max_position_embeddings": 77,
-  "model_type": "clip_text_model",
-  "num_attention_heads": 20,
-  "num_hidden_layers": 32,
-  "pad_token_id": 1,
-  "projection_dim": 1280,
-  "torch_dtype": "float16",
-  "transformers_version": "4.32.0.dev0",
-  "vocab_size": 49408
-}
--- a/invokeai/backend/assets/model_base_conf_files/stable-diffusion-xl-base-1.0/tokenizer/merges.txt
+++ b/invokeai/backend/assets/model_base_conf_files/stable-diffusion-xl-base-1.0/tokenizer/merges.txt
--- a/invokeai/backend/assets/model_base_conf_files/stable-diffusion-xl-base-1.0/tokenizer/special_tokens_map.json
+++ b/invokeai/backend/assets/model_base_conf_files/stable-diffusion-xl-base-1.0/tokenizer/special_tokens_map.json
@@ -1,24 +0,0 @@
-{
-  "bos_token": {
-    "content": "<|startoftext|>",
-    "lstrip": false,
-    "normalized": true,
-    "rstrip": false,
-    "single_word": false
-  },
-  "eos_token": {
-    "content": "<|endoftext|>",
-    "lstrip": false,
-    "normalized": true,
-    "rstrip": false,
-    "single_word": false
-  },
-  "pad_token": "<|endoftext|>",
-  "unk_token": {
-    "content": "<|endoftext|>",
-    "lstrip": false,
-    "normalized": true,
-    "rstrip": false,
-    "single_word": false
-  }
-}
--- a/invokeai/backend/assets/model_base_conf_files/stable-diffusion-xl-base-1.0/tokenizer/tokenizer_config.json
+++ b/invokeai/backend/assets/model_base_conf_files/stable-diffusion-xl-base-1.0/tokenizer/tokenizer_config.json
@@ -1,33 +0,0 @@
-{
-  "add_prefix_space": false,
-  "bos_token": {
-    "__type": "AddedToken",
-    "content": "<|startoftext|>",
-    "lstrip": false,
-    "normalized": true,
-    "rstrip": false,
-    "single_word": false
-  },
-  "clean_up_tokenization_spaces": true,
-  "do_lower_case": true,
-  "eos_token": {
-    "__type": "AddedToken",
-    "content": "<|endoftext|>",
-    "lstrip": false,
-    "normalized": true,
-    "rstrip": false,
-    "single_word": false
-  },
-  "errors": "replace",
-  "model_max_length": 77,
-  "pad_token": "<|endoftext|>",
-  "tokenizer_class": "CLIPTokenizer",
-  "unk_token": {
-    "__type": "AddedToken",
-    "content": "<|endoftext|>",
-    "lstrip": false,
-    "normalized": true,
-    "rstrip": false,
-    "single_word": false
-  }
-}
--- a/invokeai/backend/assets/model_base_conf_files/stable-diffusion-xl-base-1.0/tokenizer/vocab.json
+++ b/invokeai/backend/assets/model_base_conf_files/stable-diffusion-xl-base-1.0/tokenizer/vocab.json
--- a/invokeai/backend/assets/model_base_conf_files/stable-diffusion-xl-base-1.0/tokenizer_2/merges.txt
+++ b/invokeai/backend/assets/model_base_conf_files/stable-diffusion-xl-base-1.0/tokenizer_2/merges.txt
--- a/invokeai/backend/assets/model_base_conf_files/stable-diffusion-xl-base-1.0/tokenizer_2/special_tokens_map.json
+++ b/invokeai/backend/assets/model_base_conf_files/stable-diffusion-xl-base-1.0/tokenizer_2/special_tokens_map.json
@@ -1,24 +0,0 @@
-{
-  "bos_token": {
-    "content": "<|startoftext|>",
-    "lstrip": false,
-    "normalized": true,
-    "rstrip": false,
-    "single_word": false
-  },
-  "eos_token": {
-    "content": "<|endoftext|>",
-    "lstrip": false,
-    "normalized": true,
-    "rstrip": false,
-    "single_word": false
-  },
-  "pad_token": "!",
-  "unk_token": {
-    "content": "<|endoftext|>",
-    "lstrip": false,
-    "normalized": true,
-    "rstrip": false,
-    "single_word": false
-  }
-}
--- a/invokeai/backend/assets/model_base_conf_files/stable-diffusion-xl-base-1.0/tokenizer_2/tokenizer_config.json
+++ b/invokeai/backend/assets/model_base_conf_files/stable-diffusion-xl-base-1.0/tokenizer_2/tokenizer_config.json
@@ -1,33 +0,0 @@
-{
-  "add_prefix_space": false,
-  "bos_token": {
-    "__type": "AddedToken",
-    "content": "<|startoftext|>",
-    "lstrip": false,
-    "normalized": true,
-    "rstrip": false,
-    "single_word": false
-  },
-  "clean_up_tokenization_spaces": true,
-  "do_lower_case": true,
-  "eos_token": {
-    "__type": "AddedToken",
-    "content": "<|endoftext|>",
-    "lstrip": false,
-    "normalized": true,
-    "rstrip": false,
-    "single_word": false
-  },
-  "errors": "replace",
-  "model_max_length": 77,
-  "pad_token": "!",
-  "tokenizer_class": "CLIPTokenizer",
-  "unk_token": {
-    "__type": "AddedToken",
-    "content": "<|endoftext|>",
-    "lstrip": false,
-    "normalized": true,
-    "rstrip": false,
-    "single_word": false
-  }
-}
--- a/invokeai/backend/assets/model_base_conf_files/stable-diffusion-xl-base-1.0/tokenizer_2/vocab.json
+++ b/invokeai/backend/assets/model_base_conf_files/stable-diffusion-xl-base-1.0/tokenizer_2/vocab.json
--- a/invokeai/backend/assets/model_base_conf_files/stable-diffusion-xl-base-1.0/unet/config.json
+++ b/invokeai/backend/assets/model_base_conf_files/stable-diffusion-xl-base-1.0/unet/config.json
@@ -1,69 +0,0 @@
-{
-  "_class_name": "UNet2DConditionModel",
-  "_diffusers_version": "0.19.0.dev0",
-  "act_fn": "silu",
-  "addition_embed_type": "text_time",
-  "addition_embed_type_num_heads": 64,
-  "addition_time_embed_dim": 256,
-  "attention_head_dim": [
-    5,
-    10,
-    20
-  ],
-  "block_out_channels": [
-    320,
-    640,
-    1280
-  ],
-  "center_input_sample": false,
-  "class_embed_type": null,
-  "class_embeddings_concat": false,
-  "conv_in_kernel": 3,
-  "conv_out_kernel": 3,
-  "cross_attention_dim": 2048,
-  "cross_attention_norm": null,
-  "down_block_types": [
-    "DownBlock2D",
-    "CrossAttnDownBlock2D",
-    "CrossAttnDownBlock2D"
-  ],
-  "downsample_padding": 1,
-  "dual_cross_attention": false,
-  "encoder_hid_dim": null,
-  "encoder_hid_dim_type": null,
-  "flip_sin_to_cos": true,
-  "freq_shift": 0,
-  "in_channels": 4,
-  "layers_per_block": 2,
-  "mid_block_only_cross_attention": null,
-  "mid_block_scale_factor": 1,
-  "mid_block_type": "UNetMidBlock2DCrossAttn",
-  "norm_eps": 1e-05,
-  "norm_num_groups": 32,
-  "num_attention_heads": null,
-  "num_class_embeds": null,
-  "only_cross_attention": false,
-  "out_channels": 4,
-  "projection_class_embeddings_input_dim": 2816,
-  "resnet_out_scale_factor": 1.0,
-  "resnet_skip_time_act": false,
-  "resnet_time_scale_shift": "default",
-  "sample_size": 128,
-  "time_cond_proj_dim": null,
-  "time_embedding_act_fn": null,
-  "time_embedding_dim": null,
-  "time_embedding_type": "positional",
-  "timestep_post_act": null,
-  "transformer_layers_per_block": [
-    1,
-    2,
-    10
-  ],
-  "up_block_types": [
-    "CrossAttnUpBlock2D",
-    "CrossAttnUpBlock2D",
-    "UpBlock2D"
-  ],
-  "upcast_attention": null,
-  "use_linear_projection": true
-}
--- a/invokeai/backend/assets/model_base_conf_files/stable-diffusion-xl-base-1.0/vae/config.json
+++ b/invokeai/backend/assets/model_base_conf_files/stable-diffusion-xl-base-1.0/vae/config.json
@@ -1,32 +0,0 @@
-{
-  "_class_name": "AutoencoderKL",
-  "_diffusers_version": "0.20.0.dev0",
-  "_name_or_path": "../sdxl-vae/",
-  "act_fn": "silu",
-  "block_out_channels": [
-    128,
-    256,
-    512,
-    512
-  ],
-  "down_block_types": [
-    "DownEncoderBlock2D",
-    "DownEncoderBlock2D",
-    "DownEncoderBlock2D",
-    "DownEncoderBlock2D"
-  ],
-  "force_upcast": true,
-  "in_channels": 3,
-  "latent_channels": 4,
-  "layers_per_block": 2,
-  "norm_num_groups": 32,
-  "out_channels": 3,
-  "sample_size": 1024,
-  "scaling_factor": 0.13025,
-  "up_block_types": [
-    "UpDecoderBlock2D",
-    "UpDecoderBlock2D",
-    "UpDecoderBlock2D",
-    "UpDecoderBlock2D"
-  ]
-}
--- a/invokeai/backend/assets/model_base_conf_files/stable-diffusion-xl-base-1.0/vae_1_0/config.json
+++ b/invokeai/backend/assets/model_base_conf_files/stable-diffusion-xl-base-1.0/vae_1_0/config.json
@@ -1,31 +0,0 @@
-{
-  "_class_name": "AutoencoderKL",
-  "_diffusers_version": "0.19.0.dev0",
-  "act_fn": "silu",
-  "block_out_channels": [
-    128,
-    256,
-    512,
-    512
-  ],
-  "down_block_types": [
-    "DownEncoderBlock2D",
-    "DownEncoderBlock2D",
-    "DownEncoderBlock2D",
-    "DownEncoderBlock2D"
-  ],
-  "force_upcast": true,
-  "in_channels": 3,
-  "latent_channels": 4,
-  "layers_per_block": 2,
-  "norm_num_groups": 32,
-  "out_channels": 3,
-  "sample_size": 1024,
-  "scaling_factor": 0.13025,
-  "up_block_types": [
-    "UpDecoderBlock2D",
-    "UpDecoderBlock2D",
-    "UpDecoderBlock2D",
-    "UpDecoderBlock2D"
-  ]
-}
--- a/invokeai/backend/assets/model_base_conf_files/stable-diffusion-xl-base-1.0/vae_decoder/config.json
+++ b/invokeai/backend/assets/model_base_conf_files/stable-diffusion-xl-base-1.0/vae_decoder/config.json
@@ -1,31 +0,0 @@
-{
-  "_class_name": "AutoencoderKL",
-  "_diffusers_version": "0.19.0.dev0",
-  "act_fn": "silu",
-  "block_out_channels": [
-    128,
-    256,
-    512,
-    512
-  ],
-  "down_block_types": [
-    "DownEncoderBlock2D",
-    "DownEncoderBlock2D",
-    "DownEncoderBlock2D",
-    "DownEncoderBlock2D"
-  ],
-  "force_upcast": true,
-  "in_channels": 3,
-  "latent_channels": 4,
-  "layers_per_block": 2,
-  "norm_num_groups": 32,
-  "out_channels": 3,
-  "sample_size": 1024,
-  "scaling_factor": 0.13025,
-  "up_block_types": [
-    "UpDecoderBlock2D",
-    "UpDecoderBlock2D",
-    "UpDecoderBlock2D",
-    "UpDecoderBlock2D"
-  ]
-}
--- a/invokeai/backend/assets/model_base_conf_files/stable-diffusion-xl-base-1.0/vae_encoder/config.json
+++ b/invokeai/backend/assets/model_base_conf_files/stable-diffusion-xl-base-1.0/vae_encoder/config.json
@@ -1,31 +0,0 @@
-{
-  "_class_name": "AutoencoderKL",
-  "_diffusers_version": "0.19.0.dev0",
-  "act_fn": "silu",
-  "block_out_channels": [
-    128,
-    256,
-    512,
-    512
-  ],
-  "down_block_types": [
-    "DownEncoderBlock2D",
-    "DownEncoderBlock2D",
-    "DownEncoderBlock2D",
-    "DownEncoderBlock2D"
-  ],
-  "force_upcast": true,
-  "in_channels": 3,
-  "latent_channels": 4,
-  "layers_per_block": 2,
-  "norm_num_groups": 32,
-  "out_channels": 3,
-  "sample_size": 1024,
-  "scaling_factor": 0.13025,
-  "up_block_types": [
-    "UpDecoderBlock2D",
-    "UpDecoderBlock2D",
-    "UpDecoderBlock2D",
-    "UpDecoderBlock2D"
-  ]
-}
--- a/invokeai/backend/assets/model_base_conf_files/stable-diffusion-xl-refiner-1.0/model_index.json
+++ b/invokeai/backend/assets/model_base_conf_files/stable-diffusion-xl-refiner-1.0/model_index.json
@@ -1,35 +0,0 @@
-{
-  "_class_name": "StableDiffusionXLImg2ImgPipeline",
-  "_diffusers_version": "0.19.0.dev0",
-  "force_zeros_for_empty_prompt": false,
-  "add_watermarker": null,
-  "requires_aesthetics_score": true,
-  "scheduler": [
-    "diffusers",
-    "EulerDiscreteScheduler"
-  ],
-  "text_encoder": [
-    null,
-    null
-  ],
-  "text_encoder_2": [
-    "transformers",
-    "CLIPTextModelWithProjection"
-  ],
-  "tokenizer": [
-    null,
-    null
-  ],
-  "tokenizer_2": [
-    "transformers",
-    "CLIPTokenizer"
-  ],
-  "unet": [
-    "diffusers",
-    "UNet2DConditionModel"
-  ],
-  "vae": [
-    "diffusers",
-    "AutoencoderKL"
-  ]
-}
--- a/invokeai/backend/assets/model_base_conf_files/stable-diffusion-xl-refiner-1.0/scheduler/scheduler_config.json
+++ b/invokeai/backend/assets/model_base_conf_files/stable-diffusion-xl-refiner-1.0/scheduler/scheduler_config.json
@@ -1,18 +0,0 @@
-{
-  "_class_name": "EulerDiscreteScheduler",
-  "_diffusers_version": "0.19.0.dev0",
-  "beta_end": 0.012,
-  "beta_schedule": "scaled_linear",
-  "beta_start": 0.00085,
-  "clip_sample": false,
-  "interpolation_type": "linear",
-  "num_train_timesteps": 1000,
-  "prediction_type": "epsilon",
-  "sample_max_value": 1.0,
-  "set_alpha_to_one": false,
-  "skip_prk_steps": true,
-  "steps_offset": 1,
-  "timestep_spacing": "leading",
-  "trained_betas": null,
-  "use_karras_sigmas": false
-}
--- a/invokeai/backend/assets/model_base_conf_files/stable-diffusion-xl-refiner-1.0/text_encoder_2/config.json
+++ b/invokeai/backend/assets/model_base_conf_files/stable-diffusion-xl-refiner-1.0/text_encoder_2/config.json
@@ -1,24 +0,0 @@
-{
-  "architectures": [
-    "CLIPTextModelWithProjection"
-  ],
-  "attention_dropout": 0.0,
-  "bos_token_id": 0,
-  "dropout": 0.0,
-  "eos_token_id": 2,
-  "hidden_act": "gelu",
-  "hidden_size": 1280,
-  "initializer_factor": 1.0,
-  "initializer_range": 0.02,
-  "intermediate_size": 5120,
-  "layer_norm_eps": 1e-05,
-  "max_position_embeddings": 77,
-  "model_type": "clip_text_model",
-  "num_attention_heads": 20,
-  "num_hidden_layers": 32,
-  "pad_token_id": 1,
-  "projection_dim": 1280,
-  "torch_dtype": "float16",
-  "transformers_version": "4.32.0.dev0",
-  "vocab_size": 49408
-}
--- a/invokeai/backend/assets/model_base_conf_files/stable-diffusion-xl-refiner-1.0/tokenizer_2/merges.txt
+++ b/invokeai/backend/assets/model_base_conf_files/stable-diffusion-xl-refiner-1.0/tokenizer_2/merges.txt
--- a/invokeai/backend/assets/model_base_conf_files/stable-diffusion-xl-refiner-1.0/tokenizer_2/special_tokens_map.json
+++ b/invokeai/backend/assets/model_base_conf_files/stable-diffusion-xl-refiner-1.0/tokenizer_2/special_tokens_map.json
@@ -1,24 +0,0 @@
-{
-  "bos_token": {
-    "content": "<|startoftext|>",
-    "lstrip": false,
-    "normalized": true,
-    "rstrip": false,
-    "single_word": false
-  },
-  "eos_token": {
-    "content": "<|endoftext|>",
-    "lstrip": false,
-    "normalized": true,
-    "rstrip": false,
-    "single_word": false
-  },
-  "pad_token": "!",
-  "unk_token": {
-    "content": "<|endoftext|>",
-    "lstrip": false,
-    "normalized": true,
-    "rstrip": false,
-    "single_word": false
-  }
-}
--- a/invokeai/backend/assets/model_base_conf_files/stable-diffusion-xl-refiner-1.0/tokenizer_2/tokenizer_config.json
+++ b/invokeai/backend/assets/model_base_conf_files/stable-diffusion-xl-refiner-1.0/tokenizer_2/tokenizer_config.json
@@ -1,33 +0,0 @@
-{
-  "add_prefix_space": false,
-  "bos_token": {
-    "__type": "AddedToken",
-    "content": "<|startoftext|>",
-    "lstrip": false,
-    "normalized": true,
-    "rstrip": false,
-    "single_word": false
-  },
-  "clean_up_tokenization_spaces": true,
-  "do_lower_case": true,
-  "eos_token": {
-    "__type": "AddedToken",
-    "content": "<|endoftext|>",
-    "lstrip": false,
-    "normalized": true,
-    "rstrip": false,
-    "single_word": false
-  },
-  "errors": "replace",
-  "model_max_length": 77,
-  "pad_token": "!",
-  "tokenizer_class": "CLIPTokenizer",
-  "unk_token": {
-    "__type": "AddedToken",
-    "content": "<|endoftext|>",
-    "lstrip": false,
-    "normalized": true,
-    "rstrip": false,
-    "single_word": false
-  }
-}
--- a/invokeai/backend/assets/model_base_conf_files/stable-diffusion-xl-refiner-1.0/tokenizer_2/vocab.json
+++ b/invokeai/backend/assets/model_base_conf_files/stable-diffusion-xl-refiner-1.0/tokenizer_2/vocab.json
--- a/invokeai/backend/assets/model_base_conf_files/stable-diffusion-xl-refiner-1.0/unet/config.json
+++ b/invokeai/backend/assets/model_base_conf_files/stable-diffusion-xl-refiner-1.0/unet/config.json
@@ -1,69 +0,0 @@
-{
-  "_class_name": "UNet2DConditionModel",
-  "_diffusers_version": "0.19.0.dev0",
-  "act_fn": "silu",
-  "addition_embed_type": "text_time",
-  "addition_embed_type_num_heads": 64,
-  "addition_time_embed_dim": 256,
-  "attention_head_dim": [
-    6,
-    12,
-    24,
-    24
-  ],
-  "block_out_channels": [
-    384,
-    768,
-    1536,
-    1536
-  ],
-  "center_input_sample": false,
-  "class_embed_type": null,
-  "class_embeddings_concat": false,
-  "conv_in_kernel": 3,
-  "conv_out_kernel": 3,
-  "cross_attention_dim": 1280,
-  "cross_attention_norm": null,
-  "down_block_types": [
-    "DownBlock2D",
-    "CrossAttnDownBlock2D",
-    "CrossAttnDownBlock2D",
-    "DownBlock2D"
-  ],
-  "downsample_padding": 1,
-  "dual_cross_attention": false,
-  "encoder_hid_dim": null,
-  "encoder_hid_dim_type": null,
-  "flip_sin_to_cos": true,
-  "freq_shift": 0,
-  "in_channels": 4,
-  "layers_per_block": 2,
-  "mid_block_only_cross_attention": null,
-  "mid_block_scale_factor": 1,
-  "mid_block_type": "UNetMidBlock2DCrossAttn",
-  "norm_eps": 1e-05,
-  "norm_num_groups": 32,
-  "num_attention_heads": null,
-  "num_class_embeds": null,
-  "only_cross_attention": false,
-  "out_channels": 4,
-  "projection_class_embeddings_input_dim": 2560,
-  "resnet_out_scale_factor": 1.0,
-  "resnet_skip_time_act": false,
-  "resnet_time_scale_shift": "default",
-  "sample_size": 128,
-  "time_cond_proj_dim": null,
-  "time_embedding_act_fn": null,
-  "time_embedding_dim": null,
-  "time_embedding_type": "positional",
-  "timestep_post_act": null,
-  "transformer_layers_per_block": 4,
-  "up_block_types": [
-    "UpBlock2D",
-    "CrossAttnUpBlock2D",
-    "CrossAttnUpBlock2D",
-    "UpBlock2D"
-  ],
-  "upcast_attention": null,
-  "use_linear_projection": true
-}
--- a/invokeai/backend/assets/model_base_conf_files/stable-diffusion-xl-refiner-1.0/vae/config.json
+++ b/invokeai/backend/assets/model_base_conf_files/stable-diffusion-xl-refiner-1.0/vae/config.json
@@ -1,32 +0,0 @@
-{
-  "_class_name": "AutoencoderKL",
-  "_diffusers_version": "0.20.0.dev0",
-  "_name_or_path": "../sdxl-vae/",
-  "act_fn": "silu",
-  "block_out_channels": [
-    128,
-    256,
-    512,
-    512
-  ],
-  "down_block_types": [
-    "DownEncoderBlock2D",
-    "DownEncoderBlock2D",
-    "DownEncoderBlock2D",
-    "DownEncoderBlock2D"
-  ],
-  "force_upcast": true,
-  "in_channels": 3,
-  "latent_channels": 4,
-  "layers_per_block": 2,
-  "norm_num_groups": 32,
-  "out_channels": 3,
-  "sample_size": 1024,
-  "scaling_factor": 0.13025,
-  "up_block_types": [
-    "UpDecoderBlock2D",
-    "UpDecoderBlock2D",
-    "UpDecoderBlock2D",
-    "UpDecoderBlock2D"
-  ]
-}
--- a/invokeai/backend/assets/model_base_conf_files/stable-diffusion-xl-refiner-1.0/vae_1_0/config.json
+++ b/invokeai/backend/assets/model_base_conf_files/stable-diffusion-xl-refiner-1.0/vae_1_0/config.json
@@ -1,31 +0,0 @@
-{
-  "_class_name": "AutoencoderKL",
-  "_diffusers_version": "0.19.0.dev0",
-  "act_fn": "silu",
-  "block_out_channels": [
-    128,
-    256,
-    512,
-    512
-  ],
-  "down_block_types": [
-    "DownEncoderBlock2D",
-    "DownEncoderBlock2D",
-    "DownEncoderBlock2D",
-    "DownEncoderBlock2D"
-  ],
-  "force_upcast": true,
-  "in_channels": 3,
-  "latent_channels": 4,
-  "layers_per_block": 2,
-  "norm_num_groups": 32,
-  "out_channels": 3,
-  "sample_size": 1024,
-  "scaling_factor": 0.13025,
-  "up_block_types": [
-    "UpDecoderBlock2D",
-    "UpDecoderBlock2D",
-    "UpDecoderBlock2D",
-    "UpDecoderBlock2D"
-  ]
-}
--- a/invokeai/backend/model_cache_v2/init.py
+++ b/invokeai/backend/model_cache_v2/init.py
--- a/invokeai/backend/model_cache_v2/cached_model_v2.py
+++ b/invokeai/backend/model_cache_v2/cached_model_v2.py
@@ -0,0 +1,105 @@
+import torch
+
+from invokeai.backend.model_cache_v2.torch_module_overrides import CustomLinear, inject_custom_layers_into_module
+
+
+class CachedModelV2:
+    """A wrapper around a PyTorch model to handle partial loads and unloads between the CPU and the compute device.
+
+    Note: "VRAM" is used throughout this class to refer to the memory on the compute device. It could be CUDA memory,
+    MPS memory, etc.
+    """
+
+    def __init__(self, model: torch.nn.Module, compute_device: torch.device):
+        print("CachedModelV2.__init__")
+        self._model = model
+        inject_custom_layers_into_module(self._model)
+        self._compute_device = compute_device
+
+        # Memoized values.
+        self._total_size_cache = None
+        self._cur_vram_bytes_cache = None
+
+    @property
+    def model(self) -> torch.nn.Module:
+        return self._model
+
+    def total_bytes(self) -> int:
+        if self._total_size_cache is None:
+            self._total_size_cache = sum(p.numel() * p.element_size() for p in self._model.parameters())
+        return self._total_size_cache
+
+    def cur_vram_bytes(self) -> int:
+        """Return the size (in bytes) of the weights that are currently in VRAM."""
+        if self._cur_vram_bytes_cache is None:
+            self._cur_vram_bytes_cache = sum(
+                p.numel() * p.element_size()
+                for p in self._model.parameters()
+                if p.device.type == self._compute_device.type
+            )
+        return self._cur_vram_bytes_cache
+
+    def full_load_to_vram(self):
+        """Load all weights into VRAM."""
+        raise NotImplementedError("Not implemented")
+        self._cur_vram_bytes_cache = self.total_bytes()
+
+    def partial_load_to_vram(self, vram_bytes_to_load: int) -> int:
+        """Load more weights into VRAM without exceeding vram_bytes_to_load.
+
+        Returns:
+            The number of bytes loaded into VRAM.
+        """
+        vram_bytes_loaded = 0
+
+        def to_vram(m: torch.nn.Module):
+            nonlocal vram_bytes_loaded
+
+            if not isinstance(m, CustomLinear):
+                # We don't handle offload of this type of module.
+                return
+
+            m_device = m.weight.device
+            m_bytes = sum(p.numel() * p.element_size() for p in m.parameters())
+
+            # Skip modules that are already on the compute device.
+            if m_device.type == self._compute_device.type:
+                return
+
+            # Check the size of the parameter.
+            if vram_bytes_loaded + m_bytes > vram_bytes_to_load:
+                # TODO(ryand): Should we just break here? If we couldn't fit this parameter into VRAM, is it really
+                # worth continuing to search for a smaller parameter that would fit?
+                return
+
+            vram_bytes_loaded += m_bytes
+            m.to(self._compute_device)
+
+        self._model.apply(to_vram)
+        self._cur_vram_bytes_cache = None
+
+        return vram_bytes_loaded
+
+    def partial_unload_from_vram(self, vram_bytes_to_free: int) -> int:
+        """Unload weights from VRAM until vram_bytes_to_free bytes are freed. Or the entire model is unloaded."""
+
+        vram_bytes_freed = 0
+
+        def from_vram(m: torch.nn.Module):
+            nonlocal vram_bytes_freed
+
+            if vram_bytes_freed >= vram_bytes_to_free:
+                return
+
+            m_device = m.weight.device
+            m_bytes = sum(p.numel() * p.element_size() for p in m.parameters())
+            if m_device.type != self._compute_device.type:
+                return
+
+            vram_bytes_freed += m_bytes
+            m.to("cpu")
+
+        self._model.apply(from_vram)
+        self._cur_vram_bytes_cache = None
+
+        return vram_bytes_freed
--- a/invokeai/backend/model_cache_v2/torch_autocast_context.py
+++ b/invokeai/backend/model_cache_v2/torch_autocast_context.py
@@ -0,0 +1,18 @@
+import torch
+from torch.utils._python_dispatch import TorchDispatchMode
+
+
+def cast_to_device_and_run(func, args, kwargs, to_device: torch.device):
+    args_on_device = [a.to(to_device) if isinstance(a, torch.Tensor) else a for a in args]
+    kwargs_on_device = {k: v.to(to_device) if isinstance(v, torch.Tensor) else v for k, v in kwargs.items()}
+    return func(*args_on_device, **kwargs_on_device)
+
+
+class TorchAutocastContext(TorchDispatchMode):
+    def __init__(self, to_device: torch.device):
+        self._to_device = to_device
+
+    def __torch_dispatch__(self, func, types, args, kwargs):
+        # print(f"Dispatch Log: {func}(*{args}, **{kwargs})")
+        # print(f"Dispatch Log: {types}")
+        return cast_to_device_and_run(func, args, kwargs, self._to_device)
--- a/invokeai/backend/model_cache_v2/torch_function_autocast_context.py
+++ b/invokeai/backend/model_cache_v2/torch_function_autocast_context.py
@@ -0,0 +1,16 @@
+import torch
+from torch.overrides import TorchFunctionMode
+
+
+def cast_to_device_and_run(func, args, kwargs, to_device: torch.device):
+    args_on_device = [a.to(to_device) if isinstance(a, torch.Tensor) else a for a in args]
+    kwargs_on_device = {k: v.to(to_device) if isinstance(v, torch.Tensor) else v for k, v in kwargs.items()}
+    return func(*args_on_device, **kwargs_on_device)
+
+
+class TorchFunctionAutocastContext(TorchFunctionMode):
+    def __init__(self, to_device: torch.device):
+        self._to_device = to_device
+
+    def __torch_function__(self, func, types, args, kwargs=None):
+        return cast_to_device_and_run(func, args, kwargs or {}, self._to_device)
--- a/invokeai/backend/model_cache_v2/torch_module_overrides.py
+++ b/invokeai/backend/model_cache_v2/torch_module_overrides.py
@@ -0,0 +1,26 @@
+from typing import TypeVar
+
+import torch
+
+T = TypeVar("T", torch.Tensor, None)
+
+
+def cast_to_device(t: T, to_device: torch.device, non_blocking: bool = True) -> T:
+    if t is None:
+        return t
+    return t.to(to_device, non_blocking=non_blocking)
+
+
+def inject_custom_layers_into_module(model: torch.nn.Module):
+    def inject_custom_layers(module: torch.nn.Module):
+        if isinstance(module, torch.nn.Linear):
+            module.__class__ = CustomLinear
+
+    model.apply(inject_custom_layers)
+
+
+class CustomLinear(torch.nn.Linear):
+    def forward(self, input: torch.Tensor) -> torch.Tensor:
+        weight = cast_to_device(self.weight, input.device)
+        bias = cast_to_device(self.bias, input.device)
+        return torch.nn.functional.linear(input, weight, bias)
--- a/invokeai/backend/model_manager/load/init.py
+++ b/invokeai/backend/model_manager/load/init.py
@@ -8,7 +8,7 @@ from pathlib import Path

 from invokeai.backend.model_manager.load.load_base import LoadedModel, LoadedModelWithoutConfig, ModelLoaderBase
 from invokeai.backend.model_manager.load.load_default import ModelLoader
-from invokeai.backend.model_manager.load.model_cache.model_cache_default import ModelCache
+from invokeai.backend.model_manager.load.model_cache.model_cache import ModelCache
 from invokeai.backend.model_manager.load.model_loader_registry import ModelLoaderRegistry, ModelLoaderRegistryBase

 # This registers the subclasses that implement loaders of specific model types
--- a/invokeai/backend/model_manager/load/load_base.py
+++ b/invokeai/backend/model_manager/load/load_base.py
@@ -5,7 +5,6 @@ Base class for model loading in InvokeAI.

 from abc import ABC, abstractmethod
 from contextlib import contextmanager
-from dataclasses import dataclass
 from logging import Logger
 from pathlib import Path
 from typing import Any, Dict, Generator, Optional, Tuple
@@ -18,19 +17,17 @@ from invokeai.backend.model_manager.config import (
    AnyModelConfig,
    SubModelType,
 )
-from invokeai.backend.model_manager.load.model_cache.model_cache_base import ModelCacheBase, ModelLockerBase
+from invokeai.backend.model_manager.load.model_cache.cache_record import CacheRecord
+from invokeai.backend.model_manager.load.model_cache.model_cache import ModelCache


-@dataclass
 class LoadedModelWithoutConfig:
-    """
-    Context manager object that mediates transfer from RAM<->VRAM.
+    """Context manager object that mediates transfer from RAM<->VRAM.

    This is a context manager object that has two distinct APIs:

    1. Older API (deprecated):
-    Use the LoadedModel object directly as a context manager.
-    It will move the model into VRAM (on CUDA devices), and
+    Use the LoadedModel object directly as a context manager.  It will move the model into VRAM (on CUDA devices), and
    return the model in a form suitable for passing to torch.
    Example:
    ```
@@ -40,13 +37,9 @@ class LoadedModelWithoutConfig:
    ```

    2. Newer API (recommended):
-    Call the LoadedModel's `model_on_device()` method in a
-    context. It returns a tuple consisting of a copy of
-    the model's state dict in CPU RAM followed by a copy
-    of the model in VRAM. The state dict is provided to allow
-    LoRAs and other model patchers to return the model to
-    its unpatched state without expensive copy and restore
-    operations.
+    Call the LoadedModel's `model_on_device()` method in a context. It returns a tuple consisting of a copy of the
+    model's state dict in CPU RAM followed by a copy of the model in VRAM. The state dict is provided to allow LoRAs and
+    other model patchers to return the model to its unpatched state without expensive copy and restore operations.

    Example:
    ```
@@ -55,43 +48,42 @@ class LoadedModelWithoutConfig:
        image = vae.decode(latents)[0]
    ```

-    The state_dict should be treated as a read-only object and
-    never modified. Also be aware that some loadable models do
-    not have a state_dict, in which case this value will be None.
+    The state_dict should be treated as a read-only object and never modified. Also be aware that some loadable models
+    do not have a state_dict, in which case this value will be None.
    """

-    _locker: ModelLockerBase
+    def __init__(self, cache_record: CacheRecord, cache: ModelCache):
+        self._cache_record = cache_record
+        self._cache = cache

    def __enter__(self) -> AnyModel:
-        """Context entry."""
-        self._locker.lock()
+        self._cache.lock(self._cache_record.key)
        return self.model

    def __exit__(self, *args: Any, **kwargs: Any) -> None:
-        """Context exit."""
-        self._locker.unlock()
+        self._cache.unlock(self._cache_record.key)

    @contextmanager
    def model_on_device(self) -> Generator[Tuple[Optional[Dict[str, torch.Tensor]], AnyModel], None, None]:
        """Return a tuple consisting of the model's state dict (if it exists) and the locked model on execution device."""
-        locked_model = self._locker.lock()
+        self._cache.lock(self._cache_record.key)
        try:
-            state_dict = self._locker.get_state_dict()
-            yield (state_dict, locked_model)
+            yield (self._cache_record.state_dict, self._cache_record.model)
        finally:
-            self._locker.unlock()
+            self._cache.unlock(self._cache_record.key)

    @property
    def model(self) -> AnyModel:
        """Return the model without locking it."""
-        return self._locker.model
+        return self._cache_record.model


-@dataclass
 class LoadedModel(LoadedModelWithoutConfig):
    """Context manager object that mediates transfer from RAM<->VRAM."""

-    config: Optional[AnyModelConfig] = None
+    def __init__(self, config: Optional[AnyModelConfig], cache_record: CacheRecord, cache: ModelCache):
+        super().__init__(cache_record=cache_record, cache=cache)
+        self.config = config


 # TODO(MM2):
@@ -110,7 +102,7 @@ class ModelLoaderBase(ABC):
        self,
        app_config: InvokeAIAppConfig,
        logger: Logger,
-        ram_cache: ModelCacheBase[AnyModel],
+        ram_cache: ModelCache,
    ):
        """Initialize the loader."""
        pass
@@ -138,6 +130,6 @@ class ModelLoaderBase(ABC):

    @property
    @abstractmethod
-    def ram_cache(self) -> ModelCacheBase[AnyModel]:
+    def ram_cache(self) -> ModelCache:
        """Return the ram cache associated with this loader."""
        pass
--- a/invokeai/backend/model_manager/load/load_default.py
+++ b/invokeai/backend/model_manager/load/load_default.py
@@ -14,7 +14,8 @@ from invokeai.backend.model_manager import (
 )
 from invokeai.backend.model_manager.config import DiffusersConfigBase
 from invokeai.backend.model_manager.load.load_base import LoadedModel, ModelLoaderBase
-from invokeai.backend.model_manager.load.model_cache.model_cache_base import ModelCacheBase, ModelLockerBase
+from invokeai.backend.model_manager.load.model_cache.cache_record import CacheRecord
+from invokeai.backend.model_manager.load.model_cache.model_cache import ModelCache, get_model_cache_key
 from invokeai.backend.model_manager.load.model_util import calc_model_size_by_fs
 from invokeai.backend.model_manager.load.optimizations import skip_torch_weight_init
 from invokeai.backend.util.devices import TorchDevice
@@ -28,7 +29,7 @@ class ModelLoader(ModelLoaderBase):
        self,
        app_config: InvokeAIAppConfig,
        logger: Logger,
-        ram_cache: ModelCacheBase[AnyModel],
+        ram_cache: ModelCache,
    ):
        """Initialize the loader."""
        self._app_config = app_config
@@ -54,11 +55,11 @@ class ModelLoader(ModelLoaderBase):
            raise InvalidModelConfigException(f"Files for model '{model_config.name}' not found at {model_path}")

        with skip_torch_weight_init():
-            locker = self._load_and_cache(model_config, submodel_type)
-        return LoadedModel(config=model_config, _locker=locker)
+            cache_record = self._load_and_cache(model_config, submodel_type)
+        return LoadedModel(config=model_config, cache_record=cache_record, cache=self._ram_cache)

    @property
-    def ram_cache(self) -> ModelCacheBase[AnyModel]:
+    def ram_cache(self) -> ModelCache:
        """Return the ram cache associated with this loader."""
        return self._ram_cache

@@ -66,10 +67,10 @@ class ModelLoader(ModelLoaderBase):
        model_base = self._app_config.models_path
        return (model_base / config.path).resolve()

-    def _load_and_cache(self, config: AnyModelConfig, submodel_type: Optional[SubModelType] = None) -> ModelLockerBase:
+    def _load_and_cache(self, config: AnyModelConfig, submodel_type: Optional[SubModelType] = None) -> CacheRecord:
        stats_name = ":".join([config.base, config.type, config.name, (submodel_type or "")])
        try:
-            return self._ram_cache.get(config.key, submodel_type, stats_name=stats_name)
+            return self._ram_cache.get(key=get_model_cache_key(config.key, submodel_type), stats_name=stats_name)
        except IndexError:
            pass

@@ -78,16 +79,11 @@ class ModelLoader(ModelLoaderBase):
        loaded_model = self._load_model(config, submodel_type)

        self._ram_cache.put(
-            config.key,
-            submodel_type=submodel_type,
+            get_model_cache_key(config.key, submodel_type),
            model=loaded_model,
        )

-        return self._ram_cache.get(
-            key=config.key,
-            submodel_type=submodel_type,
-            stats_name=stats_name,
-        )
+        return self._ram_cache.get(key=get_model_cache_key(config.key, submodel_type), stats_name=stats_name)

    def get_size_fs(
        self, config: AnyModelConfig, model_path: Path, submodel_type: Optional[SubModelType] = None
--- a/invokeai/backend/model_manager/load/model_cache/init.py
+++ b/invokeai/backend/model_manager/load/model_cache/init.py
@@ -1,6 +0,0 @@
-"""Init file for ModelCache."""
-
-from .model_cache_base import ModelCacheBase, CacheStats  # noqa F401
-from .model_cache_default import ModelCache  # noqa F401
-
-_all__ = ["ModelCacheBase", "ModelCache", "CacheStats"]
--- a/invokeai/backend/model_manager/load/model_cache/cache_record.py
+++ b/invokeai/backend/model_manager/load/model_cache/cache_record.py
@@ -0,0 +1,47 @@
+from dataclasses import dataclass
+from typing import Any, Dict, Optional
+
+import torch
+
+
+@dataclass
+class CacheRecord:
+    """
+    Elements of the cache:
+
+    key: Unique key for each model, same as used in the models database.
+    model: Model in memory.
+    state_dict: A read-only copy of the model's state dict in RAM. It will be
+                used as a template for creating a copy in the VRAM.
+    size: Size of the model
+    loaded: True if the model's state dict is currently in VRAM
+
+    Before a model is executed, the state_dict template is copied into VRAM,
+    and then injected into the model. When the model is finished, the VRAM
+    copy of the state dict is deleted, and the RAM version is reinjected
+    into the model.
+
+    The state_dict should be treated as a read-only attribute. Do not attempt
+    to patch or otherwise modify it. Instead, patch the copy of the state_dict
+    after it is loaded into the execution device (e.g. CUDA) using the `LoadedModel`
+    context manager call `model_on_device()`.
+    """
+
+    key: str
+    model: Any
+    device: torch.device
+    state_dict: Optional[Dict[str, torch.Tensor]]
+    size: int
+    loaded: bool = False
+    _locks: int = 0
+
+    def lock(self) -> None:
+        self._locks += 1
+
+    def unlock(self) -> None:
+        self._locks -= 1
+        assert self._locks >= 0
+
+    @property
+    def is_locked(self) -> bool:
+        return self._locks > 0
--- a/invokeai/backend/model_manager/load/model_cache/cache_stats.py
+++ b/invokeai/backend/model_manager/load/model_cache/cache_stats.py
@@ -0,0 +1,15 @@
+from dataclasses import dataclass, field
+from typing import Dict
+
+
+@dataclass
+class CacheStats(object):
+    """Collect statistics on cache performance."""
+
+    hits: int = 0  # cache hits
+    misses: int = 0  # cache misses
+    high_watermark: int = 0  # amount of cache used
+    in_cache: int = 0  # number of models in cache
+    cleared: int = 0  # number of models cleared to make space
+    cache_size: int = 0  # total size of cache
+    loaded_model_sizes: Dict[str, int] = field(default_factory=dict)
--- a/invokeai/backend/model_manager/load/model_cache/cached_model/init.py
+++ b/invokeai/backend/model_manager/load/model_cache/cached_model/init.py
--- a/invokeai/backend/model_manager/load/model_cache/cached_model/cached_model_only_full_load.py
+++ b/invokeai/backend/model_manager/load/model_cache/cached_model/cached_model_only_full_load.py
@@ -0,0 +1,69 @@
+from typing import Any
+
+import torch
+
+
+class CachedModelOnlyFullLoad:
+    """A wrapper around a PyTorch model to handle full loads and unloads between the CPU and the compute device.
+
+    Note: "VRAM" is used throughout this class to refer to the memory on the compute device. It could be CUDA memory,
+    MPS memory, etc.
+    """
+
+    def __init__(self, model: torch.nn.Module | Any, compute_device: torch.device, total_bytes: int):
+        """Initialize a CachedModelOnlyFullLoad.
+
+        Args:
+            model (torch.nn.Module | Any): The model to wrap. Should be on the CPU.
+            compute_device (torch.device): The compute device to move the model to.
+            total_bytes (int): The total size (in bytes) of all the weights in the model.
+        """
+        # model is often a torch.nn.Module, but could be any model type. Throughout this class, we handle both cases.
+        self._model = model
+        self._compute_device = compute_device
+        self._total_bytes = total_bytes
+        self._is_in_vram = False
+
+    @property
+    def model(self) -> torch.nn.Module:
+        return self._model
+
+    def total_bytes(self) -> int:
+        """Get the total size (in bytes) of all the weights in the model."""
+        return self._total_bytes
+
+    def is_in_vram(self) -> bool:
+        """Return true if the model is currently in VRAM."""
+        return self._is_in_vram
+
+    def full_load_to_vram(self) -> int:
+        """Load all weights into VRAM (if supported by the model).
+
+        Returns:
+            The number of bytes loaded into VRAM.
+        """
+        if self._is_in_vram:
+            # Already in VRAM.
+            return 0
+
+        if not hasattr(self._model, "to"):
+            # Model doesn't support moving to a device.
+            return 0
+
+        self._model.to(self._compute_device)
+        self._is_in_vram = True
+        return self._total_bytes
+
+    def full_unload_from_vram(self) -> int:
+        """Unload all weights from VRAM.
+
+        Returns:
+            The number of bytes unloaded from VRAM.
+        """
+        if not self._is_in_vram:
+            # Already in RAM.
+            return 0
+
+        self._model.to("cpu")
+        self._is_in_vram = False
+        return self._total_bytes
--- a/invokeai/backend/model_manager/load/model_cache/cached_model/cached_model_with_partial_load.py
+++ b/invokeai/backend/model_manager/load/model_cache/cached_model/cached_model_with_partial_load.py
@@ -0,0 +1,84 @@
+import torch
+
+from invokeai.backend.util.calc_tensor_size import calc_tensor_size
+
+
+class CachedModelWithPartialLoad:
+    """A wrapper around a PyTorch model to handle partial loads and unloads between the CPU and the compute device.
+
+    Note: "VRAM" is used throughout this class to refer to the memory on the compute device. It could be CUDA memory,
+    MPS memory, etc.
+    """
+
+    def __init__(self, model: torch.nn.Module, compute_device: torch.device):
+        self._model = model
+        self._compute_device = compute_device
+
+        # TODO(ryand): Add memoization for total_bytes and cur_vram_bytes?
+
+    @property
+    def model(self) -> torch.nn.Module:
+        return self._model
+
+    def total_bytes(self) -> int:
+        """Get the total size (in bytes) of all the weights in the model."""
+        return sum(calc_tensor_size(p) for p in self._model.parameters())
+
+    def cur_vram_bytes(self) -> int:
+        """Get the size (in bytes) of the weights that are currently in VRAM."""
+        return sum(calc_tensor_size(p) for p in self._model.parameters() if p.device.type == self._compute_device.type)
+
+    def partial_load_to_vram(self, vram_bytes_to_load: int) -> int:
+        """Load more weights into VRAM without exceeding vram_bytes_to_load.
+
+        Returns:
+            The number of bytes loaded into VRAM.
+        """
+        vram_bytes_loaded = 0
+
+        # TODO(ryand): Should we use self._model.apply(...) instead and move modules around instead of moving tensors?
+        # This way we don't have to use the private _apply() method.
+        def to_vram(t: torch.Tensor):
+            nonlocal vram_bytes_loaded
+
+            # Skip parameters that are already on the compute device.
+            if t.device.type == self._compute_device.type:
+                return t
+
+            # Check the size of the parameter.
+            param_size = calc_tensor_size(t)
+            if vram_bytes_loaded + param_size > vram_bytes_to_load:
+                # TODO(ryand): Should we just break here? If we couldn't fit this parameter into VRAM, is it really
+                # worth continuing to search for a smaller parameter that would fit?
+                return t
+
+            vram_bytes_loaded += param_size
+            return t.to(self._compute_device)
+
+        self._model._apply(to_vram)
+
+        return vram_bytes_loaded
+
+    def partial_unload_from_vram(self, vram_bytes_to_free: int) -> int:
+        """Unload weights from VRAM until vram_bytes_to_free bytes are freed. Or the entire model is unloaded.
+
+        Returns:
+            The number of bytes unloaded from VRAM.
+        """
+        vram_bytes_freed = 0
+
+        def from_vram(t: torch.Tensor):
+            nonlocal vram_bytes_freed
+
+            if vram_bytes_freed >= vram_bytes_to_free:
+                return t
+
+            if t.device.type != self._compute_device.type:
+                return t
+
+            vram_bytes_freed += calc_tensor_size(t)
+            return t.to("cpu")
+
+        self._model._apply(from_vram)
+
+        return vram_bytes_freed
--- a/invokeai/backend/model_manager/load/model_cache/model_cache_default.py
+++ b/invokeai/backend/model_manager/load/model_cache/model_cache_default.py
@@ -1,11 +1,9 @@
 # Copyright (c) 2024 Lincoln D. Stein and the InvokeAI Development team
 # TODO: Add Stalker's proper name to copyright
-""" """

 import gc
 import math
 import time
-from contextlib import suppress
 from logging import Logger
 from typing import Dict, List, Optional

@@ -13,13 +11,8 @@ import torch

 from invokeai.backend.model_manager import AnyModel, SubModelType
 from invokeai.backend.model_manager.load.memory_snapshot import MemorySnapshot, get_pretty_snapshot_diff
-from invokeai.backend.model_manager.load.model_cache.model_cache_base import (
-    CacheRecord,
-    CacheStats,
-    ModelCacheBase,
-    ModelLockerBase,
-)
-from invokeai.backend.model_manager.load.model_cache.model_locker import ModelLocker
+from invokeai.backend.model_manager.load.model_cache.cache_record import CacheRecord
+from invokeai.backend.model_manager.load.model_cache.cache_stats import CacheStats
 from invokeai.backend.model_manager.load.model_util import calc_model_size_by_data
 from invokeai.backend.util.devices import TorchDevice
 from invokeai.backend.util.logging import InvokeAILogger
@@ -31,7 +24,14 @@ GB = 2**30
 MB = 2**20


-class ModelCache(ModelCacheBase[AnyModel]):
+def get_model_cache_key(model_key: str, submodel_type: Optional[SubModelType] = None) -> str:
+    if submodel_type:
+        return f"{model_key}:{submodel_type.value}"
+    else:
+        return model_key
+
+
+class ModelCache:
    """A cache for managing models in memory.

    The cache is based on two levels of model storage:
@@ -70,7 +70,6 @@ class ModelCache(ModelCacheBase[AnyModel]):
        max_vram_cache_size: float,
        execution_device: torch.device = torch.device("cuda"),
        storage_device: torch.device = torch.device("cpu"),
-        precision: torch.dtype = torch.float16,
        lazy_offloading: bool = True,
        log_memory_usage: bool = False,
        logger: Optional[Logger] = None,
@@ -82,7 +81,6 @@ class ModelCache(ModelCacheBase[AnyModel]):
        :param max_vram_cache_size: Maximum size of the execution_device cache in GBs.
        :param execution_device: Torch device to load active model into [torch.device('cuda')]
        :param storage_device: Torch device to save inactive model in [torch.device('cpu')]
-        :param precision: Precision for loaded models [torch.float16]
        :param lazy_offloading: Keep model in VRAM until another model needs to be loaded
        :param log_memory_usage: If True, a memory snapshot will be captured before and after every model cache
            operation, and the result will be logged (at debug level). There is a time cost to capturing the memory
@@ -100,29 +98,9 @@ class ModelCache(ModelCacheBase[AnyModel]):
        self._log_memory_usage = log_memory_usage
        self._stats: Optional[CacheStats] = None

-        self._cached_models: Dict[str, CacheRecord[AnyModel]] = {}
+        self._cached_models: Dict[str, CacheRecord] = {}
        self._cache_stack: List[str] = []

-    @property
-    def logger(self) -> Logger:
-        """Return the logger used by the cache."""
-        return self._logger
-
-    @property
-    def lazy_offloading(self) -> bool:
-        """Return true if the cache is configured to lazily offload models in VRAM."""
-        return self._lazy_offloading
-
-    @property
-    def storage_device(self) -> torch.device:
-        """Return the storage device (e.g. "CPU" for RAM)."""
-        return self._storage_device
-
-    @property
-    def execution_device(self) -> torch.device:
-        """Return the exection device (e.g. "cuda" for VRAM)."""
-        return self._execution_device
-
    @property
    def max_cache_size(self) -> float:
        """Return the cap on cache size."""
@@ -153,49 +131,26 @@ class ModelCache(ModelCacheBase[AnyModel]):
        """Set the CacheStats object for collectin cache statistics."""
        self._stats = stats

-    def cache_size(self) -> int:
-        """Get the total size of the models currently cached."""
-        total = 0
-        for cache_record in self._cached_models.values():
-            total += cache_record.size
-        return total
-
-    def put(
-        self,
-        key: str,
-        model: AnyModel,
-        submodel_type: Optional[SubModelType] = None,
-    ) -> None:
-        """Store model under key and optional submodel_type."""
-        key = self._make_cache_key(key, submodel_type)
+    def put(self, key: str, model: AnyModel) -> None:
        if key in self._cached_models:
            return
-        size = calc_model_size_by_data(self.logger, model)
+        size = calc_model_size_by_data(self._logger, model)
        self.make_room(size)

-        running_on_cpu = self.execution_device == torch.device("cpu")
+        running_on_cpu = self._execution_device == torch.device("cpu")
        state_dict = model.state_dict() if isinstance(model, torch.nn.Module) and not running_on_cpu else None
-        cache_record = CacheRecord(key=key, model=model, device=self.storage_device, state_dict=state_dict, size=size)
+        cache_record = CacheRecord(key=key, model=model, device=self._storage_device, state_dict=state_dict, size=size)
        self._cached_models[key] = cache_record
        self._cache_stack.append(key)

-    def get(
-        self,
-        key: str,
-        submodel_type: Optional[SubModelType] = None,
-        stats_name: Optional[str] = None,
-    ) -> ModelLockerBase:
-        """
-        Retrieve model using key and optional submodel_type.
+    def get(self, key: str, stats_name: Optional[str] = None) -> CacheRecord:
+        """Retrieve a model from the cache.

-        :param key: Opaque model key
-        :param submodel_type: Type of the submodel to fetch
-        :param stats_name: A human-readable id for the model for the purposes of
-        stats reporting.
+        :param key: Model key
+        :param stats_name: A human-readable id for the model for the purposes of stats reporting.

-        This may raise an IndexError if the model is not in the cache.
+        Raises IndexError if the model is not in the cache.
        """
-        key = self._make_cache_key(key, submodel_type)
        if key in self._cached_models:
            if self.stats:
                self.stats.hits += 1
@@ -210,20 +165,52 @@ class ModelCache(ModelCacheBase[AnyModel]):
        if self.stats:
            stats_name = stats_name or key
            self.stats.cache_size = int(self._max_cache_size * GB)
-            self.stats.high_watermark = max(self.stats.high_watermark, self.cache_size())
+            self.stats.high_watermark = max(self.stats.high_watermark, self._get_cache_size())
            self.stats.in_cache = len(self._cached_models)
            self.stats.loaded_model_sizes[stats_name] = max(
                self.stats.loaded_model_sizes.get(stats_name, 0), cache_entry.size
            )

        # this moves the entry to the top (right end) of the stack
-        with suppress(Exception):
-            self._cache_stack.remove(key)
+        self._cache_stack = [k for k in self._cache_stack if k != key]
        self._cache_stack.append(key)
-        return ModelLocker(
-            cache=self,
-            cache_entry=cache_entry,
-        )
+
+        return cache_entry
+
+    def lock(self, key: str) -> None:
+        """Lock a model for use and move it into VRAM."""
+        cache_entry = self._cached_models[key]
+        cache_entry.lock()
+
+        try:
+            if self._lazy_offloading:
+                self._offload_unlocked_models(cache_entry.size)
+            self._move_model_to_device(cache_entry, self._execution_device)
+            cache_entry.loaded = True
+            self._logger.debug(f"Locking {cache_entry.key} in {self._execution_device}")
+            self._print_cuda_stats()
+        except torch.cuda.OutOfMemoryError:
+            self._logger.warning("Insufficient GPU memory to load model. Aborting")
+            cache_entry.unlock()
+            raise
+        except Exception:
+            cache_entry.unlock()
+            raise
+
+    def unlock(self, key: str) -> None:
+        """Unlock a model."""
+        cache_entry = self._cached_models[key]
+        cache_entry.unlock()
+        if not self._lazy_offloading:
+            self._offload_unlocked_models(0)
+            self._print_cuda_stats()
+
+    def _get_cache_size(self) -> int:
+        """Get the total size of the models currently cached."""
+        total = 0
+        for cache_record in self._cached_models.values():
+            total += cache_record.size
+        return total

    def _capture_memory_snapshot(self) -> Optional[MemorySnapshot]:
        if self._log_memory_usage:
@@ -236,30 +223,30 @@ class ModelCache(ModelCacheBase[AnyModel]):
        else:
            return model_key

-    def offload_unlocked_models(self, size_required: int) -> None:
+    def _offload_unlocked_models(self, size_required: int) -> None:
        """Offload models from the execution_device to make room for size_required.

        :param size_required: The amount of space to clear in the execution_device cache, in bytes.
        """
        reserved = self._max_vram_cache_size * GB
        vram_in_use = torch.cuda.memory_allocated() + size_required
-        self.logger.debug(f"{(vram_in_use/GB):.2f}GB VRAM needed for models; max allowed={(reserved/GB):.2f}GB")
+        self._logger.debug(f"{(vram_in_use/GB):.2f}GB VRAM needed for models; max allowed={(reserved/GB):.2f}GB")
        for _, cache_entry in sorted(self._cached_models.items(), key=lambda x: x[1].size):
            if vram_in_use <= reserved:
                break
            if not cache_entry.loaded:
                continue
-            if not cache_entry.locked:
-                self.move_model_to_device(cache_entry, self.storage_device)
+            if not cache_entry.is_locked:
+                self._move_model_to_device(cache_entry, self._storage_device)
                cache_entry.loaded = False
                vram_in_use = torch.cuda.memory_allocated() + size_required
-                self.logger.debug(
+                self._logger.debug(
                    f"Removing {cache_entry.key} from VRAM to free {(cache_entry.size/GB):.2f}GB; vram free = {(torch.cuda.memory_allocated()/GB):.2f}GB"
                )

        TorchDevice.empty_cache()

-    def move_model_to_device(self, cache_entry: CacheRecord[AnyModel], target_device: torch.device) -> None:
+    def _move_model_to_device(self, cache_entry: CacheRecord, target_device: torch.device) -> None:
        """Move model into the indicated device.

        :param cache_entry: The CacheRecord for the model
@@ -267,7 +254,7 @@ class ModelCache(ModelCacheBase[AnyModel]):

        May raise a torch.cuda.OutOfMemoryError
        """
-        self.logger.debug(f"Called to move {cache_entry.key} to {target_device}")
+        self._logger.debug(f"Called to move {cache_entry.key} to {target_device}")
        source_device = cache_entry.device

        # Note: We compare device types only so that 'cuda' == 'cuda:0'.
@@ -294,7 +281,7 @@ class ModelCache(ModelCacheBase[AnyModel]):
        try:
            if cache_entry.state_dict is not None:
                assert hasattr(cache_entry.model, "load_state_dict")
-                if target_device == self.storage_device:
+                if target_device == self._storage_device:
                    cache_entry.model.load_state_dict(cache_entry.state_dict, assign=True)
                else:
                    new_dict: Dict[str, torch.Tensor] = {}
@@ -309,7 +296,7 @@ class ModelCache(ModelCacheBase[AnyModel]):

        snapshot_after = self._capture_memory_snapshot()
        end_model_to_time = time.time()
-        self.logger.debug(
+        self._logger.debug(
            f"Moved model '{cache_entry.key}' from {source_device} to"
            f" {target_device} in {(end_model_to_time-start_model_to_time):.2f}s."
            f"Estimated model size: {(cache_entry.size/GB):.3f} GB."
@@ -331,7 +318,7 @@ class ModelCache(ModelCacheBase[AnyModel]):
                rel_tol=0.1,
                abs_tol=10 * MB,
            ):
-                self.logger.debug(
+                self._logger.debug(
                    f"Moving model '{cache_entry.key}' from {source_device} to"
                    f" {target_device} caused an unexpected change in VRAM usage. The model's"
                    " estimated size may be incorrect. Estimated model size:"
@@ -339,24 +326,24 @@ class ModelCache(ModelCacheBase[AnyModel]):
                    f"{get_pretty_snapshot_diff(snapshot_before, snapshot_after)}"
                )

-    def print_cuda_stats(self) -> None:
+    def _print_cuda_stats(self) -> None:
        """Log CUDA diagnostics."""
        vram = "%4.2fG" % (torch.cuda.memory_allocated() / GB)
-        ram = "%4.2fG" % (self.cache_size() / GB)
+        ram = "%4.2fG" % (self._get_cache_size() / GB)

        in_ram_models = 0
        in_vram_models = 0
        locked_in_vram_models = 0
        for cache_record in self._cached_models.values():
            if hasattr(cache_record.model, "device"):
-                if cache_record.model.device == self.storage_device:
+                if cache_record.model.device == self._storage_device:
                    in_ram_models += 1
                else:
                    in_vram_models += 1
-                if cache_record.locked:
+                if cache_record.is_locked:
                    locked_in_vram_models += 1

-                self.logger.debug(
+                self._logger.debug(
                    f"Current VRAM/RAM usage: {vram}/{ram}; models_in_ram/models_in_vram(locked) ="
                    f" {in_ram_models}/{in_vram_models}({locked_in_vram_models})"
                )
@@ -369,16 +356,16 @@ class ModelCache(ModelCacheBase[AnyModel]):
        garbage-collected.
        """
        bytes_needed = size
-        maximum_size = self.max_cache_size * GB  # stored in GB, convert to bytes
-        current_size = self.cache_size()
+        maximum_size = self._max_cache_size * GB  # stored in GB, convert to bytes
+        current_size = self._get_cache_size()

        if current_size + bytes_needed > maximum_size:
-            self.logger.debug(
+            self._logger.debug(
                f"Max cache size exceeded: {(current_size/GB):.2f}/{self.max_cache_size:.2f} GB, need an additional"
                f" {(bytes_needed/GB):.2f} GB"
            )

-        self.logger.debug(f"Before making_room: cached_models={len(self._cached_models)}")
+        self._logger.debug(f"Before making_room: cached_models={len(self._cached_models)}")

        pos = 0
        models_cleared = 0
@@ -386,12 +373,12 @@ class ModelCache(ModelCacheBase[AnyModel]):
            model_key = self._cache_stack[pos]
            cache_entry = self._cached_models[model_key]
            device = cache_entry.model.device if hasattr(cache_entry.model, "device") else None
-            self.logger.debug(
+            self._logger.debug(
                f"Model: {model_key}, locks: {cache_entry._locks}, device: {device}, loaded: {cache_entry.loaded}"
            )

-            if not cache_entry.locked:
-                self.logger.debug(
+            if not cache_entry.is_locked:
+                self._logger.debug(
                    f"Removing {model_key} from RAM cache to free at least {(size/GB):.2f} GB (-{(cache_entry.size/GB):.2f} GB)"
                )
                current_size -= cache_entry.size
@@ -419,8 +406,8 @@ class ModelCache(ModelCacheBase[AnyModel]):
            gc.collect()

        TorchDevice.empty_cache()
-        self.logger.debug(f"After making room: cached_models={len(self._cached_models)}")
+        self._logger.debug(f"After making room: cached_models={len(self._cached_models)}")

-    def _delete_cache_entry(self, cache_entry: CacheRecord[AnyModel]) -> None:
+    def _delete_cache_entry(self, cache_entry: CacheRecord) -> None:
        self._cache_stack.remove(cache_entry.key)
        del self._cached_models[cache_entry.key]
--- a/invokeai/backend/model_manager/load/model_cache/model_cache_base.py
+++ b/invokeai/backend/model_manager/load/model_cache/model_cache_base.py
@@ -1,221 +0,0 @@
-# Copyright (c) 2024 Lincoln D. Stein and the InvokeAI Development team
-# TODO: Add Stalker's proper name to copyright
-"""
-Manage a RAM cache of diffusion/transformer models for fast switching.
-They are moved between GPU VRAM and CPU RAM as necessary. If the cache
-grows larger than a preset maximum, then the least recently used
-model will be cleared and (re)loaded from disk when next needed.
-"""
-
-from abc import ABC, abstractmethod
-from dataclasses import dataclass, field
-from logging import Logger
-from typing import Dict, Generic, Optional, TypeVar
-
-import torch
-
-from invokeai.backend.model_manager.config import AnyModel, SubModelType
-
-
-class ModelLockerBase(ABC):
-    """Base class for the model locker used by the loader."""
-
-    @abstractmethod
-    def lock(self) -> AnyModel:
-        """Lock the contained model and move it into VRAM."""
-        pass
-
-    @abstractmethod
-    def unlock(self) -> None:
-        """Unlock the contained model, and remove it from VRAM."""
-        pass
-
-    @abstractmethod
-    def get_state_dict(self) -> Optional[Dict[str, torch.Tensor]]:
-        """Return the state dict (if any) for the cached model."""
-        pass
-
-    @property
-    @abstractmethod
-    def model(self) -> AnyModel:
-        """Return the model."""
-        pass
-
-
-T = TypeVar("T")
-
-
-@dataclass
-class CacheRecord(Generic[T]):
-    """
-    Elements of the cache:
-
-    key: Unique key for each model, same as used in the models database.
-    model: Model in memory.
-    state_dict: A read-only copy of the model's state dict in RAM. It will be
-                used as a template for creating a copy in the VRAM.
-    size: Size of the model
-    loaded: True if the model's state dict is currently in VRAM
-
-    Before a model is executed, the state_dict template is copied into VRAM,
-    and then injected into the model. When the model is finished, the VRAM
-    copy of the state dict is deleted, and the RAM version is reinjected
-    into the model.
-
-    The state_dict should be treated as a read-only attribute. Do not attempt
-    to patch or otherwise modify it. Instead, patch the copy of the state_dict
-    after it is loaded into the execution device (e.g. CUDA) using the `LoadedModel`
-    context manager call `model_on_device()`.
-    """
-
-    key: str
-    model: T
-    device: torch.device
-    state_dict: Optional[Dict[str, torch.Tensor]]
-    size: int
-    loaded: bool = False
-    _locks: int = 0
-
-    def lock(self) -> None:
-        """Lock this record."""
-        self._locks += 1
-
-    def unlock(self) -> None:
-        """Unlock this record."""
-        self._locks -= 1
-        assert self._locks >= 0
-
-    @property
-    def locked(self) -> bool:
-        """Return true if record is locked."""
-        return self._locks > 0
-
-
-@dataclass
-class CacheStats(object):
-    """Collect statistics on cache performance."""
-
-    hits: int = 0  # cache hits
-    misses: int = 0  # cache misses
-    high_watermark: int = 0  # amount of cache used
-    in_cache: int = 0  # number of models in cache
-    cleared: int = 0  # number of models cleared to make space
-    cache_size: int = 0  # total size of cache
-    loaded_model_sizes: Dict[str, int] = field(default_factory=dict)
-
-
-class ModelCacheBase(ABC, Generic[T]):
-    """Virtual base class for RAM model cache."""
-
-    @property
-    @abstractmethod
-    def storage_device(self) -> torch.device:
-        """Return the storage device (e.g. "CPU" for RAM)."""
-        pass
-
-    @property
-    @abstractmethod
-    def execution_device(self) -> torch.device:
-        """Return the exection device (e.g. "cuda" for VRAM)."""
-        pass
-
-    @property
-    @abstractmethod
-    def lazy_offloading(self) -> bool:
-        """Return true if the cache is configured to lazily offload models in VRAM."""
-        pass
-
-    @property
-    @abstractmethod
-    def max_cache_size(self) -> float:
-        """Return the maximum size the RAM cache can grow to."""
-        pass
-
-    @max_cache_size.setter
-    @abstractmethod
-    def max_cache_size(self, value: float) -> None:
-        """Set the cap on vram cache size."""
-
-    @property
-    @abstractmethod
-    def max_vram_cache_size(self) -> float:
-        """Return the maximum size the VRAM cache can grow to."""
-        pass
-
-    @max_vram_cache_size.setter
-    @abstractmethod
-    def max_vram_cache_size(self, value: float) -> float:
-        """Set the maximum size the VRAM cache can grow to."""
-        pass
-
-    @abstractmethod
-    def offload_unlocked_models(self, size_required: int) -> None:
-        """Offload from VRAM any models not actively in use."""
-        pass
-
-    @abstractmethod
-    def move_model_to_device(self, cache_entry: CacheRecord[AnyModel], target_device: torch.device) -> None:
-        """Move model into the indicated device."""
-        pass
-
-    @property
-    @abstractmethod
-    def stats(self) -> Optional[CacheStats]:
-        """Return collected CacheStats object."""
-        pass
-
-    @stats.setter
-    @abstractmethod
-    def stats(self, stats: CacheStats) -> None:
-        """Set the CacheStats object for collectin cache statistics."""
-        pass
-
-    @property
-    @abstractmethod
-    def logger(self) -> Logger:
-        """Return the logger used by the cache."""
-        pass
-
-    @abstractmethod
-    def make_room(self, size: int) -> None:
-        """Make enough room in the cache to accommodate a new model of indicated size."""
-        pass
-
-    @abstractmethod
-    def put(
-        self,
-        key: str,
-        model: T,
-        submodel_type: Optional[SubModelType] = None,
-    ) -> None:
-        """Store model under key and optional submodel_type."""
-        pass
-
-    @abstractmethod
-    def get(
-        self,
-        key: str,
-        submodel_type: Optional[SubModelType] = None,
-        stats_name: Optional[str] = None,
-    ) -> ModelLockerBase:
-        """
-        Retrieve model using key and optional submodel_type.
-
-        :param key: Opaque model key
-        :param submodel_type: Type of the submodel to fetch
-        :param stats_name: A human-readable id for the model for the purposes of
-        stats reporting.
-
-        This may raise an IndexError if the model is not in the cache.
-        """
-        pass
-
-    @abstractmethod
-    def cache_size(self) -> int:
-        """Get the total size of the models currently cached."""
-        pass
-
-    @abstractmethod
-    def print_cuda_stats(self) -> None:
-        """Log debugging information on CUDA usage."""
-        pass
--- a/invokeai/backend/model_manager/load/model_cache/model_locker.py
+++ b/invokeai/backend/model_manager/load/model_cache/model_locker.py
@@ -1,64 +0,0 @@
-"""
-Base class and implementation of a class that moves models in and out of VRAM.
-"""
-
-from typing import Dict, Optional
-
-import torch
-
-from invokeai.backend.model_manager import AnyModel
-from invokeai.backend.model_manager.load.model_cache.model_cache_base import (
-    CacheRecord,
-    ModelCacheBase,
-    ModelLockerBase,
-)
-
-
-class ModelLocker(ModelLockerBase):
-    """Internal class that mediates movement in and out of GPU."""
-
-    def __init__(self, cache: ModelCacheBase[AnyModel], cache_entry: CacheRecord[AnyModel]):
-        """
-        Initialize the model locker.
-
-        :param cache: The ModelCache object
-        :param cache_entry: The entry in the model cache
-        """
-        self._cache = cache
-        self._cache_entry = cache_entry
-
-    @property
-    def model(self) -> AnyModel:
-        """Return the model without moving it around."""
-        return self._cache_entry.model
-
-    def get_state_dict(self) -> Optional[Dict[str, torch.Tensor]]:
-        """Return the state dict (if any) for the cached model."""
-        return self._cache_entry.state_dict
-
-    def lock(self) -> AnyModel:
-        """Move the model into the execution device (GPU) and lock it."""
-        self._cache_entry.lock()
-        try:
-            if self._cache.lazy_offloading:
-                self._cache.offload_unlocked_models(self._cache_entry.size)
-            self._cache.move_model_to_device(self._cache_entry, self._cache.execution_device)
-            self._cache_entry.loaded = True
-            self._cache.logger.debug(f"Locking {self._cache_entry.key} in {self._cache.execution_device}")
-            self._cache.print_cuda_stats()
-        except torch.cuda.OutOfMemoryError:
-            self._cache.logger.warning("Insufficient GPU memory to load model. Aborting")
-            self._cache_entry.unlock()
-            raise
-        except Exception:
-            self._cache_entry.unlock()
-            raise
-
-        return self.model
-
-    def unlock(self) -> None:
-        """Call upon exit from context."""
-        self._cache_entry.unlock()
-        if not self._cache.lazy_offloading:
-            self._cache.offload_unlocked_models(0)
-            self._cache.print_cuda_stats()
--- a/invokeai/backend/model_manager/load/model_loaders/controlnet.py
+++ b/invokeai/backend/model_manager/load/model_loaders/controlnet.py
@@ -1,12 +1,10 @@
 # Copyright (c) 2024, Lincoln D. Stein and the InvokeAI Development Team
 """Class for ControlNet model loading in InvokeAI."""

-from pathlib import Path
 from typing import Optional

 from diffusers import ControlNetModel

-import invokeai.backend.assets.model_base_conf_files as conf_file_cache
 from invokeai.backend.model_manager import (
    AnyModel,
    AnyModelConfig,
@@ -48,20 +46,9 @@ class ControlNetLoader(GenericDiffusersLoader):
        config: AnyModelConfig,
        submodel_type: Optional[SubModelType] = None,
    ) -> AnyModel:
-        config_dirs = {
-            BaseModelType.StableDiffusion1: "controlnet_sd15",
-            BaseModelType.StableDiffusionXL: "controlnet_sdxl",
-        }
-        try:
-            config_dir = config_dirs[config.base]
-        except KeyError:
-            raise Exception(f"No configuration template known for controlnet model with base={config.base}")
-
        if isinstance(config, ControlNetCheckpointConfig):
            return ControlNetModel.from_single_file(
                config.path,
-                config=Path(conf_file_cache.__path__[0], config_dir).as_posix(),
-                local_files_only=True,
                torch_dtype=self._torch_dtype,
            )
        else:
--- a/invokeai/backend/model_manager/load/model_loaders/lora.py
+++ b/invokeai/backend/model_manager/load/model_loaders/lora.py
@@ -26,7 +26,7 @@ from invokeai.backend.model_manager import (
    SubModelType,
 )
 from invokeai.backend.model_manager.load.load_default import ModelLoader
-from invokeai.backend.model_manager.load.model_cache.model_cache_base import ModelCacheBase
+from invokeai.backend.model_manager.load.model_cache.model_cache import ModelCache
 from invokeai.backend.model_manager.load.model_loader_registry import ModelLoaderRegistry


@@ -40,7 +40,7 @@ class LoRALoader(ModelLoader):
        self,
        app_config: InvokeAIAppConfig,
        logger: Logger,
-        ram_cache: ModelCacheBase[AnyModel],
+        ram_cache: ModelCache,
    ):
        """Initialize the loader."""
        super().__init__(app_config, logger, ram_cache)
--- a/invokeai/backend/model_manager/load/model_loaders/stable_diffusion.py
+++ b/invokeai/backend/model_manager/load/model_loaders/stable_diffusion.py
@@ -11,7 +11,6 @@ from diffusers import (
    StableDiffusionXLPipeline,
 )

-import invokeai.backend.assets.model_base_conf_files as conf_file_cache
 from invokeai.backend.model_manager import (
    AnyModel,
    AnyModelConfig,
@@ -19,7 +18,6 @@ from invokeai.backend.model_manager import (
    ModelFormat,
    ModelType,
    ModelVariantType,
-    SchedulerPredictionType,
    SubModelType,
 )
 from invokeai.backend.model_manager.config import (
@@ -27,6 +25,7 @@ from invokeai.backend.model_manager.config import (
    DiffusersConfigBase,
    MainCheckpointConfig,
 )
+from invokeai.backend.model_manager.load.model_cache.model_cache import get_model_cache_key
 from invokeai.backend.model_manager.load.model_loader_registry import ModelLoaderRegistry
 from invokeai.backend.model_manager.load.model_loaders.generic_diffusers import GenericDiffusersLoader
 from invokeai.backend.util.silence_warnings import SilenceWarnings
@@ -108,33 +107,11 @@ class StableDiffusionDiffusersModel(GenericDiffusersLoader):
                ModelVariantType.Normal: StableDiffusionXLPipeline,
            },
        }
-        config_dirs = {
-            BaseModelType.StableDiffusion1: {
-                SchedulerPredictionType.Epsilon: "stable-diffusion-1.5-epsilon",
-                SchedulerPredictionType.VPrediction: "stable-diffusion-1.5-v_prediction",
-            },
-            BaseModelType.StableDiffusion2: {
-                SchedulerPredictionType.VPrediction: "stable-diffusion-2.0-v_prediction",
-            },
-            BaseModelType.StableDiffusionXL: {
-                SchedulerPredictionType.Epsilon: "stable-diffusion-xl-base-1.0",
-            },
-            BaseModelType.StableDiffusionXLRefiner: {
-                SchedulerPredictionType.Epsilon: "stable-diffusion-xl-refiner-1.0",
-            },
-        }
-
        assert isinstance(config, MainCheckpointConfig)
        try:
            load_class = load_classes[config.base][config.variant]
        except KeyError as e:
            raise Exception(f"No diffusers pipeline known for base={config.base}, variant={config.variant}") from e
-        try:
-            config_dir = config_dirs[config.base][config.prediction_type]
-        except KeyError as e:
-            raise Exception(
-                f"No configuration template known for base={config.base}, prediction_type={config.prediction_type}"
-            ) from e

        # Without SilenceWarnings we get log messages like this:
        # site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
@@ -144,17 +121,8 @@ class StableDiffusionDiffusersModel(GenericDiffusersLoader):
        # Some weights of the model checkpoint were not used when initializing CLIPTextModelWithProjection:
        # ['text_model.embeddings.position_ids']

-        original_config_file = self._app_config.legacy_conf_path / config.config_path
-
        with SilenceWarnings():
-            pipeline = load_class.from_single_file(
-                config.path,
-                config=Path(conf_file_cache.__path__[0], config_dir).as_posix(),
-                original_config=original_config_file,
-                torch_dtype=self._torch_dtype,
-                local_files_only=True,
-                kwargs={"load_safety_checker": False},
-            )
+            pipeline = load_class.from_single_file(config.path, torch_dtype=self._torch_dtype)

        if not submodel_type:
            return pipeline
@@ -165,5 +133,5 @@ class StableDiffusionDiffusersModel(GenericDiffusersLoader):
            if subtype == submodel_type:
                continue
            if submodel := getattr(pipeline, subtype.value, None):
-                self._ram_cache.put(config.key, submodel_type=subtype, model=submodel)
+                self._ram_cache.put(get_model_cache_key(config.key, subtype), model=submodel)
        return getattr(pipeline, submodel_type.value)
--- a/invokeai/backend/model_manager/probe.py
+++ b/invokeai/backend/model_manager/probe.py
@@ -684,7 +684,6 @@ class ControlNetCheckpointProbe(CheckpointProbeBase):
            "controlnet_mid_block.bias",
            "input_blocks.2.1.transformer_blocks.0.attn2.to_k.weight",
            "down_blocks.1.attentions.0.transformer_blocks.0.attn2.to_k.weight",
-            "input_blocks.7.0.emb_layers.1.down",
        ):
            if key_name not in checkpoint:
                continue
--- a/invokeai/backend/model_manager/util/model_util.py
+++ b/invokeai/backend/model_manager/util/model_util.py
@@ -52,15 +52,16 @@ def read_checkpoint_meta(path: Union[str, Path], scan: bool = True) -> Dict[str,
        except Exception:
            # TODO: create issue for support "meta"?
            checkpoint = safetensors.torch.load_file(path, device="cpu")
-    elif str(path).endswith(".gguf"):
-        # The GGUF reader used here uses numpy memmap, so these tensors are not loaded into memory during this function
-        checkpoint = gguf_sd_loader(Path(path), compute_dtype=torch.float32)
    else:
        if scan:
            scan_result = scan_file_path(path)
            if scan_result.infected_files != 0 or scan_result.scan_err:
                raise Exception(f'The model file "{path}" is potentially infected by malware. Aborting import.')
-        checkpoint = torch.load(path, map_location=torch.device("meta"))
+        if str(path).endswith(".gguf"):
+            # The GGUF reader used here uses numpy memmap, so these tensors are not loaded into memory during this function
+            checkpoint = gguf_sd_loader(Path(path), compute_dtype=torch.float32)
+        else:
+            checkpoint = torch.load(path, map_location=torch.device("meta"))
    return checkpoint


--- a/invokeai/frontend/web/README.md
+++ b/invokeai/frontend/web/README.md
@@ -1,3 +1,3 @@
 # Invoke UI

-<https://invoke-ai.github.io/InvokeAI/contributing/frontend/>
+<https://invoke-ai.github.io/InvokeAI/contributing/frontend/OVERVIEW/>
--- a/invokeai/frontend/web/public/locales/de.json
+++ b/invokeai/frontend/web/public/locales/de.json
@@ -642,6 +642,12 @@
        "remixImage": "Remix des Bilds erstellen",
        "imageActions": "Weitere Bildaktionen",
        "invoke": {
+            "layer": {
+                "t2iAdapterIncompatibleBboxWidth": "$t(parameters.invoke.layer.t2iAdapterRequiresDimensionsToBeMultipleOf) {{multiple}}, Bbox-Breite ist {{width}}",
+                "t2iAdapterIncompatibleScaledBboxWidth": "$t(parameters.invoke.layer.t2iAdapterRequiresDimensionsToBeMultipleOf) {{multiple}}, Skalierte Bbox-Breite ist {{width}}",
+                "t2iAdapterIncompatibleScaledBboxHeight": "$t(parameters.invoke.layer.t2iAdapterRequiresDimensionsToBeMultipleOf) {{multiple}}, Skalierte Bbox-Höhe ist {{height}}",
+                "t2iAdapterIncompatibleBboxHeight": "$t(parameters.invoke.layer.t2iAdapterRequiresDimensionsToBeMultipleOf) {{multiple}}, Bbox-Höhe ist {{height}}"
+            },
            "fluxModelIncompatibleScaledBboxWidth": "$t(parameters.invoke.fluxRequiresDimensionsToBeMultipleOf16), Skalierte Bbox-Breite ist {{width}}",
            "fluxModelIncompatibleScaledBboxHeight": "$t(parameters.invoke.fluxRequiresDimensionsToBeMultipleOf16), Skalierte Bbox-Höhe ist {{height}}",
            "fluxModelIncompatibleBboxWidth": "$t(parameters.invoke.fluxRequiresDimensionsToBeMultipleOf16), Bbox-Breite ist {{width}}",
--- a/invokeai/frontend/web/public/locales/en.json
+++ b/invokeai/frontend/web/public/locales/en.json
@@ -2133,8 +2133,8 @@
    "whatsNew": {
        "whatsNewInInvoke": "What's New in Invoke",
        "items": [
-            "<StrongComponent>FLUX Regional Guidance (beta)</StrongComponent>: Our beta release of FLUX Regional Guidance is live for regional prompt control.",
-            "<StrongComponent>Various UX Improvements</StrongComponent>: A number of small UX and Quality of Life improvements throughout the app."
+            "<StrongComponent>Workflows</StrongComponent>: Run a workflow for a collection of images using the new <StrongComponent>Image Batch</StrongComponent> node.",
+            "<StrongComponent>FLUX</StrongComponent>: Support for XLabs IP Adapter v2."
        ],
        "readReleaseNotes": "Read Release Notes",
        "watchRecentReleaseVideos": "Watch Recent Release Videos",
--- a/Show More
+++ b/Show More
Author	SHA1	Message	Date
Ryan Dick	987393853c	Create CachedModelOnlyFullLoad class.	2024-12-05 18:43:50 +00:00
Ryan Dick	91c5af1b95	Move CachedModelWithPartialLoad into the main model_cache/ directory.	2024-12-05 18:21:26 +00:00
Ryan Dick	5c67dd507a	Get rid of ModelLocker. It was an unnecessary layer of indirection.	2024-12-05 16:59:40 +00:00
Ryan Dick	2ff928ec17	Move lock(...) and unlock(...) logic from ModelLocker to the ModelCache and make a bunch of ModelCache properties/methods private.	2024-12-05 16:11:40 +00:00
Ryan Dick	4327bbe77e	Pull get_model_cache_key(...) out of ModelCache. The ModelCache should not be concerned with implementation details like the submodel_type.	2024-12-04 22:53:57 +00:00
Ryan Dick	ad1c0d37ef	Rename model_cache_default.py -> model_cache.py.	2024-12-04 22:45:30 +00:00
Ryan Dick	9708d87946	Remove ModelCacheBase.	2024-12-04 22:05:34 +00:00
Ryan Dick	3ad44f7850	Move CacheStats to its own file.	2024-12-04 21:56:50 +00:00
Ryan Dick	9a482981b2	Move CacheRecord out to its own file.	2024-12-04 21:53:19 +00:00
Ryan Dick	6b02362b12	Rip out ModelLockerBase.	2024-12-04 21:47:11 +00:00
Ryan Dick	8fec4ec91c	Tidy up CachedModel and improve unit test coverage.	2024-12-04 20:28:31 +00:00
Ryan Dick	693e421970	Alternative implementation with torch.nn.Linear module streaming.	2024-12-03 22:32:15 +00:00
Ryan Dick	dc14104bc8	Add TorchFunctionAutocastContext	2024-12-03 19:26:46 +00:00
Ryan Dick	f286a1d1f3	Remove debug logs.	2024-12-03 18:04:55 +00:00
Ryan Dick	9dc86b2b71	Add basic CachedModel class with features for partial load/unload.	2024-12-03 17:12:22 +00:00
Ryan Dick	2cab689b79	Naive TorchAutocastContext.	2024-12-03 14:55:43 +00:00