Compare commits

..

3 Commits

44 changed files with 371 additions and 565 deletions

View File

@@ -18,22 +18,6 @@ Note that any releases marked as _pre-release_ are in a beta state. You may expe
The Model Manager tab in the UI provides a few ways to install models, including using your already-downloaded models. You'll see a popup directing you there on first startup. For more information, see the [model install docs].
## Missing models after updating to v4
If you find some models are missing after updating to v4, it's likely they weren't correctly registered before the update and didn't get picked up in the migration.
You can use the `Scan Folder` tab in the Model Manager UI to fix this. The models will either be in the old, now-unused `autoimport` folder, or your `models` folder.
- Find and copy your install's old `autoimport` folder path, install the main install folder.
- Go to the Model Manager and click `Scan Folder`.
- Paste the path and scan.
- IMPORTANT: Uncheck `Inplace install`.
- Click `Install All` to install all found models, or just install the models you want.
Next, find and copy your install's `models` folder path (this could be your custom models folder path, or the `models` folder inside the main install folder).
Follow the same steps to scan and import the missing models.
## Slow generation
- Check the [system requirements] to ensure that your system is capable of generating images.

View File

@@ -44,7 +44,7 @@ The installation process is simple, with a few prompts:
- Select the version to install. Unless you have a specific reason to install a specific version, select the default (the latest version).
- Select location for the install. Be sure you have enough space in this folder for the base application, as described in the [installation requirements].
- Select a GPU device.
- Select a GPU device. If you are unsure, you can let the installer figure it out.
!!! info "Slow Installation"

View File

@@ -6,7 +6,11 @@
## Introduction
InvokeAI is distributed as a python package on PyPI, installable with `pip`. There are a few things that are handled by the installer and launcher that you'll need to manage manually, described in this guide.
!!! tip "Conda"
As of InvokeAI v2.3.0 installation using the `conda` package manager is no longer being supported. It will likely still work, but we are not testing this installation method.
InvokeAI is distributed as a python package on PyPI, installable with `pip`. There are a few things that are handled by the installer that you'll need to manage manually, described in this guide.
### Requirements
@@ -36,11 +40,11 @@ Before you start, go through the [installation requirements].
1. Enter the root (invokeai) directory and create a virtual Python environment within it named `.venv`.
!!! warning "Virtual Environment Location"
!!! info "Virtual Environment Location"
While you may create the virtual environment anywhere in the file system, we recommend that you create it within the root directory as shown here. This allows the application to automatically detect its data directories.
If you choose a different location for the venv, then you _must_ set the `INVOKEAI_ROOT` environment variable or specify the root directory using the `--root` CLI arg.
If you choose a different location for the venv, then you must set the `INVOKEAI_ROOT` environment variable or pass the directory using the `--root` CLI arg.
```terminal
cd $INVOKEAI_ROOT
@@ -77,23 +81,31 @@ Before you start, go through the [installation requirements].
python3 -m pip install --upgrade pip
```
1. Install the InvokeAI Package. The base command is `pip install InvokeAI --use-pep517`, but you may need to change this depending on your system and the desired features.
1. Install the InvokeAI Package. The `--extra-index-url` option is used to select the correct `torch` backend:
- You may need to provide an [extra index URL]. Select your platform configuration using [this tool on the PyTorch website]. Copy the `--extra-index-url` string from this and append it to your install command.
=== "CUDA (NVidia)"
!!! example "Install with an extra index URL"
```bash
pip install "InvokeAI[xformers]" --use-pep517 --extra-index-url https://download.pytorch.org/whl/cu121
```
```bash
pip install InvokeAI --use-pep517 --extra-index-url https://download.pytorch.org/whl/cu121
```
=== "ROCm (AMD)"
- If you have a CUDA GPU and want to install with `xformers`, you need to add an option to the package name. Note that `xformers` is not necessary. PyTorch includes an implementation of the SDP attention algorithm with the same performance.
```bash
pip install InvokeAI --use-pep517 --extra-index-url https://download.pytorch.org/whl/rocm5.6
```
!!! example "Install with `xformers`"
=== "CPU (Intel Macs & non-GPU systems)"
```bash
pip install "InvokeAI[xformers]" --use-pep517
```
```bash
pip install InvokeAI --use-pep517 --extra-index-url https://download.pytorch.org/whl/cpu
```
=== "MPS (Apple Silicon)"
```bash
pip install InvokeAI --use-pep517
```
1. Deactivate and reactivate your runtime directory so that the invokeai-specific commands become available in the environment:
@@ -114,6 +126,37 @@ Before you start, go through the [installation requirements].
Run `invokeai-web` to start the UI. You must activate the virtual environment before running the app.
!!! warning
If the virtual environment you selected is NOT inside `INVOKEAI_ROOT`, then you must specify the path to the root directory by adding
`--root_dir \path\to\invokeai`.
If the virtual environment is _not_ inside the root directory, then you _must_ specify the path to the root directory with `--root_dir \path\to\invokeai` or the `INVOKEAI_ROOT` environment variable.
!!! tip
You can permanently set the location of the runtime directory
by setting the environment variable `INVOKEAI_ROOT` to the
path of the directory. As mentioned previously, this is
recommended if your virtual environment is located outside of
your runtime directory.
## Unsupported Conda Install
Congratulations, you found the "secret" Conda installation instructions. If you really **really** want to use Conda with InvokeAI, you can do so using this unsupported recipe:
```sh
mkdir ~/invokeai
conda create -n invokeai python=3.11
conda activate invokeai
# Adjust this as described above for the appropriate torch backend
pip install InvokeAI[xformers] --use-pep517 --extra-index-url https://download.pytorch.org/whl/cu121
invokeai-web --root ~/invokeai
```
The `pip install` command shown in this recipe is for Linux/Windows
systems with an NVIDIA GPU. See step (6) above for the command to use
with other platforms/GPU combinations. If you don't wish to pass the
`--root` argument to `invokeai` with each launch, you may set the
environment variable `INVOKEAI_ROOT` to point to the installation directory.
Note that if you run into problems with the Conda installation, the InvokeAI
staff will **not** be able to help you out. Caveat Emptor!
[installation requirements]: INSTALL_REQUIREMENTS.md

View File

@@ -3,7 +3,6 @@
InvokeAI installer script
"""
import locale
import os
import platform
import re
@@ -317,9 +316,7 @@ def upgrade_pip(venv_path: Path) -> str | None:
python = str(venv_path.expanduser().resolve() / python)
try:
result = subprocess.check_output([python, "-m", "pip", "install", "--upgrade", "pip"]).decode(
encoding=locale.getpreferredencoding()
)
result = subprocess.check_output([python, "-m", "pip", "install", "--upgrade", "pip"]).decode()
except subprocess.CalledProcessError as e:
print(e)
result = None
@@ -407,29 +404,22 @@ def get_torch_source() -> Tuple[str | None, str | None]:
# device can be one of: "cuda", "rocm", "cpu", "cuda_and_dml, autodetect"
device = select_gpu()
# The correct extra index URLs for torch are inconsistent, see https://pytorch.org/get-started/locally/#start-locally
url = None
optional_modules: str | None = None
optional_modules = "[onnx]"
if OS == "Linux":
if device.value == "rocm":
url = "https://download.pytorch.org/whl/rocm5.6"
elif device.value == "cpu":
url = "https://download.pytorch.org/whl/cpu"
elif device.value == "cuda":
# CUDA uses the default PyPi index
optional_modules = "[xformers,onnx-cuda]"
elif OS == "Windows":
if device.value == "cuda":
url = "https://download.pytorch.org/whl/cu121"
optional_modules = "[xformers,onnx-cuda]"
elif device.value == "cpu":
# CPU uses the default PyPi index, no optional modules
pass
elif OS == "Darwin":
# macOS uses the default PyPi index, no optional modules
pass
if device.value == "cuda_and_dml":
url = "https://download.pytorch.org/whl/cu121"
optional_modules = "[xformers,onnx-directml]"
# Fall back to defaults
# in all other cases, Torch wheels should be coming from PyPi as of Torch 1.13
return (url, optional_modules)

View File

@@ -207,8 +207,10 @@ def dest_path(dest: Optional[str | Path] = None) -> Path | None:
class GpuType(Enum):
CUDA = "cuda"
CUDA_AND_DML = "cuda_and_dml"
ROCM = "rocm"
CPU = "cpu"
AUTODETECT = "autodetect"
def select_gpu() -> GpuType:
@@ -224,6 +226,10 @@ def select_gpu() -> GpuType:
"an [gold1 b]NVIDIA[/] GPU (using CUDA™)",
GpuType.CUDA,
)
nvidia_with_dml = (
"an [gold1 b]NVIDIA[/] GPU (using CUDA™, and DirectML™ for ONNX) -- ALPHA",
GpuType.CUDA_AND_DML,
)
amd = (
"an [gold1 b]AMD[/] GPU (using ROCm™)",
GpuType.ROCM,
@@ -232,19 +238,27 @@ def select_gpu() -> GpuType:
"Do not install any GPU support, use CPU for generation (slow)",
GpuType.CPU,
)
autodetect = (
"I'm not sure what to choose",
GpuType.AUTODETECT,
)
options = []
if OS == "Windows":
options = [nvidia, cpu]
options = [nvidia, nvidia_with_dml, cpu]
if OS == "Linux":
options = [nvidia, amd, cpu]
elif OS == "Darwin":
options = [cpu]
# future CoreML?
if len(options) == 1:
print(f'Your platform [gold1]{OS}-{ARCH}[/] only supports the "{options[0][1]}" driver. Proceeding with that.')
return options[0][1]
# "I don't know" is always added the last option
options.append(autodetect) # type: ignore
options = {str(i): opt for i, opt in enumerate(options, 1)}
console.rule(":space_invader: GPU (Graphics Card) selection :space_invader:")
@@ -278,6 +292,11 @@ def select_gpu() -> GpuType:
),
)
if options[choice][1] is GpuType.AUTODETECT:
console.print(
"No problem. We will install CUDA support first :crossed_fingers: If Invoke does not detect a GPU, please re-run the installer and select one of the other GPU types."
)
return options[choice][1]

View File

@@ -219,13 +219,28 @@ async def scan_for_models(
non_core_model_paths = [p for p in found_model_paths if not p.is_relative_to(core_models_path)]
installed_models = ApiDependencies.invoker.services.model_manager.store.search_by_attr()
resolved_installed_model_paths: list[str] = []
installed_model_sources: list[str] = []
# This call lists all installed models.
for model in installed_models:
path = pathlib.Path(model.path)
# If the model has a source, we need to add it to the list of installed sources.
if model.source:
installed_model_sources.append(model.source)
# If the path is not absolute, that means it is in the app models directory, and we need to join it with
# the models path before resolving.
if not path.is_absolute():
resolved_installed_model_paths.append(str(pathlib.Path(models_path, path).resolve()))
continue
resolved_installed_model_paths.append(str(path.resolve()))
scan_results: list[FoundModel] = []
# Check if the model is installed by comparing paths, appending to the scan result.
# Check if the model is installed by comparing the resolved paths, appending to the scan result.
for p in non_core_model_paths:
path = str(p)
is_installed = any(str(models_path / m.path) == path for m in installed_models)
is_installed = path in resolved_installed_model_paths or path in installed_model_sources
found_model = FoundModel(path=path, is_installed=is_installed)
scan_results.append(found_model)
except Exception as e:

View File

@@ -1,22 +1,21 @@
from builtins import float
from typing import List, Literal, Union
from typing import List, Union
from pydantic import BaseModel, Field, field_validator, model_validator
from typing_extensions import Self
from invokeai.app.invocations.baseinvocation import BaseInvocation, BaseInvocationOutput, invocation, invocation_output
from invokeai.app.invocations.baseinvocation import (
BaseInvocation,
BaseInvocationOutput,
invocation,
invocation_output,
)
from invokeai.app.invocations.fields import FieldDescriptions, Input, InputField, OutputField, UIType
from invokeai.app.invocations.model import ModelIdentifierField
from invokeai.app.invocations.primitives import ImageField
from invokeai.app.invocations.util import validate_begin_end_step, validate_weights
from invokeai.app.services.shared.invocation_context import InvocationContext
from invokeai.backend.model_manager.config import (
AnyModelConfig,
BaseModelType,
IPAdapterCheckpointConfig,
IPAdapterInvokeAIConfig,
ModelType,
)
from invokeai.backend.model_manager.config import AnyModelConfig, BaseModelType, IPAdapterConfig, ModelType
class IPAdapterField(BaseModel):
@@ -49,15 +48,12 @@ class IPAdapterOutput(BaseInvocationOutput):
ip_adapter: IPAdapterField = OutputField(description=FieldDescriptions.ip_adapter, title="IP-Adapter")
CLIP_VISION_MODEL_MAP = {"ViT-H": "ip_adapter_sd_image_encoder", "ViT-G": "ip_adapter_sdxl_image_encoder"}
@invocation("ip_adapter", title="IP-Adapter", tags=["ip_adapter", "control"], category="ip_adapter", version="1.2.2")
class IPAdapterInvocation(BaseInvocation):
"""Collects IP-Adapter info to pass to other nodes."""
# Inputs
image: Union[ImageField, List[ImageField]] = InputField(description="The IP-Adapter image prompt(s).", ui_order=1)
image: Union[ImageField, List[ImageField]] = InputField(description="The IP-Adapter image prompt(s).")
ip_adapter_model: ModelIdentifierField = InputField(
description="The IP-Adapter model.",
title="IP-Adapter Model",
@@ -65,11 +61,7 @@ class IPAdapterInvocation(BaseInvocation):
ui_order=-1,
ui_type=UIType.IPAdapterModel,
)
clip_vision_model: Literal["auto", "ViT-H", "ViT-G"] = InputField(
description="CLIP Vision model to use. Overrides model settings. Mandatory for checkpoint models.",
default="auto",
ui_order=2,
)
weight: Union[float, List[float]] = InputField(
default=1, description="The weight given to the IP-Adapter", title="Weight"
)
@@ -94,21 +86,10 @@ class IPAdapterInvocation(BaseInvocation):
def invoke(self, context: InvocationContext) -> IPAdapterOutput:
# Lookup the CLIP Vision encoder that is intended to be used with the IP-Adapter model.
ip_adapter_info = context.models.get_config(self.ip_adapter_model.key)
assert isinstance(ip_adapter_info, (IPAdapterInvokeAIConfig, IPAdapterCheckpointConfig))
if self.clip_vision_model == "auto":
if isinstance(ip_adapter_info, IPAdapterInvokeAIConfig):
image_encoder_model_id = ip_adapter_info.image_encoder_model_id
image_encoder_model_name = image_encoder_model_id.split("/")[-1].strip()
else:
raise RuntimeError(
"You need to set the appropriate CLIP Vision model for checkpoint IP Adapter models."
)
else:
image_encoder_model_name = CLIP_VISION_MODEL_MAP[self.clip_vision_model]
assert isinstance(ip_adapter_info, IPAdapterConfig)
image_encoder_model_id = ip_adapter_info.image_encoder_model_id
image_encoder_model_name = image_encoder_model_id.split("/")[-1].strip()
image_encoder_model = self._get_image_encoder(context, image_encoder_model_name)
return IPAdapterOutput(
ip_adapter=IPAdapterField(
image=self.image,
@@ -121,25 +102,19 @@ class IPAdapterInvocation(BaseInvocation):
)
def _get_image_encoder(self, context: InvocationContext, image_encoder_model_name: str) -> AnyModelConfig:
image_encoder_models = context.models.search_by_attrs(
name=image_encoder_model_name, base=BaseModelType.Any, type=ModelType.CLIPVision
)
if not len(image_encoder_models) > 0:
context.logger.warning(
f"The image encoder required by this IP Adapter ({image_encoder_model_name}) is not installed. \
Downloading and installing now. This may take a while."
)
installer = context._services.model_manager.install
job = installer.heuristic_import(f"InvokeAI/{image_encoder_model_name}")
installer.wait_for_job(job, timeout=600) # Wait for up to 10 minutes
found = False
while not found:
image_encoder_models = context.models.search_by_attrs(
name=image_encoder_model_name, base=BaseModelType.Any, type=ModelType.CLIPVision
)
if len(image_encoder_models) == 0:
context.logger.error("Error while fetching CLIP Vision Image Encoder")
assert len(image_encoder_models) == 1
found = len(image_encoder_models) > 0
if not found:
context.logger.warning(
f"The image encoder required by this IP Adapter ({image_encoder_model_name}) is not installed."
)
context.logger.warning("Downloading and installing now. This may take a while.")
installer = context._services.model_manager.install
job = installer.heuristic_import(f"InvokeAI/{image_encoder_model_name}")
installer.wait_for_job(job, timeout=600) # wait up to 10 minutes - then raise a TimeoutException
assert len(image_encoder_models) == 1
return image_encoder_models[0]

View File

@@ -43,7 +43,11 @@ from invokeai.app.invocations.fields import (
WithMetadata,
)
from invokeai.app.invocations.ip_adapter import IPAdapterField
from invokeai.app.invocations.primitives import DenoiseMaskOutput, ImageOutput, LatentsOutput
from invokeai.app.invocations.primitives import (
DenoiseMaskOutput,
ImageOutput,
LatentsOutput,
)
from invokeai.app.invocations.t2i_adapter import T2IAdapterField
from invokeai.app.services.shared.invocation_context import InvocationContext
from invokeai.app.util.controlnet_utils import prepare_control_image
@@ -64,7 +68,12 @@ from ...backend.stable_diffusion.diffusers_pipeline import (
)
from ...backend.stable_diffusion.schedulers import SCHEDULER_MAP
from ...backend.util.devices import choose_precision, choose_torch_device
from .baseinvocation import BaseInvocation, BaseInvocationOutput, invocation, invocation_output
from .baseinvocation import (
BaseInvocation,
BaseInvocationOutput,
invocation,
invocation_output,
)
from .controlnet_image_processors import ControlField
from .model import ModelIdentifierField, UNetField, VAEField

View File

@@ -2,8 +2,16 @@ from typing import Any, Literal, Optional, Union
from pydantic import BaseModel, ConfigDict, Field
from invokeai.app.invocations.baseinvocation import BaseInvocation, BaseInvocationOutput, invocation, invocation_output
from invokeai.app.invocations.controlnet_image_processors import CONTROLNET_MODE_VALUES, CONTROLNET_RESIZE_VALUES
from invokeai.app.invocations.baseinvocation import (
BaseInvocation,
BaseInvocationOutput,
invocation,
invocation_output,
)
from invokeai.app.invocations.controlnet_image_processors import (
CONTROLNET_MODE_VALUES,
CONTROLNET_RESIZE_VALUES,
)
from invokeai.app.invocations.fields import (
FieldDescriptions,
ImageField,
@@ -35,7 +43,6 @@ class IPAdapterMetadataField(BaseModel):
image: ImageField = Field(description="The IP-Adapter image prompt.")
ip_adapter_model: ModelIdentifierField = Field(description="The IP-Adapter model.")
clip_vision_model: Literal["ViT-H", "ViT-G"] = Field(description="The CLIP Vision model")
weight: Union[float, list[float]] = Field(description="The weight given to the IP-Adapter")
begin_step_percent: float = Field(description="When the IP-Adapter is first applied (% of total steps)")
end_step_percent: float = Field(description="When the IP-Adapter is last applied (% of total steps)")

View File

@@ -3,7 +3,6 @@
from __future__ import annotations
import locale
import os
import re
import shutil
@@ -318,10 +317,11 @@ class InvokeAIAppConfig(BaseSettings):
@staticmethod
def find_root() -> Path:
"""Choose the runtime root directory when not specified on command line or init file."""
venv = Path(os.environ.get("VIRTUAL_ENV") or ".")
if os.environ.get("INVOKEAI_ROOT"):
root = Path(os.environ["INVOKEAI_ROOT"])
elif venv := os.environ.get("VIRTUAL_ENV", None):
root = Path(venv).parent.resolve()
elif any((venv.parent / x).exists() for x in [INIT_FILE, LEGACY_INIT_FILE]):
root = (venv.parent).resolve()
else:
root = Path("~/invokeai").expanduser().resolve()
return root
@@ -402,7 +402,7 @@ def load_and_migrate_config(config_path: Path) -> InvokeAIAppConfig:
An instance of `InvokeAIAppConfig` with the loaded and migrated settings.
"""
assert config_path.suffix == ".yaml"
with open(config_path, "rt", encoding=locale.getpreferredencoding()) as file:
with open(config_path) as file:
loaded_config_dict = yaml.safe_load(file)
assert isinstance(loaded_config_dict, dict)

View File

@@ -1,6 +1,5 @@
"""Model installation class."""
import locale
import os
import re
import signal
@@ -324,8 +323,7 @@ class ModelInstallService(ModelInstallServiceBase):
legacy_models_yaml_path = Path(self._app_config.root_path, legacy_models_yaml_path)
if legacy_models_yaml_path.exists():
with open(legacy_models_yaml_path, "rt", encoding=locale.getpreferredencoding()) as file:
legacy_models_yaml = yaml.safe_load(file)
legacy_models_yaml = yaml.safe_load(legacy_models_yaml_path.read_text())
yaml_metadata = legacy_models_yaml.pop("__metadata__")
yaml_version = yaml_metadata.get("version")
@@ -566,7 +564,7 @@ class ModelInstallService(ModelInstallServiceBase):
# The model is not in the models directory - we don't need to move it.
return model
new_path = models_dir / model.base.value / model.type.value / old_path.name
new_path = (models_dir / model.base.value / model.type.value / model.name).with_suffix(old_path.suffix)
if old_path == new_path or new_path.exists() and old_path == new_path.resolve():
return model

View File

@@ -1,11 +1,8 @@
# copied from https://github.com/tencent-ailab/IP-Adapter (Apache License 2.0)
# and modified as needed
import pathlib
from typing import List, Optional, TypedDict, Union
from typing import Optional, Union
import safetensors
import safetensors.torch
import torch
from PIL import Image
from transformers import CLIPImageProcessor, CLIPVisionModelWithProjection
@@ -16,17 +13,10 @@ from ..raw_model import RawModel
from .resampler import Resampler
class IPAdapterStateDict(TypedDict):
ip_adapter: dict[str, torch.Tensor]
image_proj: dict[str, torch.Tensor]
class ImageProjModel(torch.nn.Module):
"""Image Projection Model"""
def __init__(
self, cross_attention_dim: int = 1024, clip_embeddings_dim: int = 1024, clip_extra_context_tokens: int = 4
):
def __init__(self, cross_attention_dim=1024, clip_embeddings_dim=1024, clip_extra_context_tokens=4):
super().__init__()
self.cross_attention_dim = cross_attention_dim
@@ -35,7 +25,7 @@ class ImageProjModel(torch.nn.Module):
self.norm = torch.nn.LayerNorm(cross_attention_dim)
@classmethod
def from_state_dict(cls, state_dict: dict[str, torch.Tensor], clip_extra_context_tokens: int = 4):
def from_state_dict(cls, state_dict: dict[torch.Tensor], clip_extra_context_tokens=4):
"""Initialize an ImageProjModel from a state_dict.
The cross_attention_dim and clip_embeddings_dim are inferred from the shape of the tensors in the state_dict.
@@ -55,7 +45,7 @@ class ImageProjModel(torch.nn.Module):
model.load_state_dict(state_dict)
return model
def forward(self, image_embeds: torch.Tensor):
def forward(self, image_embeds):
embeds = image_embeds
clip_extra_context_tokens = self.proj(embeds).reshape(
-1, self.clip_extra_context_tokens, self.cross_attention_dim
@@ -67,7 +57,7 @@ class ImageProjModel(torch.nn.Module):
class MLPProjModel(torch.nn.Module):
"""SD model with image prompt"""
def __init__(self, cross_attention_dim: int = 1024, clip_embeddings_dim: int = 1024):
def __init__(self, cross_attention_dim=1024, clip_embeddings_dim=1024):
super().__init__()
self.proj = torch.nn.Sequential(
@@ -78,7 +68,7 @@ class MLPProjModel(torch.nn.Module):
)
@classmethod
def from_state_dict(cls, state_dict: dict[str, torch.Tensor]):
def from_state_dict(cls, state_dict: dict[torch.Tensor]):
"""Initialize an MLPProjModel from a state_dict.
The cross_attention_dim and clip_embeddings_dim are inferred from the shape of the tensors in the state_dict.
@@ -97,7 +87,7 @@ class MLPProjModel(torch.nn.Module):
model.load_state_dict(state_dict)
return model
def forward(self, image_embeds: torch.Tensor):
def forward(self, image_embeds):
clip_extra_context_tokens = self.proj(image_embeds)
return clip_extra_context_tokens
@@ -107,7 +97,7 @@ class IPAdapter(RawModel):
def __init__(
self,
state_dict: IPAdapterStateDict,
state_dict: dict[str, torch.Tensor],
device: torch.device,
dtype: torch.dtype = torch.float16,
num_tokens: int = 4,
@@ -139,27 +129,24 @@ class IPAdapter(RawModel):
return calc_model_size_by_data(self._image_proj_model) + calc_model_size_by_data(self.attn_weights)
def _init_image_proj_model(
self, state_dict: dict[str, torch.Tensor]
) -> Union[ImageProjModel, Resampler, MLPProjModel]:
def _init_image_proj_model(self, state_dict):
return ImageProjModel.from_state_dict(state_dict, self._num_tokens).to(self.device, dtype=self.dtype)
@torch.inference_mode()
def get_image_embeds(self, pil_image: List[Image.Image], image_encoder: CLIPVisionModelWithProjection):
def get_image_embeds(self, pil_image, image_encoder: CLIPVisionModelWithProjection):
if isinstance(pil_image, Image.Image):
pil_image = [pil_image]
clip_image = self._clip_image_processor(images=pil_image, return_tensors="pt").pixel_values
clip_image_embeds = image_encoder(clip_image.to(self.device, dtype=self.dtype)).image_embeds
try:
image_prompt_embeds = self._image_proj_model(clip_image_embeds)
uncond_image_prompt_embeds = self._image_proj_model(torch.zeros_like(clip_image_embeds))
return image_prompt_embeds, uncond_image_prompt_embeds
except RuntimeError as e:
raise RuntimeError("Selected CLIP Vision Model is incompatible with the current IP Adapter") from e
image_prompt_embeds = self._image_proj_model(clip_image_embeds)
uncond_image_prompt_embeds = self._image_proj_model(torch.zeros_like(clip_image_embeds))
return image_prompt_embeds, uncond_image_prompt_embeds
class IPAdapterPlus(IPAdapter):
"""IP-Adapter with fine-grained features"""
def _init_image_proj_model(self, state_dict: dict[str, torch.Tensor]) -> Union[Resampler, MLPProjModel]:
def _init_image_proj_model(self, state_dict):
return Resampler.from_state_dict(
state_dict=state_dict,
depth=4,
@@ -170,32 +157,31 @@ class IPAdapterPlus(IPAdapter):
).to(self.device, dtype=self.dtype)
@torch.inference_mode()
def get_image_embeds(self, pil_image: List[Image.Image], image_encoder: CLIPVisionModelWithProjection):
def get_image_embeds(self, pil_image, image_encoder: CLIPVisionModelWithProjection):
if isinstance(pil_image, Image.Image):
pil_image = [pil_image]
clip_image = self._clip_image_processor(images=pil_image, return_tensors="pt").pixel_values
clip_image = clip_image.to(self.device, dtype=self.dtype)
clip_image_embeds = image_encoder(clip_image, output_hidden_states=True).hidden_states[-2]
image_prompt_embeds = self._image_proj_model(clip_image_embeds)
uncond_clip_image_embeds = image_encoder(torch.zeros_like(clip_image), output_hidden_states=True).hidden_states[
-2
]
try:
image_prompt_embeds = self._image_proj_model(clip_image_embeds)
uncond_image_prompt_embeds = self._image_proj_model(uncond_clip_image_embeds)
return image_prompt_embeds, uncond_image_prompt_embeds
except RuntimeError as e:
raise RuntimeError("Selected CLIP Vision Model is incompatible with the current IP Adapter") from e
uncond_image_prompt_embeds = self._image_proj_model(uncond_clip_image_embeds)
return image_prompt_embeds, uncond_image_prompt_embeds
class IPAdapterFull(IPAdapterPlus):
"""IP-Adapter Plus with full features."""
def _init_image_proj_model(self, state_dict: dict[str, torch.Tensor]):
def _init_image_proj_model(self, state_dict: dict[torch.Tensor]):
return MLPProjModel.from_state_dict(state_dict).to(self.device, dtype=self.dtype)
class IPAdapterPlusXL(IPAdapterPlus):
"""IP-Adapter Plus for SDXL."""
def _init_image_proj_model(self, state_dict: dict[str, torch.Tensor]):
def _init_image_proj_model(self, state_dict):
return Resampler.from_state_dict(
state_dict=state_dict,
depth=4,
@@ -206,48 +192,24 @@ class IPAdapterPlusXL(IPAdapterPlus):
).to(self.device, dtype=self.dtype)
def load_ip_adapter_tensors(ip_adapter_ckpt_path: pathlib.Path, device: str) -> IPAdapterStateDict:
state_dict: IPAdapterStateDict = {"ip_adapter": {}, "image_proj": {}}
if ip_adapter_ckpt_path.suffix == ".safetensors":
model = safetensors.torch.load_file(ip_adapter_ckpt_path, device=device)
for key in model.keys():
if key.startswith("image_proj."):
state_dict["image_proj"][key.replace("image_proj.", "")] = model[key]
elif key.startswith("ip_adapter."):
state_dict["ip_adapter"][key.replace("ip_adapter.", "")] = model[key]
else:
raise RuntimeError(f"Encountered unexpected IP Adapter state dict key: '{key}'.")
else:
ip_adapter_diffusers_checkpoint_path = ip_adapter_ckpt_path / "ip_adapter.bin"
state_dict = torch.load(ip_adapter_diffusers_checkpoint_path, map_location="cpu")
return state_dict
def build_ip_adapter(
ip_adapter_ckpt_path: pathlib.Path, device: torch.device, dtype: torch.dtype = torch.float16
) -> Union[IPAdapter, IPAdapterPlus, IPAdapterPlusXL, IPAdapterPlus]:
state_dict = load_ip_adapter_tensors(ip_adapter_ckpt_path, device.type)
ip_adapter_ckpt_path: str, device: torch.device, dtype: torch.dtype = torch.float16
) -> Union[IPAdapter, IPAdapterPlus]:
state_dict = torch.load(ip_adapter_ckpt_path, map_location="cpu")
# IPAdapter (with ImageProjModel)
if "proj.weight" in state_dict["image_proj"]:
if "proj.weight" in state_dict["image_proj"]: # IPAdapter (with ImageProjModel).
return IPAdapter(state_dict, device=device, dtype=dtype)
# IPAdaterPlus or IPAdapterPlusXL (with Resampler)
elif "proj_in.weight" in state_dict["image_proj"]:
elif "proj_in.weight" in state_dict["image_proj"]: # IPAdaterPlus or IPAdapterPlusXL (with Resampler).
cross_attention_dim = state_dict["ip_adapter"]["1.to_k_ip.weight"].shape[-1]
if cross_attention_dim == 768:
return IPAdapterPlus(state_dict, device=device, dtype=dtype) # SD1 IP-Adapter Plus
# SD1 IP-Adapter Plus
return IPAdapterPlus(state_dict, device=device, dtype=dtype)
elif cross_attention_dim == 2048:
return IPAdapterPlusXL(state_dict, device=device, dtype=dtype) # SDXL IP-Adapter Plus
# SDXL IP-Adapter Plus
return IPAdapterPlusXL(state_dict, device=device, dtype=dtype)
else:
raise Exception(f"Unsupported IP-Adapter Plus cross-attention dimension: {cross_attention_dim}.")
# IPAdapterFull (with MLPProjModel)
elif "proj.0.weight" in state_dict["image_proj"]:
elif "proj.0.weight" in state_dict["image_proj"]: # IPAdapterFull (with MLPProjModel).
return IPAdapterFull(state_dict, device=device, dtype=dtype)
# Unrecognized IP Adapter Architectures
else:
raise ValueError(f"'{ip_adapter_ckpt_path}' has an unrecognized IP-Adapter model architecture.")

View File

@@ -9,8 +9,8 @@ import torch.nn as nn
# FFN
def FeedForward(dim: int, mult: int = 4):
inner_dim = dim * mult
def FeedForward(dim, mult=4):
inner_dim = int(dim * mult)
return nn.Sequential(
nn.LayerNorm(dim),
nn.Linear(dim, inner_dim, bias=False),
@@ -19,8 +19,8 @@ def FeedForward(dim: int, mult: int = 4):
)
def reshape_tensor(x: torch.Tensor, heads: int):
bs, length, _ = x.shape
def reshape_tensor(x, heads):
bs, length, width = x.shape
# (bs, length, width) --> (bs, length, n_heads, dim_per_head)
x = x.view(bs, length, heads, -1)
# (bs, length, n_heads, dim_per_head) --> (bs, n_heads, length, dim_per_head)
@@ -31,7 +31,7 @@ def reshape_tensor(x: torch.Tensor, heads: int):
class PerceiverAttention(nn.Module):
def __init__(self, *, dim: int, dim_head: int = 64, heads: int = 8):
def __init__(self, *, dim, dim_head=64, heads=8):
super().__init__()
self.scale = dim_head**-0.5
self.dim_head = dim_head
@@ -45,7 +45,7 @@ class PerceiverAttention(nn.Module):
self.to_kv = nn.Linear(dim, inner_dim * 2, bias=False)
self.to_out = nn.Linear(inner_dim, dim, bias=False)
def forward(self, x: torch.Tensor, latents: torch.Tensor):
def forward(self, x, latents):
"""
Args:
x (torch.Tensor): image features
@@ -80,14 +80,14 @@ class PerceiverAttention(nn.Module):
class Resampler(nn.Module):
def __init__(
self,
dim: int = 1024,
depth: int = 8,
dim_head: int = 64,
heads: int = 16,
num_queries: int = 8,
embedding_dim: int = 768,
output_dim: int = 1024,
ff_mult: int = 4,
dim=1024,
depth=8,
dim_head=64,
heads=16,
num_queries=8,
embedding_dim=768,
output_dim=1024,
ff_mult=4,
):
super().__init__()
@@ -110,15 +110,7 @@ class Resampler(nn.Module):
)
@classmethod
def from_state_dict(
cls,
state_dict: dict[str, torch.Tensor],
depth: int = 8,
dim_head: int = 64,
heads: int = 16,
num_queries: int = 8,
ff_mult: int = 4,
):
def from_state_dict(cls, state_dict: dict[torch.Tensor], depth=8, dim_head=64, heads=16, num_queries=8, ff_mult=4):
"""A convenience function that initializes a Resampler from a state_dict.
Some of the shape parameters are inferred from the state_dict (e.g. dim, embedding_dim, etc.). At the time of
@@ -153,7 +145,7 @@ class Resampler(nn.Module):
model.load_state_dict(state_dict)
return model
def forward(self, x: torch.Tensor):
def forward(self, x):
latents = self.latents.repeat(x.size(0), 1, 1)
x = self.proj_in(x)

View File

@@ -3,7 +3,7 @@
import bisect
from pathlib import Path
from typing import Dict, List, Optional, Tuple, Union
from typing import Dict, List, Optional, Tuple, Type, Union
import torch
from safetensors.torch import load_file
@@ -457,6 +457,55 @@ class LoRAModelRaw(RawModel): # (torch.nn.Module):
return new_state_dict
@classmethod
def _keys_match(cls, keys: set[str], required_keys: set[str], optional_keys: set[str]) -> bool:
"""Check if the set of keys matches the required and optional keys."""
if len(required_keys - keys) > 0:
# missing required keys.
return False
non_required_keys = keys - required_keys
for k in non_required_keys:
if k not in optional_keys:
# unexpected key
return False
return True
@classmethod
def get_layer_type_from_state_dict_keys(cls, peft_layer_keys: set[str]) -> Type[AnyLoRALayer]:
"""Infer the parameter-efficient finetuning model type from the state dict keys."""
common_optional_keys = {"alpha", "bias_indices", "bias_values", "bias_size"}
if cls._keys_match(
peft_layer_keys,
required_keys={"lora_down.weight", "lora_up.weight"},
optional_keys=common_optional_keys | {"lora_mid.weight"},
):
return LoRALayer
if cls._keys_match(
peft_layer_keys,
required_keys={"hada_w1_b", "hada_w1_a", "hada_w2_b", "hada_w2_a"},
optional_keys=common_optional_keys | {"hada_t1", "hada_t2"},
):
return LoHALayer
if cls._keys_match(
peft_layer_keys,
required_keys=set(),
optional_keys=common_optional_keys
| {"lokr_w1", "lokr_w1_a", "lokr_w1_b", "lokr_w2", "lokr_w2_a", "lokr_w2_b", "lokr_t2"},
):
return LoKRLayer
if cls._keys_match(peft_layer_keys, required_keys={"diff"}, optional_keys=common_optional_keys):
return FullLayer
if cls._keys_match(peft_layer_keys, required_keys={"weight", "on_input"}, optional_keys=common_optional_keys):
return IA3Layer
raise ValueError(f"Unsupported PEFT model type with keys: {peft_layer_keys}")
@classmethod
def from_checkpoint(
cls,
@@ -486,37 +535,21 @@ class LoRAModelRaw(RawModel): # (torch.nn.Module):
if base_model == BaseModelType.StableDiffusionXL:
state_dict = cls._convert_sdxl_keys_to_diffusers_format(state_dict)
# We assume that all layers have the same PEFT layer type. This saves time by not having to infer the type for
# each layer.
first_module_key = next(iter(state_dict))
peft_layer_keys = set(state_dict[first_module_key].keys())
layer_cls = cls.get_layer_type_from_state_dict_keys(peft_layer_keys)
for layer_key, values in state_dict.items():
# lora and locon
if "lora_down.weight" in values:
layer: AnyLoRALayer = LoRALayer(layer_key, values)
# loha
elif "hada_w1_b" in values:
layer = LoHALayer(layer_key, values)
# lokr
elif "lokr_w1_b" in values or "lokr_w1" in values:
layer = LoKRLayer(layer_key, values)
# diff
elif "diff" in values:
layer = FullLayer(layer_key, values)
# ia3
elif "weight" in values and "on_input" in values:
layer = IA3Layer(layer_key, values)
else:
print(f">> Encountered unknown lora layer module in {model.name}: {layer_key} - {list(values.keys())}")
raise Exception("Unknown lora format!")
# lower memory consumption by removing already parsed layer values
state_dict[layer_key].clear()
layer = layer_cls(layer_key, values)
# TODO(ryand): This .to() call causes an implicit CUDA sync point in a tight loop. This is very slow (even
# slower than loading the weights from disk). We should ideally only be copying the weights once - right
# before they are used. Or, if we want to do this here, then setting non_blocking = True would probably
# help.
layer.to(device=device, dtype=dtype)
model.layers[layer_key] = layer
return model
@staticmethod

View File

@@ -323,13 +323,10 @@ class MainDiffusersConfig(DiffusersConfigBase, MainConfigBase):
return Tag(f"{ModelType.Main.value}.{ModelFormat.Diffusers.value}")
class IPAdapterBaseConfig(ModelConfigBase):
class IPAdapterConfig(ModelConfigBase):
"""Model config for IP Adaptor format models."""
type: Literal[ModelType.IPAdapter] = ModelType.IPAdapter
class IPAdapterInvokeAIConfig(IPAdapterBaseConfig):
"""Model config for IP Adapter diffusers format models."""
image_encoder_model_id: str
format: Literal[ModelFormat.InvokeAI]
@@ -338,16 +335,6 @@ class IPAdapterInvokeAIConfig(IPAdapterBaseConfig):
return Tag(f"{ModelType.IPAdapter.value}.{ModelFormat.InvokeAI.value}")
class IPAdapterCheckpointConfig(IPAdapterBaseConfig):
"""Model config for IP Adapter checkpoint format models."""
format: Literal[ModelFormat.Checkpoint]
@staticmethod
def get_tag() -> Tag:
return Tag(f"{ModelType.IPAdapter.value}.{ModelFormat.Checkpoint.value}")
class CLIPVisionDiffusersConfig(DiffusersConfigBase):
"""Model config for CLIPVision."""
@@ -403,8 +390,7 @@ AnyModelConfig = Annotated[
Annotated[LoRADiffusersConfig, LoRADiffusersConfig.get_tag()],
Annotated[TextualInversionFileConfig, TextualInversionFileConfig.get_tag()],
Annotated[TextualInversionFolderConfig, TextualInversionFolderConfig.get_tag()],
Annotated[IPAdapterInvokeAIConfig, IPAdapterInvokeAIConfig.get_tag()],
Annotated[IPAdapterCheckpointConfig, IPAdapterCheckpointConfig.get_tag()],
Annotated[IPAdapterConfig, IPAdapterConfig.get_tag()],
Annotated[T2IAdapterConfig, T2IAdapterConfig.get_tag()],
Annotated[CLIPVisionDiffusersConfig, CLIPVisionDiffusersConfig.get_tag()],
],

View File

@@ -429,8 +429,4 @@ class ModelCache(ModelCacheBase[AnyModel]):
)
free_mem, _ = torch.cuda.mem_get_info(torch.device(vram_device))
if needed_size > free_mem:
needed_gb = round(needed_size / GIG, 2)
free_gb = round(free_mem / GIG, 2)
raise torch.cuda.OutOfMemoryError(
f"Insufficient VRAM to load model, requested {needed_gb}GB but only had {free_gb}GB free"
)
raise torch.cuda.OutOfMemoryError

View File

@@ -7,13 +7,19 @@ from typing import Optional
import torch
from invokeai.backend.ip_adapter.ip_adapter import build_ip_adapter
from invokeai.backend.model_manager import AnyModel, AnyModelConfig, BaseModelType, ModelFormat, ModelType, SubModelType
from invokeai.backend.model_manager import (
AnyModel,
AnyModelConfig,
BaseModelType,
ModelFormat,
ModelType,
SubModelType,
)
from invokeai.backend.model_manager.load import ModelLoader, ModelLoaderRegistry
from invokeai.backend.raw_model import RawModel
@ModelLoaderRegistry.register(base=BaseModelType.Any, type=ModelType.IPAdapter, format=ModelFormat.InvokeAI)
@ModelLoaderRegistry.register(base=BaseModelType.Any, type=ModelType.IPAdapter, format=ModelFormat.Checkpoint)
class IPAdapterInvokeAILoader(ModelLoader):
"""Class to load IP Adapter diffusers models."""
@@ -26,7 +32,7 @@ class IPAdapterInvokeAILoader(ModelLoader):
raise ValueError("There are no submodels in an IP-Adapter model.")
model_path = Path(config.path)
model: RawModel = build_ip_adapter(
ip_adapter_ckpt_path=model_path,
ip_adapter_ckpt_path=str(model_path / "ip_adapter.bin"),
device=torch.device("cpu"),
dtype=self._torch_dtype,
)

View File

@@ -230,10 +230,9 @@ class ModelProbe(object):
return ModelType.LoRA
elif any(key.startswith(v) for v in {"controlnet", "control_model", "input_blocks"}):
return ModelType.ControlNet
elif any(key.startswith(v) for v in {"image_proj.", "ip_adapter."}):
return ModelType.IPAdapter
elif key in {"emb_params", "string_to_param"}:
return ModelType.TextualInversion
else:
# diffusers-ti
if len(ckpt) < 10 and all(isinstance(v, torch.Tensor) for v in ckpt.values()):
@@ -324,7 +323,7 @@ class ModelProbe(object):
with SilenceWarnings():
if model_path.suffix.endswith((".ckpt", ".pt", ".pth", ".bin")):
cls._scan_model(model_path.name, model_path)
model = torch.load(model_path, map_location="cpu")
model = torch.load(model_path)
assert isinstance(model, dict)
return model
else:
@@ -528,25 +527,8 @@ class ControlNetCheckpointProbe(CheckpointProbeBase):
class IPAdapterCheckpointProbe(CheckpointProbeBase):
"""Class for probing IP Adapters"""
def get_base_type(self) -> BaseModelType:
checkpoint = self.checkpoint
for key in checkpoint.keys():
if not key.startswith(("image_proj.", "ip_adapter.")):
continue
cross_attention_dim = checkpoint["ip_adapter.1.to_k_ip.weight"].shape[-1]
if cross_attention_dim == 768:
return BaseModelType.StableDiffusion1
elif cross_attention_dim == 1024:
return BaseModelType.StableDiffusion2
elif cross_attention_dim == 2048:
return BaseModelType.StableDiffusionXL
else:
raise InvalidModelConfigException(
f"IP-Adapter had unexpected cross-attention dimension: {cross_attention_dim}."
)
raise InvalidModelConfigException(f"{self.model_path}: Unable to determine base type")
raise NotImplementedError()
class CLIPVisionCheckpointProbe(CheckpointProbeBase):
@@ -786,7 +768,7 @@ class T2IAdapterFolderProbe(FolderProbeBase):
)
# Register probe classes
############## register probe classes ######
ModelProbe.register_probe("diffusers", ModelType.Main, PipelineFolderProbe)
ModelProbe.register_probe("diffusers", ModelType.VAE, VaeFolderProbe)
ModelProbe.register_probe("diffusers", ModelType.LoRA, LoRAFolderProbe)

View File

@@ -217,7 +217,6 @@
"saveControlImage": "Save Control Image",
"scribble": "scribble",
"selectModel": "Select a model",
"selectCLIPVisionModel": "Select a CLIP Vision model",
"setControlImageDimensions": "Set Control Image Dimensions To W/H",
"showAdvanced": "Show Advanced",
"small": "Small",
@@ -656,7 +655,6 @@
"install": "Install",
"installAll": "Install All",
"installRepo": "Install Repo",
"ipAdapters": "IP Adapters",
"load": "Load",
"localOnly": "local only",
"manual": "Manual",

View File

@@ -43,7 +43,6 @@ export const addModelInstallEventListener = (startAppListening: AppStartListenin
})
);
dispatch(api.util.invalidateTags([{ type: 'ModelConfig', id: LIST_TAG }]));
dispatch(api.util.invalidateTags([{ type: 'ModelScanFolderResults', id: LIST_TAG }]));
},
});

View File

@@ -1,18 +1,12 @@
import type { ComboboxOnChange, ComboboxOption } from '@invoke-ai/ui-library';
import { Combobox, Flex, FormControl, Tooltip } from '@invoke-ai/ui-library';
import { Combobox, FormControl, Tooltip } from '@invoke-ai/ui-library';
import { createMemoizedSelector } from 'app/store/createMemoizedSelector';
import { useAppDispatch, useAppSelector } from 'app/store/storeHooks';
import { useGroupedModelCombobox } from 'common/hooks/useGroupedModelCombobox';
import { useControlAdapterCLIPVisionModel } from 'features/controlAdapters/hooks/useControlAdapterCLIPVisionModel';
import { useControlAdapterIsEnabled } from 'features/controlAdapters/hooks/useControlAdapterIsEnabled';
import { useControlAdapterModel } from 'features/controlAdapters/hooks/useControlAdapterModel';
import { useControlAdapterModels } from 'features/controlAdapters/hooks/useControlAdapterModels';
import { useControlAdapterType } from 'features/controlAdapters/hooks/useControlAdapterType';
import {
controlAdapterCLIPVisionModelChanged,
controlAdapterModelChanged,
} from 'features/controlAdapters/store/controlAdaptersSlice';
import type { CLIPVisionModel } from 'features/controlAdapters/store/types';
import { controlAdapterModelChanged } from 'features/controlAdapters/store/controlAdaptersSlice';
import { selectGenerationSlice } from 'features/parameters/store/generationSlice';
import { memo, useCallback, useMemo } from 'react';
import { useTranslation } from 'react-i18next';
@@ -35,7 +29,6 @@ const ParamControlAdapterModel = ({ id }: ParamControlAdapterModelProps) => {
const { modelConfig } = useControlAdapterModel(id);
const dispatch = useAppDispatch();
const currentBaseModel = useAppSelector((s) => s.generation.model?.base);
const currentCLIPVisionModel = useControlAdapterCLIPVisionModel(id);
const mainModel = useAppSelector(selectMainModel);
const { t } = useTranslation();
@@ -56,16 +49,6 @@ const ParamControlAdapterModel = ({ id }: ParamControlAdapterModelProps) => {
[dispatch, id]
);
const onCLIPVisionModelChange = useCallback<ComboboxOnChange>(
(v) => {
if (!v?.value) {
return;
}
dispatch(controlAdapterCLIPVisionModelChanged({ id, clipVisionModel: v.value as CLIPVisionModel }));
},
[dispatch, id]
);
const selectedModel = useMemo(
() => (modelConfig && controlAdapterType ? { ...modelConfig, model_type: controlAdapterType } : null),
[controlAdapterType, modelConfig]
@@ -88,51 +71,18 @@ const ParamControlAdapterModel = ({ id }: ParamControlAdapterModelProps) => {
isLoading,
});
const clipVisionOptions = useMemo<ComboboxOption[]>(
() => [
{ label: 'ViT-H', value: 'ViT-H' },
{ label: 'ViT-G', value: 'ViT-G' },
],
[]
);
const clipVisionModel = useMemo(
() => clipVisionOptions.find((o) => o.value === currentCLIPVisionModel),
[clipVisionOptions, currentCLIPVisionModel]
);
return (
<Flex sx={{ gap: 2 }}>
<Tooltip label={value?.description}>
<FormControl
isDisabled={!isEnabled}
isInvalid={!value || mainModel?.base !== modelConfig?.base}
sx={{ width: '100%' }}
>
<Combobox
options={options}
placeholder={t('controlnet.selectModel')}
value={value}
onChange={onChange}
noOptionsMessage={noOptionsMessage}
/>
</FormControl>
</Tooltip>
{modelConfig?.type === 'ip_adapter' && modelConfig.format === 'checkpoint' && (
<FormControl
isDisabled={!isEnabled}
isInvalid={!value || mainModel?.base !== modelConfig?.base}
sx={{ width: 'max-content', minWidth: 28 }}
>
<Combobox
options={clipVisionOptions}
placeholder={t('controlnet.selectCLIPVisionModel')}
value={clipVisionModel}
onChange={onCLIPVisionModelChange}
/>
</FormControl>
)}
</Flex>
<Tooltip label={value?.description}>
<FormControl isDisabled={!isEnabled} isInvalid={!value || mainModel?.base !== modelConfig?.base}>
<Combobox
options={options}
placeholder={t('controlnet.selectModel')}
value={value}
onChange={onChange}
noOptionsMessage={noOptionsMessage}
/>
</FormControl>
</Tooltip>
);
};

View File

@@ -1,24 +0,0 @@
import { createMemoizedSelector } from 'app/store/createMemoizedSelector';
import { useAppSelector } from 'app/store/storeHooks';
import {
selectControlAdapterById,
selectControlAdaptersSlice,
} from 'features/controlAdapters/store/controlAdaptersSlice';
import { useMemo } from 'react';
export const useControlAdapterCLIPVisionModel = (id: string) => {
const selector = useMemo(
() =>
createMemoizedSelector(selectControlAdaptersSlice, (controlAdapters) => {
const cn = selectControlAdapterById(controlAdapters, id);
if (cn && cn?.type === 'ip_adapter') {
return cn.clipVisionModel;
}
}),
[id]
);
const clipVisionModel = useAppSelector(selector);
return clipVisionModel;
};

View File

@@ -14,7 +14,6 @@ import { v4 as uuidv4 } from 'uuid';
import { controlAdapterImageProcessed } from './actions';
import { CONTROLNET_PROCESSORS } from './constants';
import type {
CLIPVisionModel,
ControlAdapterConfig,
ControlAdapterProcessorType,
ControlAdaptersState,
@@ -245,13 +244,6 @@ export const controlAdaptersSlice = createSlice({
}
caAdapter.updateOne(state, { id, changes: { controlMode } });
},
controlAdapterCLIPVisionModelChanged: (
state,
action: PayloadAction<{ id: string; clipVisionModel: CLIPVisionModel }>
) => {
const { id, clipVisionModel } = action.payload;
caAdapter.updateOne(state, { id, changes: { clipVisionModel } });
},
controlAdapterResizeModeChanged: (
state,
action: PayloadAction<{
@@ -389,7 +381,6 @@ export const {
controlAdapterProcessedImageChanged,
controlAdapterIsEnabledChanged,
controlAdapterModelChanged,
controlAdapterCLIPVisionModelChanged,
controlAdapterWeightChanged,
controlAdapterBeginStepPctChanged,
controlAdapterEndStepPctChanged,

View File

@@ -243,15 +243,12 @@ export type T2IAdapterConfig = {
shouldAutoConfig: boolean;
};
export type CLIPVisionModel = 'ViT-H' | 'ViT-G';
export type IPAdapterConfig = {
type: 'ip_adapter';
id: string;
isEnabled: boolean;
controlImage: string | null;
model: ParameterIPAdapterModel | null;
clipVisionModel: CLIPVisionModel;
weight: number;
beginStepPct: number;
endStepPct: number;

View File

@@ -46,7 +46,6 @@ export const initialIPAdapter: Omit<IPAdapterConfig, 'id'> = {
isEnabled: true,
controlImage: null,
model: null,
clipVisionModel: 'ViT-H',
weight: 1,
beginStepPct: 0,
endStepPct: 1,

View File

@@ -372,7 +372,6 @@ const parseIPAdapter: MetadataParseFunc<IPAdapterConfigMetadata> = async (metada
type: 'ip_adapter',
isEnabled: true,
model: zModelIdentifierField.parse(ipAdapterModel),
clipVisionModel: 'ViT-H',
controlImage: image?.image_name ?? null,
weight: weight ?? initialIPAdapter.weight,
beginStepPct: begin_step_percent ?? initialIPAdapter.beginStepPct,

View File

@@ -87,10 +87,6 @@ export const ModelInstallQueueItem = (props: ModelListItemProps) => {
}, [installJob.source]);
const progressValue = useMemo(() => {
if (installJob.status === 'completed' || installJob.status === 'error' || installJob.status === 'cancelled') {
return 100;
}
if (isNil(installJob.bytes) || isNil(installJob.total_bytes)) {
return null;
}
@@ -100,7 +96,7 @@ export const ModelInstallQueueItem = (props: ModelListItemProps) => {
}
return (installJob.bytes / installJob.total_bytes) * 100;
}, [installJob.bytes, installJob.status, installJob.total_bytes]);
}, [installJob.bytes, installJob.total_bytes]);
return (
<Flex gap={3} w="full" alignItems="center">

View File

@@ -1,19 +1,48 @@
import { Badge, Box, Flex, IconButton, Text } from '@invoke-ai/ui-library';
import { useAppDispatch } from 'app/store/storeHooks';
import { addToast } from 'features/system/store/systemSlice';
import { makeToast } from 'features/system/util/makeToast';
import { useCallback } from 'react';
import { useTranslation } from 'react-i18next';
import { PiPlusBold } from 'react-icons/pi';
import type { ScanFolderResponse } from 'services/api/endpoints/models';
import { useInstallModelMutation } from 'services/api/endpoints/models';
type Props = {
result: ScanFolderResponse[number];
installModel: (source: string) => void;
};
export const ScanModelResultItem = ({ result, installModel }: Props) => {
export const ScanModelResultItem = ({ result }: Props) => {
const { t } = useTranslation();
const dispatch = useAppDispatch();
const handleInstall = useCallback(() => {
installModel(result.path);
}, [installModel, result]);
const [installModel] = useInstallModelMutation();
const handleQuickAdd = useCallback(() => {
installModel({ source: result.path })
.unwrap()
.then((_) => {
dispatch(
addToast(
makeToast({
title: t('toast.modelAddedSimple'),
status: 'success',
})
)
);
})
.catch((error) => {
if (error) {
dispatch(
addToast(
makeToast({
title: `${error.data.detail} `,
status: 'error',
})
)
);
}
});
}, [installModel, result, dispatch, t]);
return (
<Flex alignItems="center" justifyContent="space-between" w="100%" gap={3}>
@@ -25,7 +54,7 @@ export const ScanModelResultItem = ({ result, installModel }: Props) => {
{result.is_installed ? (
<Badge>{t('common.installed')}</Badge>
) : (
<IconButton aria-label={t('modelManager.install')} icon={<PiPlusBold />} onClick={handleInstall} size="sm" />
<IconButton aria-label={t('modelManager.install')} icon={<PiPlusBold />} onClick={handleQuickAdd} size="sm" />
)}
</Box>
</Flex>

View File

@@ -1,10 +1,7 @@
import {
Button,
Checkbox,
Divider,
Flex,
FormControl,
FormLabel,
Heading,
IconButton,
Input,
@@ -15,7 +12,7 @@ import { useAppDispatch } from 'app/store/storeHooks';
import ScrollableContent from 'common/components/OverlayScrollbars/ScrollableContent';
import { addToast } from 'features/system/store/systemSlice';
import { makeToast } from 'features/system/util/makeToast';
import type { ChangeEvent, ChangeEventHandler } from 'react';
import type { ChangeEventHandler } from 'react';
import { useCallback, useMemo, useState } from 'react';
import { useTranslation } from 'react-i18next';
import { PiXBold } from 'react-icons/pi';
@@ -31,7 +28,7 @@ export const ScanModelsResults = ({ results }: ScanModelResultsProps) => {
const { t } = useTranslation();
const [searchTerm, setSearchTerm] = useState('');
const dispatch = useAppDispatch();
const [inplace, setInplace] = useState(true);
const [installModel] = useInstallModelMutation();
const filteredResults = useMemo(() => {
@@ -45,10 +42,6 @@ export const ScanModelsResults = ({ results }: ScanModelResultsProps) => {
setSearchTerm(e.target.value.trim());
}, []);
const onChangeInplace = useCallback((e: ChangeEvent<HTMLInputElement>) => {
setInplace(e.target.checked);
}, []);
const clearSearch = useCallback(() => {
setSearchTerm('');
}, []);
@@ -58,7 +51,7 @@ export const ScanModelsResults = ({ results }: ScanModelResultsProps) => {
if (result.is_installed) {
continue;
}
installModel({ source: result.path, inplace })
installModel({ source: result.path })
.unwrap()
.then((_) => {
dispatch(
@@ -83,37 +76,7 @@ export const ScanModelsResults = ({ results }: ScanModelResultsProps) => {
}
});
}
}, [filteredResults, installModel, inplace, dispatch, t]);
const handleInstallOne = useCallback(
(source: string) => {
installModel({ source, inplace })
.unwrap()
.then((_) => {
dispatch(
addToast(
makeToast({
title: t('toast.modelAddedSimple'),
status: 'success',
})
)
);
})
.catch((error) => {
if (error) {
dispatch(
addToast(
makeToast({
title: `${error.data.detail} `,
status: 'error',
})
)
);
}
});
},
[installModel, inplace, dispatch, t]
);
}, [installModel, filteredResults, dispatch, t]);
return (
<>
@@ -122,10 +85,6 @@ export const ScanModelsResults = ({ results }: ScanModelResultsProps) => {
<Flex justifyContent="space-between" alignItems="center">
<Heading size="sm">{t('modelManager.scanResults')}</Heading>
<Flex alignItems="center" gap={3}>
<FormControl w="min-content">
<FormLabel m={0}>{t('modelManager.inplaceInstall')}</FormLabel>
<Checkbox isChecked={inplace} onChange={onChangeInplace} size="md" />
</FormControl>
<Button size="sm" onClick={handleAddAll} isDisabled={filteredResults.length === 0}>
{t('modelManager.installAll')}
</Button>
@@ -157,7 +116,7 @@ export const ScanModelsResults = ({ results }: ScanModelResultsProps) => {
<ScrollableContent>
<Flex flexDir="column" gap={3}>
{filteredResults.map((result) => (
<ScanModelResultItem key={result.path} result={result} installModel={handleInstallOne} />
<ScanModelResultItem key={result.path} result={result} />
))}
</Flex>
</ScrollableContent>

View File

@@ -90,13 +90,11 @@ const ModelListItem = (props: ModelListItemProps) => {
cursor="pointer"
onClick={handleSelectModel}
>
<Flex gap={2} w="full" h="full" minW={0}>
<Flex gap={2} w="full" h="full">
<ModelImage image_url={model.cover_image} />
<Flex gap={1} alignItems="flex-start" flexDir="column" w="full" minW={0}>
<Flex gap={1} alignItems="flex-start" flexDir="column" w="full">
<Flex gap={2} w="full" alignItems="flex-start">
<Text fontWeight="semibold" noOfLines={1} wordBreak="break-all">
{model.name}
</Text>
<Text fontWeight="semibold">{model.name}</Text>
<Spacer />
</Flex>
<Text variant="subtext" noOfLines={1}>

View File

@@ -87,9 +87,9 @@ export const Model = () => {
<Flex flexDir="column" gap={4}>
<Flex alignItems="flex-start" gap={4}>
<ModelImageUpload model_key={selectedModelKey} model_image={data.cover_image} />
<Flex flexDir="column" gap={1} flexGrow={1} minW={0}>
<Flex flexDir="column" gap={1} flexGrow={1}>
<Flex gap={2}>
<Heading as="h2" fontSize="lg" noOfLines={1} wordBreak="break-all">
<Heading as="h2" fontSize="lg">
{data.name}
</Heading>
<Spacer />
@@ -114,7 +114,7 @@ export const Model = () => {
)}
</Flex>
{data.source && (
<Text variant="subtext" noOfLines={1} wordBreak="break-all">
<Text variant="subtext">
{t('modelManager.source')}: {data?.source}
</Text>
)}

View File

@@ -9,9 +9,7 @@ export const ModelAttrView = ({ label, value }: Props) => {
return (
<FormControl flexDir="column" alignItems="flex-start" gap={0}>
<FormLabel>{label}</FormLabel>
<Text fontSize="md" noOfLines={1} wordBreak="break-all">
{value || '-'}
</Text>
<Text fontSize="md">{value || '-'}</Text>
</FormControl>
);
};

View File

@@ -53,7 +53,7 @@ export const ModelView = () => {
</>
)}
{data.type === 'ip_adapter' && data.format === 'invokeai' && (
{data.type === 'ip_adapter' && (
<Flex gap={2}>
<ModelAttrView label={t('modelManager.imageEncoderModelId')} value={data.image_encoder_model_id} />
</Flex>

View File

@@ -48,7 +48,7 @@ export const addIPAdapterToLinearGraph = async (
if (!ipAdapter.model) {
return;
}
const { id, weight, model, clipVisionModel, beginStepPct, endStepPct, controlImage } = ipAdapter;
const { id, weight, model, beginStepPct, endStepPct, controlImage } = ipAdapter;
assert(controlImage, 'IP Adapter image is required');
@@ -58,7 +58,6 @@ export const addIPAdapterToLinearGraph = async (
is_intermediate: true,
weight: weight,
ip_adapter_model: model,
clip_vision_model: clipVisionModel,
begin_step_percent: beginStepPct,
end_step_percent: endStepPct,
image: {
@@ -84,7 +83,7 @@ export const addIPAdapterToLinearGraph = async (
};
const buildIPAdapterMetadata = (ipAdapter: IPAdapterConfig): S['IPAdapterMetadataField'] => {
const { controlImage, beginStepPct, endStepPct, model, clipVisionModel, weight } = ipAdapter;
const { controlImage, beginStepPct, endStepPct, model, weight } = ipAdapter;
assert(model, 'IP Adapter model is required');
@@ -100,7 +99,6 @@ const buildIPAdapterMetadata = (ipAdapter: IPAdapterConfig): S['IPAdapterMetadat
return {
ip_adapter_model: model,
clip_vision_model: clipVisionModel,
weight,
begin_step_percent: beginStepPct,
end_step_percent: endStepPct,

View File

@@ -61,7 +61,7 @@ export const AdvancedSettingsAccordion = memo(() => {
return (
<StandaloneAccordion label={t('accordions.advanced.title')} badges={badges} isOpen={isOpen} onToggle={onToggle}>
<Flex gap={4} alignItems="center" p={4} flexDir="column" data-testid="advanced-settings-accordion">
<Flex gap={4} alignItems="center" p={4} flexDir="column">
<Flex gap={4} w="full">
<ParamVAEModelSelect />
<ParamVAEPrecision />

View File

@@ -77,7 +77,7 @@ export const ControlSettingsAccordion: React.FC = memo(() => {
return (
<StandaloneAccordion label={t('accordions.control.title')} badges={badges} isOpen={isOpen} onToggle={onToggle}>
<Flex gap={2} p={4} flexDir="column" data-testid="control-accordion">
<Flex gap={2} p={4} flexDir="column">
<ButtonGroup size="sm" w="full" justifyContent="space-between" variant="ghost" isAttached={false}>
<Button
tooltip={t('controlnet.addControlNet')}

View File

@@ -53,7 +53,7 @@ export const GenerationSettingsAccordion = memo(() => {
isOpen={isOpenAccordion}
onToggle={onToggleAccordion}
>
<Box px={4} pt={4} data-testid="generation-accordion">
<Box px={4} pt={4}>
<Flex gap={4} flexDir="column">
<Flex gap={4} alignItems="center">
<ParamMainModelSelect />

View File

@@ -83,7 +83,7 @@ export const ImageSettingsAccordion = memo(() => {
isOpen={isOpenAccordion}
onToggle={onToggleAccordion}
>
<Flex px={4} pt={4} w="full" h="full" flexDir="column" data-testid="image-settings-accordion">
<Flex px={4} pt={4} w="full" h="full" flexDir="column">
{activeTabName === 'unifiedCanvas' ? <ImageSizeCanvas /> : <ImageSizeLinear />}
<Expander label={t('accordions.advanced.options')} isOpen={isOpenExpander} onToggle={onToggleExpander}>
<Flex gap={4} pb={4} flexDir="column">

View File

@@ -195,7 +195,6 @@ export const modelsApi = api.injectEndpoints({
url: buildModelsUrl(`scan_folder?${folderQueryStr}`),
};
},
providesTags: [{ type: 'ModelScanFolderResults', id: LIST_TAG }],
}),
getHuggingFaceModels: build.query<GetHuggingFaceModelsResponse, string>({
query: (hugging_face_repo) => {

View File

@@ -29,7 +29,6 @@ const tagTypes = [
'InvocationCacheStatus',
'ModelConfig',
'ModelInstalls',
'ModelScanFolderResults',
'T2IAdapterModel',
'MainModel',
'VaeModel',

File diff suppressed because one or more lines are too long

View File

@@ -46,7 +46,7 @@ export type LoRAModelConfig = S['LoRADiffusersConfig'] | S['LoRALyCORISConfig'];
// TODO(MM2): Can we rename this from Vae -> VAE
export type VAEModelConfig = S['VAECheckpointConfig'] | S['VAEDiffusersConfig'];
export type ControlNetModelConfig = S['ControlNetDiffusersConfig'] | S['ControlNetCheckpointConfig'];
export type IPAdapterModelConfig = S['IPAdapterInvokeAIConfig'] | S['IPAdapterCheckpointConfig'];
export type IPAdapterModelConfig = S['IPAdapterConfig'];
export type T2IAdapterModelConfig = S['T2IAdapterConfig'];
type TextualInversionModelConfig = S['TextualInversionFileConfig'] | S['TextualInversionFolderConfig'];
type DiffusersModelConfig = S['MainDiffusersConfig'];

View File

@@ -1 +1 @@
__version__ = "4.0.2"
__version__ = "4.0.1"

View File

@@ -87,11 +87,9 @@ def test_rename(
key = mm2_installer.install_path(embedding_file)
model_record = store.get_model(key)
assert model_record.path.endswith("sd-1/embedding/test_embedding.safetensors")
store.update_model(key, ModelRecordChanges(name="new model name", base=BaseModelType("sd-2")))
store.update_model(key, ModelRecordChanges(name="new_name.safetensors", base=BaseModelType("sd-2")))
new_model_record = mm2_installer.sync_model_path(key)
# Renaming the model record shouldn't rename the file
assert new_model_record.name == "new model name"
assert new_model_record.path.endswith("sd-2/embedding/test_embedding.safetensors")
assert new_model_record.path.endswith("sd-2/embedding/new_name.safetensors")
@pytest.mark.parametrize(