Compare commits

...

1773 Commits

Author SHA1 Message Date
psychedelicious
2576bb00ab Revert "Implementing support for Non-Standard LoRA Format (#7985)"
This reverts commit 1f63b60021.
2025-05-06 10:53:41 +10:00
psychedelicious
862e2a3e49 chore(ui): typegen 2025-05-05 16:09:13 -04:00
Mary Hipp
d22fd32b05 typegen 2025-05-05 16:09:13 -04:00
Mary Hipp
391e5b7f8c update schema 2025-05-05 16:09:13 -04:00
Mary Hipp
c9d2a5f59a display credit column in queue list if shouldShowCredits is true 2025-05-05 16:09:13 -04:00
Kent Keirsey
1f63b60021 Implementing support for Non-Standard LoRA Format (#7985)
* integrate loRA

* idk anymore tbh

* enable fused matrix for quantized models

* integrate loRA

* idk anymore tbh

* enable fused matrix for quantized models

* ruff fix

---------

Co-authored-by: Sam <bhaskarmdutt@gmail.com>
Co-authored-by: psychedelicious <4822129+psychedelicious@users.noreply.github.com>
2025-05-05 09:40:38 -04:00
psychedelicious
a499b9f54e chore: bump version to v5.11.0rc2 2025-05-05 23:32:27 +10:00
psychedelicious
104505ea02 chore(ui): lint 2025-05-05 23:25:29 +10:00
psychedelicious
ee4002607c feat(ui): add UI to reset hf token 2025-05-05 23:25:29 +10:00
psychedelicious
fd20582cdd chore(ui): typegen 2025-05-05 23:25:29 +10:00
psychedelicious
43b0d07517 feat(api): add route to reset hf token 2025-05-05 23:25:29 +10:00
blessedcoolant
f83592a052 fix: deprecation warning in get_iso_timestemp 2025-05-05 11:45:30 +10:00
Mary Hipp
b3ee906749 add prompt validation to imagen3 graph 2025-05-01 13:02:13 -04:00
psychedelicious
5d69e9068a feat(ui): add ability to globally disable hotkeys
This will both hide the hotkey from the hotkey modal and override any other enabled status it has.
2025-05-01 10:50:34 -04:00
psychedelicious
a79136b058 fix(ui): always add selectModelsTab hotkey data to prevent unhandled exception while registering the hotkey handler 2025-05-01 10:50:34 -04:00
psychedelicious
944af4d4a9 feat(ui): show unsupported gen mode toasts as warnings intead of errors 2025-05-01 23:25:01 +10:00
psychedelicious
5e001be73a tidy(ui): remove excessive nav to mm buttons 2025-05-01 23:22:19 +10:00
psychedelicious
576a644b3a tidy(ui): modelpicker component 2025-05-01 23:22:19 +10:00
psychedelicious
703557c8a6 feat(ui): cleanup 2025-05-01 23:22:19 +10:00
psychedelicious
d59a53b3f9 feat(ui): simplify picker types 2025-05-01 23:22:19 +10:00
psychedelicious
7b8f78c2d9 fix(ui): focus bug w/ popvoer 2025-05-01 23:22:19 +10:00
psychedelicious
31ab9be79a feat(ui): iterate on picker 2025-05-01 23:22:19 +10:00
psychedelicious
5011fab85d fix(ui): restore FLUX Dev info popover to main model picker 2025-05-01 10:59:51 +10:00
psychedelicious
92bdb9fdcc chore(ui): remove unused exports 2025-05-01 10:59:51 +10:00
Mary Hipp
548e766c0b feat(ui): ability to disable generating with API models 2025-05-01 10:59:51 +10:00
Mary Hipp
ff897f74a1 send the list of reference images reversed to chatGPT so it matches displayed order 2025-04-30 15:56:38 -04:00
psychedelicious
3d29c996ed feat(ui): support img2img for chatgpt 4o w/ ref images 2025-04-30 13:39:05 +10:00
psychedelicious
42d57d1225 fix(ui): ref image layout 2025-04-30 13:39:05 +10:00
psychedelicious
193fa9395a fix(ui): match ref image model to main model when creating global ref image 2025-04-30 13:39:05 +10:00
psychedelicious
56cd839d5b feat(ui): support for ref images for chatgpt on canvas 2025-04-30 13:39:05 +10:00
ubansi
7b446ee40d docs: fix Contribute node import error
When I followed the Contribute Node documentation, I encountered an import error.
This commit fixes the error, which will help reduce debugging time for all future contributors.
2025-04-29 21:03:00 -04:00
Mary Hipp Rogers
17027c4070 Maryhipp/chatgpt UI (#7969)
* add GPTimage1 as allowed base model

* fix for non-disabled inpaint layers

* lots of boilerplate for adding gpt-image base model and disabling things along with imagen

* handle gpt-image dimensions

* build graph for gpt-image

* lint

* feat(ui): make chatgpt model naming consistent

* feat(ui): graph builder naming

* feat(ui): disable img2img for imagen3

* feat(ui): more naming

* feat(ui): support presigned url prefetch

* feat(ui): disable neg prompt for chatgpt

* docs(ui): update docstring

* feat(ui): fix graph building issues for chatgpt

* fix(ui): node ids for chatgpt/imagen

* chore(ui): typegen

---------

Co-authored-by: Mary Hipp <maryhipp@Marys-MacBook-Air.local>
Co-authored-by: psychedelicious <4822129+psychedelicious@users.noreply.github.com>
2025-04-29 09:38:03 -04:00
psychedelicious
13d44f47ce chore(ui): prettier 2025-04-29 09:12:49 +10:00
psychedelicious
550fbdeb1c fix(ui): more types fixes 2025-04-29 09:12:49 +10:00
psychedelicious
a01cd7c497 fix(ui): add chatgpt-4o to zod schemas that need to match autogenerated types 2025-04-29 09:12:49 +10:00
Mary Hipp
c54afd600c typegen 2025-04-29 09:12:49 +10:00
Mary Hipp
4f911a0ea8 typegen 2025-04-29 09:12:49 +10:00
Mary Hipp
fb91f48722 change base model for chatGPT 4o 2025-04-29 09:12:49 +10:00
psychedelicious
69db60a614 fix(ui): toast typo 2025-04-29 06:56:36 +10:00
Mary Hipp
c6d7f951aa typegen 2025-04-28 15:39:11 -04:00
Mary Hipp
04c005284c add gpt-image to possible base model types 2025-04-28 15:39:11 -04:00
psychedelicious
2d7f9697bf chore(ui): lint 2025-04-28 13:31:26 -04:00
psychedelicious
ae530492a2 chore(ui): typegen 2025-04-28 13:31:26 -04:00
psychedelicious
87ed1e3b6d feat(ui): do not allow imagen3 nodes in published workflows 2025-04-28 13:31:26 -04:00
psychedelicious
cc54466db9 fix(nodes): default value for UIConfigBase.tags 2025-04-28 13:31:26 -04:00
psychedelicious
cbdafe7e38 feat(nodes): allow node clobbering 2025-04-28 13:31:26 -04:00
psychedelicious
112cb76174 fix: random seed for edit mode imagen 2025-04-28 13:31:26 -04:00
psychedelicious
e56d41ab99 feat: rip out enhance prompt as toggleable option, imagen always randomizes seed 2025-04-28 13:31:26 -04:00
psychedelicious
273dfd86ab fix(ui): upscale builder 2025-04-28 13:31:26 -04:00
psychedelicious
871271fde5 feat(ui): rough out imagen3 support for canvas 2025-04-28 13:31:26 -04:00
psychedelicious
14944872c4 feat(mm): add model taxonomy for API models & Imagen3 as base model type 2025-04-28 13:31:26 -04:00
psychedelicious
07bcf3c446 feat(ui): port bbox select to native select 2025-04-28 13:31:26 -04:00
psychedelicious
8ed5585285 feat(nodes): move output metadata to BaseInvocationOutput 2025-04-28 09:19:43 -04:00
psychedelicious
5ce226a467 chore(ui): typegen 2025-04-28 09:19:43 -04:00
Mary Hipp
c64f20a72b remove output_metdata from schema 2025-04-28 09:19:43 -04:00
Mary Hipp
0c9c10a03a update schema 2025-04-28 09:19:43 -04:00
Mary Hipp
4a0df6b865 add optional output_metadata to baseinvocation 2025-04-28 09:19:43 -04:00
psychedelicious
ba165572bf chore: bump version to v5.11.0rc1 2025-04-28 10:10:50 +10:00
psychedelicious
c3d6a10603 fix(ui): handle minor breaking typing change from serialize-error 2025-04-28 09:53:08 +10:00
psychedelicious
4efc86299d fix(ui): type error in SettingsUpsellMenuItem 2025-04-28 09:53:08 +10:00
psychedelicious
e8c7cf63fd fix(ui): type error in canvas worker 2025-04-28 09:53:08 +10:00
psychedelicious
698b034190 chore(ui): bump deps 2025-04-28 09:53:08 +10:00
psychedelicious
3988128c40 feat(ui): add _all_ image outputs to gallery (including collections) 2025-04-28 09:49:04 +10:00
psychedelicious
c768f47365 fix(ui): dnd autoscroll in scrollable containers 2025-04-28 09:46:38 +10:00
psychedelicious
19a63abc54 fix(ui): hide file size on model picker when it is zero 2025-04-23 17:45:09 +10:00
psychedelicious
75ec36bf9a chore(ui): lint 2025-04-23 17:45:09 +10:00
psychedelicious
d802f8e7fb feat(ui): disable search when no options 2025-04-23 17:45:09 +10:00
psychedelicious
6873e0308d feat(ui): custom fallback for model picker when no models installed 2025-04-23 17:45:09 +10:00
psychedelicious
66eb73088e feat(ui): rename user-provided extra ctx for picker from ctx to extra to be less confusing 2025-04-23 17:45:09 +10:00
psychedelicious
ed81a13eb4 docs(ui): add some comments for picker 2025-04-23 17:45:09 +10:00
psychedelicious
fbc1aae52d feat(ui): more flexible fallbacks for model picker 2025-04-23 17:45:09 +10:00
psychedelicious
ba42c3e63f feat(ui): tooltip for compact/full model picker view 2025-04-23 17:45:09 +10:00
psychedelicious
b24e820aa0 fix(ui): flash of "select a model" when changing model 2025-04-23 17:45:09 +10:00
psychedelicious
e8f6b3b77a feat(ui): split out mainmodelpicker component 2025-04-23 17:45:09 +10:00
psychedelicious
8f13518c97 feat(ui): add clear search button to model combobox 2025-04-23 17:45:09 +10:00
psychedelicious
6afbc12074 feat(ui): when no model bases selected, show all models 2025-04-23 17:45:09 +10:00
psychedelicious
6b0a56ceb9 chore(ui): lint 2025-04-23 17:45:09 +10:00
psychedelicious
ca92497e52 feat(ui): remove description from model pciker for now 2025-04-23 17:45:09 +10:00
psychedelicious
97d45ceaf2 feat(ui): model picker filter buttons 2025-04-23 17:45:09 +10:00
psychedelicious
aeb3841a6f feat(ui): wip model picker 2025-04-23 17:45:09 +10:00
psychedelicious
c14d33d3c1 tweak(ui): remove bg on ModelImage fallback 2025-04-23 17:45:09 +10:00
psychedelicious
676e59e072 chore(ui): bump react-resizable-panels to latest
This resolves a bug where SVG elements were ignored when checking when cursor is over a resize handle
2025-04-23 17:45:09 +10:00
psychedelicious
e7dcb6a03f feat(ui): wip model picker 2025-04-23 17:45:09 +10:00
psychedelicious
fb95b7cc2b feat(ui): wip model picker 2025-04-23 17:45:09 +10:00
psychedelicious
015dc3ac0d feat(ui): wip model picker 2025-04-23 17:45:09 +10:00
psychedelicious
9d8a71b362 feat(ui): genericizing picker 2025-04-23 17:45:09 +10:00
psychedelicious
2eb212f393 feat(ui): onSelectId -> onSelectById 2025-04-23 17:45:09 +10:00
psychedelicious
34b268c15c feat(ui): use context for stable picker state 2025-04-23 17:45:09 +10:00
psychedelicious
9a203a64dc feat(ui): render picker in portal 2025-04-23 17:45:09 +10:00
psychedelicious
d80004e056 feat(ui): iterate on model combobox (wip) 2025-04-23 17:45:09 +10:00
psychedelicious
de32ed23a7 feat(ui): iterate on model combobox (wip) 2025-04-23 17:45:09 +10:00
psychedelicious
5aed2b315d feat(ui): iterate on model combobox (wip) 2025-04-23 17:45:09 +10:00
psychedelicious
48db6cfc4f feat(ui): iterate on model combobox (wip) 2025-04-23 17:45:09 +10:00
psychedelicious
aa7c5c281a feat(ui): iterate on model combobox (wip) 2025-04-23 17:45:09 +10:00
psychedelicious
87aeb7f889 feat(ui): iterate on model combobox (wip) 2025-04-23 17:45:09 +10:00
psychedelicious
3b3d6e413a feat(ui): iterate on model combobox (wip) 2025-04-23 17:45:09 +10:00
psychedelicious
b6432f2de3 feat(ui): iterate on model combobox (wip) 2025-04-23 17:45:09 +10:00
psychedelicious
9d0a28ccae feat(ui): iterate on model combobox (wip) 2025-04-23 17:45:09 +10:00
psychedelicious
c3bf0a3277 feat(ui): iterate on model combobox (wip) 2025-04-23 17:45:09 +10:00
psychedelicious
b516610c1e feat(ui): iterate on model combobox (wip) 2025-04-23 17:45:09 +10:00
psychedelicious
677e717cd7 feat(ui): iterate on model combobox (wip) 2025-04-23 17:45:09 +10:00
psychedelicious
c52584e057 feat(ui): simplify ScrollableContent 2025-04-23 17:45:09 +10:00
psychedelicious
b6767441db feat(ui): iterate on model combobox (wip) 2025-04-23 17:45:09 +10:00
psychedelicious
8745dbe67d feat(ui): iterate on model combobox (wip) 2025-04-23 17:45:09 +10:00
psychedelicious
a565d9473e feat(ui): add useStateImperative 2025-04-23 17:45:09 +10:00
psychedelicious
4dbf07c3e0 feat(ui): iterate on model combobox (wip) 2025-04-23 17:45:09 +10:00
psychedelicious
f6eb4d9a6b feat(ui): toast on select for demo purposes 2025-04-23 17:45:09 +10:00
psychedelicious
5037967b82 feat(ui): just make the damn thing myself 2025-04-23 17:45:09 +10:00
psychedelicious
4930ba48ce feat(ui): just make the damn thing myself 2025-04-23 17:45:09 +10:00
psychedelicious
40d2092256 feat(ui): reworked model selection ui (WIP) 2025-04-23 17:45:09 +10:00
psychedelicious
d2e9237740 feat(ui): reworked model selection ui (WIP) 2025-04-23 17:45:09 +10:00
psychedelicious
b191b706c1 feat(ui): reworked model selection ui (WIP) 2025-04-23 17:45:09 +10:00
psychedelicious
4d0f760ec8 chore(ui): bump cmdk to latest 2025-04-23 17:45:09 +10:00
psychedelicious
65cda5365a feat(ui): remove go to mm button from node fields 2025-04-23 17:45:09 +10:00
psychedelicious
1f2d1d086f feat(ui): add <NavigateToModelManagerButton /> to model comboboxes everywhere 2025-04-23 17:45:09 +10:00
psychedelicious
418f3c3f19 feat(ui): abstract out workflow editor model combobox, ensure consistent ui for all model fields 2025-04-23 17:45:09 +10:00
psychedelicious
72173e284c fix(ui): useModelCombobox should use null for no value instead of undefined
This fixes an issue where the refiner combobox doesn't clear itself visually when clicking the little X icon to clear the selection.
2025-04-23 17:45:09 +10:00
psychedelicious
9cc13556aa feat(ui): accept callback to override navigate to model manager functionality
If provided, `<NavigateToModelManagerButton />` will render, even if `disabledTabs` includes "models". If provided, `<NavigateToModelManagerButton />` will run the callback instead of switching tabs within the studio.

The button's tooltip is now just "Manage Models" and its icon is the same as the model manager tab's icon ([CUBE!](https://www.youtube.com/watch?v=4aGDCE6Nrz0)).
2025-04-23 17:45:09 +10:00
psychedelicious
298444f2bc chore: bump version to v5.10.1 2025-04-19 00:05:02 +10:00
psychedelicious
deb1984289 fix(mm): disable new model probe API
There is a subtle change in behaviour with the new model probe API.

Previously, checks for model types was done in a specific order. For example, we did all main model checks before LoRA checks.

With the new API, the order of checks has changed. Check ordering is as follows:
- New API checks are run first, then legacy API checks.
- New API checks categorized by their speed. When we run new API checks, we sort them from fastest to slowest, and run them in that order. This is a performance optimization.

Currently, LoRA and LLaVA models are the only model types with the new API. Checks for them are thus run first.

LoRA checks involve checking the state dict for presence of keys with specific prefixes. We expect these keys to only exist in LoRAs.

It turns out that main models may have some of these keys.

For example, this model has keys that match the LoRA prefix `lora_te_`: https://civitai.com/models/134442/helloyoung25d

Under the old probe, we'd do the main model checks first and correctly identify this as a main model. But with the new setup, we do the LoRA check first, and those pass. So we import this model as a LoRA.

Thankfully, the old probe still exists. For now, the new probe is fully disabled. It was only called in one spot.

I've also added the example affected model as a test case for the model probe. Right now, this causes the test to fail, and I've marked the test as xfail. CI will pass.

Once we enable the new API again, the xfail will pass, and CI will fail, and we'll be reminded to update the test.
2025-04-18 22:44:10 +10:00
psychedelicious
814406d98a feat(mm): siglip model loading supports partial loading
In the previous commit, the LLaVA model was updated to support partial loading.

In this commit, the SigLIP model is updated in the same way.

This model is used for FLUX Redux. It's <4GB and only ever run in isolation, so it won't benefit from partial loading for the vast majority of users. Regardless, I think it is best if we make _all_ models work with partial loading.

PS: I also fixed the initial load dtype issue, described in the prev commit. It's probably a non-issue for this model, but we may as well fix it.
2025-04-18 10:12:03 +10:00
psychedelicious
c054501103 feat(mm): llava model loading supports partial loading; fix OOM crash on initial load
The model manager has two types of model cache entries:
- `CachedModelOnlyFullLoad`: The model may only ever be loaded and unloaded as a single object.
- `CachedModelWithPartialLoad`: The model may be partially loaded and unloaded.

Partial loaded is enabled by overwriting certain torch layer classes, adding the ability to autocast the layer to a device on-the-fly. See `CustomLinear` for an example.

So, to take advantage of partial loading and be cached as a `CachedModelWithPartialLoad`, the model must inherit from `torch.nn.Module`.

The LLaVA classes provided by `transformers` do inherit from `torch.nn.Module`, but we wrap those classes in a separate class called `LlavaOnevisionModel`. The wrapper encapsulate both the LLaVA model and its "processor" - a lightweight class that prepares model inputs like text and images.

While it is more elegant to encapsulate both model and processor classes in a single entity, this prevents the model cache from enabling partial loading for the chunky vLLM model.

Fixing this involved a few changes.
- Update the `LlavaOnevisionModelLoader` class to operate on the vLLM model directly, instead the `LlavaOnevisionModel` wrapper class.
- Instantiate the processor directly in the node. The processor is lightweight and does its business on the CPU. We don't need to worry about caching in the model manager.
- Remove caching support code from the `LlavaOnevisionModel` wrapper class. It's not needed, because we do not cache this class. The class now only handles running the models provided to it.
- Rename `LlavaOnevisionModel` to `LlavaOnevisionPipeline` to better represent its purpose.

These changes have a bonus effect of fixing an OOM crash when initially loading the models. This was most apparent when loading LLaVA 7B, which is pretty chunky.

The initial load is onto CPU RAM. In the old version of the loaders, we ignored the loader's target dtype for the initial load. Instead, we loaded the model at `transformers`'s "default" dtype of fp32.

LLaVA 7B is fp16 and weighs ~17GB. Loading as fp32 means we need double that amount (~34GB) of CPU RAM. Many users only have 32GB RAM, so this causes a _CPU_ OOM - which is a hard crash of the whole process.

With the updated loaders, the initial load logic now uses the target dtype for the initial load. LLaVA now needs the expected ~17GB RAM for its initial load.

PS: If we didn't make the accompanying partial loading changes, we still could have solved this OOM. We'd just need to pass the initial load dtype to the wrapper class and have it load on that dtype. But we may as well fix both issues.

PPS: There are other models whose model classes are wrappers around a torch module class, and thus cannot be partially loaded. However, these models are typically fairly small and/or are run only on their own, so they don't benefit as much from partial loading. It's the really big models (like LLaVA 7B) that benefit most from the partial loading.
2025-04-18 10:12:03 +10:00
psychedelicious
c1d819c7e5 feat(nodes): add get_absolute_path method to context.models API
Given a model config or path (presumably to a model), returns the absolute path to the model.

Check the next few commits for use-case.
2025-04-18 10:12:03 +10:00
psychedelicious
2a8e91f94d feat(ui): wrap JSON in dataviewer 2025-04-17 22:55:04 +10:00
psychedelicious
64f3e56039 chore: bump version to v5.10.0 2025-04-17 15:08:26 +10:00
Hosted Weblate
819afab230 translationBot(ui): update translation files
Updated by "Cleanup translation files" hook in Weblate.

translationBot(ui): update translation files

Updated by "Cleanup translation files" hook in Weblate.

Co-authored-by: Hosted Weblate <hosted@weblate.org>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/
Translation: InvokeAI/Web UI
2025-04-17 11:28:02 +10:00
Linos
9fff064c55 translationBot(ui): update translation (Vietnamese)
Currently translated at 100.0% (1887 of 1887 strings)

translationBot(ui): update translation (Vietnamese)

Currently translated at 100.0% (1887 of 1887 strings)

Co-authored-by: Linos <linos.coding@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/vi/
Translation: InvokeAI/Web UI
2025-04-17 11:28:02 +10:00
Riccardo Giovanetti
1aa8d94378 translationBot(ui): update translation (Italian)
Currently translated at 98.0% (1851 of 1887 strings)

Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
2025-04-17 11:28:02 +10:00
RyoKoba
d78bdde2c3 translationBot(ui): update translation (Japanese)
Currently translated at 56.6% (1069 of 1887 strings)

translationBot(ui): update translation (Japanese)

Currently translated at 50.8% (960 of 1887 strings)

translationBot(ui): update translation (Japanese)

Currently translated at 48.4% (912 of 1882 strings)

Co-authored-by: RyoKoba <kobayashi_ryo@cyberagent.co.jp>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/ja/
Translation: InvokeAI/Web UI
2025-04-17 11:28:02 +10:00
psychedelicious
7b663b3432 fix(ui): scrolling in builder
I am at loss as the to cause of this bug. The styles that I needed to change to fix it haven't been changed in a couple months. But these do seem to fix it.

Closes #7910
2025-04-17 11:24:54 +10:00
psychedelicious
9c4159915a feat(ui): add guardrails to prevent entity types being missed in useIsEntityTypeEnabled 2025-04-17 11:21:16 +10:00
psychedelicious
dbb5830027 fix(ui): useIsEntityTypeEnabled should use useMemo not useCallback
Typo/bug introduced in #7770
2025-04-17 11:21:16 +10:00
psychedelicious
4fc4dbb656 fix(ui): ensure query subs are reset in case of error 2025-04-17 11:13:41 +10:00
psychedelicious
d4f6d09cc9 fix(ui): never subscribe to dynamic prompts queries
If the request errors, we would never get to unsubscribe. The request would forever be marked as having a subscriber and never be cleared from memory.
2025-04-17 10:36:09 +10:00
psychedelicious
44e44602d3 feat(ui): remove keepUnusedDataFor for dynamic prompts
This query can have potentially large responses. Keeping them around for 24 hours essentially a hardcoded memory leak. Use the default for RTKQ of 60 seconds.
2025-04-17 10:36:09 +10:00
psychedelicious
36066c5f26 fix(ui): ensure dynamic prompts updates on any change to any dependent state
When users generate on the canvas or upscaling tabs, we parse prompts through dynamic prompts before invoking. Whenever the prompt or other settings change, we run dynamic prompts.

Previously, we used a redux listener to react to changes to dynamic prompts' dependent state, keeping the processed dynamic prompts synced. For example, when the user changed the prompt field, we re-processed the dynamic prompts.

This requires that all redux actions that change the dependent state be added to the listener matcher. It's easy to forget actions, though, which can result in the dynamic prompts state being stale.

For example, when resetting canvas state, we dispatch an action that resets the whole params slice, but this wasn't in the matcher. As a result, when resetting canvas, the dynamic prompts aren't updated. If the user then clicks Invoke (with an empty prompt), the last dynamic prompts state will be used.

For example:
- Generate w/ prompt "frog", get frog
- Click new canvas session
- Generate without any prompt, still get frog

To resolve this, the logic that keeps the dynamic prompts synced is moved from the listener to a hook. The way the logic is triggered is improved - it's now triggered in a useEffect, which is run when the dependent state changes. This way, it doesn't matter _how_ the dependent state changes - the changes will always be "seen", and the dynamic prompts will update.
2025-04-17 10:36:09 +10:00
psychedelicious
361c6eed4b docs: update manual install docs w/ correct pytorch indicies for v5.10.0 and later 2025-04-17 10:32:41 +10:00
psychedelicious
bb154fd40f docs: update dev env docs with correct pytorch pypi index 2025-04-17 10:32:41 +10:00
psychedelicious
cbee6e6faf fix(app): remove accidentally committed tensor cache size
I had set this to zero for testing udring the python 2.6.0 upgrade and neglected to remove it.
2025-04-17 10:12:47 +10:00
psychedelicious
6a822a52b8 chore(ui): update whats new copy 2025-04-16 07:17:52 +10:00
psychedelicious
d10dc28fc2 chore: bump version to v5.10.0rc1 2025-04-16 07:17:52 +10:00
psychedelicious
20eea18c41 chore(ui): typegen 2025-04-16 06:28:22 +10:00
skunkworxdark
566282bff0 Update metadata_linked.py
added metadata_to_string_collection, metadata_to_integer_collection, metadata_to_float_collection, metadata_to_bool_collection
2025-04-16 06:28:22 +10:00
psychedelicious
e7e874f7c3 fix(ui): increase padding when fitting layers to stage 2025-04-15 07:47:39 +10:00
Eugene Brodsky
95445c1163 chore: update pre-commit syntax; add check for uv.lock needing an update 2025-04-15 07:41:32 +10:00
psychedelicious
557e0cb3e6 chore(ui): knip 2025-04-15 07:13:25 +10:00
psychedelicious
a12bf07fb3 feat(ui): add node publish denylist 2025-04-15 07:13:25 +10:00
psychedelicious
a5bc21cf50 feat(nodes): extract LaMa model url to constant 2025-04-15 07:13:25 +10:00
psychedelicious
03ca23bec2 chore: update lockfile 2025-04-15 07:06:23 +10:00
psychedelicious
e15194a45d Revert "ci: change pyproject.toml to trigger uv lock check (it should fail)"
This reverts commit b802933190.
2025-04-15 07:06:23 +10:00
psychedelicious
e71ea309e7 ci: change pyproject.toml to trigger uv lock check (it should fail) 2025-04-15 07:06:23 +10:00
psychedelicious
2513756c25 ci: fix name of uv lock checks job 2025-04-15 07:06:23 +10:00
psychedelicious
875670f713 ci: add comment to uv-lock-checks.yml 2025-04-15 07:06:23 +10:00
psychedelicious
153b148362 ci: add check for uv lockfile consistency with pyproject.toml 2025-04-15 07:06:23 +10:00
psychedelicious
7b84f8c5e8 fix(ui): do not disable image context canvas actions based on selected base model
These actions should be accessible at any time.
2025-04-10 10:50:13 +10:00
psychedelicious
0280c9b4b9 fix(ui): generation_mode metadata not set correctly 2025-04-10 10:50:13 +10:00
psychedelicious
ae8d1f26d6 fix(app): import CogView4Transformer2DModel from the module that exports it 2025-04-10 10:50:13 +10:00
psychedelicious
170ea4fb75 fix(app): add CogView4ConditioningInfo to ObjectSerializerDisk's safe_globals
needed for torch w/ weights_only=True
2025-04-10 10:50:13 +10:00
psychedelicious
e5b0f8b985 feat(app): remove cogview4 inpaint workflow
This doesn't make sense to have as a default workflow given the trickiness of producing alpha masks.
2025-04-10 10:50:13 +10:00
psychedelicious
3f656072cf feat(app): update cogview4 t2i workflow w/ form 2025-04-10 10:50:13 +10:00
psychedelicious
1d4aa93f5e chore(ui): typegen 2025-04-10 10:50:13 +10:00
psychedelicious
b182060201 chore(ui): lint 2025-04-10 10:50:13 +10:00
psychedelicious
2b2f64b232 refactor(ui): simplify useIsEntityTypeEnabled 2025-04-10 10:50:13 +10:00
psychedelicious
df32974378 fix(ui): add checks for cogview4's dimension restrictions 2025-04-10 10:50:13 +10:00
psychedelicious
ad582c8cc5 feat(nodes): rename CogView4 nodes to match naming format 2025-04-10 10:50:13 +10:00
psychedelicious
47273135ca feat(ui): add cogview4 and inpainting tags to library 2025-04-10 10:50:13 +10:00
psychedelicious
c99e65bdab feat(app): add cogview4 default workflows 2025-04-10 10:50:13 +10:00
Mary Hipp
92b726d731 update available params for cogview4 2025-04-10 10:50:13 +10:00
Mary Hipp
8837932bad create hook for managing entity type enabledness for given base model and update usage 2025-04-10 10:50:13 +10:00
Mary Hipp
9846229e52 build graph for cogview4 2025-04-10 10:50:13 +10:00
maryhipp
305c5761d0 add generation modes for cogview linear 2025-04-10 10:50:13 +10:00
Ryan Dick
3ba399779f Fix lint error. 2025-04-10 10:50:13 +10:00
Ryan Dick
46316e43f0 typegen 2025-04-10 10:50:13 +10:00
Ryan Dick
d86cd66994 Add CogView4 VAE approximation for progress images. 2025-04-10 10:50:13 +10:00
Ryan Dick
13850271ab Add inpainting to CogView4DenoiseInvocation. 2025-04-10 10:50:13 +10:00
Ryan Dick
7e894ffe83 Consolidate InpaintExtension implementations for SD3 and FLUX. 2025-04-10 10:50:13 +10:00
Ryan Dick
0939030324 Support cfg_scale list in CogView4Denoise. 2025-04-10 10:50:13 +10:00
Ryan Dick
30f19dc37a Update CogView4Denoise to support image-to-image. 2025-04-10 10:50:13 +10:00
Ryan Dick
ace5e748f4 Simplify CogView4 timesteps schedule generation in preparation for timestep schedule slipping. 2025-04-10 10:50:13 +10:00
Ryan Dick
4fae8ad163 Add CogView4ImageToLatentsInvocation. 2025-04-10 10:50:13 +10:00
Ryan Dick
5e75bc570a Fix bug in CogView4 noise schedule handling that was resulting in low-quality images. 2025-04-10 10:50:13 +10:00
Ryan Dick
3166b5d2ea Switch to sequential CFG for CogView4 (for now, until I sort out the padding). 2025-04-10 10:50:13 +10:00
Ryan Dick
321c2d358c Add CogView4 model loader. And various other fixes to get a CogView4 workflow running (though quality is still below expectations). 2025-04-10 10:50:13 +10:00
Ryan Dick
0338983895 Update CogView4 starter model entry with approximate bundle size. 2025-04-10 10:50:13 +10:00
Ryan Dick
f4e00ab261 Add CogView4 to frontend. 2025-04-10 10:50:13 +10:00
Ryan Dick
e1133bc53f Fix typo in BaseModelTypo.CogView4. 2025-04-10 10:50:13 +10:00
Ryan Dick
e1ccbd5c29 typegen 2025-04-10 10:50:13 +10:00
Ryan Dick
cf76a0b575 Add CogView4ModelLoaderInvocation. (Not wired up with frontend yet.) 2025-04-10 10:50:13 +10:00
Ryan Dick
67bfd63c73 Require the cogview4 height/width are multiples of 32. This requirement is documented here: https://huggingface.co/THUDM/CogView4-6B. I haven't tracked down the underlying source of this requirement. 2025-04-10 10:50:13 +10:00
Ryan Dick
cdad8a4fd1 Add CogView4LatentsToImageInvocation. 2025-04-10 10:50:13 +10:00
Ryan Dick
5d9797945b Completed first pass of CogView4Denoise. 2025-04-10 10:50:13 +10:00
Ryan Dick
78159c3200 Simplify CogView4 timestep schedule initialization. 2025-04-10 10:50:13 +10:00
Ryan Dick
1320c4fa13 WIP - CogView4DenoiseInvocation. 2025-04-10 10:50:13 +10:00
Ryan Dick
883297c809 Bump diffusers to dev version with CogView4 support. 2025-04-10 10:50:13 +10:00
Ryan Dick
bac05a7885 Add CogView4TextEncoderInvocation 2025-04-10 10:50:13 +10:00
Ryan Dick
e2c4ea8e89 Add CogView4 model probing. 2025-04-10 10:50:13 +10:00
psychedelicious
851e23d6b4 feat(ui): move size to be next to model name 2025-04-10 09:53:03 +10:00
psychedelicious
7c8c9694ce feat(ui): use filesize package to format model file size 2025-04-10 09:53:03 +10:00
Kevin Turner
52a8ad1c18 chore: rename model.size to model.file_size
to disambiguate from RAM size or pixel size
2025-04-10 09:53:03 +10:00
Kevin Turner
e537020c11 chore: cursed whitespace fight 2025-04-10 09:53:03 +10:00
Kevin Turner
c50d1d6127 test: add size field to model metadata 2025-04-10 09:53:03 +10:00
Kevin Turner
53292b3592 fix: localization for file size units 2025-04-10 09:53:03 +10:00
Kevin Turner
bcfc61b2d7 feat: show model size in model list 2025-04-10 09:53:03 +10:00
Kevin Turner
9d869fc9ce chore: typegen 2025-04-10 09:53:03 +10:00
Kevin Turner
f09aacf992 fix: ModelProbe.probe needs to return a size field 2025-04-10 09:53:03 +10:00
Kevin Turner
98260a8efc test: add size field to test model configs 2025-04-10 09:53:03 +10:00
Kevin Turner
9590e8ff39 feat: expose model storage size 2025-04-10 09:53:03 +10:00
psychedelicious
a23d90187b feat(ui): allow send-image-to-canvas to work when canvas is uninitialized
Add `useCanvasIsBusySafe()` hook. This is like `useCanvasIsBusy()`, but when the canvas is not initialized, it gracefully falls back to false instead of raising.

Because app tabs are lazy-loaded, the canvas is not initialized until the user visits that tab. If the page loads up on the workflows tab, the canvas will be uninitialized until the user clicks on it.

This graceful fallback behaviour allows actions like sending an image to canvas to work even when the canvas is not yet initialized. These actions are exposed in the image context menu, and previously were hidden when the canvas was not initialized. We can now show these actions and use them even when the canvas is uninitialized.

- Add `useCanvasIsBusySafe()` hook
- Use the new hook in the image context menu for send to canvas actions
- Do not use `<CanvasManagerProviderGate />` in the image context menu (this was hiding the actions when canvas was uninitialized)
2025-04-10 06:44:44 +10:00
psychedelicious
f655a85154 fix(ui): canvas dnd drop indicator color 2025-04-10 06:42:01 +10:00
psychedelicious
f45b494805 tidy(ui): remove extraneous calls to HTMLElement.remove()
these will be auto-gc'd when there are no more references
2025-04-09 14:00:20 +10:00
psychedelicious
d1776e0b63 feat(ui): safer use of drawImage
When calling `ctx.drawImage()`, if the image to be drawn has a width of height of 0, the call will raise.

In this change, I have carefully reviewed the call hierarchy for all of our own code that calls this method and ensured that each call has error handling.

Well, with one exception - I'm not sure how to handle errors in `invokeai/frontend/web/src/common/hooks/useClientSideUpload.ts`. But this should never be an issue in that hook - it's a Canvas problem.
2025-04-09 14:00:20 +10:00
psychedelicious
646887e3c9 feat(ui): save canvas/bbox to gallery saves basic metadata
- Positive prompt
- Negative prompt
- Seed
- Model (if set)

The rest is a bit complicated to derive as it comes from the graph building process.
2025-04-09 08:52:38 +10:00
Riccardo Giovanetti
e7e25a0c37 translationBot(ui): update translation (Italian)
Currently translated at 98.7% (1849 of 1873 strings)

translationBot(ui): update translation (Italian)

Currently translated at 97.8% (1833 of 1873 strings)

Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
2025-04-08 11:01:37 +10:00
Linos
589b849e64 translationBot(ui): update translation (Vietnamese)
Currently translated at 100.0% (1873 of 1873 strings)

translationBot(ui): update translation (Vietnamese)

Currently translated at 100.0% (1871 of 1871 strings)

translationBot(ui): update translation (Vietnamese)

Currently translated at 99.2% (1857 of 1871 strings)

translationBot(ui): update translation (Vietnamese)

Currently translated at 100.0% (1840 of 1840 strings)

Co-authored-by: Linos <linos.coding@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/vi/
Translation: InvokeAI/Web UI
2025-04-08 11:01:37 +10:00
psychedelicious
aedbc9f778 chore: prep for v5.10.0a1 2025-04-08 10:59:08 +10:00
psychedelicious
a0cf9e2e80 tweak(ui): ip adapter settings layout 2025-04-08 10:33:45 +10:00
psychedelicious
5c8f1c5666 fix(ui): use flux redux influence on regional guidance 2025-04-08 10:33:45 +10:00
psychedelicious
fd37117221 chore(ui): lint 2025-04-08 10:33:45 +10:00
psychedelicious
5956f96e57 feat(ui): add flux redux image influence to canvas 2025-04-08 10:33:45 +10:00
psychedelicious
49622c37ed fix(nodes): logic bug in flux redux node 2025-04-08 10:33:45 +10:00
psychedelicious
50387c8f64 chore(ui): typegen 2025-04-08 10:33:45 +10:00
skunkworxdark
e1538af219 Update flux_redux.py
Add down sampling and weight to redux node
2025-04-08 10:33:45 +10:00
psychedelicious
e5a0010a72 fix(ui): normalize alpha value to 0-1 when picking color on canvas 2025-04-08 08:20:49 +10:00
psychedelicious
b75d1b2473 refactor(ui): move update node logic from listener to hook 2025-04-08 08:18:17 +10:00
psychedelicious
b91bb9ba9f fix(ui): remove debug logger middleware 2025-04-08 08:18:17 +10:00
psychedelicious
a7c818bcae fix(ui): rebase import issue 2025-04-08 08:18:17 +10:00
psychedelicious
a54b255718 chore(ui): lint 2025-04-08 08:18:17 +10:00
psychedelicious
3e04baa684 feat(ui): improved undo/redo history grouping for selections and postiino changes 2025-04-08 08:18:17 +10:00
psychedelicious
d23db705dd feat(ui): improved undo/redo history grouping 2025-04-08 08:18:17 +10:00
psychedelicious
96a481530d refactor(ui): merge the workflow and nodes slices
This allows undo/redo history to apply to node editor and workflow details/form.
2025-04-08 08:18:17 +10:00
psychedelicious
a0b515979a Revert "correctly set is_published when loading a workflow"
This reverts commit e4b07894fd55b3a24fc006882585b6d55fe329c3.
2025-04-08 07:05:12 +10:00
Mary Hipp
2da8ac216b add mutation for unpublishing 2025-04-08 07:05:12 +10:00
Mary Hipp
1558fe9a37 correctly set is_published when loading a workflow 2025-04-08 07:05:12 +10:00
Mary Hipp
ded080ae04 show cancel icon and not retry icon on validation run queue items 2025-04-08 07:05:12 +10:00
psychedelicious
982603e051 fix(ui): use getDefaultForm when resetting form 2025-04-08 06:54:43 +10:00
psychedelicious
a23b5c3408 refactor(ui): make workflow published status server-side state
Whether a workflow is published or not shouldn't be something stored on the client. It's properly server-side state.

This change removes the `is_published` flag from redux and updates all references to the flag to use the getWorkflow query.

It also updates the socket event listener that handles session complete events. When a validation run completes, we invalidate the tags for the getWorkflow query. We need to do a bit of juggling to avoid a race condition (documented in the code). Works well though.
2025-04-08 06:54:43 +10:00
psychedelicious
c9f93b3746 refactor(ui): workflow unsaved changes tracking
Previously, we maintained an `isTouched` flag in redux state to indicate if a workflow had unsaved changes. We manually updated this whenever we changed something on the workflow.

This was tedious and error-prone. It also didn't handle undo/redo, so if you made a change to a node and undid it, we'd still think the workflow had unsaved changes.

Moving forward, we use a simpler and more robust strategy by hashing the server's version of the workflow and comparing it to the client's version of the workflow.

The hashing uses `stable-hash`, which is both fast and, well, stable. Most importantly, the ordering of keys in hashed objects does not change the resultant hash.

- Remove `isTouched` state entirely.
- Extract the logic that builds the "preview" workflow object from redux state into its own hook. This "preview" workflow is what we send to the server when saving a workflow. This "preview" workflow is effectively the client version of the workflow.
- Add `useDoesWorkflowHaveUnsavedChanges()` hook, which compares the hash of the client workflow and server workflow (if it exists).
- Add `useIsWorkflowUntouched()` hook, which compares the hash of the client workflow and the initial workflow that you get when you click new workflow.
- Remove `reactflow` workaround in the nodes slice undo/redo filter. When we set the nodes state while loading a workflow, `reactflow` emits a nodes size/placement change event. This triggered up our `isTouched` flag logic and marked the workflow as unsaved right from the get-go. With the new strategy to track touched status, this workaround can be removed.
- Update all logic that tracked the old `isTouched` flag to use the new hooks.
2025-04-08 06:54:43 +10:00
psychedelicious
e381024cc0 fix(ui): remove debug logger middleware from store setup
Accidentally left in from prev change
2025-04-08 06:54:43 +10:00
psychedelicious
bb65884040 refactor(ui): workflow form root element is a constant
Previously, the workflow form's root element id was random. Every time we reset the workflow editor, the root id changed. This makes it difficult to check if the workflow editor is untouched (in its default state).

Now that root element's id is simply "root". I can't imagine any way that this would break anything.
2025-04-08 06:54:43 +10:00
psychedelicious
920339dbeb refactor(ui): split out the modal isolator component 2025-04-08 06:54:43 +10:00
psychedelicious
0f618bdbcb refactor(ui): split out the hook isolator component 2025-04-08 06:54:43 +10:00
psychedelicious
8294e2cdea feat(mm): support size calculation for onnx models 2025-04-07 11:37:55 +10:00
psychedelicious
7da43be4b7 docs: fix incorrect filename 2025-04-07 10:57:32 +10:00
psychedelicious
8561e9e540 docs: remove legacy scripts documentation 2025-04-07 10:57:32 +10:00
psychedelicious
b0d5e7e3d8 feat(app): restore "Using torch device" message on startup 2025-04-07 10:56:26 +10:00
Eugene Brodsky
ab2d203d5e fix(build): re-add sentencepiece which is apparently needed by gguf, but is not defined as its dependency 2025-04-04 16:26:20 -04:00
Eugene Brodsky
eae5c54091 fix(docker): another pip install is needed in docker build after copying sources 2025-04-04 16:26:20 -04:00
Mary Hipp
ee2b486e8b fix badge for validation run 2025-04-04 11:38:40 -04:00
psychedelicious
a2c7050832 docs: update README.md 2025-04-04 18:42:13 +11:00
psychedelicious
cd090eb76f build: fix path in build script 2025-04-04 18:42:13 +11:00
psychedelicious
3348755e6e ci: fix name of build hweel workflow 2025-04-04 18:42:13 +11:00
psychedelicious
d6dbdaacd1 chore: bump version to v5.10.0dev4 2025-04-04 18:42:13 +11:00
psychedelicious
1c6fa1ad18 ci: update workflows to use revised build scripts 2025-04-04 18:42:13 +11:00
psychedelicious
39bed90eda build: remove installer & convert installer build script to only build the wheel 2025-04-04 18:42:13 +11:00
psychedelicious
c0e48193a7 chore: bump version to v5.10.0dev3 2025-04-04 18:42:13 +11:00
psychedelicious
41677394c0 chore: update uv.lock 2025-04-04 18:42:13 +11:00
psychedelicious
405cfd46e7 build: remove pin on spandrel dependency 2025-04-04 18:42:13 +11:00
psychedelicious
9cc9a5c8b0 build: add comment about torchsde to pyproject 2025-04-04 18:42:13 +11:00
psychedelicious
ddc0461882 build: remove pin on gguf dependency
This allows it to pull in sentencepiece on its own. In 0.10.0, it didn't have this package listed as a dependency, but in recent releases it does. So we are able to remove sentencepiece as an explicit dep.
2025-04-04 18:42:13 +11:00
psychedelicious
0f09091a26 build: remove unused clip_anytorch dependency 2025-04-04 18:42:13 +11:00
psychedelicious
dedb77b6f2 build: remove unused pytorch-lightning dependency 2025-04-04 18:42:13 +11:00
psychedelicious
89f8dbee6c build: remove unused pyreadline3 dependency 2025-04-04 18:42:13 +11:00
psychedelicious
8b0dc8ce84 build: remove unused pyperclip dependency 2025-04-04 18:42:13 +11:00
psychedelicious
018121e407 build: remove unused pympler dependency 2025-04-04 18:42:13 +11:00
psychedelicious
095025b637 build: remove unused scikit-image dependency 2025-04-04 18:42:13 +11:00
psychedelicious
ed8487659e build: remove unused npyscreen dependency 2025-04-04 18:42:13 +11:00
psychedelicious
3745d2be0c build: remove unused torchmetrics dependency 2025-04-04 18:42:13 +11:00
psychedelicious
b5206e204f build: remove unused datasets dependency 2025-04-04 18:42:13 +11:00
psychedelicious
b237ccbdd8 build: remove unused click dependency 2025-04-04 18:42:13 +11:00
psychedelicious
224ebc72ae build: remove unused omegaconf dependency 2025-04-04 18:42:13 +11:00
psychedelicious
05c3d47be9 build: remove unused facexlib dependency 2025-04-04 18:42:13 +11:00
psychedelicious
a4d709c169 build: remove unused timm dependency 2025-04-04 18:42:13 +11:00
psychedelicious
5a8e95c700 chore(ui): typegen 2025-04-04 18:42:13 +11:00
psychedelicious
e630f364df chore: update uv.lock 2025-04-04 18:42:13 +11:00
psychedelicious
9c287038e4 build: remove unused matplotlib dep 2025-04-04 18:42:13 +11:00
psychedelicious
8d32ede082 tidy(nodes): remove matplotlib dependency
It was only used for a single color conversion function. Replaced with cv2 code, tested functionality to confirm it works the same.
2025-04-04 18:42:13 +11:00
psychedelicious
bab0b6d069 build: move humanize to test deps 2025-04-04 18:42:13 +11:00
psychedelicious
8e013ef3be build: remove unused albumentations dependency
This is not used
2025-04-04 18:42:13 +11:00
psychedelicious
8188484a40 tidy: delete unused file 2025-04-04 18:42:13 +11:00
psychedelicious
5d8fe9fb56 build: remove controlnet_aux dependency, remove pin for timm 2025-04-04 18:42:13 +11:00
psychedelicious
8d3743c6f2 tidy(nodes): rename controlnet_image_processors.py -> controlnet.py 2025-04-04 18:42:13 +11:00
psychedelicious
986b7426d2 tidy(nodes): remove unused old dw openpose detector class 2025-04-04 18:42:13 +11:00
psychedelicious
8d8150b47e tidy(nodes): remove deprecated controlnet "processor" nodes 2025-04-04 18:42:13 +11:00
psychedelicious
ae3944b4e0 build: upgrade python to 3.12 in pins 2025-04-04 18:42:13 +11:00
psychedelicious
6f0c5c9c05 build: update uv.lock 2025-04-04 18:42:13 +11:00
psychedelicious
89c999ca58 fix(backend): remove mps_fixes
The fixes in this module monkeypatched `torch` to resolve some issues with FP16 on macOS. These issues have long since been resolved.

Included in the now-removed fixes is `CustomSlicedAttentionProcessor`, which is intended to reduce memory requirements for MPS. This overrides `diffusers`' own `SlicedAttentionProcessor`.

Unfortunately, `attention_type: sliced` produces hot garbage with the fixes and black images without the fixes. So this class appears to now be a moot point.

Regardless, SDPA is supported on MPS and very efficient, so sliced attention is largely obsolete.
2025-04-04 18:42:13 +11:00
psychedelicious
89cefc6a88 chore: bump version to v5.10.0dev2
Doing a dev build so I can test the launcher.
2025-04-04 18:42:13 +11:00
psychedelicious
79e384e71c build: downgrade python to 3.11 in pins 2025-04-04 18:42:13 +11:00
psychedelicious
3ebe96765a build: restore prev setuptools config to fix wheel build 2025-04-04 18:42:13 +11:00
psychedelicious
97e158f13a ci: use py3.12 to build installer 2025-04-04 18:42:13 +11:00
psychedelicious
2b1a36ef4a experiment: add pins.json to repo
The launcher will query this file to get the pins needed for installation
2025-04-04 18:42:13 +11:00
psychedelicious
6824b4b036 chore: bump version to v5.10.0dev1
Doing a dev build so I can test the launcher.
2025-04-04 18:42:13 +11:00
psychedelicious
e8a09a5ed8 chore: update uv.lock for latest pydantic
Ran `uv lock --upgrade-package pydantic`
2025-04-04 18:42:13 +11:00
psychedelicious
c4df7d3cb9 fix(ui): handle updated schema structure during invocation parsing
In https://github.com/pydantic/pydantic/pull/10029, pydantic made an improvement to its generated JSON schemas (OpenAPI schemas). The previous and new generated schemas both meet the schema spec.

When we parse the OpenAPI schema to generate node templates, we use some typeguard to narrow schema components from generic OpenAPI schema objects to a node field schema objects. The narrower node field schema objects contain extra data.

For example, they contain a `field_kind` attribute that indicates it the field is an input field or output field. These extra attributes are not part of the OpenAPI spec (but the spec allows does allow for this extra data).

This typeguard relied on a pydantic implementation detail. This was changed in the linked pydantic PR, which released with v2.9.0. With the change, our typeguard rejects input field schema objects, causing parsing to fail with errors/warnings like `Unhandled input property` in the JS console.

In the UI, this causes many fields - mostly model fields - to not show up in the workflow editor.

The fix for this is very simple - instead of relying on an implementation detail for the typeguard, we can check if the incoming schema object has any of our invoke-specific extra attributes. Specifically, we now look for the presence of the `field_kind` attribute on the incoming schema object. If it is present, we know we are dealing with an invocation input field and can parse it appropriately.
2025-04-04 18:42:13 +11:00
psychedelicious
b9e76afbf5 chore: typegen 2025-04-04 18:42:13 +11:00
psychedelicious
dfd8b8f220 chore: remove pydantic pin 2025-04-04 18:42:13 +11:00
psychedelicious
a089e1bf5c chore(ui): typegen 2025-04-04 18:42:13 +11:00
psychedelicious
875f3fe779 tests: update tests/test_object_serializer_disk.py 2025-04-04 18:42:13 +11:00
psychedelicious
5fa2cf59e2 fix(app): add trusted classes to torch safe globals to prevent errors when loading them
In `ObjectSerializerDisk`, we use `torch.load` to load serialized objects from disk. With torch 2.6.0, torch defaults to `weights_only=True`. As a result, torch will raise when attempting to deserialize anything with an unrecognized class.

For example, our `ConditioningFieldData` class is untrusted. When we load conditioning from disk, we will get a runtime error.

Torch provides a method to add trusted classes to an allowlist. This change adds an arg to `ObjectSerializerDisk` to add a list of safe globals to the allowlist and uses it for both `ObjectSerializerDisk` instances.

Note: My first attempt inferred the class from the generic type arg that `ObjectSerializerDisk` accepts, and added that to the allowlist. Unfortunately, this doesn't work.

For example, `ConditioningFieldData` has a `conditionings` attribute that may be one some other untrusted classes representing model-specific conditioning data. So, even if we allowlist `ConditioningFieldData`, loading will fail when torch deserializes the `conditionings` attribute.
2025-04-04 18:42:13 +11:00
Eugene Brodsky
4d58c222f3 resolve conflict between timm version needed by LLaVA and controlnet-aux 2025-04-04 18:42:13 +11:00
Eugene Brodsky
c27142bb02 reintroduce GPU_DRIVER build arg in CI container build, as it has apparently been removed 2025-04-04 18:42:13 +11:00
Eugene Brodsky
e3c441fda4 remove obsoleted depenencies that were used by the CLI 2025-04-04 18:42:13 +11:00
Eugene Brodsky
6bb102f860 modify docs for python 3.12 2025-04-04 18:42:13 +11:00
Eugene Brodsky
5c45ef1a8c update nodes schema / typegen 2025-04-04 18:42:13 +11:00
Eugene Brodsky
7a218a8040 update uv.lock 2025-04-04 18:42:13 +11:00
Eugene Brodsky
929d86768f refactor Dockerfile; get rid of multi-stage build; upgrade to python 3.12 2025-04-04 18:42:13 +11:00
Eugene Brodsky
3676160496 use uv.lock to pin dependencies 2025-04-04 18:42:13 +11:00
Eugene Brodsky
8e6ebb537b upgrade pytorch and unpin some of the strict dependency pins to facilitate upgrading co-dependencies.
we will use uv.lock to ensure reproducibility
2025-04-04 18:42:13 +11:00
Chantell
2b5da91beb Update manual.md
Removed a redundancy of package specifier on step 6.
2025-04-04 16:52:04 +11:00
psychedelicious
74bede14be feat(ui): put all validatoin run data into single object 2025-04-04 11:38:04 +11:00
psychedelicious
04ea3c491a chore(ui): typegen 2025-04-04 11:38:04 +11:00
psychedelicious
38e7b23d18 feat(api): put all validatoin run data into single object 2025-04-04 11:38:04 +11:00
psychedelicious
c052846e05 feat(ui): ensure workflow id is passed when doing validation run 2025-04-04 11:38:04 +11:00
psychedelicious
af3a31dfec chore(ui): typegen 2025-04-04 11:38:04 +11:00
psychedelicious
571710fab6 feat(app): add optional published_workflow_id to enqueue payloads and queue item 2025-04-04 11:38:04 +11:00
psychedelicious
a175a5c252 feat(ui): add safeguard against accidentally loading non-library workflow as library workflow 2025-04-04 11:38:04 +11:00
psychedelicious
8b3c36c6fa refactor(ui): better UX for choosing output nodes 2025-04-04 11:38:04 +11:00
psychedelicious
b9ffacd4bf fix(ui): disable publish button when not ready to enqueue (i.e. invalid graph) 2025-04-04 11:38:04 +11:00
psychedelicious
ae45fc8a74 gh: update codeowners
- Add @psychedelicious as codeowner for docs
- Remove inactive contributors
2025-04-03 18:34:39 -04:00
psychedelicious
85db9c65e5 fix(ui): add missing tkey 2025-04-03 12:42:28 +11:00
psychedelicious
ddddaef7ca refactor(ui): use dedicated allowPublishWorkflows instead of disabledFeatures 2025-04-03 12:42:28 +11:00
psychedelicious
e4678201cb feat(ui): add conditionally-enabled workflow publishing ui
This is a squash of a lot of scattered commits that became very difficult to clean up and make individually. Sorry.

Besides the new UI, there are a number of notable changes:
- Publishing logic is disabled in OSS by default. To enable it, provided a `disabledFeatures` prop _without_ "publishWorkflow".
- Enqueuing a workflow is no longer handled in a redux listener. It was  hard to track the state of the enqueue logic in the listener. It is now in a hook. I did not migrate the canvas and upscaling tabs - their enqueue logic is still in the listener.
- When queueing a validation run, the new `useEnqueueWorkflows()` hook will update the payload with the required data for the run.
- Some logic is added to the socket event listeners to handle workflow publish runs completing.
- The workflow library side nav has a new "published" view. It is hidden when the "publishWorkflow" feature is disabled.
- I've added `Safe` and `OrThrow` versions of some workflows hooks. These hooks typically retrieve some data from redux. For example, a node. The `Safe` hooks return the node or null if it cannot be found, while the `OrThrow` hooks return the node or raise if it cannot be found. The `OrThrow` hooks should be used within one of the gate components. These components use the `Safe` hooks and render a fallback if e.g. the node isn't found. This change is required for some of the publish flow UI.
- Add support for locking the workflow editor. When locked, you can pan and zoom but that's it. Currently, it is only locked during publish flow and if a published workflow is opened.
2025-04-03 12:42:28 +11:00
psychedelicious
d66fdfde71 chore(ui): typegen 2025-04-03 12:42:28 +11:00
psychedelicious
08ee08557b feat(app): add noop api validation run stuff to routes and methods 2025-04-03 12:42:28 +11:00
psychedelicious
496f1262c6 feat(app): truncate warnings for invalid model config in db
This message is logged _every_ time we retrieve a list of models if there is an invalid model. Previously it logged the _whole_ row which can be a lot of data. Truncate the row to 64 characters to reduce log pollution.
2025-04-03 12:42:28 +11:00
psychedelicious
188d52e4a5 chore(ui): bump tsafe to latest 2025-04-03 12:42:28 +11:00
Riku
db03c196a1 translationBot(ui): update translation (German)
Currently translated at 66.8% (1230 of 1840 strings)

Co-authored-by: Riku <riku.block@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/de/
Translation: InvokeAI/Web UI
2025-04-03 07:42:43 +11:00
Riccardo Giovanetti
6bc36b697d translationBot(ui): update translation (Italian)
Currently translated at 98.8% (1818 of 1840 strings)

translationBot(ui): update translation (Italian)

Currently translated at 98.6% (1816 of 1840 strings)

translationBot(ui): update translation (Italian)

Currently translated at 98.7% (1816 of 1839 strings)

Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
2025-04-03 07:42:43 +11:00
Linos
b7d71d3028 translationBot(ui): update translation (Vietnamese)
Currently translated at 100.0% (1840 of 1840 strings)

translationBot(ui): update translation (Vietnamese)

Currently translated at 100.0% (1838 of 1838 strings)

Co-authored-by: Linos <linos.coding@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/vi/
Translation: InvokeAI/Web UI
2025-04-03 07:42:43 +11:00
psychedelicious
fa1ebd9d2f fix(ui): do not switch between images when focused on a tab element
Arrow keys should only navigate between tabs, not gallery images.
2025-04-03 07:40:10 +11:00
psychedelicious
eed5d02069 fix(ui): handling for invalid edges when loading workflows
Previously, reactflow appears to have handled an edge case when using its `applyChanges` utility. If a change was provided without an item, it would skip that change. For example, an "add edge" change that somehow passed `null` as the edge, instead of a valid edge.

In our workflow loading and validation logic, invalid edges were removed from the array using `delete edges[i]`. This left "holes" in the array of edges. We then asked `reactflow` to add these edges to state. When it encountered one of the "holes", it skipped over it.

In a recent release (unsure which, somewhere between the latest v11 and ~v12.4) this seems to have changed. It no longer skips over the "holes" and instead trusts the data. This can cause a couple issues:
- Error when loading the workflow if `reactflow` attempt to do anything with the nonexistent edge.
- If somehow the workflow makes it into state with "holes" in the array of edges, all sorts of other stuff breaks when our code does anything with the nonexistent edge.

Two-part fix:
- Update the invalid edge handling to not use `delete edges[i]`. Instead, as we check each edge, we add invalid ones to a set. Then, after all the checks are finished, filter out the invalid edges. The resultant edges array has no holes.
- Simplify the logic around setting nodes and edges in redux. Previously we were using `reactflow`'s `applyChanges` utils, but this does literally nothing except take extra CPU cycles. We can simply set the loaded nodes and edges directly in redux. Perhaps we were using `applyChanges` because it addressed the "holes" issue? Not sure. But we don't need it now.

Closes #7868
2025-04-03 07:37:49 +11:00
psychedelicious
3650d91045 chore(ui): bump @xyflow/react to latest 2025-04-03 07:37:49 +11:00
Eugene Brodsky
6c7d08cacb Change timm and controlnet-aux pins to fix LLaVA model support (#7846)
## Summary

`timm` below 1.0.0 prevents llava models from working (broken in
transformers). but `controlnet-aux` pins `timm` to an earlier version
because otherwise it was breaking the ZoeDepth controlnet.

we don't use ZoeDepth (replaced by depthAnything), and downgrading
controlnet-aux seems to be acceptable.

more context here:

https://github.com/huggingface/controlnet_aux/issues/106
https://github.com/huggingface/controlnet_aux/pull/101


Note that this results in some warnings on startup, stemming from
controlnet-aux:

![image](https://github.com/user-attachments/assets/fa908837-6154-42a2-a93b-eb5e363f5783)

we can probably silence the warnings as a separate enhancement

## Related Issues / Discussions

<!--WHEN APPLICABLE: List any related issues or discussions on github or
discord. If this PR closes an issue, please use the "Closes #1234"
format, so that the issue will be automatically closed when the PR
merges.-->

## QA Instructions

<!--WHEN APPLICABLE: Describe how you have tested the changes in this
PR. Provide enough detail that a reviewer can reproduce your tests.-->

## Merge Plan

<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->

## Checklist

- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
2025-04-01 21:16:40 -04:00
Eugene Brodsky
bb1c40f222 Merge branch 'main' into pin-timm-for-llava 2025-04-01 21:10:30 -04:00
jazzhaiku
bfb117d0e0 Port LoRA to new classification API (#7849)
## Summary

- Port LoRA to new classification API
- Add 2 additional tests cases (ControlLora and Flux Diffusers LoRA)
- Moved `ModelOnDisk` to its own module

## Related Issues / Discussions

<!--WHEN APPLICABLE: List any related issues or discussions on github or
discord. If this PR closes an issue, please use the "Closes #1234"
format, so that the issue will be automatically closed when the PR
merges.-->

## QA Instructions

<!--WHEN APPLICABLE: Describe how you have tested the changes in this
PR. Provide enough detail that a reviewer can reproduce your tests.-->

## Merge Plan

<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->

## Checklist

- [ ] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
2025-04-01 08:05:48 +11:00
jazzhaiku
b31c1022c3 Merge branch 'main' into lora-classification 2025-04-01 07:58:36 +11:00
Mary Hipp
a5851ca31c fix from leftover testing 2025-03-31 12:45:53 -04:00
Mary Hipp
77bf5c15bb GET presigned URLs directly instead of trying to use redirects 2025-03-31 12:45:53 -04:00
Eugene Brodsky
d26b7a1a12 Merge branch 'main' into pin-timm-for-llava 2025-03-31 11:37:29 -04:00
psychedelicious
595133463e feat(nodes): add methods to invalidate invocation typeadapters 2025-03-31 19:15:59 +11:00
psychedelicious
6155f9ff9e feat(nodes): move invocation/output registration to separate class 2025-03-31 19:15:59 +11:00
psychedelicious
7be87c8048 refactor(nodes): simpler logic for baseinvocation typeadapter handling 2025-03-31 19:15:59 +11:00
jazzhaiku
9868c3bfe3 Merge branch 'main' into lora-classification 2025-03-31 16:43:26 +11:00
psychedelicious
8b299d0bac chore: prep for v5.9.1 2025-03-31 13:40:07 +11:00
psychedelicious
a44bfb4658 fix(mm): handle FLUX models w/ diff in_channels keys
Before FLUX Fill was merged, we didn't do any checks for the model variant. We always returned "normal".

To determine if a model is a FLUX Fill model, we need to check the state dict for a specific key. Initially, this logic was too strict and rejected quantized FLUX models. This issue was resolved, but it turns out there is another failure mode - some fine-tunes use a different key.

This change further reduces the strictness, handling the alternate key and also falling back to "normal" if we don't see either key. This effectively restores the previous probing behaviour for all FLUX models.

Closes #7856
Closes #7859
2025-03-31 12:32:55 +11:00
psychedelicious
96fb5f6881 feat(ui): disable denoising strength when selected models flux fill 2025-03-31 11:31:02 +11:00
psychedelicious
4109ea5324 fix(nodes): expanded masks not 100% transparent outside the fade out region
The polynomial fit isn't perfect and we end up with alpha values of 1 instead of 0 when applying the mask. This in turn causes issues on canvas where outputs aren't 100% transparent and individual layer bbox calculations are incorrect.
2025-03-31 11:17:00 +11:00
jazzhaiku
f6c2ee5040 Merge branch 'main' into lora-classification 2025-03-31 09:01:16 +11:00
Billy
965753bf8b Ruff formatting 2025-03-31 08:18:00 +11:00
Billy
40c53ab95c Guard 2025-03-29 09:58:02 +11:00
psychedelicious
aaa6211625 chore(backend): ruff C420 2025-03-28 18:28:32 -04:00
psychedelicious
f6d770eac9 ci: add python 3.12 to test matrix 2025-03-28 18:28:32 -04:00
psychedelicious
47cb61cd62 ci: remove python 3.10 from test matrix 2025-03-28 18:28:32 -04:00
psychedelicious
b0fdc8ae1c ci: bump linux-cpu test runner to ubuntu 24.04 2025-03-28 18:28:32 -04:00
psychedelicious
ed9b30efda ci: bump uv to 0.6.10 2025-03-28 18:28:32 -04:00
psychedelicious
168e5eeff0 ci: use uv in typegen-checks
ci: use uv in typegen-checks to generate types

experiment: simulate typegen-checks failure

Revert "experiment: simulate typegen-checks failure"

This reverts commit f53c6876fe8311de236d974194abce93ed84930c.
2025-03-28 18:28:32 -04:00
psychedelicious
7acaa86bdf ci: get ci working with uv instead of pip
Lots of squashed experimentation heh:

ci: manually specify python version in tests

ci: whoops typo in ruff cmds

ci: specify python versions for uv python install

ci: install python verbosely

ci: try forcing python preference?

ci: try forcing python preference a different way?

ci: try in a venv?

ci: it works, but try without venv

ci: oh maybe we need --preview?

ci: poking it with a stick

ci: it works, add summary to pytest output

ci: fix pytest output

experiment: simulate test failure

Revert "experiment: simulate test failure"

This reverts commit b99ca512f6e61a2a04a1c0636d44018c11019954.

ci: just use default pytest output

cI: attempt again to use uv to install python

cI: attempt again again to use uv to install python

Revert "cI: attempt again again to use uv to install python"

This reverts commit 3cba861c90738081caeeb3eca97b60656ab63929.

Revert "cI: attempt again to use uv to install python"

This reverts commit b30f2277041dc999ed514f6c594c6d6a78f5c810.
2025-03-28 18:28:32 -04:00
psychedelicious
96c0393fe7 ci: bump ruff to 0.11.2
Need to bump both CI and pyproject.toml at the same time
2025-03-28 18:28:32 -04:00
psychedelicious
403f795c5e ci: remove linux-cuda-11_7 & linux-rocm-5_2 from test matrix
We only have CPU runners, so these tests are not doing anything useful.
2025-03-28 18:28:32 -04:00
psychedelicious
c0f88a083e ci: use uv for python-tests 2025-03-28 18:28:32 -04:00
psychedelicious
542b182899 ci: use uv for python-checks 2025-03-28 18:28:32 -04:00
Mary Hipp
3f58c68c09 fix tag invalidation 2025-03-28 10:52:27 -04:00
Mary Hipp
e50c7e5947 restore multiple key 2025-03-28 10:52:27 -04:00
Mary Hipp
4a83700fe4 if clientSideUploading is enabled, handle bulk uploads using that flow 2025-03-28 10:52:27 -04:00
Eugene Brodsky
c9992914d6 Merge branch 'main' into pin-timm-for-llava 2025-03-28 09:20:30 -04:00
jazzhaiku
c25f6d1f84 Merge branch 'main' into lora-classification 2025-03-28 12:32:22 +11:00
jazzhaiku
a53e1ccf08 Small improvements (#7842)
## Summary

- Extend `ModelOnDisk` with caching, type hints, default args
- Fail early if there is an error classifying a config

## Related Issues / Discussions

<!--WHEN APPLICABLE: List any related issues or discussions on github or
discord. If this PR closes an issue, please use the "Closes #1234"
format, so that the issue will be automatically closed when the PR
merges.-->

## QA Instructions

<!--WHEN APPLICABLE: Describe how you have tested the changes in this
PR. Provide enough detail that a reviewer can reproduce your tests.-->

## Merge Plan

<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->

## Checklist

- [ ] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
2025-03-28 12:21:41 +11:00
jazzhaiku
1af9930951 Merge branch 'main' into small-improvements 2025-03-28 12:11:09 +11:00
Billy
c276c1cbee Comment 2025-03-28 10:57:46 +11:00
Billy
c619348f29 Extract ModelOnDisk to its own module 2025-03-28 10:35:13 +11:00
psychedelicious
c6f96613fc chore(ui): typegen 2025-03-28 08:14:06 +11:00
psychedelicious
258bf736da fix(nodes): handle zero fade size (e.g. mask blur 0)
Closes #7850
2025-03-28 08:14:06 +11:00
Billy
0d75c99476 Caching 2025-03-27 17:55:09 +11:00
Billy
323d409fb6 Make ruff happy 2025-03-27 17:47:57 +11:00
Billy
f251722f56 LoRA classification API 2025-03-27 17:47:01 +11:00
psychedelicious
7004fde41b fix(mm): vllm model calculates its own size 2025-03-27 09:36:14 +11:00
jazzhaiku
c9dc27afbb Merge branch 'main' into small-improvements 2025-03-27 08:14:48 +11:00
Billy
efd14ec0e4 Make ruff happy 2025-03-27 08:11:39 +11:00
Billy
21ee2b6251 Merge branch 'small-improvements' of github.com:invoke-ai/InvokeAI into small-improvements 2025-03-27 08:10:38 +11:00
Billy
82dd2d508f Deprecate checkpoint as file, diffusers as directory terminology 2025-03-27 08:10:12 +11:00
psychedelicious
ffb5f6c6a6 chore: bump version to v5.9.0 2025-03-27 08:08:44 +11:00
psychedelicious
5c5fff9ecb chore(ui): update whatsnew 2025-03-27 08:08:44 +11:00
psychedelicious
9ca071819b chore(nodes): remove beta/prototype flag from a lot of stable nodes 2025-03-27 08:08:44 +11:00
psychedelicious
b14d8e8192 chore(nodes): mark llava_onevision_vllm as beta 2025-03-27 08:08:44 +11:00
Eugene Brodsky
3f12a43e75 remove pin for controlnet-aux and pin timm to a version that works with llava
timm < 1.0.0 prevents llava models from working (broken in transformers). but controlnet-aux pinned it to an earlier version because otherwise it was breaking the ZoeDepth controlnet.

we don't use ZoeDepth (replaced by depthAnything), and downgrading controlnet-aux seems to be acceptable.

more context here:

https://github.com/huggingface/controlnet_aux/issues/106
https://github.com/huggingface/controlnet_aux/pull/101
2025-03-26 16:58:18 -04:00
jazzhaiku
5a59f6e3b8 Merge branch 'main' into small-improvements 2025-03-27 07:38:13 +11:00
Billy
60b5aef16a Log error -> warning 2025-03-27 06:56:22 +11:00
jazzhaiku
35222a8835 Taxonomy (#7833)
## Summary

This PR moves type definitions out of `config.py` into a new
`taxonomy.py` module.
The goal is to reduce clutter in `config.py`, and to resolve circular
import issues by isolating these types in a dedicated module with
(almost) no internal dependencies.
Because so many places import these definitions, these changes touch 73
files.

Additional changes:
- Removed star imports using "removestar" tool
- Added the commit to `.git-blame-ignore-revs` to avoid noise in git
blame history


## Related Issues / Discussions

<!--WHEN APPLICABLE: List any related issues or discussions on github or
discord. If this PR closes an issue, please use the "Closes #1234"
format, so that the issue will be automatically closed when the PR
merges.-->

## QA Instructions

<!--WHEN APPLICABLE: Describe how you have tested the changes in this
PR. Provide enough detail that a reviewer can reproduce your tests.-->

## Merge Plan

<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->

## Checklist

- [ ] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
2025-03-26 22:44:41 +11:00
Billy
0e8b5484d5 Error handling 2025-03-26 19:31:57 +11:00
Billy
454506c83e Type hints 2025-03-26 19:12:49 +11:00
Billy
8f6ab67376 Logs 2025-03-26 16:34:32 +11:00
Billy
5afcc7778f Redundant 2025-03-26 16:32:19 +11:00
Billy
325e07d330 Error handling 2025-03-26 16:30:45 +11:00
Billy
a016bdc159 Add todo 2025-03-26 16:17:26 +11:00
Billy
a14f0b2864 Fail early on invalid config 2025-03-26 16:10:32 +11:00
Billy
721483318a Extend ModelOnDisk 2025-03-26 16:10:00 +11:00
jazzhaiku
be04743649 Merge branch 'main' into taxonomy 2025-03-26 15:09:26 +11:00
psychedelicious
92f0c28d6c fix(ui): correctly render whitespace in strings in string generator previews
This is a visual issue - the underlying strings are not trimmed.

Closes #7830
2025-03-26 13:52:31 +11:00
Billy
a6b94e8ca4 Revert some files 2025-03-26 13:18:50 +11:00
Billy
00b11ef795 Git blame ignore revs 2025-03-26 12:56:04 +11:00
Billy
182580ff69 Imports 2025-03-26 12:55:10 +11:00
Billy
8e9d5c1187 Ruff formatting 2025-03-26 12:30:31 +11:00
Billy
99aac5870e Remove star imports 2025-03-26 12:27:00 +11:00
psychedelicious
c1b475c585 feat(ui): add getRuntimeConfig query and show it all in the about modal 2025-03-26 11:39:21 +11:00
psychedelicious
ec44e68cbf chore(ui): typegen 2025-03-26 11:39:21 +11:00
psychedelicious
73dbebbcc3 feat(api): add route to get app config and set config fields 2025-03-26 11:39:21 +11:00
psychedelicious
09f971467d feat(app): do not set port unless necessary 2025-03-26 11:39:21 +11:00
psychedelicious
2c71b0e873 fix(ui): long node titles overflow 2025-03-26 10:24:46 +11:00
Kevin Turner
92f69ac463 fix: make source location discovery more robust
The top-level `invokeai` package may have an obscured origin due to the way editible installs work, but it's much more likely that this module is from a specific file.
2025-03-26 10:12:36 +11:00
jazzhaiku
3b154df71a Import Smoke Test (#7835)
## Summary

This test imports all modules in the invokeai package and fails if there
are any exceptions.
Existing issues are excluded to avoid blocking main.

## Related Issues / Discussions

<!--WHEN APPLICABLE: List any related issues or discussions on github or
discord. If this PR closes an issue, please use the "Closes #1234"
format, so that the issue will be automatically closed when the PR
merges.-->

## QA Instructions

<!--WHEN APPLICABLE: Describe how you have tested the changes in this
PR. Provide enough detail that a reviewer can reproduce your tests.-->

## Merge Plan

<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->

## Checklist

- [ ] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
2025-03-26 08:40:07 +11:00
Billy
64aa965160 Set ordering 2025-03-25 19:21:14 +11:00
Billy
d715c27d07 Add more known failures 2025-03-25 17:59:28 +11:00
Billy
515084577c Test all imports work 2025-03-25 17:45:22 +11:00
psychedelicious
7596c07a64 chore: prep for v5.9.0rc2 2025-03-25 10:21:23 +11:00
Kevin Turner
98fd1d949b fix: make dev_reload work for files in nodes/ 2025-03-25 10:04:17 +11:00
Linos
6312e6aa8f translationBot(ui): update translation (Vietnamese)
Currently translated at 100.0% (1832 of 1832 strings)

Co-authored-by: Linos <linos.coding@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/vi/
Translation: InvokeAI/Web UI
2025-03-25 08:00:45 +11:00
Riccardo Giovanetti
6435f11bae translationBot(ui): update translation (Italian)
Currently translated at 98.7% (1815 of 1838 strings)

translationBot(ui): update translation (Italian)

Currently translated at 98.7% (1809 of 1832 strings)

Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
2025-03-25 08:00:45 +11:00
psychedelicious
1c69b9b1fa fix(ui): restore display: flex to image viewer and node editor
This was inadventently removed in #7786 and caused some minor layout overflow.
2025-03-25 07:44:07 +11:00
psychedelicious
731970ff88 fix(ui): use expanded mask for paste-back when inpainting 2025-03-25 00:03:13 +11:00
psychedelicious
038bac1614 feat(ui): make it clearer that we are doing scale before processing in graph builders 2025-03-25 00:03:13 +11:00
jazzhaiku
ed9efe7740 Port LLaVA to new API (#7817)
## Summary

- Port LLaVA model config to new classification API
- Add 2 test cases (stripped LLaVA models variants to git-lfs)

## Related Issues / Discussions

<!--WHEN APPLICABLE: List any related issues or discussions on github or
discord. If this PR closes an issue, please use the "Closes #1234"
format, so that the issue will be automatically closed when the PR
merges.-->

## QA Instructions

<!--WHEN APPLICABLE: Describe how you have tested the changes in this
PR. Provide enough detail that a reviewer can reproduce your tests.-->

## Merge Plan

<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->

## Checklist

- [ ] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
2025-03-24 22:50:54 +11:00
jazzhaiku
ffa0beba7a Merge branch 'main' into llava 2025-03-24 15:17:33 +11:00
psychedelicious
75d793f1c4 fix(ui): siglip model translation key 2025-03-24 13:26:38 +11:00
psychedelicious
2b086917e0 chore(ui): lint 2025-03-24 13:24:13 +11:00
psychedelicious
a9f2738086 feat(ui): layout improvements for string field collection input 2025-03-24 13:24:13 +11:00
psychedelicious
3a56799ea5 tidy(ui): remove unused code 2025-03-24 13:24:13 +11:00
psychedelicious
3162ce94dc tidy(ui): use settings for node field settings instead of config
Non-functional naming change to clarify the logic
2025-03-24 13:24:13 +11:00
psychedelicious
c0dc6ac4e1 fix(ui): issue where string drop-down options are not removed when changing component to a different type 2025-03-24 13:24:13 +11:00
psychedelicious
fed1995525 chore(ui): lint 2025-03-24 13:24:13 +11:00
psychedelicious
5006e23456 feat(ui): added reset options button 2025-03-24 13:24:13 +11:00
psychedelicious
2f063bddda fix(ui): restore field-node overlay
Accidentally removed it
2025-03-24 13:24:13 +11:00
psychedelicious
23a26422fd feat(ui): support for custom string field dropdowns in builder 2025-03-24 13:24:13 +11:00
psychedelicious
434f195a96 feat(ui): add empty string placeholder to string fields 2025-03-24 13:24:13 +11:00
psychedelicious
6a4c2d692c chore(ui): typegen 2025-03-24 12:45:46 +11:00
psychedelicious
5127a07cf9 feat(nodes): clean up lora node names
I had named them wonkily and caused some user confusion.
2025-03-24 12:45:46 +11:00
psychedelicious
0b4c6f0ab4 fix(mm): flux model variant probing
In #7780 we added FLUX Fill support, and needed the probe to be able to distinguish between "normal" FLUX models and FLUX Fill models.

Logic was added to the probe to check a particular state dict key (input channels), which should be 384 for FLUX Fill and 64 for other FLUX models.

The new logic was stricter and instead of falling back on the "normal" variant, it raised when an unexpected value for input channels was detected.

This caused failures to probe for BNB-NF4 quantized FLUX Dev/Schnell, which apparently only have 1 input channel.

After checking a variety of FLUX models, I loosened the strictness of the variant probing logic to only special-case the new FLUX Fill model, and otherwise fall back to returning the "normal" variant. This better matches the old behaviour and fixes the import errors.

Closes #7822
2025-03-24 12:36:18 +11:00
Billy
d8450033ea Fix 2025-03-21 17:46:18 +11:00
Billy
3938736bd8 Ruff formatting 2025-03-21 17:35:12 +11:00
Billy
fb2c7b9566 Defaults 2025-03-21 17:35:04 +11:00
Billy
29449ec27d Implement new api for LLaVA 2025-03-21 17:17:56 +11:00
Billy
e38f778d28 Extend ModelOnDisk 2025-03-21 17:17:15 +11:00
Billy
f5e78436a8 Update regression test 2025-03-21 17:14:02 +11:00
Billy
6a15b5d9be Add stripped models for testing llava 2025-03-21 15:34:20 +11:00
psychedelicious
a629102c87 chore(ui): update whatsnew 2025-03-21 13:09:27 +11:00
psychedelicious
848ade8ab8 chore: prep for v5.9.0rc1 2025-03-21 13:09:27 +11:00
Hosted Weblate
2110feb01c translationBot(ui): update translation files
Updated by "Cleanup translation files" hook in Weblate.

Co-authored-by: Hosted Weblate <hosted@weblate.org>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/
Translation: InvokeAI/Web UI
2025-03-21 12:55:07 +11:00
Riku
f3e1821957 translationBot(ui): update translation (German)
Currently translated at 67.0% (1224 of 1826 strings)

Co-authored-by: Riku <riku.block@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/de/
Translation: InvokeAI/Web UI
2025-03-21 12:55:07 +11:00
Linos
bbcf93089a translationBot(ui): update translation (Vietnamese)
Currently translated at 100.0% (1827 of 1827 strings)

translationBot(ui): update translation (Vietnamese)

Currently translated at 100.0% (1826 of 1826 strings)

translationBot(ui): update translation (Vietnamese)

Currently translated at 100.0% (1825 of 1825 strings)

Co-authored-by: Linos <linos.coding@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/vi/
Translation: InvokeAI/Web UI
2025-03-21 12:55:07 +11:00
Riccardo Giovanetti
66f41aa307 translationBot(ui): update translation (Italian)
Currently translated at 98.7% (1804 of 1827 strings)

translationBot(ui): update translation (Italian)

Currently translated at 98.7% (1803 of 1825 strings)

Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
2025-03-21 12:55:07 +11:00
psychedelicious
8a709766b3 feat(ui): better error for unknown fields in builder view mode 2025-03-21 12:51:12 +11:00
psychedelicious
efaa20a7a1 feat(ui): better labels for missing/unexpected fields 2025-03-21 12:51:12 +11:00
psychedelicious
3e4c808b23 refactor(ui): organise useInputFieldTemplate hooks again & add useInputFieldTemplateSafe 2025-03-21 12:51:12 +11:00
psychedelicious
00e3931af4 chore(ui): "useInputFieldLabel" -> "useInputFieldLabelSafe"
Also update docstrings
2025-03-21 12:51:12 +11:00
psychedelicious
08bea07f8b chore(ui): "useInputFieldDescription" -> "useInputFieldDescriptionSafe"
Also update docstrings
2025-03-21 12:51:12 +11:00
psychedelicious
166d2f0e39 chore(ui): "useInputFieldTemplate" -> "useInputFieldTemplateOrThrow" 2025-03-21 12:51:12 +11:00
psychedelicious
21f346717a docs(ui): add docstring to useInputFieldTemplate 2025-03-21 12:51:12 +11:00
psychedelicious
f966fb8b9c docs(ui): add docstring to useInputFieldDescription 2025-03-21 12:51:12 +11:00
psychedelicious
c2b20a5387 feat(ui): hide guidance when FLUX Fill model selected 2025-03-21 10:24:03 +11:00
psychedelicious
bed9089fe6 refactor(ui): just always set guidance to 30 when using FLUX Fill 2025-03-21 10:24:03 +11:00
psychedelicious
d34a4f765c feat(ui): better error for FLUX Fill + t2i/i2i incompatibility 2025-03-21 10:24:03 +11:00
psychedelicious
efe4708b8b feat(ui): better error message/warning for FLUX Fill w/ Control LoRA 2025-03-21 10:24:03 +11:00
psychedelicious
7cb1f61a9e feat(ui): bump FLUX guidance up to 30 if it's too low during graph building 2025-03-21 10:24:03 +11:00
psychedelicious
6e2ef34cba feat(ui): add warning for FLUX Fill + Control LoRA 2025-03-21 10:24:03 +11:00
psychedelicious
d208b99a47 feat(ui): pass the full model config throughout validation logic 2025-03-21 10:24:03 +11:00
psychedelicious
47eeafa5cb feat(ui): add selector to select the main model full config object 2025-03-21 10:24:03 +11:00
psychedelicious
0cb00fbe53 refactor(ui): use new compositing nodes for inpaint/outpaint graphs 2025-03-21 10:24:03 +11:00
psychedelicious
a7e8ed3bc2 feat(ui): add FLUX Fill graph builder util 2025-03-21 10:24:03 +11:00
psychedelicious
22eb25be48 refactor(ui): use more succient syntax to opt-out of RTKQ caching for model fetching utils 2025-03-21 10:24:03 +11:00
psychedelicious
a077f3fefc chore(ui): typegen 2025-03-21 10:24:03 +11:00
psychedelicious
c013a6e38d feat(nodes): deprecate canvas_v2_mask_and_crop 2025-03-21 10:24:03 +11:00
psychedelicious
6cfeb71bed feat(nodes): add expand_mask_with_fade to better handle canvas compositing needs
Previously we used erode/dilate and a Gaussian blur to expand and fade the edges of Canvas masks. The implementation a number of problems:
- Erode/dilate kernel sizes were not calculated correctly, and extra iterations were run to compensate. The result is the blur size, which should have been pixels, was very inaccurate and unreliable.
- What we want is to add a "soft bleed" - like a drop shadow with no offset - starting from the edge of the mask, extending out by however many pixels. But Gaussian blur does not do this. The blurred area starts _inside_ the mask and extends outside it. So it kinda blurs inwards and outwards. We compensated for this by expanding the mask.
- Using a Gaussian blur can cause banding artifacts. Gaussian blur doesn't have a "size" or "radius" parameter in the sense that you think it should. It's a convolution matrix and there are _no non-zero values in the result_. This means that, far away from the mask, once compositing completes, we have some values that are very close to zero but not quite zero. These values are quantized by HTML Canvas, resulting in banding artifacts where you'd expect the blur to have faded to 0% alpha. At least, that is my understanding of why the banding artifacts occur.

The new node uses a better strategy to expand the mask and add the fade out effect:
- Calculate the distance from each white pixel to the nearest black pixel.
- Normalize this distance by dividing by the fade size in px, then clip the values to 0 - 1. The result represents the distance of each white pixel to its nearest black pixel as a percentage of the fade size. At this point, it is a linear distribution.
- Create a polynomial to describe the fade's intensity so that we can have a smooth transition from the masked region (black) to unmasked (white). There are some magic numbers here, deterined experimentally.
- Evaluate the polynomial over the normalized distances, so we now have a matrix representing the fade intensity for every pixel
- Convert this matrix back to uint8 and apply it to the mask

This works soooo much better than the previous method. Not only does it fix the banding issues, but when we enable "output only generated regions", we get a much smaller image. Will add images to the PR to clarify.
2025-03-21 10:24:03 +11:00
psychedelicious
534f993023 feat(nodes): add apply_mask_to_image node
It simply applies the mask to an image.
2025-03-21 10:24:03 +11:00
psychedelicious
67f9b6420c fix(nodes): ensure alpha mask is opened as RGBA 2025-03-21 10:24:03 +11:00
psychedelicious
61bf065237 feat(nodes): rename "FLUX Fill" -> "FLUX Fill Conditioning" 2025-03-21 10:24:03 +11:00
psychedelicious
e78cf889ee fix(ui): clip shift-draw strokes to bbox when clip to bbox enabled
Closes #7809
2025-03-21 08:14:20 +11:00
psychedelicious
5d13f0ba15 tidy(ui): remove recommended flag from workflow (believe was for testing purposes) 2025-03-20 08:50:01 -04:00
psychedelicious
633b9afa46 fix(ui): recommended star stretches tag list layout 2025-03-20 08:50:01 -04:00
psychedelicious
f1889b259d tidy(ui): split browse workflows button into own component 2025-03-20 08:50:01 -04:00
psychedelicious
ed21d0b57e tidy(ui): remove noop useEffect 2025-03-20 08:50:01 -04:00
Mary Hipp
df90da28e1 tsc fix 2025-03-20 15:43:57 +11:00
Mary Hipp
702054aa62 make sure browse is selected 2025-03-20 15:43:57 +11:00
Mary Hipp
636ec1de6e add viewAllWorkflowsRecommended to studio init action to show library with only recomended workflows 2025-03-20 15:43:57 +11:00
Mary Hipp
063d07fd41 switch to using recommended with star insteaed of auto-selecting 2025-03-20 15:43:57 +11:00
Mary Hipp
c78eac624e update workflow tag/categories so that we can pass in 1+ selected tags to start with 2025-03-20 15:43:57 +11:00
Mary Hipp
05de3b7a84 workflow library UI updates: scrollbar to make obvious its overflowing, move deselecet all tags to be next to browse button 2025-03-20 15:43:57 +11:00
Ryan Dick
9cc2232b6f Bump FluxDenoise invocation version and typegen. 2025-03-19 14:45:18 +11:00
Ryan Dick
9fdc06b447 Add FLUX Fill input validation and error/warning reporting. 2025-03-19 14:45:18 +11:00
Ryan Dick
5ea3ec5cc8 Get FLUX Fill working. Note: To use FLUX Fill, set guidance to ~30. 2025-03-19 14:45:18 +11:00
Ryan Dick
f13a07ba6a WIP on updating FluxDenoise to support FLUX Fill. 2025-03-19 14:45:18 +11:00
Ryan Dick
a913f0163d WIP - Add FluxFillInvocation 2025-03-19 14:45:18 +11:00
Ryan Dick
f7cfbd1323 Add FLUX Fill starter model. 2025-03-19 14:45:18 +11:00
Ryan Dick
2806b60701 Add logic to probe FLUX variant (NORMAL vs INPAINT). 2025-03-19 14:45:18 +11:00
psychedelicious
d8c3af624b Use git-lfs for larger assets (#7804)
## Summary

- Integrate Git LFS to our automated Python tests in CI
- Add stripped model files with git-lfs
- `README.md` instructions to install and configure git-lfs
- Unrelated change (skip hashing to make unit test run faster)

## Related Issues / Discussions

<!--WHEN APPLICABLE: List any related issues or discussions on github or
discord. If this PR closes an issue, please use the "Closes #1234"
format, so that the issue will be automatically closed when the PR
merges.-->

## QA Instructions

<!--WHEN APPLICABLE: Describe how you have tested the changes in this
PR. Provide enough detail that a reviewer can reproduce your tests.-->

## Merge Plan

<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->

## Checklist

- [ ] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
2025-03-19 09:53:26 +11:00
psychedelicious
feed44b68d Stripped models (#7797)
## Summary

**Problem**
We want to have automated tests for model classification/probing, but
model files are too large to include in the source.

**Proposed Solution**
Classification/probing only requires metadata (key names, tensor
shapes), not weights.
This PR introduces "stripped" models - lightweight versions that retains
only essential metadata.

- Added script to strip models
- Added stripped models to automated tests


**Model size before and after "stripping":**
```
LLaVA Onevision Qwen2 0.5b-ov-hf before: 1.8 GB, after: 11.6 MB
text_encoder before: 246.1 MB, after: 35.6 kB
llava-onevision-qwen2-7b-si-hf before: 16.1 GB, after: 11.7 MB
RealESRGAN_x2plus.pth before: 67.1 MB, after: 143.0 kB
IP Adapter SD1 before: 2.5 GB, after: 94.9 kB
Hard Edge Detection (canny) before: 722.6 MB, after: 63.6 kB
Lineart before: 722.6 MB, after: 63.6 kB
Segmentation Map before: 722.6 MB, after: 63.6 kB
EasyNegative before: 24.7 kB, after: 151 Bytes
Face Reference (IP Adapter Plus Face) before: 98.2 MB, after: 13.7 kB
Standard Reference (IP Adapter) before: 44.6 MB, after: 6.0 kB
shinkai_makoto_offset before: 151.1 MB, after: 160.0 kB
thickline_fp16 before: 151.1 MB, after: 160.0 kB
Alien Style before: 228.5 MB, after: 582.6 kB
Noodles Style before: 228.5 MB, after: 582.6 kB
Juggernaut XL v9 before: 6.9 GB, after: 3.7 MB
dreamshaper-8 before: 168.9 MB, after: 1.6 MB
```





## Related Issues / Discussions

<!--WHEN APPLICABLE: List any related issues or discussions on github or
discord. If this PR closes an issue, please use the "Closes #1234"
format, so that the issue will be automatically closed when the PR
merges.-->

## QA Instructions

<!--WHEN APPLICABLE: Describe how you have tested the changes in this
PR. Provide enough detail that a reviewer can reproduce your tests.-->

## Merge Plan

<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->

## Checklist

- [ ] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
2025-03-19 08:13:10 +11:00
Billy
247f3b5d67 Merge branch 'stripped-models' into git-lfs 2025-03-19 07:53:27 +11:00
Billy
8e14f9d971 Merge branch 'main' into stripped-models 2025-03-19 07:52:56 +11:00
Billy
bdb44ee48d Merge branch 'git-lfs' of github.com:invoke-ai/InvokeAI into git-lfs 2025-03-19 07:30:34 +11:00
Billy
b57f5330c5 Pin action to commit 2025-03-19 07:28:28 +11:00
jazzhaiku
ade3c015b4 Update docs/contributing/dev-environment.md
Co-authored-by: Eugene Brodsky <ebr@users.noreply.github.com>
2025-03-19 07:23:23 +11:00
psychedelicious
7fe4d4c21a feat(app): better errors when scanning models with picklescan
Differentiate between malware detection and scan error.
2025-03-19 07:20:25 +11:00
psychedelicious
133a7fde55 Model classification api (#7742)
## Summary
The _goal_ of this PR is to make it easier to add an new config type.
This _scope_ of this PR is to integrate the API and does not include
adding new configs (outside tests) or porting existing ones.


One of the glaring issues of the existing *legacy probe* is that the
logic for each type is spread across multiple classes and intertwined
with the other configs. This means that adding a new config type (or
modifying an existing one) is complex and error prone.

This PR attempts to remedy this by providing a new API for adding
configs that:

- Is backwards compatible with the existing probe.
- Encapsulates fields and logic in a single class, keeping things
self-contained and easy to modify safely.

Below is a minimal toy example illustrating the proposed new structure:

```python
class MinimalConfigExample(ModelConfigBase):
    type: ModelType = ModelType.Main
    format: ModelFormat = ModelFormat.Checkpoint
    fun_quote: str

    @classmethod
    def matches(cls, mod: ModelOnDisk) -> bool:
        return mod.path.suffix == ".json"

    @classmethod
    def parse(cls, mod: ModelOnDisk) -> dict[str, Any]:
        with open(mod.path, "r") as f:
            contents = json.load(f)

        return {
            "fun_quote": contents["quote"],
            "base": BaseModelType.Any,
        }
```

To create a new config type, one needs to inherit from `ModelConfigBase`
and implement its interface.

The code falls back to the legacy model probe for existing models using
the old API.
This allows us to incrementally port the configs one by one.



## Related Issues / Discussions

<!--WHEN APPLICABLE: List any related issues or discussions on github or
discord. If this PR closes an issue, please use the "Closes #1234"
format, so that the issue will be automatically closed when the PR
merges.-->

## QA Instructions

<!--WHEN APPLICABLE: Describe how you have tested the changes in this
PR. Provide enough detail that a reviewer can reproduce your tests.-->

## Merge Plan

<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->

## Checklist

- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
2025-03-18 15:25:56 +11:00
Billy
6375214878 Merge branch 'stripped-models' into git-lfs 2025-03-18 14:57:58 +11:00
Billy
b9972be7f1 Merge branch 'model-classification-api' into stripped-models 2025-03-18 14:57:23 +11:00
Billy
e61c5a3f26 Merge 2025-03-18 14:55:11 +11:00
Billy
8c633786f6 Remove accidently included files 2025-03-18 14:16:51 +11:00
Billy
8703eea49b LFS cache 2025-03-18 14:08:21 +11:00
Billy
c8888be4c3 Formatting 2025-03-18 13:10:07 +11:00
Billy
11963a65a4 CI/CD 2025-03-18 12:56:28 +11:00
Billy
ab6422fdf7 Add to README.md 2025-03-18 12:37:32 +11:00
psychedelicious
1f8632029e fix(nodes): add validator to vllm node images field to handle single image field inputs 2025-03-18 11:53:06 +11:00
Ryan Dick
88a762474d typegen 2025-03-18 11:53:06 +11:00
Ryan Dick
e6dd721e33 Add max_length=3 to the LLaVA OneVision image input field. 2025-03-18 11:53:06 +11:00
Billy
2a09604baf Formatting 2025-03-18 11:53:06 +11:00
Billy
f94f00ede0 Ruff formatting 2025-03-18 11:53:06 +11:00
Billy
37af281299 WIP - model selection for LLaVA 2025-03-18 11:53:06 +11:00
Billy
fc82775d7a WIP - model selection for LLaVA 2025-03-18 11:53:06 +11:00
Billy
9ed46f60b7 Add LLaVA OneVision to Config dropdown in UI 2025-03-18 11:53:06 +11:00
Ryan Dick
9a389e6b93 Add a LLaVA OneVision starter model. 2025-03-18 11:53:06 +11:00
Ryan Dick
2ef1ecf381 Fix copy-paste errors. 2025-03-18 11:53:06 +11:00
Ryan Dick
41de112932 Make LLaVA Onevision node work with 0 images, and other minor improvements. 2025-03-18 11:53:06 +11:00
Ryan Dick
e9714fe476 Add LLaVA Onevision model loading and inference support. 2025-03-18 11:53:06 +11:00
Ryan Dick
3f29293e39 Add LlavaOnevision model type and probing logic. 2025-03-18 11:53:06 +11:00
Billy
db1aa38e98 Warning 2025-03-18 09:55:13 +11:00
Billy
12717d4a4d Stripped model data 2025-03-18 09:51:10 +11:00
Billy
1953f3cbcd Skip hashing to make test quicker 2025-03-18 09:50:18 +11:00
Billy
3469fc9843 Ruff 2025-03-18 09:22:16 +11:00
Billy
7cdd4187a9 Update classify script 2025-03-18 09:21:38 +11:00
Billy
ad66c101d2 Remove stripped model files 2025-03-18 09:10:37 +11:00
psychedelicious
28d3356710 chore: prep for v5.8.1 2025-03-18 09:06:47 +11:00
psychedelicious
81e70fb9d2 tidy(app): errant character 2025-03-18 08:00:51 +11:00
psychedelicious
971c425734 fix(app): incorrect values inserted when retrying queue item
In #7688 we optimized queuing preparation logic. This inadvertently broke retrying queue items.

Previously, a `NamedTuple` was used to store the values to insert in the DB when enqueuing. This handy class provides an API similar to a dataclass, where you can instantiate it with kwargs in any order. The resultant tuple re-orders the kwargs to match the order in the class definition.

For example, consider this `NamedTuple`:
```py
class SessionQueueValueToInsert(NamedTuple):
    foo: str
    bar: str
```

When instantiating it, no matter the order of the kwargs, if you make a normal tuple out of it, the tuple values are in the same order as in the class definition:

```
t1 = SessionQueueValueToInsert(foo="foo", bar="bar")
print(tuple(t1)) # -> ('foo', 'bar')

t2 = SessionQueueValueToInsert(bar="bar", foo="foo")
print(tuple(t2)) # -> ('foo', 'bar')
```

So, in the old code, when we used the `NamedTuple`, it implicitly normalized the order of the values we insert into the DB.

In the retry logic, the values of the tuple were not ordered correctly, but the use of `NamedTuple` had secretly fixed the order for us.

In the linked PR, `NamedTuple` was dropped for a normal tuple, after profiling showed `NamedTuple` to be meaningfully slower than a normal tuple.

The implicit order normalization behaviour wasn't understood, and the order wasn't fixed when changin the retry logic to use a normal tuple instead of `NamedTuple`. This results in a bug where we incorrectly create queue items in the DB. For example, we stored the `destination` in the `field_values` column.

When such an incorrectly-created queue item is dequeued, it fails pydantic validation and causes what appears to be an endless loop of errors.

The only user-facing solution is to add this line to `invokeai.yaml` and restart the app:
```yaml
clear_queue_on_startup: true
```

On next startup, the queue is forcibly cleared before the error loop is triggered. Then the user should remove this line so their queue is persisted across app launches per usual.

The solution is simple - fix the ordering of the tuple. I also added a type annotation and comment to the tuple type alias definition.

Note: The endless error loop, as a general problem, will take some thinking to fix. The queue service methods to cancel and fail a queue item still retrieve it and parse it. And the list queue items methods parse the queue items. Bit of a catch 22, maybe the solution is to simply delete totally borked queue items and log an error.
2025-03-18 08:00:51 +11:00
psychedelicious
b09008c530 feat(ui): add cancel and clear all as toggleable app feature 2025-03-18 06:48:10 +11:00
Billy
f9f99f873d More models 2025-03-17 04:18:44 +00:00
Billy
7f93f1b600 Dependencies 2025-03-17 12:57:13 +11:00
Billy
b1d336ce8a Ruff 2025-03-17 12:19:27 +11:00
Billy
40c7be8f5d Warning about missing test cases 2025-03-17 12:19:15 +11:00
Billy
24218b34bf Make ruff happy 2025-03-17 12:04:26 +11:00
Billy
d970c6d6d5 Use override fixture 2025-03-17 11:58:13 +11:00
Billy
e5308be0bb Use override fixture 2025-03-17 11:31:20 +11:00
Billy
7d5687e9ff Disable device meta for spandrel 2025-03-17 11:30:05 +11:00
Riccardo Giovanetti
7adac4581a translationBot(ui): update translation (Italian)
Currently translated at 98.7% (1800 of 1822 strings)

translationBot(ui): update translation (Italian)

Currently translated at 98.7% (1798 of 1820 strings)

translationBot(ui): update translation (Italian)

Currently translated at 98.7% (1796 of 1818 strings)

Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
2025-03-17 10:49:22 +11:00
Hosted Weblate
962db86cac translationBot(ui): update translation files
Updated by "Cleanup translation files" hook in Weblate.

Co-authored-by: Hosted Weblate <hosted@weblate.org>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/
Translation: InvokeAI/Web UI
2025-03-17 10:49:22 +11:00
psychedelicious
d65ec0e250 feat(ui): configurable form field constraints (WIP3) 2025-03-17 10:47:01 +11:00
psychedelicious
7fdde5e84a tests(ui): fix constrainNumber 2025-03-17 10:47:01 +11:00
psychedelicious
895956bcfe chore(ui): lint 2025-03-17 10:47:01 +11:00
psychedelicious
f27d26cfa2 feat(ui): configurable form field constraints (WIP2) 2025-03-17 10:47:01 +11:00
psychedelicious
965bcba6c2 feat(ui): configurable form field constraints (WIP) 2025-03-17 10:47:01 +11:00
psychedelicious
c9f2460ff2 fix(ui): generator widget should stretch to fill when added to builder 2025-03-17 10:41:59 +11:00
psychedelicious
5abbbf4b5b feat(ui): allow pasting images on workflows tab when workflows not focused 2025-03-17 10:37:27 +11:00
psychedelicious
e66688edbf feat(ui): only paste into canvas when canvas is focused 2025-03-17 10:37:27 +11:00
joshistoast
a519483f95 refactor(ui): ♻️ memoize merged styles, simplify data attribute conditional 2025-03-17 10:34:49 +11:00
joshistoast
75c91604bb fix: 🐛 export the region wrapper
am silly
2025-03-17 10:34:49 +11:00
joshistoast
53bdaba7b6 style: 🚨 linting 2025-03-17 10:34:49 +11:00
joshistoast
f3f405ca77 refactor(ui): ♻️ remove forward ref usage 2025-03-17 10:34:49 +11:00
joshistoast
dda69950a7 refactor(ui): ♻️ apply memoization, system style objects, and data attribute to region highlight wrapper 2025-03-17 10:34:49 +11:00
joshistoast
b2198b9fa7 feat: 🔧 region highlighting disabled by default
some users may not like this
2025-03-17 10:34:49 +11:00
joshistoast
02b91e8e7b feat: highlight focused regions
adds a region wrapper with a highlight effect when that region is focused, this behavior can be toggled as a setting
2025-03-17 10:34:49 +11:00
psychedelicious
09bf7c35eb chore(ui): typegen 2025-03-17 10:32:19 +11:00
psychedelicious
deb9a65b3d chore(ui): update whats new 2025-03-17 10:32:19 +11:00
psychedelicious
5be9a7227c chore: remove all explicit image references in default workflows 2025-03-17 10:32:19 +11:00
psychedelicious
bb9f886bd4 docs: update default workflows dev docs 2025-03-17 10:32:19 +11:00
psychedelicious
46520946f8 chore: remove all explicit model references in default workflows 2025-03-17 10:32:19 +11:00
psychedelicious
830880a6fc chore(nodes): update titles of all model-specific nodes to reference their models
Also bump versions on all of them.
2025-03-17 10:32:19 +11:00
psychedelicious
63b94a8ff3 feat(ui): add sd3.5 default workflows tag 2025-03-17 10:32:19 +11:00
psychedelicious
f12924a1e1 chore: update default workflow tags & names 2025-03-17 10:32:19 +11:00
psychedelicious
f8e51c86f5 chore: bump version to v5.8.0 2025-03-17 10:32:19 +11:00
Billy
654e992630 Accept extra args 2025-03-17 10:25:16 +11:00
Billy
21f247f499 Stripped models script 2025-03-17 09:18:58 +11:00
Billy
8bcd9fe4b7 Extend ModelOnDisk 2025-03-17 09:18:51 +11:00
psychedelicious
c84a646735 ci: pin tj-actions/changed-files
Closes #7793
2025-03-17 08:36:17 +11:00
psychedelicious
b52f8121af fix(ui): duplicate edges on reconnect
Closes #7127
2025-03-15 10:12:50 +11:00
psychedelicious
05bed3fddd fix(ui): do not mark workflow as touched when setting form field initial values 2025-03-15 10:10:21 +11:00
psychedelicious
87ea20192f chore(ui): knip 2025-03-14 20:54:58 +11:00
psychedelicious
2f9c95c462 fix(ui): return early in error-selecting hooks
Prevent an error when a node is deleted and the hook is being called
2025-03-14 20:54:58 +11:00
psychedelicious
47cadbb48e feat(ui): show field errors in tooltips 2025-03-14 20:54:58 +11:00
psychedelicious
23518b9830 feat(ui): useDebouncedAppSelector
Hook that replicates `useSelector`, but debounces calling the selector.
2025-03-14 20:54:58 +11:00
psychedelicious
94dcf391a6 tweak(ui): styling for image collection fields 2025-03-14 20:50:35 +11:00
Billy
637b93d2d8 Ruff 2025-03-14 10:18:25 +11:00
Billy
565b160060 More tests 2025-03-14 10:17:43 +11:00
psychedelicious
e7a60c01ed fix(ui): prevent vertical scrolling on row containers 2025-03-14 07:15:58 +11:00
Mary Hipp
4b54ccc29c getting started copy for workflows 2025-03-13 12:25:14 -04:00
Mary Hipp
c4183ec98c add with_hash to prevent rerenders on default 2025-03-13 10:29:22 -04:00
Mary Hipp
5a9cbe35e0 typegen fix 2025-03-13 10:29:22 -04:00
Mary Hipp
df18fe0298 make sure that recent view always sorts by opened_at even if not available as sort option in UI 2025-03-13 10:29:22 -04:00
Mary Hipp
e5591d145f allow workflow sort options to be passed in 2025-03-13 08:27:51 -04:00
psychedelicious
371c187fc3 chore: bump version to v5.8.0rc1 2025-03-13 23:00:01 +11:00
Billy
bdd0b90769 Merge branch 'model-classification-api' of github.com:invoke-ai/InvokeAI into model-classification-api 2025-03-13 13:37:15 +11:00
Billy
4377158503 Variant 2025-03-13 13:32:57 +11:00
Billy
c8c27079ed Codegen 2025-03-13 13:12:12 +11:00
Billy
d8b9a8d0dd Merge branch 'main' into model-classification-api 2025-03-13 13:03:51 +11:00
Billy
39a4608d15 Fix annotations compatability 3.11 2025-03-13 13:01:19 +11:00
jazzhaiku
cd2d5431db Merge branch 'main' into model-classification-api 2025-03-13 11:21:18 +11:00
Billy
c04cdd9779 Typegen 2025-03-13 11:00:26 +11:00
Billy
b86ac5e049 Explicit union 2025-03-13 10:28:07 +11:00
psychedelicious
e982c95687 fix(ui): respect line breaks in builder text and heading elements 2025-03-13 09:39:41 +11:00
Billy
665236bb79 Type hints 2025-03-13 09:21:58 +11:00
psychedelicious
0eeb0dd67b feat(ui): use invoke logo for thumbnail fallback for default workflows 2025-03-13 08:45:12 +11:00
psychedelicious
28c74cbe38 revert(app): remove test image from default workflow thumbnails 2025-03-13 08:45:12 +11:00
psychedelicious
7414f68acc fix(ui): save as marks workflow as not touched 2025-03-13 08:45:12 +11:00
psychedelicious
a984462b80 tweak(ui): workflow library card layout to fit 2 lines of title and 3 lines of desc 2025-03-13 08:45:12 +11:00
psychedelicious
c6c2567203 tweak(ui): workflow description shows 1 line w/ tooltip for full content 2025-03-13 08:45:12 +11:00
psychedelicious
f05c8b909f fix(ui): mark workflow touched on form builder state changes 2025-03-13 07:10:59 +11:00
psychedelicious
73330a1308 chore(ui): lint 2025-03-13 07:10:59 +11:00
psychedelicious
6f568d48ed fix(ui): studio init action workflow loading 2025-03-13 07:10:59 +11:00
psychedelicious
81a97f3796 fix(ui): load workflow from object 2025-03-13 07:10:59 +11:00
psychedelicious
3f9535d2f9 fix(ui): load workflow from graph 2025-03-13 07:10:59 +11:00
psychedelicious
83bfbdcad4 feat(ui): more workflow loading standardization
There is now a single entrypoint for loading a workflow - `useLoadWorkflowWithDialog`.

The hook:
Handles loading workflows from various sources. If there are unsaved changes, the user will be prompted to confirm before loading the workflow.

It returns  a function that:
Loads a workflow from various sources. If there are unsaved changes, the user will be prompted to confirm before loading the workflow. The workflow will be loaded immediately if there are no unsaved changes. On success, error or completion, the corresponding callback will be called.

WHEW
2025-03-13 07:10:59 +11:00
psychedelicious
729428084c feat(ui): prompt when loading workflow from file if unsaved changes 2025-03-13 07:10:59 +11:00
psychedelicious
523a932ecc feat(ui): accept button on workflow load dialog is "Load" 2025-03-13 07:10:59 +11:00
psychedelicious
21be7d7157 feat(ui): allow load workflow confirm dialog to load workflows from object instead of only id 2025-03-13 07:10:59 +11:00
psychedelicious
a29fb18c0b feat(ui): standardize and clean up workflow loading hooks and logic 2025-03-13 07:10:59 +11:00
psychedelicious
aed446f013 fix(ui): make the workflow load from file menu item work the same as the button in library
Upload and save as instead of just upload as draft.
2025-03-13 07:10:59 +11:00
Mary Hipp
e81c9b0d6e add default for opened_at 2025-03-12 14:35:34 -04:00
Billy
f45400a275 Remove hash algo 2025-03-12 18:39:29 +11:00
psychedelicious
89f457c486 fix(ui): mark workflow as opened when creating a new workflow 2025-03-12 12:11:00 +11:00
psychedelicious
30ed09a36e fix(ui): default categories for oss 2025-03-12 12:11:00 +11:00
psychedelicious
3334652acc feat(db): drop the opened_at column instead of marking deprecated 2025-03-12 12:11:00 +11:00
psychedelicious
e83536f396 chore(ui): lint 2025-03-12 12:11:00 +11:00
psychedelicious
97593f95f6 feat(ui): on first load, if the selected library view has no workflows, switch to the first view that has workflows 2025-03-12 12:11:00 +11:00
psychedelicious
7f14cee17e chore(ui): typegen 2025-03-12 12:11:00 +11:00
psychedelicious
0a836d6fc1 feat(app): add method and route to get workflow library counts by category 2025-03-12 12:11:00 +11:00
psychedelicious
54e781d5bb tidy(app): remove unused method in workflow records service 2025-03-12 12:11:00 +11:00
psychedelicious
aa71d0c817 tweak(ui): 'is_recent' -> 'has_been_opened' 2025-03-12 12:11:00 +11:00
psychedelicious
07313e429d chore(ui): typegen 2025-03-12 12:11:00 +11:00
psychedelicious
bad5023238 tweak(app): 'is_recent' -> 'has_been_opened' 2025-03-12 12:11:00 +11:00
psychedelicious
73a0d2c06c fix(ui): memo WorkflowLibraryModal 2025-03-12 12:11:00 +11:00
psychedelicious
918e9c8ccc feat(app): drop and recreate index on opened_at
Not sure if this is strictly required but doing it anyways.
2025-03-12 12:11:00 +11:00
psychedelicious
1e388e9ca4 tweak(ui): align new and upload workflow buttons 2025-03-12 12:11:00 +11:00
psychedelicious
5b84d45932 perf(ui): memoize workflow library components 2025-03-12 12:11:00 +11:00
psychedelicious
dc3f1184b2 fix(ui): other stuff borked by rebase 2025-03-12 12:11:00 +11:00
psychedelicious
87438bcad7 fix(ui): rebase broke things 2025-03-12 12:11:00 +11:00
Mary Hipp
afd894fd04 update recent workflows UI 2025-03-12 12:11:00 +11:00
Mary Hipp
df305c0b99 allow opened_at to be nullable for workflows that the user has never opened 2025-03-12 12:11:00 +11:00
psychedelicious
deecb7f3c3 feat(ui): "Reset Filters" -> "Deselect All" 2025-03-12 08:00:18 +11:00
psychedelicious
dd5f353465 revert(ui): use reverted API for workflow library 2025-03-12 08:00:18 +11:00
psychedelicious
a8759ea0a6 chore(ui): typegen 2025-03-12 08:00:18 +11:00
psychedelicious
3ff529c718 revert(app): use OR logic for workflow library filtering 2025-03-12 08:00:18 +11:00
psychedelicious
3b0fecafb0 fix(ui): URL mismatch for tag_counts_with_filter 2025-03-12 08:00:18 +11:00
psychedelicious
099011000f chore(ui): lint 2025-03-12 08:00:18 +11:00
psychedelicious
155daa3137 feat(ui): hide filters with no workflows 2025-03-12 08:00:18 +11:00
psychedelicious
c493e223cf feat(ui): "Reset Tags" -> "Reset Filters" 2025-03-12 08:00:18 +11:00
psychedelicious
124ca23f8b feat(ui): use new tag filtering for workflow library 2025-03-12 08:00:18 +11:00
psychedelicious
a8023cbcb6 chore(ui): typegen 2025-03-12 08:00:18 +11:00
psychedelicious
b733d3897e feat(app): revised workflow library filtering by tag
- Replace `get_counts` method with `get_tag_counts_with_filter` which gets the counts for a list of tags, filtering by a list of selected tags
- Update `get_many` logic to apply tag filtering with AND logic, to match the new `get_tag_counts_with_filter` method
- Update workflow library router
2025-03-12 08:00:18 +11:00
psychedelicious
ef95b37ace fix(ui): workflow library infinite query providesTags 2025-03-12 08:00:18 +11:00
psychedelicious
4feff5a185 chore(ui): bump @reduxjs/toolkit from 1.6.0 to 1.6.1
This brings in some fixes for the new infinite query support.
2025-03-12 08:00:18 +11:00
psychedelicious
6c8dc32d5c docs(ui): add comments to workflow library cache invalidation 2025-03-12 08:00:18 +11:00
psychedelicious
e5da808b2f fix(ui): updating workflow content should not invalidate the infinite query cache 2025-03-12 08:00:18 +11:00
psychedelicious
7d3434da62 fix(ui): updating workflow opened at invalidates infinite query cache 2025-03-12 08:00:18 +11:00
psychedelicious
4cc70d9f16 feat(ui): add cache tags for workflow library's infinite query 2025-03-12 08:00:18 +11:00
psychedelicious
7988bc1a59 chore(ui): remove unused WorkflowsRecent RTKQ tag
This didn't actually do anything. Will be implementing the actual functionality that you'd _think_ this tag would do in a future change.
2025-03-12 08:00:18 +11:00
psychedelicious
1756d885f6 refactor(ui): split workflow library state into separate slice
Has no business being in the workflow state slice.
2025-03-12 08:00:18 +11:00
psychedelicious
9ec4d968aa chore: bump version to v5.8.0a2 2025-03-11 13:29:26 +11:00
Riccardo Giovanetti
76c09301f9 translationBot(ui): update translation (Italian)
Currently translated at 98.7% (1794 of 1816 strings)

Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
2025-03-11 11:33:01 +11:00
Linos
1cf8749754 translationBot(ui): update translation (Vietnamese)
Currently translated at 100.0% (1816 of 1816 strings)

translationBot(ui): update translation (Vietnamese)

Currently translated at 99.9% (1815 of 1816 strings)

Co-authored-by: Linos <linos.coding@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/vi/
Translation: InvokeAI/Web UI
2025-03-11 11:33:01 +11:00
Riku
5d6c468833 translationBot(ui): update translation (German)
Currently translated at 67.2% (1221 of 1816 strings)

Co-authored-by: Riku <riku.block@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/de/
Translation: InvokeAI/Web UI
2025-03-11 11:33:01 +11:00
Hosted Weblate
80b3f44ae8 translationBot(ui): update translation files
Updated by "Cleanup translation files" hook in Weblate.

Co-authored-by: Hosted Weblate <hosted@weblate.org>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/
Translation: InvokeAI/Web UI
2025-03-11 11:33:01 +11:00
psychedelicious
c77c12aa1d fix(ui): missing builder translations 2025-03-11 11:28:51 +11:00
psychedelicious
731992c5ec fix(ui): restore accidentally deleted line 2025-03-11 11:17:19 +11:00
psychedelicious
c259899bf4 feat(ui): support for FLUX Redux in canvas
User facing:

When a FLUX main model is selected, users may now add Regional Reference Image layers.

When switching between FLUX Redux and FLUX IP Adapter, the settings will change to match the model type. (IP Adapter has weight, begin/end step, but Redux does not.) The image will be retained when switching between the two.

Otherwise it works the same way as IP Adapter - both in Global and Regional Reference Image layers.

---

Internal state handling:

Slightly awkward, but it was easiest to make FLUX Redux a second type of IP Adapter in redux state.

Global and regional reference images still have a single `ipAdapter` field, but it can have a type of `ip_adapter` or `flux_redux`.

Ideally, this field is called `config` or `settings` or something, but we are past that point. We _could_ do a migration to rename it, but I don't think it's worth the effort.

---

Other changes:
- Updated canvas layer validators to handle FLUX Redux.
- Updated model list loading logic to un-set FLUX Redux models in Canvas if they are not in the list (e.g. if the user deletes the model in the main app).
- Updated graph builders - new `addFLUXRedux` util & updated `addRegions` util.
- Updated the `buildModelsHook` util to return a hook that accepts a filter callback. This handles a discrepancy: FLUX IP Adapter does not support regional guidance, but FLUX Redux does. The Regional Guidance settings provide the filter to filter out FLUX IP Adapter models from the combined list of IP Adapter ahd Redux models.
2025-03-11 11:17:19 +11:00
psychedelicious
f62b9ad919 chore(ui): typegen 2025-03-11 11:17:19 +11:00
psychedelicious
57533657f9 feat(nodes): remove siglip from flux_redux, dl it jit when needed if we cannot find it
This follows the same pattern for IP Adapter w/ its CLIP Vision model. The SigLIP model is unlikely to ever change and we don't want to force the user to select it anywhere. Hardcoding it is safe and makes the UX much nicer.

The alternative is a model dropdown that will likely only ever have one valid choice in it.
2025-03-11 11:17:19 +11:00
psychedelicious
e35537e60a fix(mm): move flux_redux starter model to the flux bundle, make siglip a dependency of it 2025-03-11 11:17:19 +11:00
Billy
be53b89203 Remove redundant hash_algo field 2025-03-11 09:28:57 +11:00
Billy
a215eeaabf Update schema 2025-03-11 09:22:29 +11:00
Billy
d86b392bfd Remove redundant hash_algo field 2025-03-11 09:16:59 +11:00
Billy
3e9e45b177 Update comments 2025-03-11 09:04:19 +11:00
Billy
907d960745 PR suggestions 2025-03-11 08:37:43 +11:00
Billy
bfdace6437 New API for model classification 2025-03-11 08:34:34 +11:00
psychedelicious
a89d68b93a fix(ui): hide shared on workflow library 2025-03-10 12:29:48 -04:00
psychedelicious
59a8c0d441 feat(app): less janky custom node loading
- We don't need to copy the init file. Just crawl the custom nodes dir for modules and import them all. Dunno why I didn't do this initially.
- Pass the logger in as an arg. There was a race condition where if we got the logger directly in the load_custom_nodes function, the config would not have been loaded fully yet and we'd end up with the wrong custom nodes path!
- Remove permissions-setting logic, I do not believe it is relevant for custom nodes
- Minor cleanup of the utility
2025-03-08 09:42:13 +11:00
Riku
d5d08f6569 fix(ui): add webp to supported image types in toast messages 2025-03-07 20:38:16 +11:00
psychedelicious
8a4282365e chore: bump version to v5.8.0a1 2025-03-07 12:21:46 +11:00
psychedelicious
b9c7bc8b0e chore: ruff 2025-03-07 11:45:49 +11:00
psychedelicious
0f45ee04a2 tests: fix test_extract_valid_metadata_from_image to accomodate prev commit 2025-03-07 11:45:49 +11:00
psychedelicious
839a791509 fix(api): loosen graph parsing in extract_metadata_from_image
There's a pydantic thing that causes the graphs to fail validation erroneously. Details in the comments - not a high priority to fix but we should figure it out someday.
2025-03-07 11:45:49 +11:00
psychedelicious
f03a2bf03f chore(ui): typegen 2025-03-07 11:45:49 +11:00
psychedelicious
4136817d30 chore(ui): typegen 2025-03-07 11:45:49 +11:00
psychedelicious
7f0452173b feat(api): use extract_metadata_from_image in upload router 2025-03-07 11:45:49 +11:00
psychedelicious
8e46b03f09 tests: add tests for extract_metadata_from_image 2025-03-07 11:45:49 +11:00
psychedelicious
9045237bfb feat(api): add util to extract metadata from image 2025-03-07 11:45:49 +11:00
psychedelicious
58959a18cb chore: ruff 2025-03-07 08:44:15 +11:00
psychedelicious
e51588197f chore(ui): lint 2025-03-07 08:44:15 +11:00
psychedelicious
c5319ac48c feat(ui): restore new workflow button 2025-03-07 08:44:15 +11:00
psychedelicious
50657650c2 feat(ui): rough out recent workflows 2025-03-07 08:44:15 +11:00
psychedelicious
f657c95e45 chore(ui): lint 2025-03-07 08:44:15 +11:00
psychedelicious
2d3a2f9842 feat(app): add update_opened_at method for workflows
This method simply sets the `opened_at` attribute to the current time.

Previously `opened_at` was set when calling `get`, but that is not correct. We `get` workflows often, even when not opening them. So this needs to be a separate thing
2025-03-07 08:44:15 +11:00
psychedelicious
008837642e feat(ui): restore upload workflow button 2025-03-07 08:44:15 +11:00
psychedelicious
1a84a2fb7e feat(ui): restore share workflow button 2025-03-07 08:44:15 +11:00
psychedelicious
b87febcf4c chore(ui): lint 2025-03-07 08:44:15 +11:00
psychedelicious
95a9bb6c7b fix(ui): missing translation 2025-03-07 08:44:15 +11:00
psychedelicious
93ec9a048f fix(ui): workflow library overflow 2025-03-07 08:44:15 +11:00
psychedelicious
ec6cea6705 feat(ui): workflow library styling 2025-03-07 08:44:15 +11:00
psychedelicious
bfbcaad8c2 tweak(ui): workflow tag names 2025-03-07 08:44:15 +11:00
psychedelicious
3694158434 feat(ui): workflow library tags 2025-03-07 08:44:15 +11:00
psychedelicious
814fb939c0 chore: update default workflow tags 2025-03-07 08:44:15 +11:00
psychedelicious
4cb73e6c19 chore(ui): typegen 2025-03-07 08:44:15 +11:00
psychedelicious
e8aed67cf1 feat(app): add workflow library get_counts method
Get the counts of workflows for the given tags and/or categories. Made a separate method bc get_many will deserialize all matching workflows, which is unnecessary for this use case.
2025-03-07 08:44:15 +11:00
psychedelicious
f56dd01419 feat(ui): workflow library infinite scrolling 2025-03-07 08:44:15 +11:00
psychedelicious
ed9cd6a7a2 feat(ui): simpler workflow action buttons 2025-03-07 08:44:15 +11:00
psychedelicious
c44c28ec4c feat(ui): workflow library modal styling 2025-03-07 08:44:15 +11:00
psychedelicious
e1f7359171 feat(ui): set up RTKQ endpoint for infinite workflows list 2025-03-07 08:44:15 +11:00
psychedelicious
3e97d49a69 chore(ui): bump RTKQ to latest to get infinite query support 2025-03-07 08:44:15 +11:00
psychedelicious
c12585e52d fix(app): incorrect number of bindings for query 2025-03-07 08:44:15 +11:00
psychedelicious
b39774a57c feat(app): add searching by tags to workflow library APIs 2025-03-07 08:44:15 +11:00
psychedelicious
8988539cd5 feat(db): add generated column for tags in db migration 2025-03-07 08:44:15 +11:00
psychedelicious
88c68e8016 tidy(app): workflow records get_many 2025-03-07 08:44:15 +11:00
psychedelicious
5073c7d0a3 fix(app): ensure workflow record get_many stmt is terminated 2025-03-07 08:44:15 +11:00
psychedelicious
84e86819b8 chore(ui): lint 2025-03-07 08:44:15 +11:00
psychedelicious
440e3e01ac fix(ui): show workflow thumbnails in library 2025-03-07 08:44:15 +11:00
psychedelicious
c2302f7ab1 fix(ui): ts issues 2025-03-07 08:44:15 +11:00
Mary Hipp
2594eed1af add comments 2025-03-07 08:44:15 +11:00
Mary Hipp
e8db1c1d5a break out actions, start on marketplace categories 2025-03-07 08:44:15 +11:00
Mary Hipp
d5c5e8e8ed another new workflow library 2025-03-07 08:44:15 +11:00
Jonathan
518a7c941f Changed version of FluxDenoiseInvocation
A Redux field was added but the node version wasn't updated.
2025-03-07 07:33:31 +11:00
psychedelicious
bdafe53f2e repo: add @jazzhaiku to codeowners for CI, app and backend 2025-03-06 10:19:18 -05:00
psychedelicious
cf0cbaf0ae chore: ruff (more) 2025-03-06 10:57:54 +11:00
psychedelicious
ac6fc6eccb chore: ruff 2025-03-06 10:57:54 +11:00
psychedelicious
07d65b8fd1 refactor(ui): workflow loading, saving and saved status tracking
This big chungus reworks and simplifies much of the logic around loading and saving workflows. It also makes some minor changes to how store the current workflow and determine if it is a draft, user workflow or default workflow.

---

The lower-level hooks to save a workflow have been revised:
- `useSaveLibraryWorkflow`: Saves a user or project workflow that has had changes made to it.
- `useCreateNewWorkflow`: Saves a workflow as a new entity.

A new higher-level hook `useSaveOrSaveAsWorkflow` is intended to be used by components. It returns a single function that:
- Constructs the workflow payload to be sent to the server
- Checks if the workflow is an existing user workflow. If so, it immediately saves (updates) that workflow.
- If it's not an existing user workflow, it opens the save as dialog so the user can choose a name for it and create a new workflow. This occurs for both draft workflows and loaded default workflows.

---

The logic to build the current redux state into a workflow - either to be saved as JSON, to update an existing user workflow, or save as - was a bit convoluted.

Changes to redux state triggered a debounced function to build the workflow, setting it in a global nanostores atom. Then, all of the functions that consumed the "built workflow" referenced this atom.

Now, this logic is strictly imperative. When a consumer wants to save a workflow, we build it on the spot. This removes a layer of indirection.

The logic is in the `useBuildWorkflowFast` hook.

---

The logic for loading a workflow is also revised. Previously, it happened in an RTK listener. You'd need to dispatch an action to load a workflow, and wouldn't know if it succeeded or not (though the listener would make a toast if the load failed).

This is now done in a callback, outside redux middleware. The callback is returned from the `useLoadWorkflow` hook.

---

Previously, we stripped the id from default workflows when loading them. Then, when saving the workflow, we built a workflow object from redux state and hit the API with it.

This has two issues:
- It relies on redux state never having an ID set when a default workflow is loaded. If we somehow ended up with a default workflow's ID in redux, when we go to save the workflow, we'd get and error or it wouldn't work, because you cannot save a default workflow. You can only save-as it.
- We do not know the default workflow from which the current workflow was loaded. And be cause we don't know the default workflow, we cannot show a thumbnail image.

The responsibilities have been shifted around a bit.

Now, when we load a workflow, we load it as-is. The default workflow IDs are saved in redux state. We can render the thumbnail, and if the user goes to save the workflow, we detect that it is a default workflow and save-as it.

---

In `App.tsx`, the long list of modals are moved into their own "isolator" component to ensure any re-renders there do not affect the rest of the app.

---

The save-workflow-as modal is restructured to be a bit simpler. Still works the same. On commercial, "save to project" will be enabled by default.

---

The workflow JSON tab uses a debounced version of "buildWorkflow" to build the workflow as JSON.

---

`buildWorkflowFast` is updated to deep-copy its _whole_ output, preventing issues where field types could accidentally get mutated. I don't think this has ever happened but we may as well be safe.

---

Fixed an issue where the edit button in the workflow list didn't open the workflow in edit mode.
2025-03-06 10:57:54 +11:00
psychedelicious
3c2e6378ca chore(ui): typegen 2025-03-06 10:57:54 +11:00
psychedelicious
445f122f37 fix(api): allow deleting a workflow even if the thumbnail file doesn't exist 2025-03-06 10:57:54 +11:00
psychedelicious
8c0ee9c48f fix(app): fix import of WorkflowThumbnailServiceBase 2025-03-06 10:57:54 +11:00
psychedelicious
0eb237ac64 feat(app): make category required on workflows
It's only by misunderstanding the pydantic API that this field was is typed as optional. Workflows must _always_ have a category, and indeed they do.

Fixing this allows the generated types in the frontend to be easier to work with..
2025-03-06 10:57:54 +11:00
psychedelicious
9aa04f0bea feat(app): support thumbnails for default workflow images 2025-03-06 10:57:54 +11:00
psychedelicious
76e2f41ec7 feat(app): throw as early as possible when attempting to create, update or delete a default workflow 2025-03-06 10:57:54 +11:00
psychedelicious
1353c3301a typo(app): style_preset_id -> workflow_id 2025-03-06 10:57:54 +11:00
psychedelicious
bf209663ac tidy(app): make workflow thumbnails base class an ABC, move it to own file 2025-03-06 10:57:54 +11:00
psychedelicious
04b96dd7b4 feat(app): stable default workflows
There was a bit of wonk with default workflows. On every app startup, we wiped them all out and recreated them with new IDs. This is a quick-and-dirty way to ensure default workflows are always in sync.

Unfortunately, it also means default workflows are newly-created entities on every app load. Any thumbnails associated to them will be lost (bc they have new IDs), and `updated_at` doesn't work.

This changes makes default workflows stable entities.

The workflows we bundle in the python package in JSON format are still the source of truth for default workflows, but the startup logic that syncs them to the user DB is a bit smarter.

- All bundled workflows have an ID. It is prefixed with "default_" for  clarity.
- Any default workflows in the user's DB that are not in the bundled default workflows are deleted from the DB.
- Any bundled default workflows that are not in the user's DB are added to the DB.
- If a default workflow in the user's DB does not match the content of its corresponding bundled workflow, it is updated in the DB.

The end result is that default workflows are still kept in sync for the user, but they don't change their identity.

We may now add thumbnails to default workflows, and sorting by `updated_at` is now meaningful.
2025-03-06 10:57:54 +11:00
psychedelicious
79b2c68853 fix(ui): hide workflow thumbnail for unsaved and default workflows 2025-03-06 10:41:47 +11:00
psychedelicious
aac456527e refactor(ui): make workflow thumbnail rendering more explicit 2025-03-06 10:41:47 +11:00
psychedelicious
c88b835373 fix(ui): remove unused redux action & selector 2025-03-06 10:41:47 +11:00
Mary Hipp
9da116fd3d how to only show thumbnail for saved non-default workflows 2025-03-06 10:41:47 +11:00
Mary Hipp
201d7f1fdb fix test 2025-03-06 10:41:47 +11:00
Mary Hipp
17a5b1bd28 fix test 2025-03-06 10:41:47 +11:00
Mary Hipp
a409aec00f update schema 2025-03-06 10:41:47 +11:00
Mary Hipp
b0593eda92 ruff 2025-03-06 10:41:47 +11:00
Mary Hipp
9acb24914f tsc fix 2025-03-06 10:41:47 +11:00
Mary Hipp
ab4433da2f refactor workflow thumbnails to be separate flow/endpoints 2025-03-06 10:41:47 +11:00
Mary Hipp
d4423aa16f WIP workflow thumbnails - how to add to redux state? 2025-03-06 10:41:47 +11:00
Ryan Dick
1f6430c1b0 typegen 2025-03-06 10:31:17 +11:00
Ryan Dick
8e28888bc4 Fix SigLipPipeline model size calculation. 2025-03-06 10:31:17 +11:00
Ryan Dick
b6b21dbcbf Add model selecton fields to the FluxReduxInvocation. 2025-03-06 10:31:17 +11:00
Ryan Dick
7b48ef2264 First pass at frontend integration for FLUX Redux and SigLIP model types. 2025-03-06 10:31:17 +11:00
Ryan Dick
9c542ed655 typegen 2025-03-06 10:31:17 +11:00
Ryan Dick
4c02ba908a Add support for FLUX Redux masks. 2025-03-06 10:31:17 +11:00
Ryan Dick
82293ae3b2 Add helpful error messages when FLUX Redux starter models are not installed. 2025-03-06 10:31:17 +11:00
Ryan Dick
f1fde792ee Get FLUX Redux working: model loading and inference. 2025-03-06 10:31:17 +11:00
Ryan Dick
e82393f7ed Add FLUX Redux to starter models list. 2025-03-06 10:31:17 +11:00
Ryan Dick
d5211a8088 Add FluxRedux model type and probing logic. 2025-03-06 10:31:17 +11:00
Ryan Dick
3b095b5945 Add SigLIP starter model. 2025-03-06 10:31:17 +11:00
Ryan Dick
34959ef573 Add SigLIP model type and probing. 2025-03-06 10:31:17 +11:00
jazzhaiku
7f10f8f96a Ruff upgrade (#7741)
## Summary

Upgrade ruff version to 0.9.9 and format existing code.

## Related Issues / Discussions

<!--WHEN APPLICABLE: List any related issues or discussions on github or
discord. If this PR closes an issue, please use the "Closes #1234"
format, so that the issue will be automatically closed when the PR
merges.-->

## QA Instructions

<!--WHEN APPLICABLE: Describe how you have tested the changes in this
PR. Provide enough detail that a reviewer can reproduce your tests.-->

## Merge Plan

<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->

## Checklist

- [ ] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
2025-03-06 10:06:02 +11:00
Billy
f2689598c0 Formatting 2025-03-06 09:11:00 +11:00
Billy
551c78d9f3 Update ruff version 2025-03-06 09:10:50 +11:00
psychedelicious
0cfd713b93 fix(ui): typo 2025-03-06 08:52:10 +11:00
psychedelicious
45f5d7617a chore: bump version to v5.7.0 2025-03-06 08:38:59 +11:00
psychedelicious
f49df7d327 chore(ui): update whats new 2025-03-06 08:38:59 +11:00
Linos
87ed0ed48a translationBot(ui): update translation (Vietnamese)
Currently translated at 100.0% (1802 of 1802 strings)

Co-authored-by: Linos <linos.coding@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/vi/
Translation: InvokeAI/Web UI
2025-03-06 08:00:35 +11:00
Riccardo Giovanetti
d445c88e4c translationBot(ui): update translation (Italian)
Currently translated at 98.8% (1782 of 1802 strings)

translationBot(ui): update translation (Italian)

Currently translated at 98.8% (1782 of 1802 strings)

Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
2025-03-06 08:00:35 +11:00
Riku
c15c43ed2a translationBot(ui): update translation (German)
Currently translated at 67.2% (1212 of 1802 strings)

Co-authored-by: Riku <riku.block@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/de/
Translation: InvokeAI/Web UI
2025-03-06 08:00:35 +11:00
psychedelicious
d2f8db9745 tidy: remove unused utils 2025-03-06 07:49:35 +11:00
psychedelicious
c1cf01a038 tests: use dangerously_run_function_in_subprocess to fix configure_torch_cuda_allocator tests 2025-03-06 07:49:35 +11:00
psychedelicious
2bfb4fc79c tests: add util to run a function in separate process
This allows our tests to run in an isolated environment. For tests taht implicitly depend on import behaviour, this can prevent side-effects.

The function should only be used for tests.
2025-03-06 07:49:35 +11:00
psychedelicious
d037d8f9aa tests: update tests for configure_torch_cuda_allocator 2025-03-06 07:49:35 +11:00
psychedelicious
d5401e8443 tests: add testing utils to set/unset env var 2025-03-06 07:49:35 +11:00
psychedelicious
d193e4f02a feat(app): log warning instead of raising if PYTORCH_CUDA_ALLOC_CONF is already set 2025-03-06 07:49:35 +11:00
psychedelicious
ec493e30ee feat(app): make logger a required arg in configure_torch_cuda_allocator 2025-03-06 07:49:35 +11:00
Jonathan
081b931edf Update util.py
Changed string to a literal
2025-03-05 14:39:17 +11:00
Jonathan
8cd7035494 Fixed validation of begin and end steps
Fixed logic to match the error message - begin should be <= end.
2025-03-05 14:39:17 +11:00
Eugene Brodsky
4de6fd3ae6 chore(docker): reduce size between docker builds (#7571)
by adding a layer with all the pytorch dependencies that don't change
most of the time.

## Summary

Every time the [`main` docker
images](https://github.com/invoke-ai/InvokeAI/pkgs/container/invokeai)
rebuild and I pull `main-cuda`, it gets another 3+ GB, which seems like
about a zillion times too much since most things don't change from one
commit on `main` to the next.

This is an attempt to follow the guidance in [Using uv in Docker:
Intermediate
Layers](https://docs.astral.sh/uv/guides/integration/docker/#intermediate-layers)
so there's one layer that installs all the dependencies—including
PyTorch with its bundled nvidia libraries—_before_ the project's own
frequently-changing files are copied in to the image.


## Related Issues / Discussions

- [Improved docker layer cache with
uv](https://discord.com/channels/1020123559063990373/1329975172022927370)
- [astral: Can `uv pip install` torch, but not `uv sync`
it](https://discord.com/channels/1039017663004942429/1329986610770612347)


## QA Instructions

Hopefully the CI system building the docker images is sufficient.

But there is one change to `pyproject.toml` related to xformers, so it'd
be worth checking that `python -m xformers.info` still says it has
triton on the platforms that expect it.


## Merge Plan

I don't expect this to be a disruptive merge.

(An earlier revision of this PR moved the venv, but I've reverted that
change at ebr's recommendation.)


## Checklist

- [ ] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
2025-03-04 20:42:28 -05:00
Eugene Brodsky
3feb1a6600 Merge branch 'main' into build/docker-dependency-layer 2025-03-04 20:33:24 -05:00
psychedelicious
ea2320c57b feat(ui): add button ref image layer empty state to pull bbox 2025-03-05 08:00:20 +11:00
psychedelicious
0ad0016c2d chore: bump version to v5.7.2rc2 2025-03-04 08:48:28 +11:00
psychedelicious
c2a3c66e49 feat(app): avoid nested cursors in workflow_records service 2025-03-04 08:33:42 +11:00
psychedelicious
c0a0d20935 feat(app): avoid nested cursors in style_preset_records service 2025-03-04 08:33:42 +11:00
psychedelicious
028d8d8ead feat(app): avoid nested cursors in model_records service 2025-03-04 08:33:42 +11:00
psychedelicious
657095d2e2 feat(app): avoid nested cursors in image_records service 2025-03-04 08:33:42 +11:00
psychedelicious
1c47dc997e feat(app): avoid nested cursors in board_records service 2025-03-04 08:33:42 +11:00
psychedelicious
a3de6b6165 feat(app): avoid nested cursors in board_image_records service 2025-03-04 08:33:42 +11:00
psychedelicious
e57f0ff055 experiment(app): avoid nested cursors in session_queue service
SQLite cursors are meant to be lightweight and not reused. For whatever reason, we reuse one per service for the entire app lifecycle.

This can cause issues where a cursor is used twice at the same time in different transactions.

This experiment makes the session queue use a fresh cursor for each method, hopefully fixing the issue.
2025-03-04 08:33:42 +11:00
Eugene Brodsky
0362bd5a06 Merge branch 'main' into build/docker-dependency-layer 2025-03-03 09:32:04 -05:00
Linos
feee4c49a2 translationBot(ui): update translation (Vietnamese)
Currently translated at 100.0% (1798 of 1798 strings)

Co-authored-by: Linos <linos.coding@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/vi/
Translation: InvokeAI/Web UI
2025-03-03 14:50:08 +11:00
Riccardo Giovanetti
42e052d6f2 translationBot(ui): update translation (Italian)
Currently translated at 98.8% (1777 of 1798 strings)

Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
2025-03-03 14:50:08 +11:00
psychedelicious
b03e429b26 fix(ui): add missing builder translations 2025-03-03 14:43:23 +11:00
psychedelicious
7399909029 feat(app): use simpler syntax for enqueue_batch threaded execution 2025-03-03 14:40:48 +11:00
psychedelicious
c8aaf5e76b tidy(app): remove extraneous class attr type annotations 2025-03-03 14:40:48 +11:00
psychedelicious
0cdf7a7048 Revert "experiment(app): simulate very long enqueue operations (15s)"
This reverts commit eb6a323d0b70004732de493d6530e08eb5ca8acf.
2025-03-03 14:40:48 +11:00
psychedelicious
41985487d3 Revert "experiment(app): make socketio server ping every 1s"
This reverts commit ddf00bf260167092a3bc2afdce1244c6b116ebfb.
2025-03-03 14:40:48 +11:00
psychedelicious
41d5a17114 fix(ui): set RTKQ tag invalidationBehaviour to immediate
This allows tags to be invalidated while mutations are executing, resolving an issue in this situation:
- A long-running mutation starts.
- A tag is invalidated; for example, user edits a board name, and the boards list query tag is invalidated.
- The boards list query isn't fired, and the board name isn't updated.
- The long-running mutation finishes.
- Finally, the boards list query fires and the board name is updated.

This is the "delayed" behaviour. The "immediately" behaviour has the fires requests from tag invalidation immediately, without waiting for all mutations to finish.

It may cause extra network requests and stale data if we are mutating a lot of things very quickly. I don't think it will be an issue in practice and the improved responsiveness will be a net benefit.
2025-03-03 14:40:48 +11:00
psychedelicious
14f9d5b6bc experiment(app): remove db locking logic
Rely on WAL mode and the busy timeout.

Also changed:
- Remove extraneous rollbacks when we were only doing a `SELECT`
- Remove try/catch blocks that were made extraneous when removing the extraneous rollbacks
2025-03-03 14:40:48 +11:00
psychedelicious
eec4bdb038 experiment(app): enable WAL mode and set busy_timeout
This allows for read and write concurrency without using a global mutex. Operations may still fail they take longer than the busy timeout (5s).

If we get a database lock error after waiting 5s for an operation, we have a problem. So, I think it's actually better to use a busy timeout instead of a global mutex.

Alternatively, we could add a timeout to the global mutex.
2025-03-03 14:40:48 +11:00
psychedelicious
f3dd44044a experiment(app): run enqueue_batch async in a thread 2025-03-03 14:40:48 +11:00
psychedelicious
61a22eb8cb experiment(app): make socketio server ping every 1s 2025-03-03 14:40:48 +11:00
psychedelicious
03ca83fe13 experiment(app): simulate very long enqueue operations (15s) 2025-03-03 14:40:48 +11:00
psychedelicious
8f1e25c387 chore: bump version to v5.7.2rc1 2025-03-03 09:46:16 +11:00
Kevin Turner
29cf4bc002 feat: accept WebP uploads for assets 2025-03-02 08:50:38 -05:00
psychedelicious
9428642806 fix(ui): single or collection field rendering
Fixes an issue where fields like control weight on ControlNet nodes and image on IP Adapter nodes didn't render.

These are "single or collection" fields. They accept a single input object, or collection. They are supposed to render the UI input for a single object.

In a7a71ca935 a performance optimisation for a hot code-path inadvertently broke this.

The determination of which UI component to render for a given field was done using a type guard function for the field's template. Previously, this used a zod schema to parse the template. This is very slow, especially when the template was not the expected type.

The optimization changed the type guards to check the field name (aka its type, integer, image, etc) and cardinality directly, without any zod parsing.

It's much faster, but subtly changed the behaviour because it was a bit stricter. For some fields, it rejected "single or collection" cardinalities when it should have accepted them.

When these fields - like the aforementioned Control Weight and Image - were being rendered, none of the type guards passed and they rendered nothing.

The fix here updates the type guard functions to support multiple cardinalities. So now, when we go to render a "single or collection" field, we will render the "single" input component as it should be.
2025-03-01 10:54:31 +11:00
psychedelicious
8620572524 docs: update RELEASE.md 2025-02-28 18:43:52 -05:00
psychedelicious
f44c7e824d chore(ui): lint 2025-02-28 18:09:54 -05:00
psychedelicious
c5b8bde285 fix(ui): download button in workflow library downloads wrong workflow 2025-02-28 18:09:54 -05:00
Ryan Dick
4c86a7ecbf Update Low-VRAM docs guidance around max_cache_vram_gb. 2025-02-28 17:18:57 -05:00
Ryan Dick
b9f9d1c152 Increase the VAE decode memory estimates. to account for memory reserved by the memory allocator, but not allocated, and to generally be more conservative. 2025-02-28 17:18:57 -05:00
Ryan Dick
7567ee2adf Add pytorch_cuda_alloc_conf config to tune VRAM memory allocation (#7673)
## Summary

This PR adds a `pytorch_cuda_alloc_conf` config flag to control the
torch memory allocator behavior.

- `pytorch_cuda_alloc_conf` defaults to `None`, preserving the current
behavior.
- The configuration options are explained here:
https://pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf.
Tuning this configuration can reduce peak reserved VRAM and improve
performance.
- Setting `pytorch_cuda_alloc_conf: "backend:cudaMallocAsync"` in
`invokeai.yaml` is expected to work well on many systems. This is a good
first step for those looking to tune this config. (We may make this the
default in the future.)
- The optimal configuration seems to be dependent on a number of factors
such as device version, VRAM, CUDA kernel version, etc. For now, users
will have to experiment with this config to see if it hurts or helps on
their systems. In most cases, I expect it to help.

### Memory Tests

```
VAE decode memory usage comparison:

- SDXL, fp16, 1024x1024:
  - `cudaMallocAsync`: allocated=2593 MB, reserved=3200 MB
  - `native`:          allocated=2595 MB, reserved=4418 MB

- SDXL, fp32, 1024x1024:
  - `cudaMallocAsync`: allocated=3982 MB, reserved=5536 MB
  - `native`:          allocated=3982 MB, reserved=7276 MB

- SDXL, fp32, 1536x1536:
  - `cudaMallocAsync`: allocated=8643 MB, reserved=12032 MB
  - `native`:          allocated=8643 MB, reserved=15900 MB
```

## Related Issues / Discussions

N/A

## QA Instructions

- [x] Performance tests with `pytorch_cuda_alloc_conf` unset.
- [x] Performance tests with `pytorch_cuda_alloc_conf:
"backend:cudaMallocAsync"`.

## Merge Plan

- [x] Merge #7668 first and change target branch to `main`

## Checklist

- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
2025-02-28 16:47:01 -05:00
Ryan Dick
0e632dbc5c (minor) typo 2025-02-28 21:39:09 +00:00
Ryan Dick
49191709a0 Mark test_configure_torch_cuda_allocator_raises_if_torch_is_already_imported() to only run if CUDA is available. 2025-02-28 21:39:09 +00:00
Ryan Dick
3af7fc26fa Update low-vram docs with info abhout . 2025-02-28 21:39:09 +00:00
Ryan Dick
a36a627f83 Switch from use_cuda_malloc flag to a general pytorch_cuda_alloc_conf config field that allows full customization of the CUDA allocator. 2025-02-28 21:39:09 +00:00
Ryan Dick
b31c71f302 Simplify is_torch_cuda_malloc_enabled() implementation and add unit tests. 2025-02-28 21:39:09 +00:00
Ryan Dick
5302d4890f Add use_cuda_malloc config option. 2025-02-28 21:39:09 +00:00
Ryan Dick
766b752572 Add utils for configuring the torch CUDA allocator. 2025-02-28 21:39:09 +00:00
Eugene Brodsky
7feae5e5ce do not cache image layers in CI docker build 2025-02-28 16:24:50 -05:00
Ryan Dick
26730ca702 Tidy app entrypoint (#7668)
## Summary

Prior to this PR, most of the app setup was being done in `api_app.py`
at import time. This PR cleans this up, by:
- Splitting app setup into more modular functions
- Narrower responsibility for the `api_app.py` file - it just
initializes the `FastAPI` app

The main motivation for this changes is to make it easier to support an
upcoming torch configuration feature that requires more careful ordering
of app initialization steps.

## Related Issues / Discussions

N/A

## QA Instructions

- [x] Launch the app via invokeai-web.py and smoke test it.
- [ ] Launch the app via the installer and smoke test it.
- [x] Test that generate_openapi_schema.py produces the same result
before and after the change.
- [x] No regression in unit tests that directly interact with the app.
(test_images.py)

## Merge Plan

- [x] Check to see if there are any commercial implications to modifying
the app entrypoint.

## Checklist

- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
2025-02-28 16:07:30 -05:00
Ryan Dick
1e2c7c51b5 Move load_custom_nodes() to run_app() entrypoint. 2025-02-28 20:54:26 +00:00
Ryan Dick
da2b6815ac Make InvokeAILogger an inline import in startup_utils.py in response to review comment. 2025-02-28 20:10:24 +00:00
Ryan Dick
68d14de3ee Split run_app.py and api_app.py so that api_app.py is more narrowly responsible for just initializing the FastAPI app. This also gives clearer control over the order of the initialization steps, which will be important as we add planned torch configurations that must be applied before torch is imported. 2025-02-28 20:10:24 +00:00
Ryan Dick
38991ffc35 Add register_mime_types() startup util. 2025-02-28 20:10:24 +00:00
Ryan Dick
f345c0fabc Create an apply_monkeypatches() start util. 2025-02-28 20:10:24 +00:00
Ryan Dick
ca23b5337e Simplify port selection logic to avoid the need for a global port variable. 2025-02-28 20:10:19 +00:00
Ryan Dick
35910d3952 Move check_cudnn() and jurigged setup to startup_utils.py. 2025-02-28 20:08:53 +00:00
Ryan Dick
6f1dcf385b Move find_port() util to its own file. 2025-02-28 20:08:53 +00:00
psychedelicious
84c9ecc83f chore: bump version to v5.7.1 2025-02-28 13:23:30 -05:00
Thomas Bolteau
52aa839b7e translationBot(ui): update translation (French)
Currently translated at 99.1% (1782 of 1797 strings)

Co-authored-by: Thomas Bolteau <thomas.bolteau50@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/fr/
Translation: InvokeAI/Web UI
2025-02-28 17:07:11 +11:00
Hiroto N
316ed1d478 translationBot(ui): update translation (Japanese)
Currently translated at 42.6% (766 of 1797 strings)

Co-authored-by: Hiroto N <hironow365@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/ja/
Translation: InvokeAI/Web UI
2025-02-28 17:07:11 +11:00
Hosted Weblate
3519e8ae39 translationBot(ui): update translation files
Updated by "Cleanup translation files" hook in Weblate.

Co-authored-by: Hosted Weblate <hosted@weblate.org>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/
Translation: InvokeAI/Web UI
2025-02-28 17:07:11 +11:00
psychedelicious
82f645c7a1 feat(ui): add new workflow button to library menu 2025-02-28 16:06:02 +11:00
psychedelicious
cc36cfb617 feat(ui): reorg workflow menu buttons 2025-02-28 16:06:02 +11:00
psychedelicious
ded8a84284 feat(ui): increase spacing in form builder view mode 2025-02-28 16:06:02 +11:00
psychedelicious
94771ea626 feat(ui): add auto-links to text, heading, field description and workflow descriptions 2025-02-28 16:06:02 +11:00
psychedelicious
51d661023e Revert "feat(ui): increase spacing in form builder view mode"
This reverts commit 3766a3ba1e082f31bce09f794c47eb95cd76f1b1.
2025-02-28 16:06:02 +11:00
psychedelicious
d215829b91 feat(ui): increase spacing in form builder view mode 2025-02-28 16:06:02 +11:00
psychedelicious
fad6c67f01 fix(ui): workflow description cut off 2025-02-28 16:06:02 +11:00
psychedelicious
f366640d46 fix(ui): invoke button not showing loading indicator on canvas tab
On the Canvas tab, when we made the network request to enqueue a batch, we were immediately resetting the request. This effectively disabled RTKQ's tracking of the request - including the loading state.

As a result, when you click the Invoke button on the Canvas tab, it didn't show a spinner, and it was not clear that anything was happening.

The solution is simple - just await the enqueue request before resetting the tracking, same as we already did on the workflows and upscaling tabs.

I also added some extra logging messages for enqueuing, so we get the same JS console logs for each tab on success or failure.
2025-02-28 15:58:17 +11:00
skunkworxdark
36a3fba8cb Update metadata_linked.py
Fix input type of default_value on MetadataToFloatInvocation
2025-02-27 04:55:29 -05:00
psychedelicious
b2ff83092f fix(ui): form element settings obscured by container 2025-02-27 14:49:52 +11:00
psychedelicious
d2db38a5b9 chore(ui): update whats new 2025-02-27 13:01:07 +11:00
psychedelicious
fa988a6273 chore: bump version to v5.7.0 2025-02-27 13:01:07 +11:00
HAL
149f60946c translationBot(ui): update translation (Japanese)
Currently translated at 37.7% (680 of 1801 strings)

Co-authored-by: HAL <HALQME@users.noreply.hosted.weblate.org>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/ja/
Translation: InvokeAI/Web UI
2025-02-27 12:42:03 +11:00
Hiroto N
ee9d620a36 translationBot(ui): update translation (Japanese)
Currently translated at 40.3% (727 of 1801 strings)

translationBot(ui): update translation (Japanese)

Currently translated at 37.7% (680 of 1801 strings)

Co-authored-by: Hiroto N <hironow365@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/ja/
Translation: InvokeAI/Web UI
2025-02-27 12:42:03 +11:00
psychedelicious
4e8ce4abab feat(app): more detailed messages when loading custom nodes 2025-02-27 12:39:37 +11:00
psychedelicious
d40f2fa37c feat(app): improved custom load loading ordering
Previously, custom node loading occurred _during module imports_. A consequence of this is that when a custom node import fails (e.g. its type clobbers an existing node), the app fails to start up.

In fact, any time we import basically anything from the app, we trigger custom node imports! Not good.

This logic is now in its own function, called as the API app starts up.

If a custom node load fails for any reason, it no longer prevents the app from starting up.

One other bonus we get from this is that we can now ensure custom nodes are loaded _after_ core nodes.

Any clobbering that may occur while loading custom nodes is now guaranteed to be a custom node clobbering a core node's type - and not the other way round.
2025-02-27 12:39:37 +11:00
psychedelicious
933f4f6857 feat(app): improve error messages when registering invocations and they clobber 2025-02-27 12:39:37 +11:00
psychedelicious
f499b2db7b feat(app): add get_invocation_for_type method to BaseInvocation 2025-02-27 12:39:37 +11:00
psychedelicious
706aaf7460 tidy(app): remove unused variable 2025-02-27 12:39:37 +11:00
psychedelicious
4a706d00bb feat(app): use generic for append_list util 2025-02-27 12:28:00 +11:00
psychedelicious
2a8bff601f chore(ui): typegen 2025-02-27 12:28:00 +11:00
psychedelicious
3f0e3192f6 chore(app): mark metadata_field_extractor as deprecated 2025-02-27 12:28:00 +11:00
psychedelicious
c65147e2ff feat(app): adopt @skunkworxdark's popular metadata nodes
Thank you!
2025-02-27 12:28:00 +11:00
psychedelicious
1c14e257a3 feat(app): do not pull PIL image from disk in image primitive 2025-02-27 12:19:27 +11:00
psychedelicious
fe24217082 fix(ui): image usage checks collection fields
When deleting a board w/ images, the image usage checking logic was not checking image collection fields. This could result in a nonexistent image lingering in a node.

We already handle single image fields correctly, it's only the image collection fields taht were affected.
2025-02-27 10:24:59 +11:00
psychedelicious
aee847065c revert(ui): images from board generator only works on boards 2025-02-27 10:19:13 +11:00
psychedelicious
525da3257c chore(ui): typegen 2025-02-27 10:19:13 +11:00
psychedelicious
559654f0ca revert(app): get_all_board_image_names_for_board requires board_id 2025-02-27 10:19:13 +11:00
Eugene Brodsky
5d33874d58 fix(backend): ValuesToInsertTuple.retried_from_item_id should be an int 2025-02-27 07:35:41 +11:00
Mary Hipp
0063315139 fix(api): add new args to all uses of get_all_board_image_names_for_board 2025-02-26 15:05:40 -05:00
psychedelicious
1cbd609860 chore: bump version to v5.7.0rc2 2025-02-26 21:04:23 +11:00
psychedelicious
047c643295 tidy(app): document & clean up batch prep logic 2025-02-26 21:04:23 +11:00
psychedelicious
d1e03aa1c5 tidy(app): remove timing debug logs 2025-02-26 21:04:23 +11:00
psychedelicious
1bb8edf57e perf(app): optimise batch prep logic even more
Found another place where we deepcopy a dict, but it is safe to mutate.

Restructured the prep logic a bit to support this. Updated tests to use the new structure.
2025-02-26 21:04:23 +11:00
psychedelicious
a3e78f0db6 perf(app): optimise batch prep logic
- Avoid pydantic models when dict manipulation works
- Avoid extraneous deep copies when we can safely mutate
- Avoid NamedTuple construct and its overhead
- Fix tests to use altered function signatures
- Remove extraneous populate_graph function
2025-02-26 21:04:23 +11:00
Hosted Weblate
1ccf43aa1e translationBot(ui): update translation files
Updated by "Cleanup translation files" hook in Weblate.

Co-authored-by: Hosted Weblate <hosted@weblate.org>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/
Translation: InvokeAI/Web UI
2025-02-26 18:27:50 +11:00
Linos
a290975fae translationBot(ui): update translation (Vietnamese)
Currently translated at 100.0% (1795 of 1795 strings)

translationBot(ui): update translation (Vietnamese)

Currently translated at 98.2% (1763 of 1795 strings)

Co-authored-by: Linos <linos.coding@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/vi/
Translation: InvokeAI/Web UI
2025-02-26 18:27:50 +11:00
psychedelicious
43c2116d64 chore(ui): lint 2025-02-26 18:25:23 +11:00
psychedelicious
9d0a24ead3 fix(ui): race condition with node-form-field relationship overlay 2025-02-26 18:25:23 +11:00
psychedelicious
d61a3d2950 chore(ui): typegen 2025-02-26 18:25:23 +11:00
psychedelicious
7b63858802 fix(ui): hide node footer on batch and generator nodes 2025-02-26 18:25:23 +11:00
psychedelicious
fae23a744f fix(ui): always check batch sizes when there is at least 1 batch node
Not sure why I had this only checking if the size was >1. Doesn't make sense...
2025-02-26 18:25:23 +11:00
psychedelicious
7c574719e5 feat(ui): image generator w/ image to board type 2025-02-26 18:25:23 +11:00
psychedelicious
43a212dd47 tidy(ui): remove generator fields' explicit "value" parameter
This was a half-baked attempt to work around the issue with async generator nodes. It's not needed; the values are never referenced.
2025-02-26 18:25:23 +11:00
psychedelicious
a103bc8a0a feat(ui): update delete boards modal logic for updated board images endpoint
The functionality is the same - just need to explicitly opt out of categories and is_intermediate constraints.
2025-02-26 18:25:23 +11:00
psychedelicious
1a42fbf541 feat(ui): update listAllImageNamesForBoard query to match updated route 2025-02-26 18:25:23 +11:00
psychedelicious
d550067dd4 chore(ui): typegen 2025-02-26 18:25:23 +11:00
psychedelicious
7003bcad62 feat(nodes): add image generator node 2025-02-26 18:25:23 +11:00
psychedelicious
ef95f4962c feat(app): extend "all image names for board" apis
The method and route now supports:
- "none" as a board ID, sentinel value for uncategorized
- Optionally specify image categories
- Optionally specify is_intermediate
2025-02-26 18:25:23 +11:00
psychedelicious
2e13bbbe1b refactor(ui): make all readiness checking async
This fixes the broken readiness checks introduced in the previous commit.

To support async batch generators, all of the validation of the generators needs to be async. This is problematic because a lot of the validation logic was in redux selectors, which are necessarily synchronous.

To resolve this, the readiness checks and related logic are restructured to be run async in response to redux state changes via `useEffect` (another option is to directly subscribe to redux store). These async functions then set some react state. The checks are debounced to prevent thrashing the UI.

See #7580 for more context about this issue.

Other changes:
- Fix a minor issue where empty collections were also checked against their min and max sizes, and errors were shown for all the checks. If a collection is empty, we don't need to do the min/max checks. If a collection is empty, we skip the other min/max checks and do not report those errors to the user.
- When a field is connected, do not attempt to check its value. This fixes an issue where collection fields with a connection could erroneously appear to be invalid.
- Improved error messages for batch nodes.
2025-02-26 18:25:23 +11:00
psychedelicious
43349cb5ce feat(ui): fix dynamic prompts generators (but break readiness checks) 2025-02-26 18:25:23 +11:00
psychedelicious
d037eea42a feat(ui): debouncedUpdateReasons is async 2025-02-26 18:25:23 +11:00
psychedelicious
42c5be16d1 tidy(ui): extract resolveBatchValues to own file 2025-02-26 18:25:23 +11:00
psychedelicious
c7c4453a92 feat(ui): add overlay to show related fields/nodes 2025-02-26 17:25:58 +11:00
psychedelicious
c71ddf6e5d perf(ui): use css to hide/show node selection borders 2025-02-26 17:25:58 +11:00
psychedelicious
c33ed68f78 perf(ui): use css to hide/show field action buttons 2025-02-26 17:25:58 +11:00
psychedelicious
48e389f155 tweak(ui): form element header hover color 2025-02-26 17:25:58 +11:00
psychedelicious
5c423fece4 fix(ui): container view mode layout 2025-02-26 17:25:58 +11:00
psychedelicious
3f86049802 fix(ui): text & heading view mode layout 2025-02-26 17:25:58 +11:00
psychedelicious
47d395d0a8 chore(ui): knip 2025-02-26 17:25:58 +11:00
psychedelicious
b666ef41ff fix(ui): various styling fixes 2025-02-26 17:25:58 +11:00
psychedelicious
375f62380b fix(ui): disable autoscroll on column layout containers 2025-02-26 17:25:58 +11:00
psychedelicious
42c4462edc refactor(ui): styling for form edit mode (maybe done?)
- Restructure components
- Let each element render its own edit mode
- arrrrghh
2025-02-26 17:25:58 +11:00
psychedelicious
7591adebd5 refactor(ui): styling for form edit mode (wip) 2025-02-26 17:25:58 +11:00
psychedelicious
9d9b2f73db feat(ui): styling for dnd buttons 2025-02-26 17:25:58 +11:00
Mary Hipp
abaae39c29 make sure notes node exists like we do for invocation nodes 2025-02-26 07:33:22 +11:00
Mary Hipp
b1c9f59c30 add actions for copying image and opening image in new tab 2025-02-25 11:55:36 -05:00
psychedelicious
7bcbe180df tests(ui): fix test to account for new board field template default 2025-02-25 11:10:06 +11:00
psychedelicious
a626387a0b feat(ui): use auto-add board as default for nodes
Board fields in the workflow editor now default to using the auto-add board by default.

**This is a change in behaviour - previously, we defaulted to no board (i.e. Uncategorized).**

There is some translation needed between the UI field values for a board and what the graph expects.

A "BoardField" is an object in the shape of `{board_id: string}`.

Valid board field values in the graph:
- undefined
- a BoardField

Value UI values and their mapping to the graph values:
- 'none' -> undefined
- 'auto' -> BoardField for the auto-add board, or if the auto-add board is Uncategorized, undefined
- undefined -> undefined (this is a fallback case with the new logic)
- a BoardField -> the same BoardField
2025-02-25 11:10:06 +11:00
psychedelicious
759229e3c8 fix(ui): reset form initial values when workflow is saved 2025-02-25 11:04:44 +11:00
Mary Hipp
ad4b81ba21 do not render Whats New until app is ready 2025-02-24 11:56:16 -05:00
Mary Hipp
637b629b95 lint 2025-02-24 11:56:16 -05:00
psychedelicious
4aaa807415 experiment(ui): show loader until studio init actions are complete 2025-02-24 11:56:16 -05:00
Riccardo Giovanetti
e884be5042 translationBot(ui): update translation (Italian)
Currently translated at 98.9% (1737 of 1755 strings)

translationBot(ui): update translation (Italian)

Currently translated at 98.9% (1735 of 1753 strings)

translationBot(ui): update translation (Italian)

Currently translated at 98.9% (1731 of 1749 strings)

translationBot(ui): update translation (Italian)

Currently translated at 98.9% (1731 of 1749 strings)

translationBot(ui): update translation (Italian)

Currently translated at 98.6% (1726 of 1749 strings)

Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
2025-02-24 08:28:55 +11:00
psychedelicious
13e129bef2 fix(ui): star button not working on Chrome
Not sure why the perf optimisation doesn't work on Chrome but I reverted it.
2025-02-24 08:01:14 +11:00
psychedelicious
157904522f feat(ui): add zoom to node button to node field headers 2025-02-21 08:21:56 -05:00
psychedelicious
3045cd7b3a tidy(ui): split up FormElementEditModeHeader components 2025-02-21 08:21:56 -05:00
psychedelicious
e9e2bab4ee feat(ui): make useZoomToNode not rely on reactflow ctx 2025-02-21 08:21:56 -05:00
psychedelicious
6cd794d860 tweak(ui): container settings popover placement @ top 2025-02-21 08:21:56 -05:00
psychedelicious
c9b0307bcd fix(ui): non-direct input field names do not block reactflow drag 2025-02-21 08:21:56 -05:00
psychedelicious
55aee034b0 fix(ui): do not zoom when double clicking switch 2025-02-21 08:21:56 -05:00
psychedelicious
e81ef0a090 tweak(ui): "Description" -> "Show Description" 2025-02-21 08:21:56 -05:00
psychedelicious
1a806739f2 fix(ui): missing translation for string field component 2025-02-21 08:21:56 -05:00
psychedelicious
067aeeac23 tweak(ui): heading and text elements editable styling 2025-02-21 08:21:56 -05:00
psychedelicious
47b37d946f fix(ui): prevent selecting edit mode header 2025-02-21 08:21:56 -05:00
psychedelicious
ddfdeca8bd tweak(ui): make editable form headers less bright 2025-02-21 08:21:56 -05:00
psychedelicious
55b2a4388d fix(ui): overflow in workflow title 2025-02-21 08:21:56 -05:00
psychedelicious
6ab2bebfa6 chore: bump version to v5.7.0rc1 2025-02-21 13:00:01 +11:00
psychedelicious
3f18bfed4e feat(ui): add loading state for builder 2025-02-21 12:24:03 +11:00
psychedelicious
012054acaa feat(ui): add dialog when loading workflow if unsaved changes 2025-02-21 12:24:03 +11:00
psychedelicious
efb7f36f28 chore(ui): typegen 2025-02-21 12:24:03 +11:00
psychedelicious
05ea1c7637 chore(ui): fix circular dep 2025-02-21 12:24:03 +11:00
psychedelicious
2ba0f920d2 feat(ui): hide workflow desc in builder edit mode 2025-02-21 12:24:03 +11:00
psychedelicious
c3ab4f4d6e feat(ui): tweak dnd button styling 2025-02-21 12:24:03 +11:00
psychedelicious
36b3089d5d feat(ui): tweak dnd element buttons styling 2025-02-21 12:24:03 +11:00
psychedelicious
6c4d002bd6 feat(ui): hide reset node field value button when value is unchanged 2025-02-21 12:24:03 +11:00
psychedelicious
b2cfa137a3 feat(ui): when migrating pre-builder workflows, hide description for node fields by default, matching prev behaviour 2025-02-21 12:24:03 +11:00
psychedelicious
9d57bc1697 feat(ui): node text areas resizable
There's a reactflow issue that prevents the size from being applied when a workflow is loaded, but at least you can resize the fields.
2025-02-21 12:24:03 +11:00
psychedelicious
e6db36d0c4 feat(ui): hide the root container frame and header 2025-02-21 12:24:03 +11:00
psychedelicious
78832e546a feat(ui): restore plus sign button to add node field to form 2025-02-21 12:24:03 +11:00
psychedelicious
6cfeadb33b feat(ui): add fake dnd node field element w/ info tooltip 2025-02-21 12:24:03 +11:00
psychedelicious
d1d3971ee3 feat(ui): make index optional when adding elements, update tests 2025-02-21 12:24:03 +11:00
psychedelicious
e9ce259d43 feat(ui): smaller buttons for builder dnd elements 2025-02-21 12:24:03 +11:00
psychedelicious
34d988063f feat(ui): change reset button to menu 2025-02-21 12:24:03 +11:00
psychedelicious
e2bdbfe721 fix(ui): use getIsFormEmpty util when validating workflow 2025-02-21 12:24:03 +11:00
psychedelicious
fe7e1958ea fix(ui): fall back to empty form if invalid during validation 2025-02-21 12:24:03 +11:00
psychedelicious
cf8f18e690 feat(ui): add getIsFormEmpty util & tests 2025-02-21 12:24:03 +11:00
psychedelicious
da7b31b2a8 fix(app): add form to Workflow pydantic schema so it gets saved 2025-02-21 12:24:03 +11:00
psychedelicious
fb82664944 fix(ui): update linear view field migration logic to work w/ new data structure 2025-02-21 12:24:03 +11:00
psychedelicious
58ae9ed8a5 feat(ui): add form structure validation and tests 2025-02-21 12:24:03 +11:00
psychedelicious
d142a94b67 chore(ui): knip 2025-02-21 12:24:03 +11:00
psychedelicious
c8135126f2 fix(ui): use "native" reactflow interaction class names 2025-02-21 12:24:03 +11:00
psychedelicious
560910ed2f feat(ui): workflows panel redesign WIP 2025-02-21 12:24:03 +11:00
psychedelicious
b78ac40a22 feat(ui): workflows panel redesign WIP 2025-02-21 12:24:03 +11:00
psychedelicious
9ecafc8706 feat(ui): workflows panel redesign WIP 2025-02-21 12:24:03 +11:00
psychedelicious
871cb54988 feat(ui): panel resize handles have grab icon 2025-02-21 12:24:03 +11:00
psychedelicious
e3069ad336 fix(ui): remove ancient node selection logic that created duplicate node selection actions 2025-02-21 12:24:03 +11:00
psychedelicious
28027702dd feat(ui): add useZoomToNode hook 2025-02-21 12:24:03 +11:00
psychedelicious
d72840620a feat(ui): remove extraneous formElementNodeFieldInitialValueChanged action 2025-02-21 12:24:03 +11:00
psychedelicious
4f2de2674e feat(ui): remove extraneous formContainerChildrenReordered action 2025-02-21 12:24:03 +11:00
psychedelicious
340c9c0697 feat(ui): make builder heading a bit smaller 2025-02-21 12:24:03 +11:00
psychedelicious
f77549dc4f feat(ui): use constants for reactflow opt-out classNames 2025-02-20 14:25:51 +11:00
psychedelicious
5653352ae8 feat(ui): double click to zoom to node
Requires a bit of fanagling to ensure the double click doesn't interfer w/ other stuff
2025-02-20 14:25:51 +11:00
psychedelicious
f1bc2ea962 fix(ui): allow pasting of collapsed edges 2025-02-20 14:25:51 +11:00
psychedelicious
2a9f7b2e38 feat(ui): abstract node/field validation logic, use error color for node title when node has errors 2025-02-20 14:25:51 +11:00
psychedelicious
c379d76844 feat(ui): add "unsafe" version of field instance selector 2025-02-20 14:25:51 +11:00
psychedelicious
6496fcdcbd feat(ui): make field names draggable, not the whole field name "bar" 2025-02-20 14:25:51 +11:00
psychedelicious
812b8fddd6 feat(ui): slimmer image component 2025-02-20 14:25:51 +11:00
psychedelicious
dc9165dfc1 chore(ui): bump vitest to latest
All but the core `vitest` package were updated recently. Tests still ran but the test UI dashboard didn't. After updating, all tests still run, seems fine.

Also tested building in app and package mode.
2025-02-20 09:08:24 +11:00
psychedelicious
59826438f6 fix(ui): failing test cases for form manip utils 2025-02-20 09:08:24 +11:00
psychedelicious
87cd52241d tests(ui): coverage for form-manipulation.ts 2025-02-20 09:08:24 +11:00
psychedelicious
7506b0e7ae feat(ui): require parentId when adding form elements 2025-02-20 09:08:24 +11:00
psychedelicious
4b29a2f395 refactor(ui): validateWorkflow takes a single object as arg 2025-02-20 09:08:24 +11:00
psychedelicious
3bcaa42309 tidy(ui): more file/variable organisation 2025-02-20 09:08:24 +11:00
psychedelicious
8e14cdb8b6 feat(ui): make dnd hooks never throw
Just log errors.
2025-02-20 09:08:24 +11:00
psychedelicious
9ef6e52ad8 tidy(ui): organize & document builder dnd logic 2025-02-20 09:08:24 +11:00
psychedelicious
148bd70a24 refactor(ui): revert to using single tree for form data 2025-02-20 09:08:24 +11:00
psychedelicious
1461c88c12 lint model 2025-02-20 09:08:24 +11:00
psychedelicious
bcfeae94d2 fix(ui): node title shows text cursor 2025-02-20 09:08:24 +11:00
psychedelicious
40eedfebf7 fix(ui): zoom reset on first interaction
Closes #7648
2025-02-20 09:08:24 +11:00
psychedelicious
d0a231d59e fix(ui): model field types not recognized as such during workflow validation and field styling 2025-02-20 09:08:24 +11:00
Mary Hipp
4bba7de070 fix omnipresent pencil 2025-02-19 09:52:37 -05:00
psychedelicious
e1f2b232c8 feat(ui): color picker improvements
- Support transparency w/ color picker. To do this, we need to hide the bg layer before sampling. In testing, this has a negligible performance impact.
- Add an RGBA value readout next to the color picker ring.
2025-02-18 15:38:50 +11:00
psychedelicious
2c5b0195fc fix(ui): straight lines drawn with shift-click get cut off when canvas moved between clicks
Need to opt-out of the clipping logic when using shift-click to not cut off the line.
2025-02-18 15:38:50 +11:00
psychedelicious
56792b2d2c fix(ui): mask layers not showing up until you zoom
Unfortunately I couldn't reliably reproduce the issue, so I'm not 100% sure this fixes it. But I think there is a race condition that results in `updateCompositingRectSize` erroneously seeing the layer has no objects and skipping the update.

To address this, the compositing rect fill/size/pos are all now force-updated when the fill/objects are changed. Theoretically it should be impossible for the issue to occur now.
2025-02-18 15:38:50 +11:00
psychedelicious
d71e8b4980 fix(ui): cursor visibility
- Fix an issue where the cursor disappeared when selecting a non-renderable entity. For example, when selecting a reference image layer and certain tools, the cursor would disappear.
- Ensure color picker works no matter what layer types are selected.

The logic for showing/hiding the cursor needed to be rearranged a bit for this fix.
2025-02-18 15:38:50 +11:00
Mary Hipp
ca50f8193c add AppFeature for retryQueueItem in case we want to easily disable 2025-02-18 09:14:03 +11:00
psychedelicious
7ee636b68b feat(ui): add retry buttons to queue tab
- Add the new HTTP endpoint to the queue client
- Add buttons to the queue items to retry them
2025-02-18 09:14:03 +11:00
psychedelicious
926f69677a chore(ui): typegen 2025-02-18 09:14:03 +11:00
psychedelicious
675ac348de feat(app): add retry queue item functionality
Retrying a queue item means cloning it, resetting all execution-related state. Retried queue items reference the item they were retried from by id. This relationship is not enforced by any DB constraints.

- Add `retried_from_item_id` to `session_queue` table in DB in a migration.
- Add `retry_items_by_id` method to session queue service. Accepts a list of queue item IDs and clones them (minus execution state). Returns a list of retried items. Items that are not in a canceled or failed state are skipped.
- Add `retry_items_by_id` HTTP endpoint that maps 1-to-1 to the queue service method.
- Add `queue_items_retried` event, which includes the list of retried items.
2025-02-18 09:14:03 +11:00
psychedelicious
62e5b9da18 docs(ui): add comments for recent perf optimizations 2025-02-17 09:28:13 +11:00
psychedelicious
65eabde297 per(ui): move field desc content to own component 2025-02-17 09:28:13 +11:00
psychedelicious
6bebd2bfc8 chore(ui): lint 2025-02-17 09:28:13 +11:00
psychedelicious
cd785ba64b perf(ui): optimize field handle/title/etc rendering 2025-02-17 09:28:13 +11:00
psychedelicious
726b4637db perf(ui): optimize workflow editor inspector panel rendering 2025-02-17 09:28:13 +11:00
psychedelicious
b50241fe6a perf(ui): make field description popver rendering lazy 2025-02-17 09:28:13 +11:00
psychedelicious
5b8735db3b perf(ui): optimize node update checking 2025-02-17 09:28:13 +11:00
psychedelicious
ce286363d0 perf(ui): optimize checking if a field value is changed by wrapping in single selector 2025-02-17 09:28:13 +11:00
psychedelicious
2fa47cf270 perf(ui): use lazy rendering for builder element settings popovers 2025-02-17 09:28:13 +11:00
psychedelicious
3446486f40 perf(ui): do not use memoized selector for control adapter state 2025-02-17 09:28:13 +11:00
psychedelicious
a0cdcdef57 perf(ui): debounce invoke readiness calculations 2025-02-17 09:28:13 +11:00
psychedelicious
abbb3609c8 fix(ui): race condition that causes non-user-facing error when handling canvas filter cancelations
The abortController could be null by the time we attempt to abort it
2025-02-17 09:28:13 +11:00
psychedelicious
700ad78f87 Revert "perf(ui): connection line issue on chrome"
This reverts commit 9d482e5fe621c2dbbde18ed17301a12b0e7f2580.
2025-02-17 09:28:13 +11:00
psychedelicious
cfb08f326e perf(ui): fix issue w/ add node cmdk component (more fixed) 2025-02-17 09:28:13 +11:00
psychedelicious
aae4fa3cca perf(ui): reduce animations which slow down reactflow 2025-02-17 09:28:13 +11:00
psychedelicious
109adc5a93 perf(ui): fix issue w/ add node cmdk component 2025-02-17 09:28:13 +11:00
psychedelicious
acb7ef8837 perf(ui): slightly more efficient gallery pagination componsts 2025-02-17 09:28:13 +11:00
psychedelicious
3c5e829c72 feat(ui): use new more efficient RTK upsert methods 2025-02-17 09:28:13 +11:00
psychedelicious
10d9e75391 fix(ui): rtk upgrade TS issues 2025-02-17 09:28:13 +11:00
psychedelicious
b6a892a673 chore(ui): bump @reduxjs/toolkit to latest 2025-02-17 09:28:13 +11:00
psychedelicious
479d5cc362 perf(ui): isolate a lot of root-level hooks in a memoized component 2025-02-17 09:28:13 +11:00
psychedelicious
01e4fd100f perf(ui): optimized invocation node component structure 2025-02-17 09:28:13 +11:00
psychedelicious
8ecf9fb7e3 perf(ui): connection line issue on chrome 2025-02-17 09:28:13 +11:00
psychedelicious
436d5ee0c6 chore(ui): lint 2025-02-17 09:28:13 +11:00
psychedelicious
0671fec844 perf(ui): workflow editor misc
- Optimize component and hook structure for input fields to reduce rerenders of component tree
- Remove memoization on some selectors where it serves no purpose (bc the object will have a stable identity until it changes, at which point we need to re-render anyways)
- Shift the connection error selector logic around to rely more on the stable identity of pending connection objects
2025-02-17 09:28:13 +11:00
Kevin Turner
80d38c0e47 chore(docker): include fewer files while installing dependencies
including just invokeai/version seems sufficient to appease uv sync here. including everything else would invalidate the cache we're trying to establish.
2025-02-16 12:31:14 -08:00
Kevin Turner
22362350dc chore(docker): revert to keeping venv in /opt/venv 2025-02-16 11:26:06 -08:00
Kevin Turner
275d891f48 Merge branch 'main' into build/docker-dependency-layer 2025-02-16 10:34:17 -08:00
Eugene Brodsky
4dbde53f9b fix(docker): use the node22 image for the frontend build 2025-02-15 17:21:34 -05:00
psychedelicious
f6c4682b99 fix(ui): builder alpha status alert not visible when many elements added 2025-02-14 15:33:02 +11:00
psychedelicious
b3288ed64e chore: bump version to v5.7.0a1 2025-02-14 15:33:02 +11:00
psychedelicious
f3dfb1b6ea chore(ui): knip 2025-02-14 14:50:56 +11:00
psychedelicious
65a37ca4ff feat(ui): give vertical dividers a min height 2025-02-14 14:50:56 +11:00
psychedelicious
9adbe31fec tweak(ui): form element edit mode styling 2025-02-14 14:50:56 +11:00
psychedelicious
0a2925f02b feat(ui): add warning about alpha status of builder 2025-02-14 14:50:56 +11:00
psychedelicious
877dcc73c3 feat(ui): check image access for image collections when loading workflows 2025-02-14 14:50:56 +11:00
psychedelicious
aec2136323 fix(ui): force refetch when checking image access to ensure stale RTK query cache isn't use 2025-02-14 14:50:56 +11:00
psychedelicious
8ef5c54ffe feat(ui): add delete button to missing image placeholder for image collection fields 2025-02-14 14:50:56 +11:00
psychedelicious
6faed4f1ec fix(ui): remove images from node image collections when deleted 2025-02-14 14:50:56 +11:00
psychedelicious
aa71db4d31 tidy(ui): remove nonfunctional conditionals 2025-02-14 14:50:56 +11:00
psychedelicious
6407ab4a2e tweak(ui): builder padding 2025-02-14 14:50:56 +11:00
psychedelicious
a91b0f25cb feat(ui): consolidate row/column dnd draggables into container 2025-02-14 14:50:56 +11:00
psychedelicious
ef664863b5 feat(ui): remove separate flag for form vs workflow edit mode 2025-02-14 14:50:56 +11:00
psychedelicious
bf8ba1bb37 feat(ui): text and heading element default content is empty string 2025-02-14 14:50:56 +11:00
psychedelicious
54747bd521 feat(ui): remove element id from edit mode header 2025-02-14 14:50:56 +11:00
psychedelicious
d040a6953f tweak(ui): styling for edit mode 2025-02-14 14:50:56 +11:00
psychedelicious
828497cf89 feat(ui): remove node field reset button from edit mode header 2025-02-14 14:50:56 +11:00
psychedelicious
28950a4891 fix(ui): ignore dropping on self 2025-02-14 14:50:56 +11:00
psychedelicious
1c92838bf9 tidy(ui): builder dnd monitor logic rearrange 2025-02-14 14:50:56 +11:00
psychedelicious
71f6737e19 feat(ui): remove the showLabel flag for node fields 2025-02-14 14:50:56 +11:00
psychedelicious
dcac65f46b feat(ui): add initial values for builder fields 2025-02-14 14:50:56 +11:00
psychedelicious
46f549a57a feat(ui): better placeholders for text/heading 2025-02-14 14:50:56 +11:00
psychedelicious
fb93101085 tweak(ui): layout of workflow builder field settings 2025-02-14 14:50:56 +11:00
psychedelicious
9aabcfa4b8 feat(ui): default form field settings 2025-02-14 14:50:56 +11:00
psychedelicious
64587b37db refactor(ui): remove confusing containerId from various builder actions 2025-02-14 14:50:56 +11:00
psychedelicious
c673b6e11d feat(ui): demote dnd logs to trace 2025-02-14 14:50:56 +11:00
psychedelicious
a3a49ddda0 tidy(ui): useNodeFieldDnd 2025-02-14 14:50:56 +11:00
psychedelicious
330a0f0028 tidy(ui): extract util in dnd 2025-02-14 14:50:56 +11:00
psychedelicious
1104d2a00f feat(ui): initial values for form fields (WIP) 2025-02-14 14:50:56 +11:00
psychedelicious
aed802fa74 feat(ui): rearrange builder buttons to be less annoying 2025-02-14 14:50:56 +11:00
psychedelicious
498d99c828 fix(ui): handle form fields not existing on node on workflow load 2025-02-14 14:50:56 +11:00
psychedelicious
3d19b98208 chore(ui): lint 2025-02-14 14:50:56 +11:00
psychedelicious
85f5bb4a02 fix(ui): incorrect node data used during update 2025-02-14 14:50:56 +11:00
psychedelicious
269f718d2c tidy(ui): node description components 2025-02-14 14:50:56 +11:00
psychedelicious
211bb8a204 feat(ui): auto-update nodes on loading workflow 2025-02-14 14:50:56 +11:00
psychedelicious
ef0ef875dd feat(ui): migrated linear view exposed fields to builder form on load 2025-02-14 14:50:56 +11:00
psychedelicious
9c62648283 fix(ui): do not error in node/field selectors are used outside field gate components 2025-02-14 14:50:56 +11:00
psychedelicious
4ca45f7651 feat(ui): be double extra sure migrated workflows are parsed before loading 2025-02-14 14:50:56 +11:00
psychedelicious
2abe2f52f7 feat(ui): workflow builder layout 2025-02-14 14:50:56 +11:00
psychedelicious
6f1c814af4 revert(ui): code lint that broke stuff 2025-02-14 14:50:56 +11:00
psychedelicious
1ad6ccc426 tidy(ui): dnd code lint 2025-02-14 14:50:56 +11:00
psychedelicious
aedee536a0 tidy(ui): rename builder dnd file 2025-02-14 14:50:56 +11:00
psychedelicious
d2b15fba12 tidy(ui): improve dnd hook names 2025-02-14 14:50:56 +11:00
psychedelicious
a674e781a1 tidy(ui): dnd logic formatting 2025-02-14 14:50:56 +11:00
psychedelicious
0db74f0cde refactor(ui): add vars in dnd logic for conciseness 2025-02-14 14:50:56 +11:00
psychedelicious
d66db67d1a refactor(ui): clean up dnd logic 2025-02-14 14:50:56 +11:00
psychedelicious
2507a7f674 tidy(ui): rename root utils in dnd 2025-02-14 14:50:56 +11:00
psychedelicious
145503a0a0 refactor(ui): add dispatchAndFlash util for dnd 2025-02-14 14:50:56 +11:00
psychedelicious
32e8dd5647 fix(ui): divider rendering 2025-02-14 14:50:56 +11:00
psychedelicious
fe87adcb52 feat(ui): builder edit/view buttons 2025-02-14 14:50:56 +11:00
psychedelicious
e95255f6e8 feat(ui): make drop targets that are in root sticky 2025-02-14 14:50:56 +11:00
psychedelicious
efec224523 fix(ui): remove node field from form correctly when node is deleted 2025-02-14 14:50:56 +11:00
psychedelicious
e948e236e7 feat(ui): iterate on builder data structure 2025-02-14 14:50:56 +11:00
psychedelicious
189eb85663 feat(ui): delete form elements when node is deleted from workflow 2025-02-14 14:50:56 +11:00
psychedelicious
94f90f4082 feat(ui): string field settings 2025-02-14 14:50:56 +11:00
psychedelicious
1eb491fdaa feat(ui): builder empty state (WIP) 2025-02-14 14:50:56 +11:00
psychedelicious
176248a023 feat(ui): empty state for drop containers 2025-02-14 14:50:56 +11:00
psychedelicious
3c676ed11a fix(ui): drop target jank 2025-02-14 14:50:56 +11:00
psychedelicious
7a9340b850 fix(ui): tsc issues 2025-02-14 14:50:56 +11:00
psychedelicious
2c0b474f55 feat(ui): editable node form field labels & descriptions 2025-02-14 14:50:56 +11:00
psychedelicious
74c76611a9 feat(ui): add float field display settings 2025-02-14 14:50:56 +11:00
psychedelicious
1c7176b3f4 feat(ui): use useEditable in builder 2025-02-14 14:50:56 +11:00
psychedelicious
30363a0018 feat(ui): builder field settings (WIP) 2025-02-14 14:50:56 +11:00
psychedelicious
b46dbcc76d fix(ui): divider layout 2025-02-14 14:50:56 +11:00
psychedelicious
09879f4e19 feat(ui): builder field settings (WIP) 2025-02-14 14:50:56 +11:00
psychedelicious
4daa82c912 feat(ui): builder field settings (WIP) 2025-02-14 14:50:56 +11:00
psychedelicious
1cb04d9a4a refactor(ui): updated component structure for input and output fields 2025-02-14 14:50:56 +11:00
psychedelicious
3e6969128c feat(ui): remove sizes from text & heading 2025-02-14 14:50:56 +11:00
psychedelicious
e14c490ac6 fix(ui): drop indicator getting greyed out when dragging over self 2025-02-14 14:50:56 +11:00
psychedelicious
3ef3b97c58 feat(ui): editable heading and text elements 2025-02-14 14:50:56 +11:00
psychedelicious
3baaefb0cc chore(ui): bump @invoke-ai/ui-library 2025-02-14 14:50:56 +11:00
psychedelicious
98b0a8ffb2 feat(ui): plumbing for editable form elements 2025-02-14 14:50:56 +11:00
psychedelicious
4f85bf078a tidy(ui): import reactflow css in main theme provider 2025-02-14 14:50:56 +11:00
psychedelicious
f0563d41db fix(ui): circular dep 2025-02-14 14:50:56 +11:00
psychedelicious
a7a71ca935 perf(ui): faster InputFieldRenderer
Use non-zod type guards for input field types and fail early when possible
2025-02-14 14:50:56 +11:00
psychedelicious
c04822054b chore(ui): lint 2025-02-14 14:50:56 +11:00
psychedelicious
132e9bebd7 chore(ui): lint 2025-02-14 14:50:56 +11:00
psychedelicious
0dc45ac903 fix(ui): node-autoconnect showing invalid connection options 2025-02-14 14:50:56 +11:00
psychedelicious
4f9d81917c fix(ui): do not render dashed edges unless animation is enabled 2025-02-14 14:50:56 +11:00
psychedelicious
d3c22eceaf tweak(ui): node selection colors 2025-02-14 14:50:56 +11:00
psychedelicious
fb77d271ab refactor(ui): edge rendering
- Fix issues with positioning of labels
- Optimize styling to be less reliant on JS
2025-02-14 14:50:56 +11:00
psychedelicious
0371881349 chore(ui): upgrade reactflow to v12 2025-02-14 14:50:56 +11:00
psychedelicious
4b178fdeca fix(ui): hide nonfunctional delete button on root form element 2025-02-14 14:50:56 +11:00
psychedelicious
b53e36aaaa tidy(ui): remove unused mock form builder data 2025-02-14 14:50:56 +11:00
psychedelicious
c061cd5e54 fix(ui): use redux store for form 2025-02-14 14:50:56 +11:00
psychedelicious
ddda915ebd fix(ui): start workflow w/ single column as root 2025-02-14 14:50:56 +11:00
psychedelicious
9a2d8844a2 fix(ui): allow root element to be drop target 2025-02-14 14:50:56 +11:00
psychedelicious
48583df02e feat(ui): support adding form elements and node fields with dnd 2025-02-14 14:50:56 +11:00
psychedelicious
f9432d10d2 feat(ui): improved drop target styling 2025-02-14 14:50:56 +11:00
psychedelicious
0d28cd7ebe fix(ui): do not allow reparenting to self 2025-02-14 14:50:56 +11:00
psychedelicious
c9f9a2f2d4 feat(ui): dnd drop target styling 2025-02-14 14:50:56 +11:00
psychedelicious
a05d10f648 feat(ui): improved dnd hitbox for edges when center drop is allowed 2025-02-14 14:50:56 +11:00
psychedelicious
14845932fb feat(ui): dnd almost fully working (WIP) 2025-02-14 14:50:56 +11:00
psychedelicious
2aa1fc9301 feat(ui): dnd mostly working (WIP) 2025-02-14 14:50:56 +11:00
psychedelicious
98139562f3 feat(ui): dim form element while dragging 2025-02-14 14:50:56 +11:00
psychedelicious
8365bba5ba feat(ui): hacking on dnd (WIP) 2025-02-14 14:50:56 +11:00
psychedelicious
9f07e83a23 feat(ui): iterate on builder (WIP) 2025-02-14 14:50:56 +11:00
psychedelicious
1f995d0257 feat(ui): iterate on builder (WIP) 2025-02-14 14:50:56 +11:00
psychedelicious
6ae2d5ef9d feat(ui): iterate on builder (WIP) 2025-02-14 14:50:56 +11:00
psychedelicious
55973b4c66 feat(ui): iterate on builder (WIP) 2025-02-14 14:50:56 +11:00
psychedelicious
d8c6531b70 feat(ui): getPrefixedId supports custom separator 2025-02-14 14:50:56 +11:00
psychedelicious
81e385a756 feat(ui): iterate on builder (WIP) 2025-02-14 14:50:56 +11:00
psychedelicious
f6cb1a455f feat(ui): iterate on builder (WIP) 2025-02-14 14:50:56 +11:00
psychedelicious
bf60be99dc feat(ui): iterate on builder (WIP) 2025-02-14 14:50:56 +11:00
psychedelicious
bee0e8248f feat(ui): iterate on builder (WIP) 2025-02-14 14:50:56 +11:00
psychedelicious
1e658cf9e7 chore(ui): lint 2025-02-14 14:50:56 +11:00
psychedelicious
f130fa4d66 feat(ui): rough out workflow builder data structure 2025-02-14 14:50:56 +11:00
psychedelicious
02a47a6806 refactor(ui): split integer, float and string field components in prep for builder 2025-02-14 14:50:56 +11:00
psychedelicious
1063498458 revert(ui): rip out linear view config stuff 2025-02-14 14:50:56 +11:00
psychedelicious
e9a13ec882 refactor(ui): split up float and integer field renderers 2025-02-14 14:50:56 +11:00
psychedelicious
bd0765b744 feat(ui): rough out workflow builder data structure & dummy data 2025-02-14 14:50:56 +11:00
psychedelicious
6e1388f4fc fix(ui): dynamic prompts infinite recursion (wip) 2025-02-14 14:50:56 +11:00
psychedelicious
2a9f2b2fe2 feat(ui): use workflows view context 2025-02-14 14:50:56 +11:00
psychedelicious
0a6b0dc3bf feat(ui): get configurable notes display working 2025-02-14 14:50:56 +11:00
psychedelicious
8753406a6c fix(ui): color field component layout 2025-02-14 14:50:56 +11:00
psychedelicious
e2b09bed62 refactor(ui): continued reorg of components & hooks 2025-02-14 14:50:56 +11:00
psychedelicious
011910a08c refactor(ui): continued reorg of components & hooks 2025-02-14 14:50:56 +11:00
psychedelicious
bfd70be50b fix(ui): remove accidental change to zFieldInput schema 2025-02-14 14:50:56 +11:00
psychedelicious
9c53bd6a3b refactor(ui): workflows left panel internal components structure 2025-02-14 14:50:56 +11:00
psychedelicious
e479cb5fe4 refactor(ui): workflows component structure (WIP)
- Simplify and de-insane-ify component structure, hooks, selectors, etc.
- Some perf improvements by using data attributes for styling instead of dynamic CSS-in-JS.
- Add field notes and start of linear view config, got blocked when I ran into deeper layout issues that made it very difficult to handle field configs. So those are WIP in this commit.
2025-02-14 14:50:56 +11:00
psychedelicious
52947f40c3 perf(ui): use data attribute for input field wrapper styles 2025-02-14 14:50:56 +11:00
psychedelicious
bce9a23b25 feat(ui): add ViewContext so components can know where they are being rendered (user-linear view, editor-linear view, or editor-nodes view) 2025-02-14 14:50:56 +11:00
psychedelicious
2d05579568 feat(ui): clean up user-linear view styling 2025-02-14 14:50:56 +11:00
psychedelicious
11aabb5693 feat(ui): show notes icon on user-linear view, replacing info icon 2025-02-14 14:50:56 +11:00
psychedelicious
1e1e31d5b7 feat(ui): show notes icon on editor linear view 2025-02-14 14:50:56 +11:00
psychedelicious
fe86cf6d99 feat(ui): add notes popover to field title bar 2025-02-14 14:50:56 +11:00
psychedelicious
cfb63c1b81 feat(ui): add notes state to fields 2025-02-14 14:50:56 +11:00
Ryan Dick
b44415415a Use a default tile size of 1024 for VAE encode/decode operations in upscaling workflows. Previously, the model default was used (512 for SD1, 1024 for SDXL). Larger tile sizes help to prevent tiling artifacts. 2025-02-14 14:23:42 +11:00
psychedelicious
9353298b4f chore: bump version to v5.6.2 2025-02-14 13:13:33 +11:00
Eugene Brodsky
cf22e09b28 chore(ui): upgrade vite, vitest, and related plugins to latest versions 2025-02-14 11:09:51 +11:00
Linos
6e5ca7ece8 translationBot(ui): update translation (Vietnamese)
Currently translated at 99.8% (1753 of 1755 strings)

translationBot(ui): update translation (Vietnamese)

Currently translated at 99.8% (1751 of 1753 strings)

Co-authored-by: Linos <linos.coding@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/vi/
Translation: InvokeAI/Web UI
2025-02-13 19:26:55 +11:00
Thomas Bolteau
b81209e751 translationBot(ui): update translation (French)
Currently translated at 91.7% (1609 of 1753 strings)

Co-authored-by: Thomas Bolteau <thomas.bolteau50@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/fr/
Translation: InvokeAI/Web UI
2025-02-13 19:26:55 +11:00
Riccardo Giovanetti
c4040eb2f0 translationBot(ui): update translation (Italian)
Currently translated at 98.9% (1735 of 1753 strings)

translationBot(ui): update translation (Italian)

Currently translated at 98.9% (1731 of 1749 strings)

translationBot(ui): update translation (Italian)

Currently translated at 98.9% (1731 of 1749 strings)

translationBot(ui): update translation (Italian)

Currently translated at 98.6% (1726 of 1749 strings)

Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
2025-02-13 19:26:55 +11:00
Billy Lisner
046ea611f9 Remove files 2025-02-13 19:24:01 +11:00
Billy Lisner
1439da5e88 Make ruff gods happy 2025-02-13 19:24:01 +11:00
Billy Lisner
69a504710f More detailed error messages 2025-02-13 19:24:01 +11:00
Billy Lisner
842b770938 Update OpenAI Schema 2025-02-13 19:24:01 +11:00
Billy Lisner
ba39331594 Make ruff happy 2025-02-13 19:24:01 +11:00
Billy Lisner
8ee9509eec Add Metadata Field Extractor 2025-02-13 19:24:01 +11:00
psychedelicious
7b5dcffb3f fix(ui): prevent overflow on document root 2025-02-13 09:10:38 +11:00
Mary Hipp
6927e95444 update defaults 2025-02-12 15:49:15 -05:00
Mary Hipp
76618fee9c feat(ui): separate upscaling settings so that tab does not inherit from main generation settings 2025-02-12 15:49:15 -05:00
Maxim Evtush
b51312f1ba Update model_images_common.py 2025-02-11 20:03:11 +11:00
Maxim Evtush
c2b71854be Update useGalleryNavigation.ts 2025-02-11 20:03:11 +11:00
Maxim Evtush
df793c898f Update denoise_latents.py 2025-02-11 20:03:11 +11:00
Maxim Evtush
d6181e4d64 Update useImageViewer.ts 2025-02-11 20:03:11 +11:00
Maxim Evtush
0a4ea9ac6f Update validateWorkflow.ts 2025-02-11 20:03:11 +11:00
dunkeroni
9e6f3e9338 image channel multiply node loads as RGBA now 2025-02-11 18:32:56 +11:00
psychedelicious
d3a40d85b9 Update invokeai_version.py 2025-02-11 11:36:15 +11:00
psychedelicious
b224cc8158 fix(ui): canvas image error placeholder never disappears 2025-02-11 11:10:14 +11:00
psychedelicious
b75d08a2d0 fix(ui): ensure CanvasObjectImage's visibility is set correctly when updating it 2025-02-11 11:10:14 +11:00
psychedelicious
5f1a30ea82 fix(ui): flicker when transitioning from an output image to next generation's progress image 2025-02-11 11:10:14 +11:00
psychedelicious
d09e600802 feat(ui): add more to CanvasStagingAreaModule repr 2025-02-11 11:10:14 +11:00
psychedelicious
f4ee59b92a feat(ui): add more to CanvasObjectImage repr 2025-02-11 11:10:14 +11:00
psychedelicious
ad0b40b669 feat(ui): differentiate between failure modes for canvas image rendering 2025-02-11 11:10:14 +11:00
Mary Hipp
f3fbcf0014 fix [object object] OOM error 2025-02-11 07:04:11 +11:00
Mary Hipp
588e8a0195 fix oom toast title 2025-02-11 07:04:11 +11:00
psychedelicious
c194281f4d docs: install troubleshooting 2025-02-08 10:40:04 +11:00
psychedelicious
7daff465d3 docs: remove outdated info & update other items in FAQ 2025-02-07 12:14:23 +11:00
psychedelicious
0747a5f464 docs: add link to low vram in requirements 2025-02-07 12:14:23 +11:00
psychedelicious
e7aafdfdbf feat(ui): migrate all clipboard stuff to useClipboard 2025-02-07 11:08:03 +11:00
psychedelicious
ecb38c2bae feat(ui): disallow direct access to clipboard via eslint 2025-02-07 11:08:03 +11:00
psychedelicious
d3ef94cb3e feat(ui): add useClipboard hook to safely wrap clipboard access 2025-02-07 11:08:03 +11:00
psychedelicious
eb27b437ee docs: add firefox clipboard fix 2025-02-07 11:08:03 +11:00
Mary Hipp
25bb96ed66 restore missing translation 2025-02-06 14:10:28 -05:00
psychedelicious
a9568e00a7 chore(ui): updates whats new copy 2025-02-06 13:49:57 +11:00
psychedelicious
c8a5d3bbf9 chore: bump version to v5.6.1rc1 2025-02-06 13:49:57 +11:00
psychedelicious
ed2b2868ce chore(ui): update whats new copy 2025-02-06 13:49:57 +11:00
Hosted Weblate
35de49aa01 translationBot(ui): update translation files
Updated by "Cleanup translation files" hook in Weblate.

Co-authored-by: Hosted Weblate <hosted@weblate.org>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/
Translation: InvokeAI/Web UI
2025-02-06 13:25:57 +11:00
Riccardo Giovanetti
8bac8d3d3a translationBot(ui): update translation (Italian)
Currently translated at 98.9% (1716 of 1734 strings)

Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
2025-02-06 13:25:57 +11:00
psychedelicious
e63bd26b19 feat(ui): tweak paste buttons 2025-02-06 12:56:21 +11:00
psychedelicious
91ded4bd15 feat(ui): tweak styling of canvas paste modal 2025-02-06 12:56:21 +11:00
psychedelicious
1656d3dd21 feat(ui): better canvas paste modal copy 2025-02-06 12:56:21 +11:00
psychedelicious
fe67dfefab fix(ui): fall back to pasting to bbox when no raster layers 2025-02-06 12:56:21 +11:00
psychedelicious
6420882a5b feat(ui): add helper text to paste modal 2025-02-06 12:56:21 +11:00
psychedelicious
568e3bd714 chore(ui): lint 2025-02-06 12:56:21 +11:00
psychedelicious
d9c2115396 feat(ui): support pasting directly to canvas 2025-02-06 12:56:21 +11:00
psychedelicious
3e13249983 test(ui): remove test for collect -> iterate validation 2025-02-06 07:57:26 +11:00
psychedelicious
2c2ee7fe20 feat(ui): allow collect -> iterate connections 2025-02-06 07:57:26 +11:00
psychedelicious
50cb27cd0b feat(nodes): support collect -> iterate node connections w/ validation 2025-02-06 07:57:26 +11:00
psychedelicious
d66cd4e81b tests(nodes): add test for collect -> iterate type validation 2025-02-06 07:57:26 +11:00
psychedelicious
8556a2558e chore(nodes): better naming for graph validation utils 2025-02-06 07:57:26 +11:00
psychedelicious
2fb35d25dd feat(nodes): field type Any accepts collections 2025-02-06 07:57:26 +11:00
psychedelicious
a8eb47769a feat(ui): improved enqueue error messages 2025-02-06 07:57:26 +11:00
psychedelicious
592e45a078 feat(nodes): improved graph validation error messages 2025-02-06 07:57:26 +11:00
psychedelicious
c5e5641f0e feat(ui): add menu items to copy canvas/bbox to clipboard 2025-02-04 23:21:20 -05:00
skunkworxdark
dfb9e300d4 typegen + Suggested changes (fix typo + remove asserts) 2025-02-04 21:37:04 +11:00
skunkworxdark
d7f80fc299 Update invokeai/app/invocations/flux_lora_loader.py
good catch

Co-authored-by: psychedelicious <4822129+psychedelicious@users.noreply.github.com>
2025-02-04 21:37:04 +11:00
skunkworxdark
c9b1eb2d83 update to include T5EncoderField lora changes 2025-02-04 21:37:04 +11:00
skunkworxdark
13d505a621 Fix github test errors
Fix errors with typegen and py3.10 macos-default github tests
2025-02-04 21:37:04 +11:00
skunkworxdark
6674d95dae fix typegen error
fix typegen error
2025-02-04 21:37:04 +11:00
skunkworxdark
c1f5383e63 Fix typegen error
Fix typegen error
2025-02-04 21:37:04 +11:00
skunkworxdark
71690715db fix typegen
fix typegen
2025-02-04 21:37:04 +11:00
skunkworxdark
641489c2f8 fix typegen error
fix typegen
2025-02-04 21:37:04 +11:00
skunkworxdark
5f0bd2e1db Fix typegen issue
Fix typegen issue
2025-02-04 21:37:04 +11:00
skunkworxdark
98b8ab0147 LoRA Loader optional LoRA Collection
Update the LoRA Loaders to make the Lora Collection Optional
2025-02-04 21:37:04 +11:00
Thomas Bolteau
50bf5b7f44 translationBot(ui): update translation (French)
Currently translated at 93.1% (1588 of 1705 strings)

Co-authored-by: Thomas Bolteau <thomas.bolteau50@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/fr/
Translation: InvokeAI/Web UI
2025-02-04 16:45:12 +11:00
Riku
0184cb27c4 translationBot(ui): update translation (German)
Currently translated at 70.2% (1197 of 1705 strings)

Co-authored-by: Riku <riku.block@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/de/
Translation: InvokeAI/Web UI
2025-02-04 16:45:12 +11:00
Linos
c374ab24cb translationBot(ui): update translation (Vietnamese)
Currently translated at 100.0% (1708 of 1708 strings)

translationBot(ui): update translation (Vietnamese)

Currently translated at 100.0% (1705 of 1705 strings)

Co-authored-by: Linos <linos.coding@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/vi/
Translation: InvokeAI/Web UI
2025-02-04 16:45:12 +11:00
Riccardo Giovanetti
6313ab6a40 translationBot(ui): update translation (Italian)
Currently translated at 99.2% (1695 of 1708 strings)

translationBot(ui): update translation (Italian)

Currently translated at 99.2% (1692 of 1705 strings)

translationBot(ui): update translation (Italian)

Currently translated at 99.2% (1691 of 1704 strings)

Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
2025-02-04 16:45:12 +11:00
Riku
a4d58aab09 feat(ui): add cancel all except current queue item functionality 2025-02-04 12:23:23 +11:00
Riku
b74fb40cbc chore(ui): update typegen schema 2025-02-04 12:23:23 +11:00
Riku
47dc954385 feat(app): add cancel all except current queue item functionality 2025-02-04 12:23:23 +11:00
psychedelicious
8fc5d3dd20 chore(nodes): bump versions of changed nodes 2025-02-04 12:06:17 +11:00
dunkeroni
6f1a198af4 better granularity on image adjust slider 2025-02-04 12:06:17 +11:00
dunkeroni
9c7bac693b fix image adjust hue handling 2025-02-04 12:06:17 +11:00
dunkeroni
8c9fc45341 add labels 2025-02-04 12:06:17 +11:00
dunkeroni
f93571f7ef update default filter 2025-02-04 12:06:17 +11:00
dunkeroni
cc27730cb4 fix: image channel invocations respect alpha 2025-02-04 12:06:17 +11:00
dunkeroni
fdf9740f3c fix: offets to integers 2025-02-04 12:06:17 +11:00
dunkeroni
58255ab7ba add adjust image filter to canvas 2025-02-04 12:06:17 +11:00
Mary Hipp
64475b8f21 feat(ui): add button to clear model cache 2025-01-30 09:18:28 -05:00
Ryan Dick
cc9d215a9b Add endpoint for emptying the model cache. Also, adds a threading lock to the ModelCache to make it thread-safe. 2025-01-30 09:18:28 -05:00
Ryan Dick
f7315f0432 Make the default max RAM cache size more conservative. 2025-01-30 08:46:59 -05:00
Ryan Dick
285313b282 Fix T5EncoderField initialization in SD3 model loader. 2025-01-29 09:27:52 -05:00
Ryan Dick
debcbd6e2c Support FLUX OneTrainer LoRA formats (incl. DoRA) (#7590)
## Summary

This PR adds support for the FLUX LoRA model format produced by
OneTrainer.

Specifically, this PR adds:
- Support for DoRA patches
- Support for patch models that modify the FLUX T5 encoder
- Probing / loading support for OneTrainer models

## Known limitations

- DoRA patches cannot currently be applied to base weights that are
quantized with `bitsandbytes`. The DoRA algorithm requires accessing the
original model weight in order to compute the patch diff, and the
bitsandbytes quantization layers make this difficult. DoRA patches can
be applied to non-quantized and GGUF-quantized layers without issue.
- This PR results in a slight speed regression for a very particular
inference combination: quantized base model + LoRA with diffusers keys
(i.e. uses the `MergedLayerPatch`). Now that more LoRA formats are using
the `MergedLayerPatch`, it was becoming too much work to maintain this
optimization. Regression from ~1.7 it/s to ~1.4 it/s.

## Future Notes

- We may want to consider dropping support for bitsandbytes
quantization. It is very difficult to maintain compatibility for across
features like partial-loading and LoRA patching.
- At a future time, we should refactor the LoRA parsing logic to be more
generalized rather than handling each format independently.
- There are some redundant device casts and dequantizations in
`autocast_linear_forward_sidecar_patches(...)` (and its sub-calls).
Optimizing this is left for future work.

## Related Issues / Discussions

- This PR should address a handful of the LoRAs reported in
https://github.com/invoke-ai/InvokeAI/issues/7131 (specifically, most of
the `envy*` LoRAs).
- This PR should address the example in
https://github.com/invoke-ai/InvokeAI/issues/6912 (though the intended
effect of that LoRA is not totally clear, so its hard to verify with
full confidence).

## QA Instructions


OneTrainer test models:
-
https://civitai.com/models/844821/envy-flux-dark-watercolor-01?modelVersionId=945159
(DoRA, transformer only)
-
https://civitai.com/models/836757/envy-flux-digital-brush-01?modelVersionId=936167
(hada, transformer only)
- ball_flux from https://github.com/invoke-ai/InvokeAI/issues/6912
(DoRA, transformer/clip/t5)

The following tests were repeated with each of the OneTrainer test
models:

- [x] Test with non-quantized base model
- [x] Test with GGUF-quantized base model
- [x] Test with BnB-quantized base model
- [x] Test with non-quantized base model that is partially-loaded onto
the GPU

Other regression test:

- [x] Test some SD1 LoRAs
- [x] Test some SDXL LoRAs
- [x] Test a variety of existing FLUX LoRA formats
- [x] Test a FLUX Control LoRA on all base model quantization formats. 

## Merge Plan

No special instructions.

## Checklist

- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
2025-01-28 12:50:52 -05:00
Ryan Dick
229834a5e8 Performance optimizations for LoRAs applied on top of GGML-quantized tensors. 2025-01-28 14:51:35 +00:00
Ryan Dick
6c919e1bca Handle DoRA layer device casting when model is partially-loaded. 2025-01-28 14:51:35 +00:00
Ryan Dick
5357d6e08e Rename ConcatenatedLoRALayer to MergedLayerPatch. And other minor cleanup. 2025-01-28 14:51:35 +00:00
Ryan Dick
7fef569e38 Update frontend graph building logic to support FLUX LoRAs that modify the T5 encoder weights. 2025-01-28 14:51:35 +00:00
Ryan Dick
e7fb435cc5 Update DoRALayer with a custom get_parameters() override that 1) applies alpha scaling to delta_v, and 2) warns if the base model is incompatible. 2025-01-28 14:51:35 +00:00
Ryan Dick
5d472ac1b8 Move quantized weight handling for patch layers up from ConcatenatedLoRALayer to CustomModuleMixin. 2025-01-28 14:51:35 +00:00
Ryan Dick
28514ba59a Update ConcatenatedLoRALayer to work with all sub-layer types. 2025-01-28 14:51:35 +00:00
Ryan Dick
5ea7953537 Update GGMLTensor with ops necessary to work with ConcatenatedLoRALayer. 2025-01-28 14:51:35 +00:00
Ryan Dick
0db6639b4b Add FLUX OneTrainer model probing. 2025-01-28 14:51:35 +00:00
Ryan Dick
b8eed2bdcb Relax lora_layers_from_flux_diffusers_grouped_state_dict(...) so that it can work with more LoRA variants (e.g. hada) 2025-01-28 14:51:35 +00:00
Ryan Dick
1054283f5c Fix bug in FLUX T5 Koyha-style LoRA key parsing. 2025-01-28 14:51:35 +00:00
Ryan Dick
f4a0b78a8d Update FLUX invocations to support LoRAs that modify the T5 text encoder. 2025-01-28 14:51:35 +00:00
Ryan Dick
409b69ee5d Fix typo in DoRALayer. 2025-01-28 14:51:35 +00:00
Ryan Dick
206f261e45 Add utils for loading FLUX OneTrainer DoRA models. 2025-01-28 14:51:35 +00:00
Ryan Dick
7eee4da896 Further updates to lora_model_from_flux_diffusers_state_dict() so that it can be re-used for OneTrainer LoRAs. 2025-01-28 14:51:35 +00:00
Ryan Dick
908976ac08 Add support for LyCoris-style LoRA keys in lora_model_from_flux_diffusers_state_dict(). Previously, it only supported PEFT-style LoRA keys. 2025-01-28 14:51:35 +00:00
Ryan Dick
dfa253e75b Add utils for working with Kohya LoRA keys. 2025-01-28 14:51:35 +00:00
Ryan Dick
4f369e3dfb First draft of DoRALayer. Not tested yet. 2025-01-28 14:51:35 +00:00
Ryan Dick
faa4fa02c0 Expand unit tests to test for confusion between FLUX LoRA formats. 2025-01-28 14:51:35 +00:00
Ryan Dick
5bd6428fdd Add is_state_dict_likely_in_flux_onetrainer_format() util function. 2025-01-28 14:51:35 +00:00
Ryan Dick
8b4f411f7b Add a test state dict for the OneTrainer DoRA format. 2025-01-28 14:51:35 +00:00
Ryan Dick
9d2f8b4ac8 Improve MaskOutput dimension consistency (#7591)
## Summary

This PR fixes an issue with mask dimension consistency. Prior to this
change, the following workflow would fail with `tuple out of range`
error:

<img width="1072" alt="image"
src="https://github.com/user-attachments/assets/d0a9e658-1d64-4db4-adee-973bbdaca745"
/>

### Before this PR

Dimension compatibility for invocations that take a mask input:
- `ApplyMaskTensorToImageInvocation`: 2 or 3
- `MaskTensorToImageInvocation`: 2 or 3
- `InvertTensorMaskInvocation`: 3

Mask dimension for invocations that produce a MaskOutput:
- `RectangleMaskInvocation`: 3
- `AlphaMaskToTensorInvocation`: 3
- `InvertTensorMaskInvocation`: 3
- `ImageMaskToTensorInvocation`: 3
- `SegmentAnythingInvocation`: 2

### After this PR (changes in bold)

Dimension compatibility for invocations that take a mask input:
- `ApplyMaskTensorToImageInvocation`: 2 or 3
- `MaskTensorToImageInvocation`: 2 or 3
- `InvertTensorMaskInvocation`: **2 or 3** <----------------

Mask dimension for invocations that produce a MaskOutput:
- `RectangleMaskInvocation`: 3
- `AlphaMaskToTensorInvocation`: 3
- `InvertTensorMaskInvocation`: 3
- `ImageMaskToTensorInvocation`: 3
- `SegmentAnythingInvocation`: **3** <-------------------


## QA Instructions

I tested the workflow in the PR description and this workflow:
<img width="872" alt="image"
src="https://github.com/user-attachments/assets/20496860-ce81-47c0-a46a-a611b73faa22"
/>


## Merge Plan

No special instructions.

## Checklist

- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
2025-01-28 09:42:39 -05:00
Ryan Dick
80c3d8bc5c pnpm typegen 2025-01-28 14:30:15 +00:00
Ryan Dick
b681132da4 Update InvertTensorMaskInvocation to handle mask tensors with dim 2 or 3. 2025-01-24 22:04:46 +00:00
Ryan Dick
f60a5a5015 Update SegmentAnythingInvocation invocations to return masks with a channel dimension of size 1. This is the convention used by other nodes that produce a MaskOutput. 2025-01-24 22:04:10 +00:00
psychedelicious
6efd108481 docs: typo in manual docs install command
Thanks to ShaneDK on discord for catching this.
2025-01-23 14:57:22 +11:00
Ryan Dick
f88c1ba0c3 Fix bug with some LoRA variants when applied to a BnB NF4 quantized model. Note the previous commit which added a unit test to trigger this bug. 2025-01-22 09:20:40 +11:00
Ryan Dick
e2f05d0800 Add unit tests for LoKR patch layers. The new tests trigger a bug when LoKR layers are applied to BnB-quantized layers (also impacts several other LoRA variant types). 2025-01-22 09:20:40 +11:00
psychedelicious
83e33a4810 chore: bump version to v5.6.0 2025-01-21 17:58:47 +11:00
psychedelicious
e635028477 chore(ui): update whats new copy 2025-01-21 17:58:47 +11:00
psychedelicious
b7b8f8a9e5 fix(nodes): remove WithMetadata from non-image-outputting node 2025-01-21 17:58:47 +11:00
psychedelicious
e926d2f24b fix(nodes): add beta classification to new inpainting support nodes 2025-01-21 17:58:47 +11:00
psychedelicious
ad8885c456 chore(ui): typegen 2025-01-21 17:45:32 +11:00
psychedelicious
cf4c79fe2e feat(nodes): add PasteImageIntoBoundingBoxInvocation 2025-01-21 17:45:32 +11:00
psychedelicious
e0edfe6c40 feat(nodes): add CropImageToBoundingBoxInvocation 2025-01-21 17:45:32 +11:00
psychedelicious
8a0a37191a feat(nodes): add GetMaskBoundingBoxInvocation 2025-01-21 17:45:32 +11:00
psychedelicious
7dbd5f150a feat(nodes): add BoundingBoxField.tuple() to get bbox as PIL tuple 2025-01-21 17:45:32 +11:00
psychedelicious
1ad65ffd53 feat(nodes): re-title "Mask from ID" -> "Mask from Segmented Image" 2025-01-21 17:45:32 +11:00
psychedelicious
14b5c871dc feat(nodes): simplify MaskFromIDInvocation 2025-01-21 17:45:32 +11:00
psychedelicious
8d2b4e2bf5 feat(nodes): support FLUX, SD3 in ideal_size 2025-01-21 17:45:32 +11:00
psychedelicious
aba70eacab fix(ui): field handle positioning for non-batch fields
Accidentally overwrote some reactflow styles which caused field handles to be positioned differently for non-batch fields. Just a minor visual issue.
2025-01-21 11:49:49 +11:00
Riccardo Giovanetti
4b67175b1b translationBot(ui): update translation (Italian)
Currently translated at 99.1% (1690 of 1704 strings)

Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
2025-01-21 09:12:45 +11:00
Hosted Weblate
e3423d1ba8 translationBot(ui): update translation files
Updated by "Cleanup translation files" hook in Weblate.

Co-authored-by: Hosted Weblate <hosted@weblate.org>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/
Translation: InvokeAI/Web UI
2025-01-21 09:12:45 +11:00
Linos
87fb00ff5d translationBot(ui): update translation (Vietnamese)
Currently translated at 100.0% (1697 of 1697 strings)

translationBot(ui): update translation (Vietnamese)

Currently translated at 99.2% (1684 of 1697 strings)

translationBot(ui): update translation (Vietnamese)

Currently translated at 99.7% (1676 of 1681 strings)

translationBot(ui): update translation (Vietnamese)

Currently translated at 99.3% (1670 of 1681 strings)

translationBot(ui): update translation (Vietnamese)

Currently translated at 99.5% (1658 of 1666 strings)

translationBot(ui): update translation (Vietnamese)

Currently translated at 100.0% (1652 of 1652 strings)

Co-authored-by: Linos <linos.coding@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/vi/
Translation: InvokeAI/Web UI
2025-01-21 09:12:45 +11:00
Riccardo Giovanetti
d99a9ffb72 translationBot(ui): update translation (Italian)
Currently translated at 99.3% (1642 of 1652 strings)

Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
2025-01-21 09:12:45 +11:00
Hosted Weblate
7964f438dc translationBot(ui): update translation files
Updated by "Cleanup translation files" hook in Weblate.

Co-authored-by: Hosted Weblate <hosted@weblate.org>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/
Translation: InvokeAI/Web UI
2025-01-21 09:12:45 +11:00
Linos
b130a3a9ee translationBot(ui): update translation (Vietnamese)
Currently translated at 100.0% (1652 of 1652 strings)

Co-authored-by: Linos <linos.coding@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/vi/
Translation: InvokeAI/Web UI
2025-01-21 09:12:45 +11:00
Riccardo Giovanetti
a6b32160b2 translationBot(ui): update translation (Italian)
Currently translated at 99.3% (1642 of 1652 strings)

translationBot(ui): update translation (Italian)

Currently translated at 99.3% (1641 of 1652 strings)

Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
2025-01-21 09:12:45 +11:00
psychedelicious
7d110cc9d3 fix(ui): disable dynamic prompts generators pending resolution of infinite recursion issue
Dynamic prompts string generators can cause an infinite feedback loop when added to the linear view.

The root cause is how these generators handle "resolving" their collections. They hit the dynamic prompts HTTP API within the view component to get the prompts, then set the batch node's internal state with those values.

When the same generator is rendered in both the node editor view and linear view and the timing is just right, that state update causes an infinite feedback loop between the two components as they respond to the state updates from the other component.

The other generators never store the generated values in the batch node's internal state. The values are "resolved" just-in-time as they are needed.

To fix this, the batch value "resolver" utilities could be made async and hit the API. But there's a problem - the resolver utilities are used within the "are we ready to invoke? are there any problems with the current settings?" redux selectors, which are strictly synchronous. To fix that, we can refactor that "are we ready to invoke?" logic to not use redux selectors, so the whole thing could be async.

It's not a big change but I'm not going to spend time on it at the moment.

So, until I address this, the dynamic prompts generators are disabled.
2025-01-21 09:00:40 +11:00
psychedelicious
82122645e8 refactor(ui): organize special handling for batch field types 2025-01-21 07:17:29 +11:00
psychedelicious
f5c5b73383 fix(ui): string batch nodes' inputs get batch type 2025-01-21 07:17:29 +11:00
psychedelicious
2b2ec67cd6 fix(nodes): allow connection input on string batch nodes 2025-01-21 07:17:29 +11:00
Ryan Dick
66bc225bd3 Add a troubleshooting instructions for the Windows page file issue to the Low-VRAM docs. 2025-01-20 08:58:41 +11:00
psychedelicious
7535d2e188 feat(ui): use translation for load from file buttons 2025-01-20 08:57:42 +11:00
psychedelicious
3dff87aeee feat(ui): better layout for generator load from file buttons 2025-01-20 08:57:42 +11:00
psychedelicious
b14bf1e0f4 chore(ui): lint 2025-01-20 08:57:42 +11:00
psychedelicious
4fdc6eec9d feat(ui): support loading from file for string input generators 2025-01-20 08:57:42 +11:00
psychedelicious
180a67d11b feat(ui): small fontsize on generator textareas 2025-01-20 08:57:42 +11:00
psychedelicious
ec816d3c04 feat(ui): improved dynamicprompts generator
- Split into two (random and combinatorial) - lots of fiddly logic to do both in one generator.
- Update to support seeds for random.
2025-01-20 08:57:42 +11:00
psychedelicious
7dcc2dafbc chore(ui): typegen 2025-01-20 08:57:42 +11:00
psychedelicious
81da5210f0 feat(api): add seed field to dynamicprompts 2025-01-20 08:57:42 +11:00
psychedelicious
eb976a2ab0 feat(ui): add dynamic prompts string generator (WIP) 2025-01-20 08:57:42 +11:00
psychedelicious
724028d974 feat(ui): port improved string parsing logic from string generator to float & int 2025-01-20 08:57:42 +11:00
psychedelicious
43c98fd99e feat(ui): add string generator 2025-01-20 08:57:42 +11:00
psychedelicious
526d64a5e2 feat(nodes): add string generator 2025-01-20 08:57:42 +11:00
psychedelicious
58c6c6db53 feat(ui): make string collection component same as number collection
Same UI & better perf thanks to a different structure.
2025-01-20 08:57:42 +11:00
Kevin Turner
3848e1926b chore(docker): reduce size between docker builds
by adding a layer with all the pytorch dependencies that don't change most of the time.
2025-01-18 09:10:54 -08:00
psychedelicious
8a41e09de3 feat(ui): seeded random generators
- Add JS Mersenne Twister implementation dependency to use as seeded PRNG. This is not a cryptographically secure algorithm.
- Add nullish seed field to float and integer random generators.
- Add UI to control the seed.
- When seed is not set, behaviour is unchanged - the values are randomized when you Invoke. When seed is set, the random distribution is deterministic depending on the seed. In this case, we can display the values to the user.
2025-01-18 08:45:56 +11:00
psychedelicious
c24eae1968 chore: bump version to v5.6.0rc4 2025-01-17 16:29:20 +11:00
psychedelicious
a6b207a0d9 fix(ui): string field textarea accidentally readonly 2025-01-17 16:17:13 +11:00
psychedelicious
eea5ecdd69 Update invokeai_version.py 2025-01-17 13:15:20 +11:00
psychedelicious
50de54dcfd chore(ui): lint 2025-01-17 12:48:58 +11:00
psychedelicious
04b893f982 chore(ui): typegen 2025-01-17 12:48:58 +11:00
psychedelicious
4c655eeb48 chore(ui): lint 2025-01-17 12:48:58 +11:00
psychedelicious
298abab883 feat(ui): improved generator text area styling 2025-01-17 12:48:58 +11:00
psychedelicious
bd477ded2e feat(ui): better preview for generators 2025-01-17 12:48:58 +11:00
psychedelicious
0b64d21980 tidy(ui): remove extraneous reset button on generators 2025-01-17 12:48:58 +11:00
psychedelicious
91d5f8537d feat(ui): add integer & float parse string generators 2025-01-17 12:48:58 +11:00
psychedelicious
e498e1f07c feat(ui): reworked float/int generators (arithmetic sequence, linear dist, uniform rand dist) 2025-01-17 12:48:58 +11:00
psychedelicious
73a3f195dc fix(ui): remove nonfunctional button 2025-01-17 12:48:58 +11:00
psychedelicious
8cc790a030 fix(ui): batch size calculations 2025-01-17 12:48:58 +11:00
psychedelicious
57265c8869 feat(ui): rip out generator modal functionality 2025-01-17 12:48:58 +11:00
psychedelicious
66d08eaa1c fix(ui): translation for generators 2025-01-17 12:48:58 +11:00
psychedelicious
d69e90ca5e feat(ui): support integer generators 2025-01-17 12:48:58 +11:00
psychedelicious
f345fde512 fix(ui): use utils to get default float generator values 2025-01-17 12:48:58 +11:00
psychedelicious
508c702289 feat(nodes): remove default values for generator; let UI handle it 2025-01-17 12:48:58 +11:00
psychedelicious
8fbd2f9a97 feat(nodes): add integer generator nodes 2025-01-17 12:48:58 +11:00
psychedelicious
bfb26af36a chore(ui): lint 2025-01-17 12:48:58 +11:00
psychedelicious
4400bc69f2 feat(ui): don't show generator preview for random generators 2025-01-17 12:48:58 +11:00
psychedelicious
10f2c0dc9a feat(ui): support generator nodes (wip)
- Add `batch` property to field type object to differentiate between executable nodes and batch/generator nodes.
- Support for float generators
2025-01-17 12:48:58 +11:00
psychedelicious
5b0326fc49 chore(ui): typegen 2025-01-17 12:48:58 +11:00
psychedelicious
2f9a0a250d feat(nodes): generators as nodes 2025-01-17 12:48:58 +11:00
psychedelicious
5d03328dc6 tidy(nodes): code dedupe for batch node init errors 2025-01-17 12:48:58 +11:00
psychedelicious
1fb32aec28 tidy(nodes): move batch nodes to own file 2025-01-17 12:48:58 +11:00
psychedelicious
2bbcd42036 chore(ui): knip 2025-01-17 12:34:54 +11:00
psychedelicious
2f40f7bafd tweak(ui): error verbiage for collection size mismatch 2025-01-17 12:34:54 +11:00
psychedelicious
65dd01bf3a fix(ui): invoke tooltip for invalid/empty batches 2025-01-17 12:34:54 +11:00
psychedelicious
81fc525f8a chore(ui): lint 2025-01-17 12:34:54 +11:00
psychedelicious
d2dd5ee408 fix(ui): unclosed JSX tag 2025-01-17 12:34:54 +11:00
psychedelicious
b4b1daeb26 feat(ui): validate all batch nodes have connection 2025-01-17 12:34:54 +11:00
psychedelicious
90c4c10e14 feat(ui): show batch group in node title 2025-01-17 12:34:54 +11:00
psychedelicious
30e33d30d5 fix(ui): handle batch group ids of "None" correctly 2025-01-17 12:34:54 +11:00
psychedelicious
3df3be6c34 tweak(ui): enum field selects have size="sm" 2025-01-17 12:34:54 +11:00
psychedelicious
4e917bf2b2 chore(ui): typegen 2025-01-17 12:34:54 +11:00
psychedelicious
26e6e28a13 feat(nodes): add title for batch_group_id field 2025-01-17 12:34:54 +11:00
psychedelicious
f9cee42a06 tweak(ui): node editor layout padding 2025-01-17 12:34:54 +11:00
psychedelicious
1b8da023b8 chore(ui): typegen 2025-01-17 12:34:54 +11:00
psychedelicious
05f1026812 feat(nodes): batch_group_id is a literal of options 2025-01-17 12:34:54 +11:00
psychedelicious
ca1bd254ea feat(ui): rename "link_id" -> "batch_group_id" 2025-01-17 12:34:54 +11:00
psychedelicious
29645326b9 chore(ui): typegen 2025-01-17 12:34:54 +11:00
psychedelicious
c23a2abc82 feat(nodes): rename "link_id" -> "batch_group_id" 2025-01-17 12:34:54 +11:00
psychedelicious
803ec8e904 feat(ui): add zipped batch collection size validation 2025-01-17 12:34:54 +11:00
psychedelicious
0abc0be931 fix(ui): allow batch nodes without link id (i.e. product batch nodes) to have mismatched collection sizes 2025-01-17 12:34:54 +11:00
psychedelicious
edff16124f feat(ui): support zipped batch nodes 2025-01-17 12:34:54 +11:00
psychedelicious
2e4110a29a chore(ui): typegen 2025-01-17 12:34:54 +11:00
psychedelicious
7ee51f3e14 feat(nodes): add link_id field to batch nodes
This is used to link batch nodes into zipped batch data collections.
2025-01-17 12:34:54 +11:00
psychedelicious
8ae75dbc35 chore(ui): typegen 2025-01-17 12:34:54 +11:00
psychedelicious
9265716b07 chore(ui): lint 2025-01-17 12:19:04 +11:00
psychedelicious
27b9c07711 chore(ui): typegen 2025-01-17 12:19:04 +11:00
psychedelicious
9dcbe3cc8f tweak(ui): number collection styling 2025-01-17 12:19:04 +11:00
psychedelicious
30165f66c3 feat(ui): string collection batch items are input not textarea 2025-01-17 12:19:04 +11:00
psychedelicious
deb70edc75 fix(ui): translation key 2025-01-17 12:19:04 +11:00
psychedelicious
d82d990b23 feat(ui): add number range generators 2025-01-17 12:19:04 +11:00
psychedelicious
2c64b60d32 Revert "feat(ui): rough out number generators for number collection fields"
This reverts commit 41cc6f1f96bca2a51727f21bd727ca48eab669bc.
2025-01-17 12:19:04 +11:00
psychedelicious
4e8c6d931d Revert "feat(ui): number collection generator supports floats"
This reverts commit 9da3339b513de9575ffbf6ce880b3097217b199d.
2025-01-17 12:19:04 +11:00
psychedelicious
9049e6e0f3 Revert "feat(ui): more batch generator stuff"
This reverts commit 111a29c7b4fc6b5062a0a37ce704a6508ff58dd8.
2025-01-17 12:19:04 +11:00
psychedelicious
3cb5f8536b feat(ui): more batch generator stuff 2025-01-17 12:19:04 +11:00
psychedelicious
38e50cc7aa tidy(ui): abstract out batch detection logic 2025-01-17 12:19:04 +11:00
psychedelicious
5bff6123b9 feat(nodes): add default value for batch nodes 2025-01-17 12:19:04 +11:00
psychedelicious
d63ff560d6 feat(ui): number collection generator supports floats 2025-01-17 12:19:04 +11:00
psychedelicious
acceac8304 fix(ui): do not set number collection field to undefined when removing last item 2025-01-17 12:19:04 +11:00
psychedelicious
96671d12bd fix(ui): filter out batch nodes when checking readiness on workflows tab 2025-01-17 12:19:04 +11:00
psychedelicious
584601d03f perf(ui): memoize selector in workflows 2025-01-17 12:19:04 +11:00
psychedelicious
b1c4ec0888 feat(ui): rough out number generators for number collection fields 2025-01-17 12:19:04 +11:00
psychedelicious
db5f016826 fix(nodes): allow batch datum items to mix ints and floats
Unfortunately we cannot do strict floats or ints.

The batch data models don't specify the value types, it instead relies on pydantic parsing. JSON doesn't differentiate between float and int, so a float `1.0` gets parsed as `1` in python.

As a result, we _must_ accept mixed floats and ints for BatchDatum.items.

Tests and validation updated to handle this.

Maybe we should update the BatchDatum model to have a `type` field? Then we could parse as float or int, depending on the inputs...
2025-01-17 12:19:04 +11:00
psychedelicious
c1fd28472d fix(ui): float batch data creation 2025-01-17 12:19:04 +11:00
psychedelicious
0c5958675a chore(ui): lint 2025-01-17 12:19:04 +11:00
psychedelicious
912e07f2c8 tidy(ui): use zod typeguard builder util for fields 2025-01-17 12:19:04 +11:00
psychedelicious
f853b24868 chore(ui): typegen 2025-01-17 12:19:04 +11:00
psychedelicious
4f900b22dc feat(ui): validate number item multipleOf 2025-01-17 12:19:04 +11:00
psychedelicious
5823532941 feat(ui): validate string item lengths 2025-01-17 12:19:04 +11:00
psychedelicious
bfe6d98cba feat(ui): support float batches 2025-01-17 12:19:04 +11:00
psychedelicious
c26b3cd54f refactor(ui): abstract out helper to add batch data 2025-01-17 12:19:04 +11:00
psychedelicious
c012d832d2 fix(ui): typo 2025-01-17 12:19:04 +11:00
psychedelicious
9d11d2aabd refactor(ui): abstract out field validators 2025-01-17 12:19:04 +11:00
psychedelicious
a5f1587ce7 feat(ui): add template validation for integer collection items 2025-01-17 12:19:04 +11:00
psychedelicious
0b26bb1ca3 feat(ui): add template validation for string collection items 2025-01-17 12:19:04 +11:00
psychedelicious
0f1e632117 feat(nodes): add float batch node 2025-01-17 12:19:04 +11:00
psychedelicious
b212332b3e feat(ui): support integer batches 2025-01-17 12:19:04 +11:00
psychedelicious
90a91ff438 feat(nodes): add integer batch node 2025-01-17 12:19:04 +11:00
psychedelicious
b52b271dc4 feat(ui): support string batches 2025-01-17 12:19:04 +11:00
psychedelicious
e077fe8046 refactor(ui): streamline image field collection input logic, support multiple images w/ same name in collection 2025-01-17 12:19:04 +11:00
psychedelicious
368957b208 tweak(ui): image field collection input component styling 2025-01-17 12:19:04 +11:00
psychedelicious
27277e1fd6 docs(ui): improved comments for image batch node special handling 2025-01-17 12:19:04 +11:00
psychedelicious
236c0d89e7 feat(nodes): add string batch node 2025-01-17 12:19:04 +11:00
psychedelicious
b807170701 fix(ui): typo in error message for image collection fields 2025-01-17 12:19:04 +11:00
Ryan Dick
c5d2de3169 Revise the default logic for the model cache RAM limit (#7566)
## Summary

This PR revises the logic for calculating the model cache RAM limit. See
the code for thorough documentation of the change.

The updated logic is more conservative in the amount of RAM that it will
use. This will likely be a better default for more users. Of course,
users can still choose to set a more aggressive limit by overriding the
logic with `max_cache_ram_gb`.

## Related Issues / Discussions

- Should help with https://github.com/invoke-ai/InvokeAI/issues/7563

## QA Instructions

Exercise all heuristics:
- [x] Heuristic 1
- [x] Heuristic 2
- [x] Heuristic 3
- [x] Heuristic 4

## Merge Plan

- [x] Merge https://github.com/invoke-ai/InvokeAI/pull/7565 first and
update the target branch

## Checklist

- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
2025-01-16 19:59:14 -05:00
Ryan Dick
f7511bfd94 Add keep_ram_copy_of_weights config option (#7565)
## Summary

This PR adds a `keep_ram_copy_of_weights` config option the default (and
legacy) behavior is `true`. The tradeoffs for this setting are as
follows:
- `keep_ram_copy_of_weights: true`: Faster model switching and LoRA
patching.
- `keep_ram_copy_of_weights: false`: Lower average RAM load (may not
help significantly with peak RAM).

## Related Issues / Discussions

- Helps with https://github.com/invoke-ai/InvokeAI/issues/7563
- The Low-VRAM docs are updated to include this feature in
https://github.com/invoke-ai/InvokeAI/pull/7566

## QA Instructions

- Test with `enable_partial_load: false` and `keep_ram_copy_of_weights:
false`.
  - [x] RAM usage when model is loaded is reduced.
  - [x] Model loading / unloading works as expected.
  - [x] LoRA patching still works.
- Test with `enable_partial_load: false` and `keep_ram_copy_of_weights:
true`.
  - [x] Behavior should be unchanged.
- Test with `enable_partial_load: true` and `keep_ram_copy_of_weights:
false`.
  - [x] RAM usage when model is loaded is reduced.
  - [x] Model loading / unloading works as expected.
  - [x] LoRA patching still works.
- Test with `enable_partial_load: true` and `keep_ram_copy_of_weights:
true`.
  - [x] Behavior should be unchanged.

- [x] Smoke test CPU-only and MPS with default configs.

## Merge Plan

- [x] Merge https://github.com/invoke-ai/InvokeAI/pull/7564 first and
change target branch.

## Checklist

- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
2025-01-16 19:57:02 -05:00
Ryan Dick
0abb5ea114 Reduce peak memory during FLUX model load (#7564)
## Summary

Prior to this change, there were several cases where we initialized the
weights of a FLUX model before loading its state dict (and, to make
things worse, in some cases the weights were in float32). This PR fixes
a handful of these cases. (I think I found all instances for the FLUX
family of models.)

## Related Issues / Discussions

- Helps with https://github.com/invoke-ai/InvokeAI/issues/7563

## QA Instructions

I tested that that model loading still works and that there is no
virtual memory reservation on model initialization for the following
models:
- [x] FLUX VAE
- [x] Full T5 Encoder
- [x] Full FLUX checkpoint
- [x] GGUF FLUX checkpoint

## Merge Plan

No special instructions.

## Checklist

- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
2025-01-16 18:47:17 -05:00
Ryan Dick
ce57c4ed2e Update the Low-VRAM docs. 2025-01-16 23:46:07 +00:00
Ryan Dick
0cf51cefe8 Revise the logic for calculating the RAM model cache limit. 2025-01-16 23:46:07 +00:00
Ryan Dick
e5e848d239 Update config docstring. 2025-01-16 22:34:23 +00:00
Ryan Dick
da589b3f1f Memory optimization to load state dicts one module at a time in CachedModelWithPartialLoad when we are not storing a CPU copy of the state dict (i.e. when keep_ram_copy_of_weights=False). 2025-01-16 17:00:33 +00:00
Ryan Dick
36a3869af0 Add keep_ram_copy_of_weights config option. 2025-01-16 15:35:25 +00:00
Ryan Dick
c76d08d1fd Add keep_ram_copy option to CachedModelOnlyFullLoad. 2025-01-16 15:08:23 +00:00
Ryan Dick
04087c38ce Add keep_ram_copy option to CachedModelWithPartialLoad. 2025-01-16 14:51:44 +00:00
Ryan Dick
b2bb359d47 Update the model loading logic for several of the large FLUX-related models to ensure that the model is initialized on the meta device prior to loading the state dict into it. This helps to keep peak memory down. 2025-01-16 02:30:28 +00:00
Mary Hipp
b57aa06d9e take out AbortController logic and simplify dependencies 2025-01-16 09:39:32 +11:00
Mary Hipp
f856246c36 try removing abortcontroller 2025-01-16 09:39:32 +11:00
Mary Hipp
195df2ebe6 remove logic changes, keep logging 2025-01-16 09:39:32 +11:00
Mary Hipp
7b5cef6bd7 lint fix 2025-01-16 09:39:32 +11:00
Mary Hipp
69e7ffaaf5 add logging, remove deps 2025-01-16 09:39:32 +11:00
psychedelicious
993401ad6c fix(ui): hide layer when previewing filter
Previously, when previewing a filter on a layer with some transparency or a filter that changes the alpha, the preview was rendered on top of the layer. The preview blended with the layer, which isn't right.

In this change, the layer is hidden during the preview, and when the filter finishes (having been applied or canceled - the two possible paths), the layer is shown.

Technically, we are hiding and showing the layer's object renderer's konva group, which contains the layer's "real" data.

Another small change was made to prevent a flash of empty layer, by waiting to destroy a previous filter preview image until the new preview image is ready to display.
2025-01-16 09:27:36 +11:00
psychedelicious
8d570dcffc chore(ui): typegen 2025-01-16 09:27:36 +11:00
psychedelicious
3f70e947fd chore: ruff 2025-01-16 09:27:36 +11:00
dunkeroni
157290bef4 add: size option for image noise node and filter 2025-01-16 09:27:36 +11:00
dunkeroni
b7389da89b add: Noise filter on Canvas 2025-01-16 09:27:36 +11:00
dunkeroni
254b89b1f5 add: Blur filter option on canvas 2025-01-16 09:27:36 +11:00
dunkeroni
2b122d7882 add: image noise invocation 2025-01-16 09:27:36 +11:00
dunkeroni
ded9213eb4 trim blur splitting logic 2025-01-16 09:27:36 +11:00
dunkeroni
9d51eb49cd fix: ImageBlurInvocation handles transparency now 2025-01-16 09:27:36 +11:00
dunkeroni
0a6e22bc9e fix: ImagePasteInvocation respects transparency 2025-01-16 09:27:36 +11:00
Ryan Dick
b301785dc8 Normalize the T5 model identifiers so that a FLUX T5 or an SD3 T5 model can be used interchangeably. 2025-01-16 08:33:58 +11:00
psychedelicious
edcdff4f78 fix(ui): round rects when applying transform
Due to the limited floating point precision, and konva's `scale` properties, it is possible for the relative rect of an object to have non-integer coordinates and dimensions.

When we go to rasterize and otherwise export images, the HTML canvas API truncates these numbers.

So, we can end up with situations where the relative width and height of a layer are very close to the "real" value, but slightly off.

For example, width and height might be 512px, but the relative rect is calculated to be something like 512.000000003 or 511.9999999997.

In the first case, the truncation results in 512x512 for the dimensions - which is correct. But in the second case, it results in 511x511!

One place where this causes issues is the image action `New Canvas from image -> As Raster Layer (resize)`. For certain input image sizes, this results in an incorrectly resized image. For example, a 1496x1946 input image is resized to 511x511 pixels when the bbox is 512x512.

To fix this, we can round both coords and dimensions of rects when rasterizing.

I've thought through the implications and done some testing. I believe this change will not cause any regressions and only fix edge cases. But, it's possible that something was inadvertently relying on the old behavior.
2025-01-16 01:17:30 +11:00
psychedelicious
66e04ea7ab fix(ui): sticky preset image tooltip
There's a bug where preset image tooltips get stuck open in the list.

After much fiddling, debugging, and review of upstream dependencies, I have determined that this is bug in Chakra-UI v2.

Specifically, it appears to be a race condition related to the Tooltip component's internal use of the `useDisclosure` hook to manage tooltip open state, and the react render cycle.

Unfortunately, Chakra v2 is no longer being updated, and it's a pain in the butt to vendor and fix that component given its dependencies. Not 100% sure I could easily fix it, anyways.

Fortunately, there is a workaround - reduce the tooltip openDelay to 0ms. I prefer the current 500ms delay but I think it's preferable to have too-quick tooltips than too-sticky tooltips...
2025-01-15 09:12:46 -05:00
Ryan Dick
497bc916cc Add unet_config to get_scheduler(...) call in TiledMultiDiffusionDenoiseLatents. 2025-01-15 08:44:08 -05:00
dunkeroni
ebe1873712 fix: only add prediction type if it exists 2025-01-15 08:44:08 -05:00
dunkeroni
59926c320c support v-prediction in denoise_latents.py 2025-01-15 08:44:08 -05:00
Mary Hipp
2d3e2f1907 use window instead of document 2025-01-14 20:01:08 -05:00
psychedelicious
d88b59c5c4 Revert "feat(ui): rearrange canvas paste back nodes to save an image step"
This reverts commit 7cdda00a54.
2025-01-10 15:59:29 +11:00
Simon Fuhrmann
1c7adb5c70 Update communityNodes.md - Fix broken image
The image under https://invoke-ai.github.io/InvokeAI/nodes/communityNodes/#stereogram-nodes is broken. Changing img src to fix.
2025-01-09 07:29:02 -05:00
psychedelicious
8da9d3bc19 chore: bump version to v5.6.0rc2 2025-01-09 14:12:46 +11:00
psychedelicious
d9c099bd3a docs: fix incorrect macOS launcher fix command 2025-01-09 11:26:59 +11:00
psychedelicious
a329588e5a feat: add link to low vram guide to OOM toast (local only)
Needed to do a bit of refactoring to support this. Overall, the error toast components are easier to understand now.
2025-01-09 11:20:05 +11:00
psychedelicious
e09cf64779 feat: more updates to first run view 2025-01-09 11:20:05 +11:00
psychedelicious
fc8cf224ca docs: typo 2025-01-09 11:20:05 +11:00
psychedelicious
3e1ed18a1f Update docs/features/low-vram.md
Co-authored-by: Ryan Dick <ryanjdick3@gmail.com>
2025-01-09 11:20:05 +11:00
psychedelicious
9a84c85486 docs: add section about disabling the sysmem fallback 2025-01-09 11:20:05 +11:00
psychedelicious
e6deaa2d2f feat(ui): minor layout tweaks for first run screen 2025-01-09 11:20:05 +11:00
psychedelicious
5246b31347 feat(ui): add low vram link to first run page 2025-01-09 11:20:05 +11:00
psychedelicious
b15dd00840 docs: add docs for low vram mode 2025-01-09 11:20:05 +11:00
psychedelicious
8808c36028 docs: update example yaml file 2025-01-09 11:20:05 +11:00
psychedelicious
89b576f10d fix(ui): prevent canvas & main panel content from scrolling
Hopefully fixes issues where, when run via the launcher, the main panel kinda just scrolls out of bounds.
2025-01-09 09:14:22 +11:00
psychedelicious
d7893a52c3 tweak(ui): whats new copy 2025-01-08 15:26:26 +11:00
Mary Hipp
b9c45c3232 Whats new update 2025-01-08 15:26:26 +11:00
David Burnett
afc9d3b98f more ruff formating 2025-01-07 20:18:19 -05:00
David Burnett
7ddc757bdb ruff format changes 2025-01-07 20:18:19 -05:00
David Burnett
d8da9b45cc Fix for DEIS / DPM clash 2025-01-07 20:18:19 -05:00
Ryan Dick
607d19f4dd We should not trust the value of since the model could be partially-loaded. 2025-01-07 19:22:31 -05:00
psychedelicious
32286f321c docs: note that version is not req for editable install 2025-01-07 17:17:40 -05:00
psychedelicious
03f7bdc9f9 docs: fix manual install rocm pypi indices 2025-01-07 17:17:40 -05:00
Ryan Dick
4df3d0861b Deprecate ram/vram configs for smoother migration path to dynamic limits (#7526)
## Summary

Changes:
- Deprecate `ram` and `vram` configs. If these are set in invokeai.yaml,
they will be ignored.
- Create new `max_cache_ram_gb` and `max_cache_vram_gb` configs with the
same definitions as the old configs.

The main motivation of this change is to make the migration path
smoother for users who had previously added `ram` /`vram` to their
config files. Now, these users will be automatically migrated into the
new dynamic limit behavior (which is better in most cases). These users
will have to manually re-add `max_cache_ram_gb` and `max_cache_vram_gb`
to their configs if they wish to go back to specifying manual limits.

## Related Issues / Discussions

See the release notes for RC v5.6.0rc1 for the old migration behavior
that we are trying to improve:
https://github.com/invoke-ai/InvokeAI/releases/tag/v5.6.0rc1

## QA Instructions

- [x] Test that if `ram` or `vram` are present in a user's
`invokeai.yaml`, these values are ignored.
- [x] Test that `max_cache_ram_gb` and `max_cache_vram_gb` are applied,
if set.

## Merge Plan

- Don't forget to update the RC release notes accordingly.

## Checklist

- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
2025-01-07 17:03:11 -05:00
Ryan Dick
974b4671b1 Deprecate the ram and vram configs to make the migration to dynamic
memory limits smoother for users who had previously overriden these
values.
2025-01-07 16:45:29 +00:00
Ryan Dick
6b18f270dd Bugfix: Offload of GGML-quantized model in torch.inference_mode() cm (#7525)
## Summary

This PR contains a bugfix for an edge case with model unloading (from
VRAM to RAM). Thanks to @JPPhoto for finding it.

The bug was triggered under the following conditions:
- A GGML-quantized model is loaded in VRAM
- We run a Spandrel image-to-image invocation (which is wrapped in a
`torch.inference_mode()` context manager.
- The model cache attempts to unload the GGML-quantized model from VRAM
to RAM.
- Doing this inside of the `torch.inference_mode()` cm results in the
following error:
```
 [2025-01-07 15:48:17,744]::[InvokeAI]::ERROR --> Error while invoking session 98a07259-0c03-4111-a8d8-107041cb86f9, invocation d8daa90b-7e4c-4fc4-807c-50ba9be1a4ed (spandrel_image_to_image): Cannot set version_counter for inference tensor
[2025-01-07 15:48:17,744]::[InvokeAI]::ERROR --> Traceback (most recent call last):
  File "/home/ryan/src/InvokeAI/invokeai/app/services/session_processor/session_processor_default.py", line 129, in run_node
    output = invocation.invoke_internal(context=context, services=self._services)
  File "/home/ryan/src/InvokeAI/invokeai/app/invocations/baseinvocation.py", line 300, in invoke_internal
    output = self.invoke(context)
  File "/home/ryan/.pyenv/versions/3.10.14/envs/InvokeAI_3.10.14/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
    return func(*args, **kwargs)
  File "/home/ryan/src/InvokeAI/invokeai/app/invocations/spandrel_image_to_image.py", line 167, in invoke
    with context.models.load(self.image_to_image_model) as spandrel_model:
  File "/home/ryan/src/InvokeAI/invokeai/backend/model_manager/load/load_base.py", line 60, in __enter__
    self._cache.lock(self._cache_record, None)
  File "/home/ryan/src/InvokeAI/invokeai/backend/model_manager/load/model_cache/model_cache.py", line 224, in lock
    self._load_locked_model(cache_entry, working_mem_bytes)
  File "/home/ryan/src/InvokeAI/invokeai/backend/model_manager/load/model_cache/model_cache.py", line 272, in _load_locked_model
    vram_bytes_freed = self._offload_unlocked_models(model_vram_needed, working_mem_bytes)
  File "/home/ryan/src/InvokeAI/invokeai/backend/model_manager/load/model_cache/model_cache.py", line 458, in _offload_unlocked_models
    cache_entry_bytes_freed = self._move_model_to_ram(cache_entry, vram_bytes_to_free)
  File "/home/ryan/src/InvokeAI/invokeai/backend/model_manager/load/model_cache/model_cache.py", line 330, in _move_model_to_ram
    return cache_entry.cached_model.partial_unload_from_vram(
  File "/home/ryan/.pyenv/versions/3.10.14/envs/InvokeAI_3.10.14/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
    return func(*args, **kwargs)
  File "/home/ryan/src/InvokeAI/invokeai/backend/model_manager/load/model_cache/cached_model/cached_model_with_partial_load.py", line 182, in partial_unload_from_vram
    cur_state_dict = self._model.state_dict()
  File "/home/ryan/.pyenv/versions/3.10.14/envs/InvokeAI_3.10.14/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1939, in state_dict
    module.state_dict(destination=destination, prefix=prefix + name + '.', keep_vars=keep_vars)
  File "/home/ryan/.pyenv/versions/3.10.14/envs/InvokeAI_3.10.14/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1936, in state_dict
    self._save_to_state_dict(destination, prefix, keep_vars)
  File "/home/ryan/.pyenv/versions/3.10.14/envs/InvokeAI_3.10.14/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1843, in _save_to_state_dict
    destination[prefix + name] = param if keep_vars else param.detach()
RuntimeError: Cannot set version_counter for inference tensor
```

### Explanation

From the `torch.inference_mode()` docs:
> Code run under this mode gets better performance by disabling view
tracking and version counter bumps.

Disabling version counter bumps results in the aforementioned error when
saving `GGMLTensor`s to a state_dict.

This incompatibility between `GGMLTensors` and `torch.inference_mode()`
is likely caused by the custom tensor type implementation. There may
very well be a way to get these to cooperate, but for now it is much
simpler to remove the `torch.inference_mode()` contexts.

Note that there are several other uses of `torch.inference_mode()` in
the Invoke codebase, but they are all tight wrappers around the
inference forward pass and do not contain the model load/unload process.

## Related Issues / Discussions

Original discussion:
https://discord.com/channels/1020123559063990373/1149506274971631688/1326180753159094303

## QA Instructions

Find a sequence of operations that triggers the condition. For me, this
was:
- Reserve VRAM in a separate process so that there was ~12GB left.
- Fresh start of Invoke
- Run FLUX inference with a GGML 8K model
- Run Spandrel upscaling

Tests:
- [x] Confirmed that I can reproduce the error and that it is no longer
hit after the change
- [x] Confirm that there is no speed regression from switching from
`torch.inference_mode()` to `torch.no_grad()`.
    - Before: `50.354s`, After: `51.536s`


## Checklist

- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
2025-01-07 11:31:20 -05:00
Ryan Dick
85eb4f0312 Fix an edge case with model offloading from VRAM to RAM. If a GGML-quantized model is offloaded from VRAM inside of a torch.inference_mode() context manager, this will cause the following error: 'RuntimeError: Cannot set version_counter for inference tensor'. 2025-01-07 15:59:50 +00:00
psychedelicious
67e948b50d chore: bump version to v5.6.0rc1 2025-01-07 19:41:56 +11:00
Riccardo Giovanetti
d9a20f319f translationBot(ui): update translation (Italian)
Currently translated at 99.3% (1639 of 1649 strings)

Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
2025-01-07 19:32:50 +11:00
Riku
38d4863e09 translationBot(ui): update translation (German)
Currently translated at 71.7% (1181 of 1645 strings)

Co-authored-by: Riku <riku.block@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/de/
Translation: InvokeAI/Web UI
2025-01-07 19:32:50 +11:00
Nik Nikovsky
cd7ba14adc translationBot(ui): update translation (Polish)
Currently translated at 16.5% (273 of 1645 strings)

translationBot(ui): update translation (Polish)

Currently translated at 15.4% (254 of 1645 strings)

translationBot(ui): update translation (Polish)

Currently translated at 10.8% (178 of 1645 strings)

Co-authored-by: Nik Nikovsky <zejdzztegomaila@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/pl/
Translation: InvokeAI/Web UI
2025-01-07 19:32:50 +11:00
Linos
e5b6beb24d translationBot(ui): update translation (Vietnamese)
Currently translated at 100.0% (1649 of 1649 strings)

translationBot(ui): update translation (Vietnamese)

Currently translated at 100.0% (1645 of 1645 strings)

translationBot(ui): update translation (Vietnamese)

Currently translated at 100.0% (1645 of 1645 strings)

translationBot(ui): update translation (Vietnamese)

Currently translated at 100.0% (1645 of 1645 strings)

Co-authored-by: Linos <linos.coding@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/vi/
Translation: InvokeAI/Web UI
2025-01-07 19:32:50 +11:00
Ryan Dick
0258b6a04f Partial Loading PR5: Dynamic cache ram/vram limits (#7509)
## Summary

This PR enables RAM/VRAM cache size limits to be determined dynamically
based on availability.

**Config Changes**

This PR modifies the app configs in the following ways:
- A new `device_working_mem_gb` config was added. This is the amount of
non-model working memory to keep available on the execution device (i.e.
GPU) when using dynamic cache limits. It default to 3GB.
- The `ram` and `vram` configs now default to `None`. If these configs
are set, they will take precedence over the dynamic limits. **Note: Some
users may have previously overriden the `ram` and `vram` values in their
`invokeai.yaml`. They will need to remove these configs to enable the
new dynamic limit feature.**

**Working Memory**

In addition to the new `device_working_mem_gb` config described above,
memory-intensive operations can estimate the amount of working memory
that they will need and request it from the model cache. This is
currently applied to the VAE decoding step for all models. In the
future, we may apply this to other operations as we work out which ops
tend to exceed the default working memory reservation.

**Mitigations for https://github.com/invoke-ai/InvokeAI/issues/7513**

This PR includes some mitigations for the issue described in
https://github.com/invoke-ai/InvokeAI/issues/7513. Without these
mitigations, it would occur with higher frequency when dynamic RAM
limits are used and the RAM is close to maxed-out.

## Limitations / Future Work

- Only _models_ can be offloaded to RAM to conserve VRAM. I.e. if VAE
decoding requires more working VRAM than available, the best we can do
is keep the full model on the CPU, but we will still hit an OOM error.
In the future, we could detect this ahead of time and switch to running
inference on the CPU for those ops.
- There is often a non-negligible amount of VRAM 'reserved' by the torch
CUDA allocator, but not used by any allocated tensors. We may be able to
tune the torch CUDA allocator to work better for our use case.
Reference:
https://pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf
- There may be some ops that require high working memory that haven't
been updated to request extra memory yet. We will update these as we
uncover them.
- If a model is 'locked' in VRAM, it won't be partially unloaded if a
later model load requests extra working memory. This should be uncommon,
but I can think of cases where it would matter.

## Related Issues / Discussions

- #7492 
- #7494 
- #7500 
- #7505 

## QA Instructions

Run a variety of models near the cache limits to ensure that model
switching works properly for the following configurations:
- [x] CUDA, `enable_partial_loading=true`, all other configs default
(i.e. dynamic memory limits)
- [x] CUDA, `enable_partial_loading=true`, CPU and CUDA memory reserved
in another process so there is limited RAM/VRAM remaining, all other
configs default (i.e. dynamic memory limits)
- [x] CUDA, `enable_partial_loading=false`, all other configs default
(i.e. dynamic memory limits)
- [x] CUDA, ram/vram limits set (these should take precedence over the
dynamic limits)
- [x] MPS, all other default (i.e. dynamic memory limits)
- [x] CPU, all other default (i.e. dynamic memory limits) 

## Merge Plan

- [x] Merge #7505 first and change target branch to main

## Checklist

- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
2025-01-07 00:35:39 -05:00
Ryan Dick
87fdcb7f6f Partial Loading PR4: Enable partial loading (behind config flag) (#7505)
## Summary

This PR adds support for partial loading of models onto the GPU. This
enables models to run with much lower peak VRAM requirements (e.g. full
FLUX dev with 8GB of VRAM).

The partial loading feature is enabled behind a new config flag:
`enable_partial_loading=True`. This flag defaults to `False`.

**Note about performance:**
The `ram` and `vram` config limits are still applied when
`enable_partial_loading=True` is set. This can result in significant
slowdowns compared to the 'old' behaviour. Consider the case where the
VRAM limit is set to `vram=0.75` (GB) and we are trying to run an 8GB
model. When `enable_partial_loading=False`, we attempt to load the
entire model into VRAM, and if it fits (no OOM error) then it will run
at full speed. When `enable_partial_loading=True`, since we have the
option to partially load the model we will only load 0.75 GB into VRAM
and leave the remaining 7.25 GB in RAM. This will cause inference to be
much slower than before. To workaround this, it is important that your
`ram` and `vram` configs are carefully tuned. In a future PR, we will
add the ability to dynamically set the RAM/VRAM limits based on the
available memory / VRAM.

## Related Issues / Discussions

- #7492 
- #7494 
- #7500

## QA Instructions

Tests with `enable_partial_loading=True`, `vram=2`, on CUDA device:
For all tests, we expect model memory to stay below 2 GB. Peak working
memory will be higher.
- [x] SD1 inference
- [x] SDXL inference
- [x] FLUX non-quantized inference
- [x] FLUX GGML-quantized inference
- [x] FLUX BnB quantized inference
- [x] Variety of ControlNet / IP-Adapter / LoRA smoke tests

Tests with `enable_partial_loading=True`, and hack to force all models
to load 10%, on CUDA device:
- [x] SD1 inference
- [x] SDXL inference
- [x] FLUX non-quantized inference
- [x] FLUX GGML-quantized inference
- [x] FLUX BnB quantized inference
- [x] Variety of ControlNet / IP-Adapter / LoRA smoke tests

Tests with `enable_partial_loading=False`, `vram=30`:
We expect no change in behaviour when  `enable_partial_loading=False`.
- [x] SD1 inference
- [x] SDXL inference
- [x] FLUX non-quantized inference
- [x] FLUX GGML-quantized inference
- [x] FLUX BnB quantized inference
- [x] Variety of ControlNet / IP-Adapter / LoRA smoke tests

Other platforms:
- [x] No change in behavior on MPS, even if
`enable_partial_loading=True`.
- [x] No change in behavior on CPU-only systems, even if
`enable_partial_loading=True`.

## Merge Plan

- [x] Merge #7500 first, and change the target branch to main

## Checklist

- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
2025-01-06 23:18:31 -05:00
Ryan Dick
d7ab464176 Offload the current model when locking if it is already partially loaded and we have insufficient VRAM. 2025-01-07 02:53:44 +00:00
Ryan Dick
5eafe1ec7a Fix ModelCache execution device selection in unit tests. 2025-01-07 01:20:15 +00:00
Ryan Dick
548b3eddb8 pnpm typegen 2025-01-07 01:20:15 +00:00
Ryan Dick
5b42b7bd45 Add a utility to help with determining the working memory required for expensive operations. 2025-01-07 01:20:15 +00:00
Ryan Dick
71b97ce7be Reduce the likelihood of encountering https://github.com/invoke-ai/InvokeAI/issues/7513 by elminating places where the door was left open for this to happen. 2025-01-07 01:20:15 +00:00
Ryan Dick
b343f81644 Use torch.cuda.memory_allocated() rather than torch.cuda.memory_reserved() to be more conservative in setting dynamic VRAM cache limits. 2025-01-07 01:20:15 +00:00
Ryan Dick
4abfb35321 Tune SD3 VAE decode working memory estimate. 2025-01-07 01:20:15 +00:00
Ryan Dick
cba6528ea7 Add a 20% buffer to all VAE decode working memory estimates. 2025-01-07 01:20:15 +00:00
Ryan Dick
6a5cee61be Tune the working memory estimate for FLUX VAE decoding. 2025-01-07 01:20:15 +00:00
Ryan Dick
bd8017ecd5 Update working memory estimate for VAE decoding when tiling is being applied. 2025-01-07 01:20:15 +00:00
Ryan Dick
299eb94a05 Estimate the working memory required for VAE decoding, since this operations tends to be memory intensive. 2025-01-07 01:20:15 +00:00
Ryan Dick
fc4a22fe78 Allow expensive operations to request more working memory. 2025-01-07 01:20:13 +00:00
Ryan Dick
a167632f09 Calculate model cache size limits dynamically based on the available RAM / VRAM. 2025-01-07 01:14:20 +00:00
Ryan Dick
1321fac8f2 Remove get_cache_size() and set_cache_size() endpoints. These were unused by the frontend and refer to cache fields that are no longer accessible. 2025-01-07 01:06:20 +00:00
Ryan Dick
6a9de1fcf3 Change definition of VRAM in use for the ModelCache from sum of model weights to the total torch.cuda.memory_allocated(). 2025-01-07 00:31:53 +00:00
Ryan Dick
e5180c4e6b Add get_effective_device(...) utility to aid in determining the effective device of models that are partially loaded. 2025-01-07 00:31:00 +00:00
Ryan Dick
2619ef53ca Handle device casting in ia2_layer.py. 2025-01-07 00:31:00 +00:00
Ryan Dick
bcd29c5d74 Remove all cases where we check the 'model.device'. This is no longer trustworthy now that partial loading is permitted. 2025-01-07 00:31:00 +00:00
Ryan Dick
1b7bb70bde Improve handling of cases when application code modifies the size of a model after registering it with the model cache. 2025-01-07 00:31:00 +00:00
Ryan Dick
402dd840a1 Add seed to flaky unit test. 2025-01-07 00:31:00 +00:00
Ryan Dick
7127040c3a Remove unused function set_nested_attr(...). 2025-01-07 00:31:00 +00:00
Ryan Dick
ceb2498a67 Add log prefix to model cache logs. 2025-01-07 00:31:00 +00:00
Ryan Dick
d0bfa019be Add 'enable_partial_loading' config flag. 2025-01-07 00:31:00 +00:00
Ryan Dick
535e45cedf First pass at adding partial loading support to the ModelCache. 2025-01-07 00:30:58 +00:00
Ryan Dick
782ee7a0ec Partial Loading PR 3.5: Fix pre-mature model drops from the RAM cache (#7522)
## Summary

This is an unplanned fix between PR3 and PR4 in the sequence of partial
loading (i.e. low-VRAM) PRs. This PR restores the 'Current Workaround'
documented in https://github.com/invoke-ai/InvokeAI/issues/7513. In
other words, to work around a flaw in the model cache API, this fix
allows models to be loaded into VRAM _even if_ they have been dropped
from the RAM cache.

This PR also adds an info log each time that this workaround is hit. In
a future PR (#7509), we will eliminate the places in the application
code that are capable of triggering this condition.

## Related Issues / Discussions

- #7492 
- #7494
- #7500 
- https://github.com/invoke-ai/InvokeAI/issues/7513

## QA Instructions

- Set RAM cache limit to a small value. E.g. `ram: 4`
- Run FLUX text-to-image with the full T5 encoder, which exceeds 4GB.
This will trigger the error condition.
- Before the fix, this test configuration would cause a `KeyError`.
After the fix, we should see an info-level log explaining that the
condition was hit, but that generation should continue successfully.

## Merge Plan

No special instructions.

## Checklist

- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
2025-01-06 19:05:48 -05:00
Ryan Dick
c579a218ef Allow models to be locked in VRAM, even if they have been dropped from the RAM cache (related: https://github.com/invoke-ai/InvokeAI/issues/7513). 2025-01-06 23:02:52 +00:00
Riku
f4f7415a3b fix(app): remove obsolete DEFAULT_PRECISION variable 2025-01-06 11:14:58 +11:00
Mary Hipp
7d6c443d6f fix(api): limit board_name length to 300 characters 2025-01-06 10:49:49 +11:00
psychedelicious
868e06eb8b tests: fix test_model_install.py 2025-01-03 11:21:23 -05:00
psychedelicious
40e4dbe1fb docs: add blurb about setting a HF token when downloading HF models by URL and not repo id 2025-01-03 11:21:23 -05:00
psychedelicious
4815b4ea80 feat(ui): tweak verbiage for model install errors 2025-01-03 11:21:23 -05:00
psychedelicious
d77a6ccd76 fix(ui): model install error toasts not updating correctly 2025-01-03 11:21:23 -05:00
psychedelicious
3e860c8338 feat(ui): starter models filter works with model base
For example, "flux" now matches any starter model with a model base of "FLUX".
2025-01-03 11:21:23 -05:00
psychedelicious
4f2ef7ce76 refactor(ui): handle hf vs civitai/other url model install errors separately
Previously, we didn't differentiate between model install errors for different types of model install sources, resulting in a buggy UX:
- If a HF model install failed, but it was a HF URL install and not a repo id install, the link to the HF model page was incorrect.
- If a non-HF URL install (e.g. civitai) failed, we treated it as a HF URL install. In this case, if the user's HF token was invalid or unset, we directed the user to set it. If the HF token was valid, we displayed an empty red toast. If it's not a HF URL install, then of course neither of these are correct.

Also, the logic for handling the toasts was a bit complicated.

This change does a few things:
- Consolidate the model install error toasts into one place - the socket.io event handler for the model install error event. There is no more global state for the toasts and there are no hooks managing them.
- Handling the different cases for errors, including all combinations of HF/non-HF and unauthorized/forbidden/unknown.
2025-01-03 11:21:23 -05:00
psychedelicious
d7e9ad52f9 chore(ui): typegen 2025-01-03 11:21:23 -05:00
psychedelicious
b6d7a44004 refactor(events): include full model source in model install events
This is required to fix an issue with the MM UI's error handling.

Previously, we only included the model source as a string. That could be an arbitrary URL, file path or HF repo id, but the frontend has no parsing logic to differentiate between these different model sources.

Without access to the type of model source, it is difficult to determine how the user should proceed. For example, if it's HF URL with an HTTP unauthorized error, we should direct the user to log in to HF. But if it's a civitai URL with the same error, we should not direct the user to HF.

There are a variety of related edge cases.

With this change, the full `ModelSource` object is included in each model install event, including error events.

I had to fix some circular import issues, hence the import changes to files other than `events_common.py`.
2025-01-03 11:21:23 -05:00
psychedelicious
e18100ae7e refactor(ui): move model install error event handling to own file
No logic change.
2025-01-03 11:21:23 -05:00
psychedelicious
ad0aa0e6b2 feat(ui): reset canvas layers only resets the layers 2025-01-03 11:02:04 -05:00
psychedelicious
157b92e0fd docs: no need to specify version for dev env setup 2025-01-03 10:59:39 -05:00
psychedelicious
fd838ad9d4 docs: update dev env docs to mirror the launcher's install method 2025-01-03 14:27:45 +11:00
psychedelicious
5e9227c052 docs: update manual install docs to mirror the launcher's install method 2025-01-03 14:27:45 +11:00
Kent Keirsey
94785231ce Update href to correct link 2025-01-02 09:39:41 +11:00
Ryan Dick
b46d7abfb0 Partial Loading PR3: Integrate 1) partial loading, 2) quantized models, 3) model patching (#7500)
## Summary

This PR is the third in a sequence of PRs working towards support for
partial loading of models onto the compute device (for low-VRAM
operation). This PR updates the LoRA patching code so that the following
features can cooperate fully:
- Partial loading of weights onto the GPU
- Quantized layers / weights
- Model patches (e.g. LoRA)

Note that this PR does not yet enable partial loading. It adds support
in the model patching code so that partial loading can be enabled in a
future PR.

## Technical Design Decisions

The layer patching logic has been integrated into the custom layers (via
`CustomModuleMixin`) rather than keeping it in a separate set of wrapper
layers, as before. This has the following advantages:
- It makes it easier to calculate the modified weights on the fly and
then reuse the normal forward() logic.
- In the future, it makes it possible to pass original parameters that
have been cast to the device down to the LoRA calculation without having
to re-cast (but the current implementation hasn't fully taken advantage
of this yet).

## Know Limitations

1. I haven't fully solved device management for patch types that require
the original layer value to calculate the patch. These aren't very
common, and are not compatible with some quantized layers, so leaving
this for future if there's demand.
2. There is a small speed regression for models that have CPU
bottlenecks. This seems to be caused by slightly slower method
resolution on the custom layers sub-classes. The regression does not
show up on larger models, like FLUX, that are almost entirely
GPU-limited. I think this small regression is tolerable, but if we
decide that it's not, then the slowdown can easily be reclaimed by
optimizing other CPU operations (e.g. if we only sent every 2nd progress
image, we'd see a much more significant speedup).

## Related Issues / Discussions

- https://github.com/invoke-ai/InvokeAI/pull/7492
- https://github.com/invoke-ai/InvokeAI/pull/7494

## QA Instructions

Speed tests:
- Vanilla SD1 speed regression
    - Before: 3.156s (8.78 it/s)
    - After: 3.54s (8.35 it/s)
- Vanilla SDXL speed regression
    - Before: 6.23s (4.46 it/s)
    - After: 6.45s (4.31 it/s)
- Vanilla FLUX speed regression
    - Before: 12.02s (2.27 it/s)
    - After: 11.91s (2.29 it/s)

LoRA tests with default configuration:
- [x] SD1: A handful of LoRA variants
- [x] SDXL: A handful of LoRA variants
- [x] flux non-quantized: multiple lora variants
- [x] flux bnb-quantized: multiple lora variants
- [x] flux ggml-quantized: muliple lora variants
- [x] flux non-quantized: FLUX control LoRA
- [x] flux bnb-quantized: FLUX control LoRA
- [x] flux ggml-quantized: FLUX control LoRA

LoRA tests with sidecar patching forced:
- [x] SD1: A handful of LoRA variants
- [x] SDXL: A handful of LoRA variants
- [x] flux non-quantized: multiple lora variants
- [x] flux bnb-quantized: multiple lora variants
- [x] flux ggml-quantized: muliple lora variants
- [x] flux non-quantized: FLUX control LoRA
- [x] flux bnb-quantized: FLUX control LoRA
- [x] flux ggml-quantized: FLUX control LoRA

Other:
- [x] Smoke testing of IP-Adapter, ControlNet

All tests repeated on:
- [x] cuda
- [x] cpu (only test SD1, because larger models are prohibitively slow)
- [x] mps (skipped FLUX tests, because my Mac doesn't have enough memory
to run them in a reasonable amount of time)

## Merge Plan

No special instructions.

## Checklist

- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
2024-12-31 13:58:13 -05:00
Ryan Dick
9a0a226ce1 Fix bitsandbytes imports in unit tests on MacOS. 2024-12-30 10:41:48 -05:00
Ryan Dick
477d87ec31 Fix layer patch dtype selection for CLIP text encoder models. 2024-12-29 21:48:51 +00:00
Ryan Dick
8b4b0ff0cf Fix bug in CustomConv1d and CustomConv2d patch calculations. 2024-12-29 19:10:19 +00:00
Ryan Dick
6fd9b0a274 Delete old sidecar wrapper implementation. This functionality has moved into the custom layers. 2024-12-29 17:33:08 +00:00
Ryan Dick
52fc5a64d4 Add a unit test for a LoRA patch applied to a quantized linear layer with weights streamed from CPU to GPU. 2024-12-29 17:14:55 +00:00
Ryan Dick
a8bef59699 First pass at making custom layer patches work with weights streamed from the CPU to the GPU. 2024-12-29 17:01:37 +00:00
Ryan Dick
6d49ee839c Switch the LayerPatcher to use 'custom modules' to manage layer patching. 2024-12-29 01:18:30 +00:00
Ryan Dick
0525f967c2 Fix the _autocast_forward_with_patches() function for CustomConv1d and CustomConv2d. 2024-12-29 00:22:37 +00:00
Ryan Dick
2855bb6b41 Update BaseLayerPatch.get_parameters(...) to accept a dict of orig_parameters rather than orig_module. This will enable compatibility between patching and cpu->gpu streaming. 2024-12-28 21:12:53 +00:00
Ryan Dick
20acfc9a00 Raise in CustomEmbedding and CustomGroupNorm if a patch is applied. 2024-12-28 20:49:17 +00:00
Ryan Dick
918f541af8 Add unit test for a SetParameterLayer patch applied to a CustomFluxRMSNorm layer. 2024-12-28 20:44:48 +00:00
Ryan Dick
93e76b61d6 Add CustomFluxRMSNorm layer. 2024-12-28 20:33:38 +00:00
Ryan Dick
f692e217ea Add patch support to CustomConv1d and CustomConv2d (no unit tests yet). 2024-12-27 22:23:17 +00:00
Ryan Dick
f2981979f9 Get custom layer patches working with all quantized linear layer types. 2024-12-27 22:00:22 +00:00
Ryan Dick
ef970a1cdc Add support for FluxControlLoRALayer in CustomLinear layers and add a unit test for it. 2024-12-27 21:00:47 +00:00
Ryan Dick
5ee7405f97 Add more unit tests for custom module LoRA patching: multiple LoRAs and ConcatenatedLoRALayers. 2024-12-27 19:47:21 +00:00
Ryan Dick
e24e386a27 Add support for patches to CustomModuleMixin and add a single unit test (more to come). 2024-12-27 18:57:13 +00:00
Ryan Dick
b06d61e3c0 Improve custom layer wrap/unwrap logic. 2024-12-27 16:29:48 +00:00
Ryan Dick
6bf5b747ce Partial Loading PR2: Add utils to support partial loading of models from CPU to GPU (#7494)
## Summary

This PR adds utilities to support partial loading of models from CPU to
GPU. The new utilities are not yet being used by the ModelCache, so
there should be no functional behavior changes in this PR.

Detailed changes:

- Add autocast modules that are designed to wrap common
`torch.nn.Module`s and enable them to run with automatic device casting.
E.g. a linear layer on the CPU can be executed with an input tensor on
the GPU by streaming the weights to the GPU at runtime.
- Add unit tests for the aforementioned autocast modules to verify that
they work for all supported quantization formats (GGUF, BnB NF4, BnB
LLM.int8()).
- Add `CachedModelWithPartialLoad` and `CachedModelOnlyFullLoad` classes
to manage partial loading at the model level.

## Alternative Implementations

Several options were explored for supporting inference on
partially-loaded models. The pros/cons of the explored options are
summarized here for reference. In the end, wrapper modules were selected
as the best overall solution for our use case.

Option 1: Re-implement the .forward() methods of modules to add support
for device conversions
- This is the option implemented in this PR.
- This approach is the most manual of the three, but as a result offers
the broadest compatibility with unusual model types. It is manual in
that we have to explicitly add support for all module types that we wish
to support. Fortunately, the list of foundational module types is
relatively small (e.g. the current set of implemented layers covers all
but 0.04 MB of the full FLUX model.).

Option 2: Implement a custom Tensor type that casts tensors to a
`target_device` each time the tensor is used
- This approach has the nice property that it is injected at the tensor
level, and the model does not need to be modified in any way.
- One challenge with this approach is handling interactions with other
custom tensor types (e.g. GGMLTensor). This problem is solvable, but
definitely introduces a layer of complexity. (There are likely to also
be some similar issues with interactions with the BnB quantization, but
I didn't get as far as testing BnB.)

Option 3: Override the `__torch_function__` dispatch calls globally and
cast all params to the execution device.
- This approach is nice and simple: just apply a global context manager
and all operations will happen on the compute device regardless of the
device of the participating tensors.
- Challenges:
- Overriding the `__torch_function__` dispatch calls introduces some
overhead even if the tensors are already on the correct device.
- It is difficult to manage the autocasting context manager. E.g. it is
tempting to apply it to the model's `.forward(...)` method, but we use
some models with non-standard entrypoints. And we don't want to end up
with nested autocasting context managers.
- BnB applies quantization side effects when a param is moved to the GPU
- this interacts in unexpected ways with a global context manager.


## QA Instructions

Most of the changes in this PR should not impact active code, and thus
should not cause any changes to behavior. The main risks come from
bumping the bitsandbytes dependency and some minor modifications to the
bitsandbytes quantization code.

- [x] Regression test bitsandbytes NF4 quantization
- [x] Regression test bitsandbytes LLM.int8() quantization
- [x] Regression test on MacOS (to ensure that there are no lingering
bitsandbytes import errors)

I also tested the new utilities for inference on full models in another
branch to validate that there were not major issues. This functionality
will be tested more thoroughly in a future PR.

## Merge Plan

- [x] #7492 should be merged first so that the target branch can be
updated to main.

## Checklist

- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
2024-12-27 09:20:24 -05:00
Ryan Dick
7d6ab0ceb2 Add a CustomModuleMixin class with a flag for enabling/disabling autocasting (since it incurs some runtime speed overhead.) 2024-12-26 20:08:30 +00:00
Ryan Dick
9692a36dd6 Use a fixture to parameterize tests in test_all_custom_modules.py so that a fresh instance of the layer under test is initialized for each test. 2024-12-26 19:41:25 +00:00
Ryan Dick
b0b699a01f Add unit test to test that isinstance(...) behaves as expected with custom module types. 2024-12-26 18:45:56 +00:00
Ryan Dick
a8b2c4c3d2 Add inference tests for all custom module types (i.e. to test autocasting from cpu to device). 2024-12-26 18:33:46 +00:00
Ryan Dick
03944191db Split test_autocast_modules.py into separate test files to mirror the source file structure. 2024-12-24 22:29:11 +00:00
Ryan Dick
987c9ae076 Move custom autocast modules to separate files in a custom_modules/ directory. 2024-12-24 22:21:31 +00:00
Ryan Dick
6d7314ac0a Consolidate the LayerPatching patching modes into a single implementation. 2024-12-24 15:57:54 +00:00
Ryan Dick
80db9537ff Rename model_patcher.py -> layer_patcher.py. 2024-12-24 15:57:54 +00:00
Ryan Dick
6f926f05b0 Update apply_smart_model_patches() so that layer restore matches the behavior of non-smart mode. 2024-12-24 15:57:54 +00:00
Ryan Dick
61253b91f1 Enable LoRAPatcher.apply_smart_lora_patches(...) throughout the stack. 2024-12-24 15:57:54 +00:00
Ryan Dick
0148512038 (minor) Rename num_layers -> num_loras in unit tests. 2024-12-24 15:57:54 +00:00
Ryan Dick
d0f35fceed Add test_apply_smart_lora_patches_to_partially_loaded_model(...). 2024-12-24 15:57:54 +00:00
Ryan Dick
cefcb340d9 Add LoRAPatcher.smart_apply_lora_patches() 2024-12-24 15:57:54 +00:00
Ryan Dick
0fc538734b Skip flaky test when running on Github Actions, and further reduce peak unit test memory. 2024-12-24 14:32:11 +00:00
Ryan Dick
7214d4969b Workaround a weird quirk of QuantState.to() and add a unit test to exercise it. 2024-12-24 14:32:11 +00:00
Ryan Dick
a83a999b79 Reduce peak memory used for unit tests. 2024-12-24 14:32:11 +00:00
Ryan Dick
f8a6accf8a Fix bitsandbytes imports to avoid ImportErrors on MacOS. 2024-12-24 14:32:11 +00:00
Ryan Dick
f8ab414f99 Add CachedModelOnlyFullLoad to mirror the CachedModelWithPartialLoad for models that cannot or should not be partially loaded. 2024-12-24 14:32:11 +00:00
Ryan Dick
c6795a1b47 Make CachedModelWithPartialLoad work with models that have non-persistent buffers. 2024-12-24 14:32:11 +00:00
Ryan Dick
0a8fc74ae9 Add CachedModelWithPartialLoad to manage partially-loaded models using the new autocast modules. 2024-12-24 14:32:11 +00:00
Ryan Dick
dc54e8763b Add CustomInvokeLinearNF4 to enable CPU -> GPU streaming for InvokeLinearNF4 layers. 2024-12-24 14:32:11 +00:00
Ryan Dick
1b56020876 Add CustomInvokeLinear8bitLt layer for device streaming with InvokeLinear8bitLt layers. 2024-12-24 14:32:11 +00:00
Ryan Dick
3f990393a1 Simplify the state management in InvokeLinear8bitLt and add unit tests. This is in preparation for wrapping it to support streaming of weights from cpu to gpu. 2024-12-24 14:32:11 +00:00
Ryan Dick
97d56f7dc9 Add torch module autocast unit test for GGUF-quantized models. 2024-12-24 14:32:11 +00:00
Ryan Dick
fe0ef2c27c Add torch module autocast utilities. 2024-12-24 14:32:11 +00:00
Ryan Dick
65fcbf5f60 Bump bitsandbytes. The new verson contains improvements to state_dict loading/saving for LLM.int8 and promises improved speed on some HW. 2024-12-24 14:32:11 +00:00
Ryan Dick
d3916dbdb6 Partial Loading PR1: Tidy ModelCache (#7492)
## Summary

This PR tidies up the model cache code in preparation for further
refactoring to support partial loading of models onto the GPU. **These
code changes should not change the functional behavior in any way.**

Changes:
- Remove the `ModelCacheBase` class. `ModelCache` is the only
implementation, so there is no benefit to the separate abstract class.
- Split `CacheRecord` and `CacheStats` out into their own files.
- Remove the `ModelLocker` class. This extra layer of indirection was
not providing any benefit. Locking is now done directly with the
`ModelCache`.
- Tidy up relative imports that were contributing to circular import
issues.
- Pull the 'submodel' concern out of the `ModelCache`. The `ModelCache`
should not need to be aware of the model manager submodel system.
- Delete unused properties from the `ModelCache` (e.g.
`.lazy_offloading`, `.storage_device`, etc.)

## QA Instructions

I ran smoke tests with a variety of SD1, SDXL and FLUX models. No change
to behavior is expected.

## Merge Plan

<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->

## Checklist

- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
2024-12-24 09:30:44 -05:00
Ryan Dick
55b13c1da3 (minor) Add TODO comment regarding the location of get_model_cache_key(). 2024-12-24 14:23:19 +00:00
Ryan Dick
7dc3e0fdbe Get rid of ModelLocker. It was an unnecessary layer of indirection. 2024-12-24 14:23:18 +00:00
Ryan Dick
a39bcf7e85 Move lock(...) and unlock(...) logic from ModelLocker to the ModelCache and make a bunch of ModelCache properties/methods private. 2024-12-24 14:23:18 +00:00
Ryan Dick
a7c72992a6 Pull get_model_cache_key(...) out of ModelCache. The ModelCache should not be concerned with implementation details like the submodel_type. 2024-12-24 14:23:18 +00:00
Ryan Dick
d30a9ced38 Rename model_cache_default.py -> model_cache.py. 2024-12-24 14:23:18 +00:00
Ryan Dick
e0bfa6157b Remove ModelCacheBase. 2024-12-24 14:23:18 +00:00
Ryan Dick
83ea6420e2 Move CacheStats to its own file. 2024-12-24 14:23:18 +00:00
Ryan Dick
ce11a1952e Move CacheRecord out to its own file. 2024-12-24 14:23:18 +00:00
Ryan Dick
e48dee4c4a Rip out ModelLockerBase. 2024-12-24 14:23:18 +00:00
Simon Fuhrmann
712674b6dd Add Stereogram Nodes to communityNodes.md 2024-12-23 13:51:53 -05:00
psychedelicious
de0043f443 docs: update download links for launcher 2024-12-23 13:23:14 +11:00
Riku
d21506da6f feat(ci): add typegen check workflow 2024-12-22 06:05:17 +11:00
psychedelicious
a49894901a docs: fix installation docs home again 2024-12-20 17:35:50 +11:00
psychedelicious
e7e26c8a93 docs: fix installation docs home 2024-12-20 17:12:44 +11:00
psychedelicious
9adcd2cc31 docs: update install-related docs 2024-12-20 17:01:34 +11:00
Kent Keirsey
f9edd009f5 Update README.md 2024-12-20 17:01:34 +11:00
Kent Keirsey
91a4160e36 Update Installation Docs 2024-12-20 17:01:34 +11:00
Kent Keirsey
9c9cec1b43 Update README.md 2024-12-20 17:01:34 +11:00
psychedelicious
948ecf9333 chore: bump version to v5.5.0 2024-12-20 16:17:23 +11:00
psychedelicious
1038f7bcab Update invokeai_version.py 2024-12-20 10:17:09 +11:00
Riccardo Giovanetti
c7d9e2d62a translationBot(ui): update translation (Italian)
Currently translated at 99.3% (1635 of 1645 strings)

translationBot(ui): update translation (Italian)

Currently translated at 99.3% (1634 of 1645 strings)

Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
2024-12-20 10:07:15 +11:00
Riku
11c3a2e15d translationBot(ui): update translation (German)
Currently translated at 70.8% (1165 of 1645 strings)

Co-authored-by: Riku <riku.block@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/de/
Translation: InvokeAI/Web UI
2024-12-20 10:07:15 +11:00
psychedelicious
9e3ca383ec fix(ui): add missing model config to AnyModelConfig union type 2024-12-20 09:45:04 +11:00
Riku
bda83c2634 chore(ui): update typegen schema 2024-12-20 09:45:04 +11:00
Riku
525cb38c71 fix(app): fixed InputField default values 2024-12-20 09:30:56 +11:00
psychedelicious
a9a6720bad feat(app): change queue item execution log from debug to info
This provides useful context for subsequent logs during queue item execution.
2024-12-20 09:19:04 +11:00
psychedelicious
858bf9cf8c feat(api): less verbose uvicorn logs
Uvicorn's logging is rather verbose. This change adds a `log_level_network` config setting to independently control uvicorn's log outputs. The setting defaults to warning.

The change hides the helpful startup message that says the host and port we are running on.

For example: `Uvicorn running on http://0.0.0.0:9090 (Press CTRL+C to quit`

The ASGI lifespan handler is updated to log an equivalent message on startup, regardless of log level settings.

Besides being helpful, the launcher relies on a message like this to launch the app. So, previously, if the user set their log level to anything above info (e.g. warning or error), the launcher would fail to open the app. This change prevents that edge case.
2024-12-20 09:19:04 +11:00
David Hauptman
74a29c3735 re-format to fix ruff error 2024-12-19 22:33:17 +11:00
David Hauptman
6fc6be3aa0 Fix error message when adding a local path with quotes around the string 2024-12-19 22:33:17 +11:00
Mary Hipp
174ea021a6 lint 2024-12-18 12:48:15 -05:00
Mary Hipp
50b804e087 remove space 2024-12-18 12:48:15 -05:00
Mary Hipp
23270d7dfe update copy again 2024-12-18 12:48:15 -05:00
Mary Hipp
39e6f6d53f update whats new copy for control LOras 2024-12-18 12:48:15 -05:00
Mary Hipp
c154d833b9 raise error if control lora used with schnell 2024-12-18 10:19:28 -05:00
Mary Hipp
899a00af62 fix double filter on slow networks 2024-12-18 08:40:50 -05:00
Hosted Weblate
7c9ecdb362 translationBot(ui): update translation files
Updated by "Cleanup translation files" hook in Weblate.

Co-authored-by: Hosted Weblate <hosted@weblate.org>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/
Translation: InvokeAI/Web UI
2024-12-18 18:05:42 +11:00
Riccardo Giovanetti
4a5255611b translationBot(ui): update translation (Italian)
Currently translated at 99.3% (1634 of 1644 strings)

Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
2024-12-18 18:05:42 +11:00
Thomas Bolteau
b5b39db304 translationBot(ui): update translation (French)
Currently translated at 97.0% (1595 of 1643 strings)

Co-authored-by: Thomas Bolteau <thomas.bolteau50@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/fr/
Translation: InvokeAI/Web UI
2024-12-18 18:05:42 +11:00
Linos
2cb5743cc5 translationBot(ui): update translation (Vietnamese)
Currently translated at 100.0% (1644 of 1644 strings)

translationBot(ui): update translation (Vietnamese)

Currently translated at 100.0% (1643 of 1643 strings)

translationBot(ui): update translation (Vietnamese)

Currently translated at 100.0% (1643 of 1643 strings)

Co-authored-by: Linos <linos.coding@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/vi/
Translation: InvokeAI/Web UI
2024-12-18 18:05:42 +11:00
Riku
64ee8d491e translationBot(ui): update translation (German)
Currently translated at 70.3% (1156 of 1643 strings)

Co-authored-by: Riku <riku.block@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/de/
Translation: InvokeAI/Web UI
2024-12-18 18:05:42 +11:00
psychedelicious
d70d48de45 chore(ui): update whats new 2024-12-18 17:52:39 +11:00
psychedelicious
3f8636330f chore: bump version to v5.4.4rc1 2024-12-18 17:52:39 +11:00
Mary Hipp
0c2f96daf1 add probe for ControlLoRA x diffusers 2024-12-17 14:01:41 -05:00
Brandon Rising
c9b2cce627 Add diffusers config object for control loras 2024-12-17 14:01:41 -05:00
Mary Hipp
401fb392b8 add FLUX control loras to starter models 2024-12-17 09:29:21 -05:00
Ryan Dick
594511cf4a Add FLUX Control LoRA weight param (#7452)
## Summary

Add the ability to control the weight of a FLUX Control LoRA.

## Example

Original image:
<div style="display: flex; gap: 10px;">
<img
src="https://github.com/user-attachments/assets/4a2d9f4a-b58b-4df6-af90-67b018763a38"
alt="Image 1" width="300"/>
</div>

Prompt: `a scarecrow playing tennis`
Weights: 0.4, 0.6, 0.8, 1.0
<div style="display: flex; gap: 10px;">
<img
src="https://github.com/user-attachments/assets/62b83fd6-46ce-460a-8d51-9c2cda9b05c9"
alt="Image 1" width="300"/>
<img
src="https://github.com/user-attachments/assets/75442207-1538-46bc-9d6b-08ac5c235c93"
alt="Image 2" width="300"/>
</div>
<div style="display: flex; gap: 10px;">
<img
src="https://github.com/user-attachments/assets/4a9dc9ea-9757-4965-837e-197fc9243007"
alt="Image 1" width="300"/>
<img
src="https://github.com/user-attachments/assets/846f6918-ca82-4482-8c19-19172752fa8c"
alt="Image 2" width="300"/>
</div>

## QA Instructions

- [x] weight control changes strength of control image
- [x] Test that results match across both quantized and non-quantized.

## Merge Plan

**_Do not merge this PR yet._**

1. Merge #7450 
2. Merge #7446 
3. Change target branch to main
4. Merge this branch.

## Checklist

- [ ] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
2024-12-17 08:46:31 -05:00
psychedelicious
d764aa4a2a fix(ui): ensure only the expected properties are used when converting between control layer adapter settings 2024-12-17 13:36:11 +00:00
psychedelicious
ea34726329 chore(ui): lint 2024-12-17 13:36:11 +00:00
Ryan Dick
9b615e0de7 Fix bugs when switching control layer type. This logic still feels very hacky. 2024-12-17 13:36:11 +00:00
Ryan Dick
a463e97269 Bump FluxControlLoRALoaderInvocation version. 2024-12-17 13:36:10 +00:00
Ryan Dick
b272d46056 Enable ability to control the weight of FLUX Control LoRAs. 2024-12-17 13:36:10 +00:00
Ryan Dick
4d5f74c05b LoRA refactor to enable FLUX control LoRAs w/ quantized tranformers (#7446)
## Summary

This PR refactors the LoRA handling code to enable the use of FLUX
control LoRAs on top of quantized transformers.

Changes:
- Renamed a bunch of the model patching utilities to reflect that they
are not LoRA-specific
- Improved the unit test coverage.
- Refactored the handling of 'sidecar' patch layers to make them work
with more layer patch types. (This was necessary to get FLUX control
LoRAs working on top of quantized models.)
- Removed `ONNXModelPatcher`. It is out-of-date and hasn't been used in
a while.


## QA Instructions

I completed the following tests.

**These should be repeated after changing the target branch to main.**

**Due to the large surface area of this PR, reviewers should do
regression tests on a range of LoRA formats. There is a risk of
regression on a specific format that was missed during the
refactoring.**

- [x] FLUX Control LoRA + full FLUX transformer
- [x] FLUX Control LoRA + BnB NF4 quantized transformer
- [x] FLUX Control LoRA + GGUF quantized transformer
- [x] FLUX Control LoRA + non-control LoRA + full FLUX transformer
- [x] FLUX Contro LoRA + non-control LoRA + BnB quantized transformer
- [x] FLUX Control LoRA + non-control LoRA + GGUF quantized transformer
- Test the following cases for regression:
    - [x] Misc SD1/SDXL LoRA variants (LoRA, LoKr, IA3)
    - [x] FLUX, non-quantized, variety of LoRA formats
    - [x] FLUX, quantized, variety of LoRA formats

## Merge Plan

**_Don't merge this PR yet._**

Merge plan:
1. First merge brandon/flux-tools-loras into main
2. Change the target branch of this PR to main
3. Review / test / merge this PR

## Checklist

- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
2024-12-17 08:30:50 -05:00
Ryan Dick
dd09509dbd Rename ModelPatcher -> LayerPatcher to avoid conflicts with another ModelPatcher definition. 2024-12-17 13:20:19 +00:00
Ryan Dick
7fad4c9491 Rename LoRAModelRaw to ModelPatchRaw. 2024-12-17 13:20:19 +00:00
Ryan Dick
b820862eab Rename ModelPatcher methods to reflect that they are general model patching methods and are not LoRA-specific. 2024-12-17 13:20:19 +00:00
Ryan Dick
c604a0956e Rename LoRAPatcher -> ModelPatcher. 2024-12-17 13:20:19 +00:00
Ryan Dick
9369b39a12 Add GGMLTensor op. 2024-12-17 13:20:19 +00:00
Ryan Dick
80f64abd1e Use a FluxControlLoRALayer when loading FLUX control LoRAs. 2024-12-17 13:20:19 +00:00
Ryan Dick
37e3089457 Push LoRA layer reshaping down into the patch layers and add a new FluxControlLoRALayer type. 2024-12-17 13:20:19 +00:00
Ryan Dick
fe09f2d27a Move handling of LoRA scale and patch weight down into the layer patch classes. 2024-12-17 13:20:19 +00:00
Ryan Dick
e7e3f7e144 Ensure that patches are on the correct device when used in sidecar wrappers. 2024-12-17 13:20:19 +00:00
Ryan Dick
606d58d7db Add sidecar wrapper for FLUX RMSNorm layers to support SetParameterLayers used by FLUX structural control LoRAs. 2024-12-17 13:20:19 +00:00
Ryan Dick
c76a448846 Delete old sidecar_layers/ dir. 2024-12-17 13:20:19 +00:00
Ryan Dick
46133b5656 Switch LoRAPatcher to use the new sidecar_wrappers/ rather than sidecar_layers/. 2024-12-17 13:20:19 +00:00
Ryan Dick
ac28370fd2 Break up functions in LoRAPatcher in preparation for more refactoring. 2024-12-17 13:20:19 +00:00
Ryan Dick
1e0552c813 Add optimized implementations for the LinearSidecarWrapper when using LoRALayer or ConcatenatedLoRALayer patch types (since these are the most common). 2024-12-17 13:20:19 +00:00
Ryan Dick
e2451ef5ca A unit tests for LinearSidecarWrapper (and fix a bug). 2024-12-17 13:20:19 +00:00
Ryan Dick
443d838fd0 Add initial basic implementation of sidecar wrappers. 2024-12-17 13:20:19 +00:00
Ryan Dick
3a8a5442ea Add basic unit tests for SetParameterLayer. 2024-12-17 13:20:19 +00:00
Ryan Dick
808e3770d3 Remove AnyLoRALayer type definition in favor of using BaseLayerPatch base class. 2024-12-17 13:20:19 +00:00
Ryan Dick
2b441d6a2d Add BaseLayerPatch ABC to clarify the intended patch interface. 2024-12-17 13:20:19 +00:00
Ryan Dick
58de93a89e Delete empty file. 2024-12-17 13:20:19 +00:00
Ryan Dick
1eede4315e Delete ONNXModelPatcher. It is outdated and hasn't been used for a long time. 2024-12-17 13:20:19 +00:00
Ryan Dick
8ea697d733 Mark LoRALayerBase.rank(...) as a private method. 2024-12-17 13:20:19 +00:00
Ryan Dick
693d42661c Add basic unit tests for LoRALayer. 2024-12-17 13:20:19 +00:00
Ryan Dick
41664f88db Rename backend/patches/conversions/ to backend/patches/lora_conversions/ 2024-12-17 13:20:19 +00:00
Ryan Dick
42f8d6aa11 Rename backend/lora/ to backend/patches 2024-12-17 13:20:19 +00:00
psychedelicious
5f41a69665 feat(ui): prevent invoking when >1 control lora enabled 2024-12-17 07:28:45 -05:00
Ryan Dick
7da90a9b6b Ensure that model probe does not crash with integer state dict keys. 2024-12-17 07:28:45 -05:00
Ryan Dick
440185cc40 Simplify FLUX control LoRA probing. 2024-12-17 07:28:45 -05:00
Ryan Dick
26edc71268 ruff format 2024-12-17 07:28:45 -05:00
Ryan Dick
a4bed7aee3 Minor tidy of FLUX control LoRA implementation. (mostly documentation) 2024-12-17 07:28:45 -05:00
Ryan Dick
5fcd76a712 Fix frontend FLUX graph construction for FLUX control LoRAs. 2024-12-17 07:28:45 -05:00
Mary Hipp
516ffa641c add logic to change type to control_lora properly 2024-12-17 07:28:45 -05:00
Ryan Dick
d84adfd39f Clean up FLUX control LoRA pre-processing logic. 2024-12-17 07:28:45 -05:00
Ryan Dick
ac82f73dbe Make FluxControlLoRALoaderOutput.control_lora non-optional. 2024-12-17 07:28:45 -05:00
Brandon Rising
70811d0bd0 Remove unexpected artifacts in output images 2024-12-17 07:28:45 -05:00
Mary Hipp
e0344a302c feat(ui): update FLUX graph building to include control layers with control loras 2024-12-17 07:28:45 -05:00
Mary Hipp
92b0d89b70 (ui): replace logic for controlnet/t2i to include control_loras and display default settings in model manager 2024-12-17 07:28:45 -05:00
Mary Hipp
da213e4638 feat(ui): add control loras to control adapter model options, add default settings for preprocessor in probe 2024-12-17 07:28:45 -05:00
Brandon Rising
246b59f148 Run pnpm fix, regenerate schema 2024-12-17 07:28:45 -05:00
Brandon Rising
046d19446c Rename Structural Lora to Control Lora 2024-12-17 07:28:45 -05:00
Ryan Dick
040551d4fb Fixes to get FLUX Control LoRA working. 2024-12-17 07:28:45 -05:00
Brandon Rising
f53da60b84 Lots of updates centered around using the lora patcher rather than changing the modules in the transformer model 2024-12-17 07:28:45 -05:00
Brandon Rising
5a035dd19f Support bnb quantized nf4 flux models, Use controlnet vae, only support 1 structural lora per transformer. various other refractors and bugfixes 2024-12-17 07:28:45 -05:00
Brandon Rising
f3b253987f Initial setup for flux tools control loras 2024-12-17 07:28:45 -05:00
psychedelicious
25ff7918e8 chore(ui): knip 2024-12-16 18:57:43 -08:00
psychedelicious
09fc60acb0 feat(ui): show toasts when filter, transform, select or crop fails 2024-12-16 18:57:43 -08:00
psychedelicious
6f55f2c723 refactor(ui): simpler handling for graph building in enqueuerequested listener 2024-12-16 18:57:43 -08:00
psychedelicious
03b815c884 feat(uI): improved error handling for generation mode calcuation
Wrap logic that might throw in a result and handle log it if it errors before throwing.
2024-12-16 18:57:43 -08:00
psychedelicious
9cecdd17eb feat(uI): improved error handling when getting composite canvas images
Wrap logic that might throw in a result and handle log it if it errors before throwing.
2024-12-16 18:57:43 -08:00
psychedelicious
6b0f7ab57c feat(uI): improved error handling during rasterization
- Ensure the currently-rasterizing adapter is reset to `null` on success or failure of a rasterization operation. In case of failure, this prevents the UI from getting stuck with a disabled Invoke button and tooltip message "Canvas is busy (rasterizing)".
- Log the error if there is one.
2024-12-16 18:57:43 -08:00
psychedelicious
c805e38da2 fix(ui): remove duplicate log on socket connect 2024-12-16 18:57:43 -08:00
psychedelicious
2c1de0f07d fix(ui): missing translation string 2024-12-12 22:44:43 -08:00
psychedelicious
261d5ab488 docs: add redirect for patchmatch docs
The patchmatch lib links directly to our docs: https://invoke-ai.github.io/InvokeAI/installation/060_INSTALL_PATCHMATCH/

That URL doesn't exist any more. Added a redirect to the new URL.
2024-12-12 22:41:05 -08:00
Mary Hipp
ca571cd7a9 swap global and regional 2024-12-12 15:53:18 -05:00
Eugene Brodsky
4c94d41fa9 (chore) ruff format 2024-12-04 17:02:08 +00:00
Eugene Brodsky
4036244ee9 (app) clarify log message when migrating old .cache 2024-12-04 17:02:08 +00:00
Eugene Brodsky
d06232d9ba (config) ensure legacy model configs and node template are writable by the user even if the source files are read-only 2024-12-04 17:02:08 +00:00
Eugene Brodsky
bacbdfb8fc (docker) add comments in docker-entrypoint.sh and ensure variables are not null in bash expansion 2024-12-04 17:02:08 +00:00
Eugene Brodsky
59f42f4682 (pkg) reduce max supported python version as we have not yet tested 3.12 well enough 2024-12-04 17:02:08 +00:00
Eugene Brodsky
a636ac2899 (docker) use 'uv' to manage python installation and the invoke dependencies, since Ubuntu 24.04 comes with Python 3.12 which we do not yet support 2024-12-04 17:02:08 +00:00
Richard Lyons
bd478360d9 Upgrade docker build to ubuntu 24 2024-12-04 17:02:08 +00:00
Richard Lyons
ac0db07649 Fix docker deployment 2024-12-04 17:02:08 +00:00
psychedelicious
b7132ce9e7 fix(ui): capitalization for vietnamese language 2024-12-03 14:52:28 -08:00
psychedelicious
90f30e7748 chore: bump version to v5.4.3 2024-12-03 14:50:09 -08:00
Riccardo Giovanetti
6b86a66bc7 translationBot(ui): update translation (Italian)
Currently translated at 99.3% (1633 of 1643 strings)

Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
2024-12-03 13:16:12 -08:00
Linos
aa97e626e9 translationBot(ui): update translation (Vietnamese)
Currently translated at 100.0% (1643 of 1643 strings)

translationBot(ui): update translation (Vietnamese)

Currently translated at 99.8% (1641 of 1643 strings)

Co-authored-by: Linos <linos.coding@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/vi/
Translation: InvokeAI/Web UI
2024-12-03 13:13:26 -08:00
Ryan Dick
c90736093f Revert FLUX performance improvement that fails on MacOS (#7423)
## Summary

https://github.com/invoke-ai/InvokeAI/issues/7422

As reported in the above ticket, a recent FLUX performance improvement
caused a regression on MacOS. This PR reverts the offending part of the
change.

## Related Issues / Discussions

- Closes #7422 
- Original perf improvement:
https://github.com/invoke-ai/InvokeAI/pull/7399

## QA Instructions

I don't have a Mac capable of running this test, so trusting the report
in #7422 that this fixes the problem.

## Checklist

- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
2024-12-03 10:58:00 -05:00
Ryan Dick
0bff4ace1b Revert performance improvement, because it caused flux inference to fail on Mac: https://github.com/invoke-ai/InvokeAI/issues/7422 2024-12-03 15:18:58 +00:00
psychedelicious
5eb382074e tweak(ui): slightly clearer logic for skipping regional guidance 2024-12-02 23:46:21 -05:00
psychedelicious
46aa930526 fix(ui): skip disabled ref images 2024-12-02 23:46:21 -05:00
psychedelicious
3305bad0c2 fix(app): queue item id check before setting cancel flag should use != instead of is not
The `is` operator compares references, not values. Thanks to a wonderfully unintuitive quirk of python, `is` works on integers from `-5` to `256`, inclusive.

Whenever integers in this range are used for a value, internally python returns a reference to a stable object in memory. When integers outside this range are used as a value, python creates a new object in memory for that integer.

See `PyLong_FromLong` documentation here: https://docs.python.org/3/c-api/long.html

Tying this back to our session processor, we were using `is` to compare the queue item ids for equality. Our queue item ids start at 0, and each queue item created increments this by one. So this comparison works only for the first 256 queue items on the machine.

Starting with the 257th queue item, the comparison starts returning `False`, and cancelation gets weird.

Easy fix - use `!=` instead of `is not`.
2024-12-02 23:22:58 -05:00
psychedelicious
13703d8f55 chore: bump version to v5.4.3rc2 2024-12-02 15:02:30 -08:00
psychedelicious
60d838d0a5 chore(ui): update whats new copy 2024-12-02 15:02:30 -08:00
Riccardo Giovanetti
2a157a44bf translationBot(ui): update translation (Italian)
Currently translated at 99.3% (1633 of 1643 strings)

Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
2024-12-02 14:52:05 -08:00
James Reynolds
d61b5833c2 Fix documentation broken links and remove whitespace at end of lines 2024-12-02 14:49:53 -08:00
Jonathan
c094838c6a Update model_util.py 2024-12-02 14:35:02 -08:00
Hosted Weblate
2d334c8dd8 translationBot(ui): update translation files
Updated by "Cleanup translation files" hook in Weblate.

Co-authored-by: Hosted Weblate <hosted@weblate.org>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/
Translation: InvokeAI/Web UI
2024-12-02 14:05:51 -08:00
Mary Hipp
a6be26e174 fix(worker): only apply processor cancel logic if cancel event is for current queue item 2024-12-02 14:03:05 -08:00
psychedelicious
f8c7adddd0 feat(ui): add vietnamese to language picker
Closes #7384
2024-12-02 08:12:14 -05:00
psychedelicious
17da1d92e9 fix(ui): remove "adding to" text on Invoke tooltip on Workflows/Upscaling tabs
The "adding to" text indicates if images are going to the gallery or staging area. This info is relevant only to the canvas tab, but was displayed on Upscaling and Workflows tabs. Removed it from those tabs.
2024-12-02 08:08:16 -05:00
psychedelicious
1cc57a4854 chore(ui): lint 2024-12-02 07:59:12 -05:00
psychedelicious
3993fae331 fix(ui): unable to invoke w/ empty inpaint mask or raster layer
Removed the empty state checks for these layer types - it's always OK to invoke when they are empty.
2024-12-02 07:59:12 -05:00
psychedelicious
1446526d55 tidy(ui): translation keys for canvas layer warnings 2024-12-02 07:59:12 -05:00
psychedelicious
62c024e725 feat(ui): add gallery image ctx menu items to create ref image from image
Appears these actions disappeared at some point. Restoring them.
2024-12-02 07:52:58 -05:00
psychedelicious
1e92bb4e94 fix(ui): ref image defaults to prev ref image's image selection
A redux selector is used to get the "default" IP Adapter. The selector uses the model list query result to select an IP Adapter model to be preset by default.

The selector is memoized, so if we mutate the returned default IP Adapter state, it mutates the result of the selector for all consumers.

For example, the `image` property of the default IP Adapter selector result is `null`. When we set the `image` property of the selector result while creating an IP Adapter, this does not trigger the selector to recompute its result. We end up setting the image for the selector result directly, and all other consumers now have that same image set.

Solution - we need to clone the selector result everywhere it is used. This was missed in a few spots, causing the issue.
2024-12-02 07:48:39 -05:00
psychedelicious
db6398fdf6 feat(ui): less confusing empty state for rg ref images
It was easy to misunderstand the empty state for a regional guidance reference image. There was no label, so it seemed like it was the whole region that was empty.

This small change adds the "Reference Image" heading to the empty state, so it's clear that the empty state messaging refers to this reference image, not the whole regional guidance layer.
2024-12-02 07:46:10 -05:00
Riccardo Giovanetti
ebd73a2ac2 translationBot(ui): update translation (Italian)
Currently translated at 98.7% (1622 of 1643 strings)

Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
2024-12-02 02:13:51 -08:00
Hosted Weblate
8ee95cab00 translationBot(ui): update translation files
Updated by "Cleanup translation files" hook in Weblate.

Co-authored-by: Hosted Weblate <hosted@weblate.org>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/
Translation: InvokeAI/Web UI
2024-12-02 02:13:51 -08:00
Linos
d1184201a8 translationBot(ui): update translation (Vietnamese)
Currently translated at 100.0% (1643 of 1643 strings)

translationBot(ui): update translation (Vietnamese)

Currently translated at 100.0% (1638 of 1638 strings)

Co-authored-by: Linos <linos.coding@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/vi/
Translation: InvokeAI/Web UI
2024-12-02 02:13:51 -08:00
Nik Nikovsky
5887891654 translationBot(ui): update translation (Polish)
Currently translated at 4.9% (81 of 1638 strings)

Co-authored-by: Nik Nikovsky <zejdzztegomaila@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/pl/
Translation: InvokeAI/Web UI
2024-12-02 02:13:51 -08:00
Riku
765ca4e004 translationBot(ui): update translation (German)
Currently translated at 69.7% (1142 of 1638 strings)

Co-authored-by: Riku <riku.block@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/de/
Translation: InvokeAI/Web UI
2024-12-02 02:13:51 -08:00
Riku
159b00a490 fix(app): adjust session queue api type 2024-12-01 20:06:05 -08:00
Riku
3fbf6f2d2a chore(ui): update typegen schema 2024-12-01 19:56:09 -08:00
Riku
931fca7cd1 fix(ui): call cancel instead of clear queue 2024-12-01 19:53:12 -08:00
Riku
db84a3a5d4 refactor(ui): move clear queue hook to separate file 2024-12-01 19:42:25 -08:00
psychedelicious
ca8313e805 feat(ui): add new layer from image menu items for staging area
The layers are disabled when created so as to not interfere with the canvas state.
2024-12-01 19:37:49 -08:00
psychedelicious
df849035ee feat(ui): allow setting isEnabled, isLocked and name in createNewCanvasEntityFromImage util 2024-12-01 19:37:49 -08:00
psychedelicious
8d97fe69ca feat(ui): use imageDTOToFile in staging area save to gallery button 2024-12-01 19:37:49 -08:00
psychedelicious
9044e53a9b feat(ui): add imageDTOToFile util 2024-12-01 19:37:49 -08:00
Jonathan
6012b0f912 Update flux_text_encoder.py
Updated version number for FLUX Text Encoding.
2024-11-30 08:29:21 -05:00
Jonathan
bb0ed5dc8a Update flux_denoise.py
Updated node version for FLUX Denoise.
2024-11-30 08:29:21 -05:00
Ryan Dick
021552fd81 Avoid unnecessary dtype conversions with rope encodings. 2024-11-29 12:32:50 -05:00
Ryan Dick
be73dbba92 Use view() instead of rearrange() for better performance. 2024-11-29 12:32:50 -05:00
Ryan Dick
db9c0cad7c Replace custom RMSNorm implementation with torch.nn.functional.rms_norm(...) for improved speed. 2024-11-29 12:32:50 -05:00
Ryan Dick
54b7f9a063 FLUX Regional Prompting (#7388)
## Summary

This PR adds support for regional prompting with FLUX.

### Example 1
Global prompt: `An architecture rendering of the reception area of a
corporate office with modern decor.`
<img width="1386" alt="image"
src="https://github.com/user-attachments/assets/c8169bdb-49a9-44bc-bd9e-58d98e09094b">

![image](https://github.com/user-attachments/assets/4a426be9-9d7a-4527-b27c-2d2514ee73fe)

## QA Instructions

- [x] Test that there is no slowdown in the base case with a single
global prompt.
- [x] Test image fully covered by regional masks.
- [x] Test image covered by region masks with small gaps.
- [x] Test region masks with large unmasked ‘background’ regions
- [x] Test region masks with significant overlap
- [x] Test multiple global prompts.
- [x] Test no global prompt.
- [x] Test regional negative prompts (It runs... but results are not
great. Needs more tuning to be useful.)
- Test compatibility with:
    - [x] ControlNet
    - [x] LoRA
    - [x] IP-Adapter

## Remaining TODO

- [x] Disable the following UI features for FLUX prompt regions:
negative prompts, reference images, auto-negative.

## Checklist

- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
2024-11-29 08:56:42 -05:00
psychedelicious
7d488a5352 feat(ui): add delete button to regional ref image empty state 2024-11-29 15:51:24 +10:00
psychedelicious
4d7667f63d fix(ui): add missing translations 2024-11-29 15:43:49 +10:00
psychedelicious
08704ee8ec feat(ui): use canvas layer validators in control/ip adapter graph builders 2024-11-29 15:32:48 +10:00
psychedelicious
5910892c33 Merge remote-tracking branch 'origin/main' into ryan/flux-regional-prompting 2024-11-29 15:19:39 +10:00
psychedelicious
46a09d9e90 feat(ui): format warnings tooltip 2024-11-29 13:32:51 +10:00
psychedelicious
df0c7d73f3 feat(ui): use regional guidance validation utils in graph builders 2024-11-29 13:26:09 +10:00
psychedelicious
3905c97e32 feat(ui): return translation keys from validation utils instead of translated strings 2024-11-29 13:25:09 +10:00
psychedelicious
0be796a808 feat(ui): use layer validation utils in invoke readiness utils 2024-11-29 13:14:26 +10:00
psychedelicious
7dd33b0f39 feat(ui): add indicator to canvas layer headers, displaying validation warnings
If there are any issues with the layer, the icon is displayed. If the layer is disabled, the icon is greyed out but still visible.
2024-11-29 13:13:47 +10:00
psychedelicious
484aaf1595 feat(ui): add canvas layer validation utils
These helpers consolidate layer validation checks. For example, checking that the layer has content drawn, is compatible with the selected main model, has valid reference images, etc.
2024-11-29 13:12:32 +10:00
psychedelicious
c276b60af9 tidy(ui): use object for addRegions graph builder util arg 2024-11-29 08:49:41 +10:00
Ryan Dick
5d8dd6e26e Fix FLUX regional negative prompts. 2024-11-28 18:49:29 +00:00
Emmanuel Ferdman
5bca68d873 docs: update code of conduct reference
Signed-off-by: Emmanuel Ferdman <emmanuelferdman@gmail.com>
2024-11-27 17:38:33 -08:00
Ryan Dick
64364e7911 Short-circuit if there are no region masks in FLUX and don't apply attention masking. 2024-11-27 22:40:10 +00:00
Ryan Dick
6565cea039 Comment unused _prepare_unrestricted_attn_mask(...) for future reference. 2024-11-27 22:16:44 +00:00
Ryan Dick
3ebd8d6c07 Delete outdated TODO comment. 2024-11-27 22:13:25 +00:00
Ryan Dick
e970185161 Tweak flux regional prompting attention scheme based on latest experimentation. 2024-11-27 22:13:07 +00:00
Ryan Dick
fa5653cdf7 Remove unused 'denoise' param to addRegions(). 2024-11-27 17:08:42 +00:00
Ryan Dick
9a7b000995 Update frontend to support regional prompts with FLUX in the canvas. 2024-11-27 17:04:43 +00:00
Ryan Dick
3a27242838 Bump transformers. The main motivation for this bump is to ingest a fix for DepthAnything postprocessing artifacts. 2024-11-27 07:46:16 -08:00
Ryan Dick
8cfb032051 Add utility ImagePanelLayoutInvocation for working with In-Context LoRA workflows. 2024-11-26 20:58:31 -08:00
Ryan Dick
06a9d4e2b2 Use a Textarea component for the FluxTextEncoderInvocation prompt field. 2024-11-26 20:58:31 -08:00
Brandon Rising
ed46acee79 fix: Fail scan on InvalidMagicError in picklescan, update default for read_checkpoint_meta to scan unless explicitly told not to 2024-11-26 16:17:12 -05:00
Ryan Dick
b54463d294 Allow regional prompting background regions to attend to themselves and to the entire txt embedding. 2024-11-26 17:57:31 +00:00
Ryan Dick
faee79dc95 Distinguish between restricted and unrestricted attn masks in FLUX regional prompting. 2024-11-26 16:55:52 +00:00
Mary Hipp
965cd76e33 lint fix 2024-11-26 11:25:53 -05:00
Mary Hipp
e5e8cbf34c shorten reference image mode descriptions; 2024-11-26 11:25:53 -05:00
Mary Hipp
3412a52594 (ui): updates various informational tooltips, adds descriptons to IP adapter method options 2024-11-26 11:25:53 -05:00
Ryan Dick
e01f66b026 Apply regional attention masks in the single stream blocks in addition to the double stream blocks. 2024-11-25 22:40:08 +00:00
Ryan Dick
53abdde242 Update Flux RegionalPromptingExtension to prepare both a mask with restricted image self-attention and a mask with unrestricted image self attention. 2024-11-25 22:04:23 +00:00
Ryan Dick
94c088300f Be smarter about selecting the global CLIP embedding for FLUX regional prompting. 2024-11-25 20:15:04 +00:00
Ryan Dick
3741a6f5e0 Fix device handling for regional masks and apply the attention mask in the FLUX double stream block. 2024-11-25 16:02:03 +00:00
Kent Keirsey
059336258f Create SECURITY.md 2024-11-25 04:10:03 -08:00
Ryan Dick
2c23b8414c Use a single global CLIP embedding for FLUX regional guidance. 2024-11-22 23:01:43 +00:00
Mary Hipp
271cc52c80 fix(ui): use token for download if its in store 2024-11-22 12:08:05 -05:00
Ryan Dick
20356c0746 Fixup the logic for preparing FLUX regional prompt attention masks. 2024-11-21 22:46:25 +00:00
psychedelicious
e44458609f chore: bump version to v5.4.3rc1 2024-11-21 10:32:43 -08:00
psychedelicious
69d86a7696 feat(ui): address feedback 2024-11-21 09:54:35 -08:00
Hippalectryon
56db1a9292 Use proxyrect and setEntityPosition to sync transformer position 2024-11-21 09:54:35 -08:00
Hippalectryon
cf50e5eeee Make sure the canvas is focused 2024-11-21 09:54:35 -08:00
Hippalectryon
c9c07968d2 lint 2024-11-21 09:54:35 -08:00
Hippalectryon
97d0757176 use $isInteractable instead of $isDisabled 2024-11-21 09:54:35 -08:00
Hippalectryon
0f51b677a9 refactor 2024-11-21 09:54:35 -08:00
Hippalectryon
56ca94c3a9 Don't move if the layer is disabled
Lint
2024-11-21 09:54:35 -08:00
Hippalectryon
28d169f859 Allow moving layers using the keyboard 2024-11-21 09:54:35 -08:00
psychedelicious
92f71d99ee tweak(ui): use X icon for rg ref image delete button 2024-11-21 08:50:39 -08:00
psychedelicious
0764c02b1d tweak(ui): code style 2024-11-21 08:50:39 -08:00
psychedelicious
081c7569fe feat(ui): add global ref image empty state 2024-11-21 08:50:39 -08:00
psychedelicious
20f6532ee8 feat(ui): add empty state for regional guidance ref image 2024-11-21 08:50:39 -08:00
Mary Hipp
b9e8910478 feat(ui): add actions for video modal clicks 2024-11-21 11:15:55 -05:00
Mary Hipp
ded8391e3c use nanostore for schema parsed instead 2024-11-20 20:13:31 -05:00
Mary Hipp
e9dd2c396a limit to one hook 2024-11-20 20:13:31 -05:00
Mary Hipp
0d86de0cb5 fix(ui): make sure schema has loaded before trying to load any workflows 2024-11-20 20:13:31 -05:00
Ryan Dick
bad1149504 WIP - add rough logic for preparing the FLUX regional prompting attention mask. 2024-11-20 22:29:36 +00:00
Ryan Dick
fda7aaa7ca Pass RegionalPromptingExtension down to the CustomDoubleStreamBlockProcessor in FLUX. 2024-11-20 19:48:04 +00:00
Ryan Dick
85c616fa34 WIP - Pass prompt masks to FLUX model during denoising. 2024-11-20 18:51:43 +00:00
psychedelicious
549f4e9794 feat(ui): set default infill method to lama 2024-11-20 11:19:17 -05:00
psychedelicious
ef8ededd2f fix(ui): disable width and height output on image batch output
There's a technical challenge with outputting these values directly. `ImageField` does not store them, so the batch's `ImageField` collection does not have width and height for each image.

In order to set up the batch and pass along width and height for each image, we'd need to make a network request for each image when the user clicks Invoke. It would often be cached, but this will eventually create a scaling issue and poor user experience.

As a very simple workaround, users can output the batch image output into an `Image Primitive` node to access the width and height.

This change is implemented by adding some simple special handling when parsing the output fields for the `image_batch` node.

I'll keep this situation in mind when extending the batching system to other field types.
2024-11-20 11:16:54 -05:00
Mary Hipp
1948ffe106 make sure Soft Edge Detection has preprocessor applied 2024-11-20 08:46:02 -05:00
psychedelicious
c70f4404c4 fix(ui): special node icon tooltip 2024-11-19 14:29:09 -08:00
psychedelicious
b157ae928c chore(ui): update what's new copy 2024-11-19 14:29:09 -08:00
psychedelicious
7a0871992d chore: bump version to v5.4.2 2024-11-19 14:29:09 -08:00
Hosted Weblate
b38e2e14f4 translationBot(ui): update translation files
Updated by "Cleanup translation files" hook in Weblate.

translationBot(ui): update translation files

Updated by "Cleanup translation files" hook in Weblate.

Co-authored-by: Hosted Weblate <hosted@weblate.org>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/
Translation: InvokeAI/Web UI
2024-11-19 14:12:00 -08:00
psychedelicious
7c0e70ec84 tweak(ui): "Watch on YouTube" -> "Watch" 2024-11-19 14:02:11 -08:00
psychedelicious
a89ae9d2bf feat(ui): add links to studio sessions/discord 2024-11-19 14:02:11 -08:00
psychedelicious
ad1fcb3f07 chore(ui): bump @invoke-ai/ui-library
Brings in a fix for `ExternalLink`
2024-11-19 14:02:11 -08:00
psychedelicious
87d74b910b feat(ui): support videos modal 2024-11-19 14:02:11 -08:00
psychedelicious
7ad1c297a4 feat(ui): add actions for reset canvas layers / generation settings to session menus 2024-11-19 13:55:16 -08:00
psychedelicious
fbc629faa6 feat(ui): change reset canvas button to new session menu 2024-11-19 13:55:16 -08:00
psychedelicious
7baa6b3c09 feat(ui): split up new from image into submenus
- `New Canvas from Image` -> `As Raster Layer`, `As Raster Layer (Resize)`, `As Control Layer`, `As Control Layer (Resize)`
- `New Layer from Image` -> (each layer type)
2024-11-19 10:34:00 -08:00
psychedelicious
53d482bade feat(ui): add image ctx menu new canvas without resize option 2024-11-19 10:34:00 -08:00
psychedelicious
5aca04b51b feat(ui): change reset canvas icon to "empty" 2024-11-19 09:56:25 -08:00
psychedelicious
ea8787c8ff feat(ui): update invoke button tooltip for batching
- Split up logic to determine reason why the user cannot invoke for each tab.
- Fix issue where the workflows tab would show reasons related to canvas/upscale tab. The tooltip now only shows information relevant to the current tab.
- Add calculation for batch size to the queue count prediction.
- Use a constant for the enqueue mutation's fixed cache key, instead of a string. Just some typo protection.
2024-11-19 09:53:59 -08:00
psychedelicious
cead2c4445 feat(ui): split up selector utils for useIsReadyToEnqueue 2024-11-19 09:53:59 -08:00
Mary Hipp
f76ac1808c fix(ui): simplify logic for non-local invocation progress alerts 2024-11-19 12:40:40 -05:00
psychedelicious
f01210861b chore: ruff 2024-11-19 07:02:37 -08:00
psychedelicious
f757f23ef0 chore(ui): typegen 2024-11-19 07:02:37 -08:00
psychedelicious
872a6ef209 tidy(nodes): extract slerp from lblend to util fn 2024-11-19 07:02:37 -08:00
psychedelicious
4267e5ffc4 tidy(nodes): bring masked blend latents masking logic into invoke core 2024-11-19 07:02:37 -08:00
Brandon Rising
a69c5ff9ef Add copyright notice for CIELab_to_UPLab.icc 2024-11-19 07:02:37 -08:00
Brandon Rising
3ebd8d7d1b Fix .icc asset file in pyproject.toml 2024-11-19 07:02:37 -08:00
Brandon Rising
1fd80d54a4 Run Ruff 2024-11-19 07:02:37 -08:00
Brandon Rising
991f63e455 Store CIELab_to_UPLab.icc within the repo 2024-11-19 07:02:37 -08:00
Brandon Rising
6a1efd3527 Add validation to some of the node inputs 2024-11-19 07:02:37 -08:00
Brandon Rising
0eadc0dd9e feat: Support a subset of composition nodes within base invokeai 2024-11-19 07:02:37 -08:00
1177 changed files with 67780 additions and 23928 deletions

View File

@@ -1,9 +1,11 @@
*
!invokeai
!pyproject.toml
!uv.lock
!docker/docker-entrypoint.sh
!LICENSE
**/dist
**/node_modules
**/__pycache__
**/*.egg-info
**/*.egg-info

View File

@@ -1,2 +1,5 @@
b3dccfaeb636599c02effc377cdd8a87d658256c
218b6d0546b990fc449c876fb99f44b50c4daa35
182580ff6970caed400be178c5b888514b75d7f2
8e9d5c1187b0d36da80571ce4c8ba9b3a37b6c46
99aac5870e1092b182e6c5f21abcaab6936a4ad1

3
.gitattributes vendored
View File

@@ -2,4 +2,5 @@
# Only affects text files and ignores other file types.
# For more info see: https://www.aleksandrhovhannisyan.com/blog/crlf-vs-lf-normalizing-line-endings-in-git/
* text=auto
docker/** text eol=lf
docker/** text eol=lf
tests/test_model_probe/stripped_models/** filter=lfs diff=lfs merge=lfs -text

10
.github/CODEOWNERS vendored
View File

@@ -1,12 +1,12 @@
# continuous integration
/.github/workflows/ @lstein @blessedcoolant @hipsterusername @ebr
/.github/workflows/ @lstein @blessedcoolant @hipsterusername @ebr @jazzhaiku
# documentation
/docs/ @lstein @blessedcoolant @hipsterusername @Millu
/mkdocs.yml @lstein @blessedcoolant @hipsterusername @Millu
/docs/ @lstein @blessedcoolant @hipsterusername @psychedelicious
/mkdocs.yml @lstein @blessedcoolant @hipsterusername @psychedelicious
# nodes
/invokeai/app/ @Kyle0654 @blessedcoolant @psychedelicious @brandonrising @hipsterusername
/invokeai/app/ @blessedcoolant @psychedelicious @brandonrising @hipsterusername @jazzhaiku
# installation and configuration
/pyproject.toml @lstein @blessedcoolant @hipsterusername
@@ -22,7 +22,7 @@
/invokeai/backend @blessedcoolant @psychedelicious @lstein @maryhipp @hipsterusername
# generation, model management, postprocessing
/invokeai/backend @damian0815 @lstein @blessedcoolant @gregghelt2 @StAlKeR7779 @brandonrising @ryanjdick @hipsterusername
/invokeai/backend @lstein @blessedcoolant @brandonrising @hipsterusername @jazzhaiku
# front ends
/invokeai/frontend/CLI @lstein @hipsterusername

View File

@@ -76,9 +76,6 @@ jobs:
latest=${{ matrix.gpu-driver == 'cuda' && github.ref == 'refs/heads/main' }}
suffix=-${{ matrix.gpu-driver }},onlatest=false
- name: Set up QEMU
uses: docker/setup-qemu-action@v3
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
with:
@@ -100,10 +97,12 @@ jobs:
context: .
file: docker/Dockerfile
platforms: ${{ env.PLATFORMS }}
build-args: |
GPU_DRIVER=${{ matrix.gpu-driver }}
push: ${{ github.ref == 'refs/heads/main' || github.ref_type == 'tag' || github.event.inputs.push-to-registry }}
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}
cache-from: |
type=gha,scope=${{ github.ref_name }}-${{ matrix.gpu-driver }}
type=gha,scope=main-${{ matrix.gpu-driver }}
cache-to: type=gha,mode=max,scope=${{ github.ref_name }}-${{ matrix.gpu-driver }}
# cache-from: |
# type=gha,scope=${{ github.ref_name }}-${{ matrix.gpu-driver }}
# type=gha,scope=main-${{ matrix.gpu-driver }}
# cache-to: type=gha,mode=max,scope=${{ github.ref_name }}-${{ matrix.gpu-driver }}

View File

@@ -1,6 +1,6 @@
# Builds and uploads the installer and python build artifacts.
# Builds and uploads python build artifacts.
name: build installer
name: build wheel
on:
workflow_dispatch:
@@ -17,7 +17,7 @@ jobs:
- name: setup python
uses: actions/setup-python@v5
with:
python-version: '3.10'
python-version: '3.12'
cache: pip
cache-dependency-path: pyproject.toml
@@ -27,19 +27,12 @@ jobs:
- name: setup frontend
uses: ./.github/actions/install-frontend-deps
- name: create installer
id: create_installer
run: ./create_installer.sh
working-directory: installer
- name: build wheel
id: build_wheel
run: ./scripts/build_wheel.sh
- name: upload python distribution artifact
uses: actions/upload-artifact@v4
with:
name: dist
path: ${{ steps.create_installer.outputs.DIST_PATH }}
- name: upload installer artifact
uses: actions/upload-artifact@v4
with:
name: installer
path: ${{ steps.create_installer.outputs.INSTALLER_PATH }}
path: ${{ steps.build_wheel.outputs.DIST_PATH }}

View File

@@ -44,7 +44,12 @@ jobs:
- name: check for changed frontend files
if: ${{ inputs.always_run != true }}
id: changed-files
uses: tj-actions/changed-files@v42
# Pinned to the _hash_ for v45.0.9 to prevent supply-chain attacks.
# See:
# - CVE-2025-30066
# - https://www.stepsecurity.io/blog/harden-runner-detection-tj-actions-changed-files-action-is-compromised
# - https://github.com/tj-actions/changed-files/issues/2463
uses: tj-actions/changed-files@a284dc1814e3fd07f2e34267fc8f81227ed29fb8
with:
files_yaml: |
frontend:

View File

@@ -44,7 +44,12 @@ jobs:
- name: check for changed frontend files
if: ${{ inputs.always_run != true }}
id: changed-files
uses: tj-actions/changed-files@v42
# Pinned to the _hash_ for v45.0.9 to prevent supply-chain attacks.
# See:
# - CVE-2025-30066
# - https://www.stepsecurity.io/blog/harden-runner-detection-tj-actions-changed-files-action-is-compromised
# - https://github.com/tj-actions/changed-files/issues/2463
uses: tj-actions/changed-files@a284dc1814e3fd07f2e34267fc8f81227ed29fb8
with:
files_yaml: |
frontend:

View File

@@ -34,6 +34,9 @@ on:
jobs:
python-checks:
env:
# uv requires a venv by default - but for this, we can simply use the system python
UV_SYSTEM_PYTHON: 1
runs-on: ubuntu-latest
timeout-minutes: 5 # expected run time: <1 min
steps:
@@ -43,7 +46,12 @@ jobs:
- name: check for changed python files
if: ${{ inputs.always_run != true }}
id: changed-files
uses: tj-actions/changed-files@v42
# Pinned to the _hash_ for v45.0.9 to prevent supply-chain attacks.
# See:
# - CVE-2025-30066
# - https://www.stepsecurity.io/blog/harden-runner-detection-tj-actions-changed-files-action-is-compromised
# - https://github.com/tj-actions/changed-files/issues/2463
uses: tj-actions/changed-files@a284dc1814e3fd07f2e34267fc8f81227ed29fb8
with:
files_yaml: |
python:
@@ -52,25 +60,19 @@ jobs:
- '!invokeai/frontend/web/**'
- 'tests/**'
- name: setup python
- name: setup uv
if: ${{ steps.changed-files.outputs.python_any_changed == 'true' || inputs.always_run == true }}
uses: actions/setup-python@v5
uses: astral-sh/setup-uv@v5
with:
python-version: '3.10'
cache: pip
cache-dependency-path: pyproject.toml
- name: install ruff
if: ${{ steps.changed-files.outputs.python_any_changed == 'true' || inputs.always_run == true }}
run: pip install ruff==0.6.0
shell: bash
version: '0.6.10'
enable-cache: true
- name: ruff check
if: ${{ steps.changed-files.outputs.python_any_changed == 'true' || inputs.always_run == true }}
run: ruff check --output-format=github .
run: uv tool run ruff@0.11.2 check --output-format=github .
shell: bash
- name: ruff format
if: ${{ steps.changed-files.outputs.python_any_changed == 'true' || inputs.always_run == true }}
run: ruff format --check .
run: uv tool run ruff@0.11.2 format --check .
shell: bash

View File

@@ -39,24 +39,15 @@ jobs:
strategy:
matrix:
python-version:
- '3.10'
- '3.11'
- '3.12'
platform:
- linux-cuda-11_7
- linux-rocm-5_2
- linux-cpu
- macos-default
- windows-cpu
include:
- platform: linux-cuda-11_7
os: ubuntu-22.04
github-env: $GITHUB_ENV
- platform: linux-rocm-5_2
os: ubuntu-22.04
extra-index-url: 'https://download.pytorch.org/whl/rocm5.2'
github-env: $GITHUB_ENV
- platform: linux-cpu
os: ubuntu-22.04
os: ubuntu-24.04
extra-index-url: 'https://download.pytorch.org/whl/cpu'
github-env: $GITHUB_ENV
- platform: macos-default
@@ -70,14 +61,22 @@ jobs:
timeout-minutes: 15 # expected run time: 2-6 min, depending on platform
env:
PIP_USE_PEP517: '1'
UV_SYSTEM_PYTHON: 1
steps:
- name: checkout
uses: actions/checkout@v4
# https://github.com/nschloe/action-cached-lfs-checkout
uses: nschloe/action-cached-lfs-checkout@f46300cd8952454b9f0a21a3d133d4bd5684cfc2
- name: check for changed python files
if: ${{ inputs.always_run != true }}
id: changed-files
uses: tj-actions/changed-files@v42
# Pinned to the _hash_ for v45.0.9 to prevent supply-chain attacks.
# See:
# - CVE-2025-30066
# - https://www.stepsecurity.io/blog/harden-runner-detection-tj-actions-changed-files-action-is-compromised
# - https://github.com/tj-actions/changed-files/issues/2463
uses: tj-actions/changed-files@a284dc1814e3fd07f2e34267fc8f81227ed29fb8
with:
files_yaml: |
python:
@@ -86,20 +85,25 @@ jobs:
- '!invokeai/frontend/web/**'
- 'tests/**'
- name: setup uv
if: ${{ steps.changed-files.outputs.python_any_changed == 'true' || inputs.always_run == true }}
uses: astral-sh/setup-uv@v5
with:
version: '0.6.10'
enable-cache: true
python-version: ${{ matrix.python-version }}
- name: setup python
if: ${{ steps.changed-files.outputs.python_any_changed == 'true' || inputs.always_run == true }}
uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python-version }}
cache: pip
cache-dependency-path: pyproject.toml
- name: install dependencies
if: ${{ steps.changed-files.outputs.python_any_changed == 'true' || inputs.always_run == true }}
env:
PIP_EXTRA_INDEX_URL: ${{ matrix.extra-index-url }}
run: >
pip3 install --editable=".[test]"
UV_INDEX: ${{ matrix.extra-index-url }}
run: uv pip install --editable ".[test]"
- name: run pytest
if: ${{ steps.changed-files.outputs.python_any_changed == 'true' || inputs.always_run == true }}

View File

@@ -49,7 +49,7 @@ jobs:
always_run: true
build:
uses: ./.github/workflows/build-installer.yml
uses: ./.github/workflows/build-wheel.yml
publish-testpypi:
runs-on: ubuntu-latest

98
.github/workflows/typegen-checks.yml vendored Normal file
View File

@@ -0,0 +1,98 @@
# Runs typegen schema quality checks.
# Frontend types should match the server.
#
# Checks for changes to files before running the checks.
# If always_run is true, always runs the checks.
name: 'typegen checks'
on:
push:
branches:
- 'main'
pull_request:
types:
- 'ready_for_review'
- 'opened'
- 'synchronize'
merge_group:
workflow_dispatch:
inputs:
always_run:
description: 'Always run the checks'
required: true
type: boolean
default: true
workflow_call:
inputs:
always_run:
description: 'Always run the checks'
required: true
type: boolean
default: true
jobs:
typegen-checks:
runs-on: ubuntu-22.04
timeout-minutes: 15 # expected run time: <5 min
steps:
- name: checkout
uses: actions/checkout@v4
- name: check for changed files
if: ${{ inputs.always_run != true }}
id: changed-files
# Pinned to the _hash_ for v45.0.9 to prevent supply-chain attacks.
# See:
# - CVE-2025-30066
# - https://www.stepsecurity.io/blog/harden-runner-detection-tj-actions-changed-files-action-is-compromised
# - https://github.com/tj-actions/changed-files/issues/2463
uses: tj-actions/changed-files@a284dc1814e3fd07f2e34267fc8f81227ed29fb8
with:
files_yaml: |
src:
- 'pyproject.toml'
- 'invokeai/**'
- name: setup uv
if: ${{ steps.changed-files.outputs.src_any_changed == 'true' || inputs.always_run == true }}
uses: astral-sh/setup-uv@v5
with:
version: '0.6.10'
enable-cache: true
python-version: '3.11'
- name: setup python
if: ${{ steps.changed-files.outputs.src_any_changed == 'true' || inputs.always_run == true }}
uses: actions/setup-python@v5
with:
python-version: '3.11'
- name: install dependencies
if: ${{ steps.changed-files.outputs.src_any_changed == 'true' || inputs.always_run == true }}
env:
UV_INDEX: ${{ matrix.extra-index-url }}
run: uv pip install --editable .
- name: install frontend dependencies
if: ${{ steps.changed-files.outputs.src_any_changed == 'true' || inputs.always_run == true }}
uses: ./.github/actions/install-frontend-deps
- name: copy schema
if: ${{ steps.changed-files.outputs.src_any_changed == 'true' || inputs.always_run == true }}
run: cp invokeai/frontend/web/src/services/api/schema.ts invokeai/frontend/web/src/services/api/schema_orig.ts
shell: bash
- name: generate schema
if: ${{ steps.changed-files.outputs.src_any_changed == 'true' || inputs.always_run == true }}
run: cd invokeai/frontend/web && uv run ../../../scripts/generate_openapi_schema.py | pnpm typegen
shell: bash
- name: compare files
if: ${{ steps.changed-files.outputs.src_any_changed == 'true' || inputs.always_run == true }}
run: |
if ! diff invokeai/frontend/web/src/services/api/schema.ts invokeai/frontend/web/src/services/api/schema_orig.ts; then
echo "Files are different!";
exit 1;
fi
shell: bash

68
.github/workflows/uv-lock-checks.yml vendored Normal file
View File

@@ -0,0 +1,68 @@
# Check the `uv` lockfile for consistency with `pyproject.toml`.
#
# If this check fails, you should run `uv lock` to update the lockfile.
name: 'uv lock checks'
on:
push:
branches:
- 'main'
pull_request:
types:
- 'ready_for_review'
- 'opened'
- 'synchronize'
merge_group:
workflow_dispatch:
inputs:
always_run:
description: 'Always run the checks'
required: true
type: boolean
default: true
workflow_call:
inputs:
always_run:
description: 'Always run the checks'
required: true
type: boolean
default: true
jobs:
uv-lock-checks:
env:
# uv requires a venv by default - but for this, we can simply use the system python
UV_SYSTEM_PYTHON: 1
runs-on: ubuntu-latest
timeout-minutes: 5 # expected run time: <1 min
steps:
- name: checkout
uses: actions/checkout@v4
- name: check for changed python files
if: ${{ inputs.always_run != true }}
id: changed-files
# Pinned to the _hash_ for v45.0.9 to prevent supply-chain attacks.
# See:
# - CVE-2025-30066
# - https://www.stepsecurity.io/blog/harden-runner-detection-tj-actions-changed-files-action-is-compromised
# - https://github.com/tj-actions/changed-files/issues/2463
uses: tj-actions/changed-files@a284dc1814e3fd07f2e34267fc8f81227ed29fb8
with:
files_yaml: |
uvlock-pyprojecttoml:
- 'pyproject.toml'
- 'uv.lock'
- name: setup uv
if: ${{ steps.changed-files.outputs.uvlock-pyprojecttoml_any_changed == 'true' || inputs.always_run == true }}
uses: astral-sh/setup-uv@v5
with:
version: '0.6.10'
enable-cache: true
- name: check lockfile
if: ${{ steps.changed-files.outputs.uvlock-pyprojecttoml_any_changed == 'true' || inputs.always_run == true }}
run: uv lock --locked # this will exit with 1 if the lockfile is not consistent with pyproject.toml
shell: bash

1
.nvmrc Normal file
View File

@@ -0,0 +1 @@
v22.14.0

View File

@@ -4,21 +4,29 @@ repos:
hooks:
- id: black
name: black
stages: [commit]
stages: [pre-commit]
language: system
entry: black
types: [python]
- id: flake8
name: flake8
stages: [commit]
stages: [pre-commit]
language: system
entry: flake8
types: [python]
- id: isort
name: isort
stages: [commit]
stages: [pre-commit]
language: system
entry: isort
types: [python]
types: [python]
- id: uvlock
name: uv lock
stages: [pre-commit]
language: system
entry: uv lock
files: ^pyproject\.toml$
pass_filenames: false

View File

@@ -16,7 +16,7 @@ help:
@echo "frontend-build Build the frontend in order to run on localhost:9090"
@echo "frontend-dev Run the frontend in developer mode on localhost:5173"
@echo "frontend-typegen Generate types for the frontend from the OpenAPI schema"
@echo "installer-zip Build the installer .zip file for the current version"
@echo "wheel Build the wheel for the current version"
@echo "tag-release Tag the GitHub repository with the current version (use at release time only!)"
@echo "openapi Generate the OpenAPI schema for the app, outputting to stdout"
@echo "docs Serve the mkdocs site with live reload"
@@ -64,13 +64,13 @@ frontend-dev:
frontend-typegen:
cd invokeai/frontend/web && python ../../../scripts/generate_openapi_schema.py | pnpm typegen
# Installer zip file
installer-zip:
cd installer && ./create_installer.sh
# Tag the release
wheel:
cd scripts && ./build_wheel.sh
# Tag the release
tag-release:
cd installer && ./tag_release.sh
cd scripts && ./tag_release.sh
# Generate the OpenAPI Schema for the app
openapi:

View File

@@ -30,51 +30,12 @@ Invoke is available in two editions:
|----------------------------------------------------------------------------------------------------------------------------|
| [Installation and Updates][installation docs] - [Documentation and Tutorials][docs home] - [Bug Reports][github issues] - [Contributing][contributing docs] |
</div>
# Installation
## Quick Start
To get started with Invoke, [Download the Installer](https://www.invoke.com/downloads).
1. Download and unzip the installer from the bottom of the [latest release][latest release link].
2. Run the installer script.
For detailed step by step instructions, or for instructions on manual/docker installations, visit our documentation on [Installation and Updates][installation docs]
- **Windows**: Double-click on the `install.bat` script.
- **macOS**: Open a Terminal window, drag the file `install.sh` from Finder into the Terminal, and press enter.
- **Linux**: Run `install.sh`.
3. When prompted, enter a location for the install and select your GPU type.
4. Once the install finishes, find the directory you selected during install. The default location is `C:\Users\Username\invokeai` for Windows or `~/invokeai` for Linux/macOS.
5. Run the launcher script (`invoke.bat` for Windows, `invoke.sh` for macOS and Linux) the same way you ran the installer script in step 2.
6. Select option 1 to start the application. Once it starts up, open your browser and go to <http://localhost:9090>.
7. Open the model manager tab to install a starter model and then you'll be ready to generate.
More detail, including hardware requirements and manual install instructions, are available in the [installation documentation][installation docs].
## Docker Container
We publish official container images in Github Container Registry: https://github.com/invoke-ai/InvokeAI/pkgs/container/invokeai. Both CUDA and ROCm images are available. Check the above link for relevant tags.
> [!IMPORTANT]
> Ensure that Docker is set up to use the GPU. Refer to [NVIDIA][nvidia docker docs] or [AMD][amd docker docs] documentation.
### Generate!
Run the container, modifying the command as necessary:
```bash
docker run --runtime=nvidia --gpus=all --publish 9090:9090 ghcr.io/invoke-ai/invokeai
```
Then open `http://localhost:9090` and install some models using the Model Manager tab to begin generating.
For ROCm, add `--device /dev/kfd --device /dev/dri` to the `docker run` command.
### Persist your data
You will likely want to persist your workspace outside of the container. Use the `--volume /home/myuser/invokeai:/invokeai` flag to mount some local directory (using its **absolute** path) to the `/invokeai` path inside the container. Your generated images and models will reside there. You can use this directory with other InvokeAI installations, or switch between runtime directories as needed.
### DIY
Build your own image and customize the environment to match your needs using our `docker-compose` stack. See [README.md](./docker/README.md) in the [docker](./docker) directory.
## Troubleshooting, FAQ and Support

14
SECURITY.md Normal file
View File

@@ -0,0 +1,14 @@
# Security Policy
## Supported Versions
Only the latest version of Invoke will receive security updates.
We do not currently maintain multiple versions of the application with updates.
## Reporting a Vulnerability
To report a vulnerability, contact the Invoke team directly at security@invoke.ai
At this time, we do not maintain a formal bug bounty program.
You can also share identified security issues with our team on huntr.com

View File

@@ -1,58 +1,8 @@
# syntax=docker/dockerfile:1.4
## Builder stage
#### Web UI ------------------------------------
FROM library/ubuntu:23.04 AS builder
ARG DEBIAN_FRONTEND=noninteractive
RUN rm -f /etc/apt/apt.conf.d/docker-clean; echo 'Binary::apt::APT::Keep-Downloaded-Packages "true";' > /etc/apt/apt.conf.d/keep-cache
RUN --mount=type=cache,target=/var/cache/apt,sharing=locked \
--mount=type=cache,target=/var/lib/apt,sharing=locked \
apt update && apt-get install -y \
git \
python3-venv \
python3-pip \
build-essential
ENV INVOKEAI_SRC=/opt/invokeai
ENV VIRTUAL_ENV=/opt/venv/invokeai
ENV PATH="$VIRTUAL_ENV/bin:$PATH"
ARG GPU_DRIVER=cuda
ARG TARGETPLATFORM="linux/amd64"
# unused but available
ARG BUILDPLATFORM
WORKDIR ${INVOKEAI_SRC}
COPY invokeai ./invokeai
COPY pyproject.toml ./
# Editable mode helps use the same image for development:
# the local working copy can be bind-mounted into the image
# at path defined by ${INVOKEAI_SRC}
# NOTE: there are no pytorch builds for arm64 + cuda, only cpu
# x86_64/CUDA is default
RUN --mount=type=cache,target=/root/.cache/pip \
python3 -m venv ${VIRTUAL_ENV} &&\
if [ "$TARGETPLATFORM" = "linux/arm64" ] || [ "$GPU_DRIVER" = "cpu" ]; then \
extra_index_url_arg="--extra-index-url https://download.pytorch.org/whl/cpu"; \
elif [ "$GPU_DRIVER" = "rocm" ]; then \
extra_index_url_arg="--extra-index-url https://download.pytorch.org/whl/rocm6.1"; \
else \
extra_index_url_arg="--extra-index-url https://download.pytorch.org/whl/cu124"; \
fi &&\
# xformers + triton fails to install on arm64
if [ "$GPU_DRIVER" = "cuda" ] && [ "$TARGETPLATFORM" = "linux/amd64" ]; then \
pip install $extra_index_url_arg -e ".[xformers]"; \
else \
pip install $extra_index_url_arg -e "."; \
fi
# #### Build the Web UI ------------------------------------
FROM node:20-slim AS web-builder
FROM docker.io/node:22-slim AS web-builder
ENV PNPM_HOME="/pnpm"
ENV PATH="$PNPM_HOME:$PATH"
RUN corepack use pnpm@8.x
@@ -64,61 +14,100 @@ RUN --mount=type=cache,target=/pnpm/store \
pnpm install --frozen-lockfile
RUN npx vite build
#### Runtime stage ---------------------------------------
## Backend ---------------------------------------
FROM library/ubuntu:23.04 AS runtime
FROM library/ubuntu:24.04
ARG DEBIAN_FRONTEND=noninteractive
ENV PYTHONUNBUFFERED=1
ENV PYTHONDONTWRITEBYTECODE=1
RUN rm -f /etc/apt/apt.conf.d/docker-clean; echo 'Binary::apt::APT::Keep-Downloaded-Packages "true";' > /etc/apt/apt.conf.d/keep-cache
RUN --mount=type=cache,target=/var/cache/apt \
--mount=type=cache,target=/var/lib/apt \
apt update && apt install -y --no-install-recommends \
ca-certificates \
git \
gosu \
libglib2.0-0 \
libgl1 \
libglx-mesa0 \
build-essential \
libopencv-dev \
libstdc++-10-dev
RUN apt update && apt install -y --no-install-recommends \
git \
curl \
vim \
tmux \
ncdu \
iotop \
bzip2 \
gosu \
magic-wormhole \
libglib2.0-0 \
libgl1-mesa-glx \
python3-venv \
python3-pip \
build-essential \
libopencv-dev \
libstdc++-10-dev &&\
apt-get clean && apt-get autoclean
ENV \
PYTHONUNBUFFERED=1 \
PYTHONDONTWRITEBYTECODE=1 \
VIRTUAL_ENV=/opt/venv \
INVOKEAI_SRC=/opt/invokeai \
PYTHON_VERSION=3.12 \
UV_PYTHON=3.12 \
UV_COMPILE_BYTECODE=1 \
UV_MANAGED_PYTHON=1 \
UV_LINK_MODE=copy \
UV_PROJECT_ENVIRONMENT=/opt/venv \
UV_INDEX="https://download.pytorch.org/whl/cu124" \
INVOKEAI_ROOT=/invokeai \
INVOKEAI_HOST=0.0.0.0 \
INVOKEAI_PORT=9090 \
PATH="/opt/venv/bin:$PATH" \
CONTAINER_UID=${CONTAINER_UID:-1000} \
CONTAINER_GID=${CONTAINER_GID:-1000}
ARG GPU_DRIVER=cuda
ENV INVOKEAI_SRC=/opt/invokeai
ENV VIRTUAL_ENV=/opt/venv/invokeai
ENV INVOKEAI_ROOT=/invokeai
ENV INVOKEAI_HOST=0.0.0.0
ENV INVOKEAI_PORT=9090
ENV PATH="$VIRTUAL_ENV/bin:$INVOKEAI_SRC:$PATH"
ENV CONTAINER_UID=${CONTAINER_UID:-1000}
ENV CONTAINER_GID=${CONTAINER_GID:-1000}
# Install `uv` for package management
COPY --from=ghcr.io/astral-sh/uv:0.6.9 /uv /uvx /bin/
# --link requires buldkit w/ dockerfile syntax 1.4
COPY --link --from=builder ${INVOKEAI_SRC} ${INVOKEAI_SRC}
COPY --link --from=builder ${VIRTUAL_ENV} ${VIRTUAL_ENV}
COPY --link --from=web-builder /build/dist ${INVOKEAI_SRC}/invokeai/frontend/web/dist
# Install python & allow non-root user to use it by traversing the /root dir without read permissions
RUN --mount=type=cache,target=/root/.cache/uv \
uv python install ${PYTHON_VERSION} && \
# chmod --recursive a+rX /root/.local/share/uv/python
chmod 711 /root
WORKDIR ${INVOKEAI_SRC}
# Install project's dependencies as a separate layer so they aren't rebuilt every commit.
# bind-mount instead of copy to defer adding sources to the image until next layer.
#
# NOTE: there are no pytorch builds for arm64 + cuda, only cpu
# x86_64/CUDA is the default
RUN --mount=type=cache,target=/root/.cache/uv \
--mount=type=bind,source=pyproject.toml,target=pyproject.toml \
--mount=type=bind,source=uv.lock,target=uv.lock \
# this is just to get the package manager to recognize that the project exists, without making changes to the docker layer
--mount=type=bind,source=invokeai/version,target=invokeai/version \
if [ "$TARGETPLATFORM" = "linux/arm64" ] || [ "$GPU_DRIVER" = "cpu" ]; then UV_INDEX="https://download.pytorch.org/whl/cpu"; \
elif [ "$GPU_DRIVER" = "rocm" ]; then UV_INDEX="https://download.pytorch.org/whl/rocm6.2"; \
fi && \
uv sync --frozen
# build patchmatch
RUN cd /usr/lib/$(uname -p)-linux-gnu/pkgconfig/ && ln -sf opencv4.pc opencv.pc
RUN python -c "from patchmatch import patch_match"
# Link amdgpu.ids for ROCm builds
# contributed by https://github.com/Rubonnek
RUN mkdir -p "/opt/amdgpu/share/libdrm" &&\
ln -s "/usr/share/libdrm/amdgpu.ids" "/opt/amdgpu/share/libdrm/amdgpu.ids"
WORKDIR ${INVOKEAI_SRC}
# build patchmatch
RUN cd /usr/lib/$(uname -p)-linux-gnu/pkgconfig/ && ln -sf opencv4.pc opencv.pc
RUN python3 -c "from patchmatch import patch_match"
ln -s "/usr/share/libdrm/amdgpu.ids" "/opt/amdgpu/share/libdrm/amdgpu.ids"
RUN mkdir -p ${INVOKEAI_ROOT} && chown -R ${CONTAINER_UID}:${CONTAINER_GID} ${INVOKEAI_ROOT}
COPY docker/docker-entrypoint.sh ./
ENTRYPOINT ["/opt/invokeai/docker-entrypoint.sh"]
CMD ["invokeai-web"]
# --link requires buldkit w/ dockerfile syntax 1.4, does not work with podman
COPY --link --from=web-builder /build/dist ${INVOKEAI_SRC}/invokeai/frontend/web/dist
# add sources last to minimize image changes on code changes
COPY invokeai ${INVOKEAI_SRC}/invokeai
# this should not increase image size because we've already installed dependencies
# in a previous layer
RUN --mount=type=cache,target=/root/.cache/uv \
--mount=type=bind,source=pyproject.toml,target=pyproject.toml \
--mount=type=bind,source=uv.lock,target=uv.lock \
if [ "$TARGETPLATFORM" = "linux/arm64" ] || [ "$GPU_DRIVER" = "cpu" ]; then UV_INDEX="https://download.pytorch.org/whl/cpu"; \
elif [ "$GPU_DRIVER" = "rocm" ]; then UV_INDEX="https://download.pytorch.org/whl/rocm6.2"; \
fi && \
uv pip install -e .

View File

@@ -16,6 +16,9 @@ set -e -o pipefail
USER_ID=${CONTAINER_UID:-1000}
USER=ubuntu
# if the user does not exist, create it. It is expected to be present on ubuntu >=24.x
_=$(id ${USER} 2>&1) || useradd -u ${USER_ID} ${USER}
# ensure the UID is correct
usermod -u ${USER_ID} ${USER} 1>/dev/null
### Set the $PUBLIC_KEY env var to enable SSH access.
@@ -36,6 +39,8 @@ fi
mkdir -p "${INVOKEAI_ROOT}"
chown --recursive ${USER} "${INVOKEAI_ROOT}" || true
cd "${INVOKEAI_ROOT}"
export HF_HOME=${HF_HOME:-$INVOKEAI_ROOT/.cache/huggingface}
export MPLCONFIGDIR=${MPLCONFIGDIR:-$INVOKEAI_ROOT/.matplotlib}
# Run the CMD as the Container User (not root).
exec gosu ${USER} "$@"

View File

@@ -1,41 +1,50 @@
# Release Process
The app is published in twice, in different build formats.
The Invoke application is published as a python package on [PyPI]. This includes both a source distribution and built distribution (a wheel).
- A [PyPI] distribution. This includes both a source distribution and built distribution (a wheel). Users install with `pip install invokeai`. The updater uses this build.
- An installer on the [InvokeAI Releases Page]. This is a zip file with install scripts and a wheel. This is only used for new installs.
Most users install it with the [Launcher](https://github.com/invoke-ai/launcher/), others with `pip`.
The launcher uses GitHub as the source of truth for available releases.
## Broad Strokes
- Merge all changes and bump the version in the codebase.
- Tag the release commit.
- Wait for the release workflow to complete.
- Approve the PyPI publish jobs.
- Write GH release notes.
## General Prep
Make a developer call-out for PRs to merge. Merge and test things out.
While the release workflow does not include end-to-end tests, it does pause before publishing so you can download and test the final build.
Make a developer call-out for PRs to merge. Merge and test things out. Bump the version by editing `invokeai/version/invokeai_version.py`.
## Release Workflow
The `release.yml` workflow runs a number of jobs to handle code checks, tests, build and publish on PyPI.
It is triggered on **tag push**, when the tag matches `v*`. It doesn't matter if you've prepped a release branch like `release/v3.5.0` or are releasing from `main` - it works the same.
> Because commits are reference-counted, it is safe to create a release branch, tag it, let the workflow run, then delete the branch. So long as the tag exists, that commit will exist.
It is triggered on **tag push**, when the tag matches `v*`.
### Triggering the Workflow
Run `make tag-release` to tag the current commit and kick off the workflow.
Ensure all commits that should be in the release are merged, and you have pulled them locally.
The release may also be dispatched [manually].
Double-check that you have checked out the commit that will represent the release (typically the latest commit on `main`).
Run `make tag-release` to tag the current commit and kick off the workflow. You will be prompted to provide a message - use the version specifier.
If this version's tag already exists for some reason (maybe you had to make a last minute change), the script will overwrite it.
> In case you cannot use the Make target, the release may also be dispatched [manually] via GH.
### Workflow Jobs and Process
The workflow consists of a number of concurrently-run jobs, and two final publish jobs.
The workflow consists of a number of concurrently-run checks and tests, then two final publish jobs.
The publish jobs require manual approval and are only run if the other jobs succeed.
#### `check-version` Job
This job checks that the git ref matches the app version. It matches the ref against the `__version__` variable in `invokeai/version/invokeai_version.py`.
When the workflow is triggered by tag push, the ref is the tag. If the workflow is run manually, the ref is the target selected from the **Use workflow from** dropdown.
This job ensures that the `invokeai` python package version specifier matches the tag for the release. The version specifier is pulled from the `__version__` variable in `invokeai/version/invokeai_version.py`.
This job uses [samuelcolvin/check-python-version].
@@ -43,62 +52,47 @@ This job uses [samuelcolvin/check-python-version].
#### Check and Test Jobs
Next, these jobs run and must pass. They are the same jobs that are run for every PR.
- **`python-tests`**: runs `pytest` on matrix of platforms
- **`python-checks`**: runs `ruff` (format and lint)
- **`frontend-tests`**: runs `vitest`
- **`frontend-checks`**: runs `prettier` (format), `eslint` (lint), `dpdm` (circular refs), `tsc` (static type check) and `knip` (unused imports)
- **`typegen-checks`**: ensures the frontend and backend types are synced
> **TODO** We should add `mypy` or `pyright` to the **`check-python`** job.
#### `build-wheel` Job
> **TODO** We should add an end-to-end test job that generates an image.
This sets up both python and frontend dependencies and builds the python package. Internally, this runs `./scripts/build_wheel.sh` and uploads `dist.zip`, which contains the wheel and unarchived build.
#### `build-installer` Job
This sets up both python and frontend dependencies and builds the python package. Internally, this runs `installer/create_installer.sh` and uploads two artifacts:
- **`dist`**: the python distribution, to be published on PyPI
- **`InvokeAI-installer-${VERSION}.zip`**: the installer to be included in the GitHub release
You don't need to download or test these artifacts.
#### Sanity Check & Smoke Test
At this point, the release workflow pauses as the remaining publish jobs require approval. Time to test the installer.
At this point, the release workflow pauses as the remaining publish jobs require approval.
Because the installer pulls from PyPI, and we haven't published to PyPI yet, you will need to install from the wheel:
It's possible to test the python package before it gets published to PyPI. We've never had problems with it, so it's not necessary to do this.
- Download and unzip `dist.zip` and the installer from the **Summary** tab of the workflow
- Run the installer script using the `--wheel` CLI arg, pointing at the wheel:
But, if you want to be extra-super careful, here's how to test it:
```sh
./install.sh --wheel ../InvokeAI-4.0.0rc6-py3-none-any.whl
```
- Install to a temporary directory so you get the new user experience
- Download a model and generate
> The same wheel file is bundled in the installer and in the `dist` artifact, which is uploaded to PyPI. You should end up with the exactly the same installation as if the installer got the wheel from PyPI.
- Download the `dist.zip` build artifact from the `build-wheel` job
- Unzip it and find the wheel file
- Create a fresh Invoke install by following the [manual install guide](https://invoke-ai.github.io/InvokeAI/installation/manual/) - but instead of installing from PyPI, install from the wheel
- Test the app
##### Something isn't right
If testing reveals any issues, no worries. Cancel the workflow, which will cancel the pending publish jobs (you didn't approve them prematurely, right?).
Now you can start from the top:
- Fix the issues and PR the fixes per usual
- Get the PR approved and merged per usual
- Switch to `main` and pull in the fixes
- Run `make tag-release` to move the tag to `HEAD` (which has the fixes) and kick off the release workflow again
- Re-do the sanity check
If testing reveals any issues, no worries. Cancel the workflow, which will cancel the pending publish jobs (you didn't approve them prematurely, right?) and start over.
#### PyPI Publish Jobs
The publish jobs will run if any of the previous jobs fail.
The publish jobs will not run if any of the previous jobs fail.
They use [GitHub environments], which are configured as [trusted publishers] on PyPI.
Both jobs require a maintainer to approve them from the workflow's **Summary** tab.
Both jobs require a @hipsterusername or @psychedelicious to approve them from the workflow's **Summary** tab.
- Click the **Review deployments** button
- Select the environment (either `testpypi` or `pypi`)
- Select the environment (either `testpypi` or `pypi` - typically you select both)
- Click **Approve and deploy**
> **If the version already exists on PyPI, the publish jobs will fail.** PyPI only allows a given version to be published once - you cannot change it. If version published on PyPI has a problem, you'll need to "fail forward" by bumping the app version and publishing a followup release.
@@ -113,46 +107,33 @@ If there are no incidents, contact @hipsterusername or @lstein, who have owner a
Publishes the distribution on the [Test PyPI] index, using the `testpypi` GitHub environment.
This job is not required for the production PyPI publish, but included just in case you want to test the PyPI release.
This job is not required for the production PyPI publish, but included just in case you want to test the PyPI release for some reason:
If approved and successful, you could try out the test release like this:
```sh
# Create a new virtual environment
python -m venv ~/.test-invokeai-dist --prompt test-invokeai-dist
# Install the distribution from Test PyPI
pip install --index-url https://test.pypi.org/simple/ invokeai
# Run and test the app
invokeai-web
# Cleanup
deactivate
rm -rf ~/.test-invokeai-dist
```
- Approve this publish job without approving the prod publish
- Let it finish
- Create a fresh Invoke install by following the [manual install guide](https://invoke-ai.github.io/InvokeAI/installation/manual/), making sure to use the Test PyPI index URL: `https://test.pypi.org/simple/`
- Test the app
#### `publish-pypi` Job
Publishes the distribution on the production PyPI index, using the `pypi` GitHub environment.
## Publish the GitHub Release with installer
It's a good idea to wait to approve and run this job until you have the release notes ready!
Once the release is published to PyPI, it's time to publish the GitHub release.
## Prep and publish the GitHub Release
1. [Draft a new release] on GitHub, choosing the tag that triggered the release.
1. Write the release notes, describing important changes. The **Generate release notes** button automatically inserts the changelog and new contributors, and you can copy/paste the intro from previous releases.
1. Use `scripts/get_external_contributions.py` to get a list of external contributions to shout out in the release notes.
1. Upload the zip file created in **`build`** job into the Assets section of the release notes.
1. Check **Set as a pre-release** if it's a pre-release.
1. Check **Create a discussion for this release**.
1. Publish the release.
1. Announce the release in Discord.
> **TODO** Workflows can create a GitHub release from a template and upload release assets. One popular action to handle this is [ncipollo/release-action]. A future enhancement to the release process could set this up.
## Manual Build
The `build installer` workflow can be dispatched manually. This is useful to test the installer for a given branch or tag.
No checks are run, it just builds.
2. The **Generate release notes** button automatically inserts the changelog and new contributors. Make sure to select the correct tags for this release and the last stable release. GH often selects the wrong tags - do this manually.
3. Write the release notes, describing important changes. Contributions from community members should be shouted out. Use the GH-generated changelog to see all contributors. If there are Weblate translation updates, open that PR and shout out every person who contributed a translation.
4. Check **Set as a pre-release** if it's a pre-release.
5. Approve and wait for the `publish-pypi` job to finish if you haven't already.
6. Publish the GH release.
7. Post the release in Discord in the [releases](https://discord.com/channels/1020123559063990373/1149260708098359327) channel with abbreviated notes. For example:
> Invoke v5.7.0 (stable): <https://github.com/invoke-ai/InvokeAI/releases/tag/v5.7.0>
>
> It's a pretty big one - Form Builder, Metadata Nodes (thanks @SkunkWorxDark!), and much more.
8. Right click the message in releases and copy the link to it. Then, post that link in the [new-release-discussion](https://discord.com/channels/1020123559063990373/1149506274971631688) channel. For example:
> Invoke v5.7.0 (stable): <https://discord.com/channels/1020123559063990373/1149260708098359327/1344521744916021248>
## Manual Release
@@ -160,12 +141,10 @@ The `release` workflow can be dispatched manually. You must dispatch the workflo
This functionality is available as a fallback in case something goes wonky. Typically, releases should be triggered via tag push as described above.
[InvokeAI Releases Page]: https://github.com/invoke-ai/InvokeAI/releases
[PyPI]: https://pypi.org/
[Draft a new release]: https://github.com/invoke-ai/InvokeAI/releases/new
[Test PyPI]: https://test.pypi.org/
[version specifier]: https://packaging.python.org/en/latest/specifications/version-specifiers/
[ncipollo/release-action]: https://github.com/ncipollo/release-action
[GitHub environments]: https://docs.github.com/en/actions/deployment/targeting-different-environments/using-environments-for-deployment
[trusted publishers]: https://docs.pypi.org/trusted-publishers/
[samuelcolvin/check-python-version]: https://github.com/samuelcolvin/check-python-version

View File

@@ -39,7 +39,7 @@ It has two sections - one for internal use and one for user settings:
```yaml
# Internal metadata - do not edit:
schema_version: 4
schema_version: 4.0.2
# Put user settings here - see https://invoke-ai.github.io/InvokeAI/features/CONFIGURATION/:
host: 0.0.0.0 # serve the app on your local network
@@ -83,6 +83,10 @@ A subset of settings may be specified using CLI args:
- `--root`: specify the root directory
- `--config`: override the default `invokeai.yaml` file location
### Low-VRAM Mode
See the [Low-VRAM mode docs][low-vram] for details on enabling this feature.
### All Settings
Following the table are additional explanations for certain settings.
@@ -114,6 +118,10 @@ remote_api_tokens:
The provided token will be added as a `Bearer` token to the network requests to download the model files. As far as we know, this works for all model marketplaces that require authorization.
!!! tip "HuggingFace Models"
If you get an error when installing a HF model using a URL instead of repo id, you may need to [set up a HF API token](https://huggingface.co/settings/tokens) and add an entry for it under `remote_api_tokens`. Use `huggingface.co` for `url_regex`.
#### Model Hashing
Models are hashed during installation, providing a stable identifier for models across all platforms. Hashing is a one-time operation.
@@ -181,3 +189,4 @@ The `log_format` option provides several alternative formats:
[basic guide to yaml files]: https://circleci.com/blog/what-is-yaml-a-beginner-s-guide/
[Model Marketplace API Keys]: #model-marketplace-api-keys
[low-vram]: ./features/low-vram.md

View File

@@ -50,7 +50,7 @@ Applications are built on top of the invoke framework. They should construct `in
### Web UI
The Web UI is built on top of an HTTP API built with [FastAPI](https://fastapi.tiangolo.com/) and [Socket.IO](https://socket.io/). The frontend code is found in `/frontend` and the backend code is found in `/ldm/invoke/app/api_app.py` and `/ldm/invoke/app/api/`. The code is further organized as such:
The Web UI is built on top of an HTTP API built with [FastAPI](https://fastapi.tiangolo.com/) and [Socket.IO](https://socket.io/). The frontend code is found in `/invokeai/frontend` and the backend code is found in `/invokeai/app/api_app.py` and `/invokeai/app/api/`. The code is further organized as such:
| Component | Description |
| --- | --- |
@@ -62,7 +62,7 @@ The Web UI is built on top of an HTTP API built with [FastAPI](https://fastapi.t
### CLI
The CLI is built automatically from invocation metadata, and also supports invocation piping and auto-linking. Code is available in `/ldm/invoke/app/cli_app.py`.
The CLI is built automatically from invocation metadata, and also supports invocation piping and auto-linking. Code is available in `/invokeai/frontend/cli`.
## Invoke
@@ -70,7 +70,7 @@ The Invoke framework provides the interface to the underlying AI systems and is
### Invoker
The invoker (`/ldm/invoke/app/services/invoker.py`) is the primary interface through which applications interact with the framework. Its primary purpose is to create, manage, and invoke sessions. It also maintains two sets of services:
The invoker (`/invokeai/app/services/invoker.py`) is the primary interface through which applications interact with the framework. Its primary purpose is to create, manage, and invoke sessions. It also maintains two sets of services:
- **invocation services**, which are used by invocations to interact with core functionality.
- **invoker services**, which are used by the invoker to manage sessions and manage the invocation queue.
@@ -82,12 +82,12 @@ The session graph does not support looping. This is left as an application probl
### Invocations
Invocations represent individual units of execution, with inputs and outputs. All invocations are located in `/ldm/invoke/app/invocations`, and are all automatically discovered and made available in the applications. These are the primary way to expose new functionality in Invoke.AI, and the [implementation guide](INVOCATIONS.md) explains how to add new invocations.
Invocations represent individual units of execution, with inputs and outputs. All invocations are located in `/invokeai/app/invocations`, and are all automatically discovered and made available in the applications. These are the primary way to expose new functionality in Invoke.AI, and the [implementation guide](INVOCATIONS.md) explains how to add new invocations.
### Services
Services provide invocations access AI Core functionality and other necessary functionality (e.g. image storage). These are available in `/ldm/invoke/app/services`. As a general rule, new services should provide an interface as an abstract base class, and may provide a lightweight local implementation by default in their module. The goal for all services should be to enable the usage of different implementations (e.g. using cloud storage for image storage), but should not load any module dependencies unless that implementation has been used (i.e. don't import anything that won't be used, especially if it's expensive to import).
Services provide invocations access AI Core functionality and other necessary functionality (e.g. image storage). These are available in `/invokeai/app/services`. As a general rule, new services should provide an interface as an abstract base class, and may provide a lightweight local implementation by default in their module. The goal for all services should be to enable the usage of different implementations (e.g. using cloud storage for image storage), but should not load any module dependencies unless that implementation has been used (i.e. don't import anything that won't be used, especially if it's expensive to import).
## AI Core
The AI Core is represented by the rest of the code base (i.e. the code outside of `/ldm/invoke/app/`).
The AI Core is represented by the rest of the code base (i.e. the code outside of `/invokeai/app/`).

View File

@@ -39,7 +39,7 @@ nodes imported in the `__init__.py` file are loaded. See the README in the nodes
folder for more examples:
```py
from .cool_node import CoolInvocation
from .cool_node import ResizeInvocation
```
## Creating A New Invocation
@@ -69,7 +69,10 @@ The first set of things we need to do when creating a new Invocation are -
So let us do that.
```python
from invokeai.app.invocations.baseinvocation import BaseInvocation, invocation
from invokeai.invocation_api import (
BaseInvocation,
invocation,
)
@invocation('resize')
class ResizeInvocation(BaseInvocation):
@@ -103,8 +106,12 @@ create your own custom field types later in this guide. For now, let's go ahead
and use it.
```python
from invokeai.app.invocations.baseinvocation import BaseInvocation, InputField, invocation
from invokeai.app.invocations.primitives import ImageField
from invokeai.invocation_api import (
BaseInvocation,
ImageField,
InputField,
invocation,
)
@invocation('resize')
class ResizeInvocation(BaseInvocation):
@@ -128,8 +135,12 @@ image: ImageField = InputField(description="The input image")
Great. Now let us create our other inputs for `width` and `height`
```python
from invokeai.app.invocations.baseinvocation import BaseInvocation, InputField, invocation
from invokeai.app.invocations.primitives import ImageField
from invokeai.invocation_api import (
BaseInvocation,
ImageField,
InputField,
invocation,
)
@invocation('resize')
class ResizeInvocation(BaseInvocation):
@@ -163,8 +174,13 @@ that are provided by it by InvokeAI.
Let us create this function first.
```python
from invokeai.app.invocations.baseinvocation import BaseInvocation, InputField, invocation, InvocationContext
from invokeai.app.invocations.primitives import ImageField
from invokeai.invocation_api import (
BaseInvocation,
ImageField,
InputField,
InvocationContext,
invocation,
)
@invocation('resize')
class ResizeInvocation(BaseInvocation):
@@ -191,8 +207,14 @@ all the necessary info related to image outputs. So let us use that.
We will cover how to create your own output types later in this guide.
```python
from invokeai.app.invocations.baseinvocation import BaseInvocation, InputField, invocation, InvocationContext
from invokeai.app.invocations.primitives import ImageField
from invokeai.invocation_api import (
BaseInvocation,
ImageField,
InputField,
InvocationContext,
invocation,
)
from invokeai.app.invocations.image import ImageOutput
@invocation('resize')
@@ -217,9 +239,15 @@ Perfect. Now that we have our Invocation setup, let us do what we want to do.
So let's do that.
```python
from invokeai.app.invocations.baseinvocation import BaseInvocation, InputField, invocation, InvocationContext
from invokeai.app.invocations.primitives import ImageField
from invokeai.app.invocations.image import ImageOutput, ResourceOrigin, ImageCategory
from invokeai.invocation_api import (
BaseInvocation,
ImageField,
InputField,
InvocationContext,
invocation,
)
from invokeai.app.invocations.image import ImageOutput
@invocation("resize")
class ResizeInvocation(BaseInvocation):
@@ -287,8 +315,8 @@ new Invocation ready to be used.
Once you've created a Node, the next step is to share it with the community! The
best way to do this is to submit a Pull Request to add the Node to the
[Community Nodes](nodes/communityNodes) list. If you're not sure how to do that,
take a look a at our [contributing nodes overview](contributingNodes).
[Community Nodes](../nodes/communityNodes.md) list. If you're not sure how to do that,
take a look a at our [contributing nodes overview](../nodes/contributingNodes.md).
## Advanced

View File

@@ -9,20 +9,20 @@ model. These are the:
configuration information. Among other things, the record service
tracks the type of the model, its provenance, and where it can be
found on disk.
* _ModelInstallServiceBase_ A service for installing models to
disk. It uses `DownloadQueueServiceBase` to download models and
their metadata, and `ModelRecordServiceBase` to store that
information. It is also responsible for managing the InvokeAI
`models` directory and its contents.
* _DownloadQueueServiceBase_
A multithreaded downloader responsible
for downloading models from a remote source to disk. The download
queue has special methods for downloading repo_id folders from
Hugging Face, as well as discriminating among model versions in
Civitai, but can be used for arbitrary content.
* _ModelLoadServiceBase_
Responsible for loading a model from disk
into RAM and VRAM and getting it ready for inference.
@@ -207,9 +207,9 @@ for use in the InvokeAI web server. Its signature is:
```
def open(
cls,
config: InvokeAIAppConfig,
conn: Optional[sqlite3.Connection] = None,
cls,
config: InvokeAIAppConfig,
conn: Optional[sqlite3.Connection] = None,
lock: Optional[threading.Lock] = None
) -> Union[ModelRecordServiceSQL, ModelRecordServiceFile]:
```
@@ -363,7 +363,7 @@ functionality:
* Registering a model config record for a model already located on the
local filesystem, without moving it or changing its path.
* Installing a model alreadiy located on the local filesystem, by
moving it into the InvokeAI root directory under the
`models` folder (or wherever config parameter `models_dir`
@@ -371,21 +371,21 @@ functionality:
* Probing of models to determine their type, base type and other key
information.
* Interface with the InvokeAI event bus to provide status updates on
the download, installation and registration process.
* Downloading a model from an arbitrary URL and installing it in
`models_dir`.
* Special handling for HuggingFace repo_ids to recursively download
the contents of the repository, paying attention to alternative
variants such as fp16.
* Saving tags and other metadata about the model into the invokeai database
when fetching from a repo that provides that type of information,
(currently only HuggingFace).
### Initializing the installer
A default installer is created at InvokeAI api startup time and stored
@@ -461,7 +461,7 @@ revision.
`config` is an optional dict of values that will override the
autoprobed values for model type, base, scheduler prediction type, and
so forth. See [Model configuration and
probing](#Model-configuration-and-probing) for details.
probing](#model-configuration-and-probing) for details.
`access_token` is an optional access token for accessing resources
that need authentication.
@@ -494,7 +494,7 @@ source8 = URLModelSource(url='https://civitai.com/api/download/models/63006', ac
for source in [source1, source2, source3, source4, source5, source6, source7]:
install_job = installer.install_model(source)
source2job = installer.wait_for_installs(timeout=120)
for source in sources:
job = source2job[source]
@@ -504,7 +504,7 @@ for source in sources:
print(f"{source} installed as {model_key}")
elif job.errored:
print(f"{source}: {job.error_type}.\nStack trace:\n{job.error}")
```
As shown here, the `import_model()` method accepts a variety of
@@ -1364,7 +1364,6 @@ the in-memory loaded model:
|----------------|-----------------|------------------|
| `config` | AnyModelConfig | A copy of the model's configuration record for retrieving base type, etc. |
| `model` | AnyModel | The instantiated model (details below) |
| `locker` | ModelLockerBase | A context manager that mediates the movement of the model into VRAM |
### get_model_by_key(key, [submodel]) -> LoadedModel

View File

@@ -1,6 +1,6 @@
# InvokeAI Backend Tests
We use `pytest` to run the backend python tests. (See [pyproject.toml](/pyproject.toml) for the default `pytest` options.)
We use `pytest` to run the backend python tests. (See [pyproject.toml](https://github.com/invoke-ai/InvokeAI/blob/main/pyproject.toml) for the default `pytest` options.)
## Fast vs. Slow
All tests are categorized as either 'fast' (no test annotation) or 'slow' (annotated with the `@pytest.mark.slow` decorator).
@@ -33,7 +33,7 @@ pytest tests -m ""
## Test Organization
All backend tests are in the [`tests/`](/tests/) directory. This directory mirrors the organization of the `invokeai/` directory. For example, tests for `invokeai/model_management/model_manager.py` would be found in `tests/model_management/test_model_manager.py`.
All backend tests are in the [`tests/`](https://github.com/invoke-ai/InvokeAI/tree/main/tests) directory. This directory mirrors the organization of the `invokeai/` directory. For example, tests for `invokeai/model_management/model_manager.py` would be found in `tests/model_management/test_model_manager.py`.
TODO: The above statement is aspirational. A re-organization of legacy tests is required to make it true.

View File

@@ -2,7 +2,7 @@
## **What do I need to know to help?**
If you are looking to help with a code contribution, InvokeAI uses several different technologies under the hood: Python (Pydantic, FastAPI, diffusers) and Typescript (React, Redux Toolkit, ChakraUI, Mantine, Konva). Familiarity with StableDiffusion and image generation concepts is helpful, but not essential.
If you are looking to help with a code contribution, InvokeAI uses several different technologies under the hood: Python (Pydantic, FastAPI, diffusers) and Typescript (React, Redux Toolkit, ChakraUI, Mantine, Konva). Familiarity with StableDiffusion and image generation concepts is helpful, but not essential.
## **Get Started**
@@ -12,7 +12,7 @@ To get started, take a look at our [new contributors checklist](newContributorCh
Once you're setup, for more information, you can review the documentation specific to your area of interest:
* #### [InvokeAI Architecure](../ARCHITECTURE.md)
* #### [Frontend Documentation](https://github.com/invoke-ai/InvokeAI/tree/main/invokeai/frontend/web)
* #### [Frontend Documentation](../frontend/index.md)
* #### [Node Documentation](../INVOCATIONS.md)
* #### [Local Development](../LOCAL_DEVELOPMENT.md)
@@ -20,15 +20,15 @@ Once you're setup, for more information, you can review the documentation specif
If you don't feel ready to make a code contribution yet, no problem! You can also help out in other ways, such as [documentation](documentation.md), [translation](translation.md) or helping support other users and triage issues as they're reported in GitHub.
There are two paths to making a development contribution:
There are two paths to making a development contribution:
1. Choosing an open issue to address. Open issues can be found in the [Issues](https://github.com/invoke-ai/InvokeAI/issues?q=is%3Aissue+is%3Aopen) section of the InvokeAI repository. These are tagged by the issue type (bug, enhancement, etc.) along with the “good first issues” tag denoting if they are suitable for first time contributors.
1. Additional items can be found on our [roadmap](https://github.com/orgs/invoke-ai/projects/7). The roadmap is organized in terms of priority, and contains features of varying size and complexity. If there is an inflight item youd like to help with, reach out to the contributor assigned to the item to see how you can help.
1. Additional items can be found on our [roadmap](https://github.com/orgs/invoke-ai/projects/7). The roadmap is organized in terms of priority, and contains features of varying size and complexity. If there is an inflight item youd like to help with, reach out to the contributor assigned to the item to see how you can help.
2. Opening a new issue or feature to add. **Please make sure you have searched through existing issues before creating new ones.**
*Regardless of what you choose, please post in the [#dev-chat](https://discord.com/channels/1020123559063990373/1049495067846524939) channel of the Discord before you start development in order to confirm that the issue or feature is aligned with the current direction of the project. We value our contributors time and effort and want to ensure that no ones time is being misspent.*
## Best Practices:
## Best Practices:
* Keep your pull requests small. Smaller pull requests are more likely to be accepted and merged
* Comments! Commenting your code helps reviewers easily understand your contribution
* Use Python and Typescripts typing systems, and consider using an editor with [LSP](https://microsoft.github.io/language-server-protocol/) support to streamline development
@@ -38,7 +38,7 @@ There are two paths to making a development contribution:
If you need help, you can ask questions in the [#dev-chat](https://discord.com/channels/1020123559063990373/1049495067846524939) channel of the Discord.
For frontend related work, **@psychedelicious** is the best person to reach out to.
For frontend related work, **@psychedelicious** is the best person to reach out to.
For backend related work, please reach out to **@blessedcoolant**, **@lstein**, **@StAlKeR7779** or **@psychedelicious**.

View File

@@ -22,15 +22,15 @@ Before starting these steps, ensure you have your local environment [configured
2. Fork the [InvokeAI](https://github.com/invoke-ai/InvokeAI) repository to your GitHub profile. This means that you will have a copy of the repository under **your-GitHub-username/InvokeAI**.
3. Clone the repository to your local machine using:
```bash
git clone https://github.com/your-GitHub-username/InvokeAI.git
```
```bash
git clone https://github.com/your-GitHub-username/InvokeAI.git
```
If you're unfamiliar with using Git through the commandline, [GitHub Desktop](https://desktop.github.com) is a easy-to-use alternative with a UI. You can do all the same steps listed here, but through the interface. 4. Create a new branch for your fix using:
```bash
git checkout -b branch-name-here
```
```bash
git checkout -b branch-name-here
```
5. Make the appropriate changes for the issue you are trying to address or the feature that you want to add.
6. Add the file contents of the changed files to the "snapshot" git uses to manage the state of the project, also known as the index:

View File

@@ -1,12 +1,10 @@
# Dev Environment
To make changes to Invoke's backend, frontend, or documentation, you'll need to set up a dev environment.
To make changes to Invoke's backend, frontend or documentation, you'll need to set up a dev environment.
If you just want to use Invoke, you should use the [installer][installer link].
If you only want to make changes to the docs site, you can skip the frontend dev environment setup as described in the below guide.
!!! info "Why do I need the frontend toolchain?"
The repo doesn't contain a build of the frontend. You'll be responsible for rebuilding it every time you pull in new changes, or run it in dev mode (which incurs a substantial performance penalty).
If you just want to use Invoke, you should use the [launcher][launcher link].
!!! warning
@@ -17,84 +15,76 @@ If you just want to use Invoke, you should use the [installer][installer link].
## Setup
1. Run through the [requirements][requirements link].
2. [Fork and clone][forking link] the [InvokeAI repo][repo link].
3. Create an directory for user data (images, models, db, etc). This is typically at `~/invokeai`, but if you already have a non-dev install, you may want to create a separate directory for the dev install.
4. Create a python virtual environment inside the directory you just created:
```sh
python3 -m venv .venv --prompt InvokeAI-Dev
```
5. Activate the venv (you'll need to do this every time you want to run the app):
```sh
source .venv/bin/activate
3. This repository uses Git LFS to manage large files. To ensure all assets are downloaded:
- Install git-lfs → [Download here](https://git-lfs.com/)
- Enable automatic LFS fetching for this repository:
```shell
git config lfs.fetchinclude "*"
```
- Fetch files from LFS (only needs to be done once; subsequent `git pull` will fetch changes automatically):
```
git lfs pull
```
4. Create an directory for user data (images, models, db, etc). This is typically at `~/invokeai`, but if you already have a non-dev install, you may want to create a separate directory for the dev install.
6. Install the repo as an [editable install][editable install link]:
5. Follow the [manual install][manual install link] guide, with some modifications to the install command:
- Use `.` instead of `invokeai` to install from the current directory. You don't need to specify the version.
- Add `-e` after the `install` operation to make this an [editable install][editable install link]. That means your changes to the python code will be reflected when you restart the Invoke server.
- When installing the `invokeai` package, add the `dev`, `test` and `docs` package options to the package specifier. You may or may not need the `xformers` option - follow the manual install guide to figure that out. So, your package specifier will be either `".[dev,test,docs]"` or `".[dev,test,docs,xformers]"`. Note the quotes!
With the modifications made, the install command should look something like this:
```sh
pip install -e ".[dev,test,xformers]" --use-pep517 --extra-index-url https://download.pytorch.org/whl/cu121
uv pip install -e ".[dev,test,docs,xformers]" --python 3.12 --python-preference only-managed --index=https://download.pytorch.org/whl/cu126 --reinstall
```
Refer to the [manual installation][manual install link]] instructions for more determining the correct install options. `xformers` is optional, but `dev` and `test` are not.
6. At this point, you should have Invoke installed, a venv set up and activated, and the server running. But you will see a warning in the terminal that no UI was found. If you go to the URL for the server, you won't get a UI.
This is because the UI build is not distributed with the source code. You need to build it manually. End the running server instance.
If you only want to edit the docs, you can stop here and skip to the **Documentation** section below.
7. Install the frontend dev toolchain:
- [`nodejs`](https://nodejs.org/) (recommend v20 LTS)
- [`pnpm`](https://pnpm.io/8.x/installation) (must be v8 - not v9!)
- [`nodejs`](https://nodejs.org/) (v20+)
- [`pnpm`](https://pnpm.io/8.x/installation) (must be v8 - not v9!)
8. Do a production build of the frontend:
```sh
cd PATH_TO_INVOKEAI_REPO/invokeai/frontend/web
cd <PATH_TO_INVOKEAI_REPO>/invokeai/frontend/web
pnpm i
pnpm build
```
9. Start the application:
```sh
cd PATH_TO_INVOKEAI_REPO
python scripts/invokeai-web.py
```
10. Access the UI at `localhost:9090`.
9. Restart the server and navigate to the URL. You should get a UI. After making changes to the python code, restart the server to see those changes.
## Updating the UI
You'll need to run `pnpm build` every time you pull in new changes. Another option is to skip the build and instead run the app in dev mode:
You'll need to run `pnpm build` every time you pull in new changes.
Another option is to skip the build and instead run the UI in dev mode:
```sh
pnpm dev
```
This starts a dev server at `localhost:5173`, which you will use instead of `localhost:9090`.
This starts a vite dev server for the UI at `127.0.0.1:5173`, which you will use instead of `127.0.0.1:9090`.
The dev mode is substantially slower than the production build but may be more convenient if you just need to test things out.
The dev mode is substantially slower than the production build but may be more convenient if you just need to test things out. It will hot-reload the UI as you make changes to the frontend code. Sometimes the hot-reload doesn't work, and you need to manually refresh the browser tab.
## Documentation
The documentation is built with `mkdocs`. To preview it locally, you need a additional set of packages installed.
The documentation is built with `mkdocs`. It provides a hot-reload dev server for the docs. Start it with `mkdocs serve`.
```sh
# after activating the venv
pip install -e ".[docs]"
```
Then, you can start a live docs dev server, which will auto-refresh when you edit the docs:
```sh
mkdocs serve
```
On macOS and Linux, there is a `make` target for this:
```sh
make docs
```
[installer link]: ../installation/installer.md
[launcher link]: ../installation/quick_start.md
[forking link]: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/fork-a-repo
[requirements link]: ../installation/requirements.md
[repo link]: https://github.com/invoke-ai/InvokeAI

View File

@@ -34,11 +34,11 @@ Please reach out to @hipsterusername on [Discord](https://discord.gg/ZmtBAhwWhy)
## Contributors
This project is a combined effort of dedicated people from across the world. [Check out the list of all these amazing people](https://invoke-ai.github.io/InvokeAI/other/CONTRIBUTORS/). We thank them for their time, hard work and effort.
This project is a combined effort of dedicated people from across the world. [Check out the list of all these amazing people](contributors.md). We thank them for their time, hard work and effort.
## Code of Conduct
The InvokeAI community is a welcoming place, and we want your help in maintaining that. Please review our [Code of Conduct](https://github.com/invoke-ai/InvokeAI/blob/main/CODE_OF_CONDUCT.md) to learn more - it's essential to maintaining a respectful and inclusive environment.
The InvokeAI community is a welcoming place, and we want your help in maintaining that. Please review our [Code of Conduct](../CODE_OF_CONDUCT.md) to learn more - it's essential to maintaining a respectful and inclusive environment.
By making a contribution to this project, you certify that:

View File

@@ -1,26 +1,18 @@
# FAQ
!!! info "How to Reinstall"
Many issues can be resolved by re-installing the application. You won't lose any data by re-installing. We suggest downloading the [latest release](https://github.com/invoke-ai/InvokeAI/releases/latest) and using it to re-install the application. Consult the [installer guide](./installation/installer.md) for more information.
When you run the installer, you'll have an option to select the version to install. If you aren't ready to upgrade, you choose the current version to fix a broken install.
If the troubleshooting steps on this page don't get you up and running, please either [create an issue] or hop on [discord] for help.
## How to Install
You can download the latest installers [here](https://github.com/invoke-ai/InvokeAI/releases).
Note that any releases marked as _pre-release_ are in a beta state. You may experience some issues, but we appreciate your help testing those! For stable/reliable installations, please install the [latest release].
Follow the [Quick Start guide](./installation/quick_start.md) to install Invoke.
## Downloading models and using existing models
The Model Manager tab in the UI provides a few ways to install models, including using your already-downloaded models. You'll see a popup directing you there on first startup. For more information, see the [model install docs].
## Missing models after updating to v4
## Missing models after updating from v3
If you find some models are missing after updating to v4, it's likely they weren't correctly registered before the update and didn't get picked up in the migration.
If you find some models are missing after updating from v3, it's likely they weren't correctly registered before the update and didn't get picked up in the migration.
You can use the `Scan Folder` tab in the Model Manager UI to fix this. The models will either be in the old, now-unused `autoimport` folder, or your `models` folder.
@@ -37,115 +29,27 @@ Follow the same steps to scan and import the missing models.
## Slow generation
- Check the [system requirements] to ensure that your system is capable of generating images.
- Check the `ram` setting in `invokeai.yaml`. This setting tells Invoke how much of your system RAM can be used to cache models. Having this too high or too low can slow things down. That said, it's generally safest to not set this at all and instead let Invoke manage it.
- Check the `vram` setting in `invokeai.yaml`. This setting tells Invoke how much of your GPU VRAM can be used to cache models. Counter-intuitively, if this setting is too high, Invoke will need to do a lot of shuffling of models as it juggles the VRAM cache and the currently-loaded model. The default value of 0.25 is generally works well for GPUs without 16GB or more VRAM. Even on a 24GB card, the default works well.
- Check that your generations are happening on your GPU (if you have one). InvokeAI will log what is being used for generation upon startup. If your GPU isn't used, re-install to ensure the correct versions of torch get installed.
- If you are on Windows, you may have exceeded your GPU's VRAM capacity and are using slower [shared GPU memory](#shared-gpu-memory-windows). There's a guide to opt out of this behaviour in the linked FAQ entry.
## Shared GPU Memory (Windows)
!!! tip "Nvidia GPUs with driver 536.40"
This only applies to current Nvidia cards with driver 536.40 or later, released in June 2023.
When the GPU doesn't have enough VRAM for a task, Windows is able to allocate some of its CPU RAM to the GPU. This is much slower than VRAM, but it does allow the system to generate when it otherwise might no have enough VRAM.
When shared GPU memory is used, generation slows down dramatically - but at least it doesn't crash.
If you'd like to opt out of this behavior and instead get an error when you exceed your GPU's VRAM, follow [this guide from Nvidia](https://nvidia.custhelp.com/app/answers/detail/a_id/5490).
Here's how to get the python path required in the linked guide:
- Run `invoke.bat`.
- Select option 2 for developer console.
- At least one python path will be printed. Copy the path that includes your invoke installation directory (typically the first).
## Installer cannot find python (Windows)
Ensure that you checked **Add python.exe to PATH** when installing Python. This can be found at the bottom of the Python Installer window. If you already have Python installed, you can re-run the python installer, choose the Modify option and check the box.
- Follow the [Low-VRAM mode guide](./features/low-vram.md) to optimize performance.
- Check that your generations are happening on your GPU (if you have one). Invoke will log what is being used for generation upon startup. If your GPU isn't used, re-install to and ensure you select the appropriate GPU option.
- If you are on Windows with an Nvidia GPU, you may have exceeded your GPU's VRAM capacity and are triggering Nvidia's "sysmem fallback". There's a guide to opt out of this behaviour in the [Low-VRAM mode guide](./features/low-vram.md).
## Triton error on startup
This can be safely ignored. InvokeAI doesn't use Triton, but if you are on Linux and wish to dismiss the error, you can install Triton.
This can be safely ignored. Invoke doesn't use Triton, but if you are on Linux and wish to dismiss the error, you can install Triton.
## Updated to 3.4.0 and xformers cant load C++/CUDA
## Unable to Copy on Firefox
An issue occurred with your PyTorch update. Follow these steps to fix :
Firefox does not allow Invoke to directly access the clipboard by default. As a result, you may be unable to use certain copy functions. You can fix this by configuring Firefox to allow access to write to the clipboard:
1. Launch your invoke.bat / invoke.sh and select the option to open the developer console
2. Run:`pip install ".[xformers]" --upgrade --force-reinstall --extra-index-url https://download.pytorch.org/whl/cu121`
- If you run into an error with `typing_extensions`, re-open the developer console and run: `pip install -U typing-extensions`
Note that v3.4.0 is an old, unsupported version. Please upgrade to the [latest release].
## Install failed and says `pip` is out of date
An out of date `pip` typically won't cause an installation to fail. The cause of the error can likely be found above the message that says `pip` is out of date.
If you saw that warning but the install went well, don't worry about it (but you can update `pip` afterwards if you'd like).
- Go to `about:config` and click the Accept button
- Search for `dom.events.asyncClipboard.clipboardItem`
- Set it to `true` by clicking the toggle button
- Restart Firefox
## Replicate image found online
Most example images with prompts that you'll find on the internet have been generated using different software, so you can't expect to get identical results. In order to reproduce an image, you need to replicate the exact settings and processing steps, including (but not limited to) the model, the positive and negative prompts, the seed, the sampler, the exact image size, any upscaling steps, etc.
## OSErrors on Windows while installing dependencies
During a zip file installation or an update, installation stops with an error like this:
![broken-dependency-screenshot](./assets/troubleshooting/broken-dependency.png){:width="800px"}
To resolve this, re-install the application as described above.
## HuggingFace install failed due to invalid access token
Some HuggingFace models require you to authenticate using an [access token].
Invoke doesn't manage this token for you, but it's easy to set it up:
- Follow the instructions in the link above to create an access token. Copy it.
- Run the launcher script.
- Select option 2 (developer console).
- Paste the following command:
```sh
python -c "import huggingface_hub; huggingface_hub.login()"
```
- Paste your access token when prompted and press Enter. You won't see anything when you paste it.
- Type `n` if prompted about git credentials.
If you get an error, try the command again - maybe the token didn't paste correctly.
Once your token is set, start Invoke and try downloading the model again. The installer will automatically use the access token.
If the install still fails, you may not have access to the model.
## Stable Diffusion XL generation fails after trying to load UNet
InvokeAI is working in other respects, but when trying to generate
images with Stable Diffusion XL you get a "Server Error". The text log
in the launch window contains this log line above several more lines of
error messages:
`INFO --> Loading model:D:\LONG\PATH\TO\MODEL, type sdxl:main:unet`
This failure mode occurs when there is a network glitch during
downloading the very large SDXL model.
To address this, first go to the Model Manager and delete the
Stable-Diffusion-XL-base-1.X model. Then, click the HuggingFace tab,
paste the Repo ID stabilityai/stable-diffusion-xl-base-1.0 and install
the model.
## Package dependency conflicts during installation or update
If you have previously installed InvokeAI or another Stable Diffusion
package, the installer may occasionally pick up outdated libraries and
either the installer or `invoke` will fail with complaints about
library conflicts.
To resolve this, re-install the application as described above.
## Invalid configuration file
Everything seems to install ok, you get a `ValidationError` when starting up the app.
@@ -154,64 +58,9 @@ This is caused by an invalid setting in the `invokeai.yaml` configuration file.
Check the [configuration docs] for more detail about the settings and how to specify them.
## `ModuleNotFoundError: No module named 'controlnet_aux'`
## Out of Memory Errors
`controlnet_aux` is a dependency of Invoke and appears to have been packaged or distributed strangely. Sometimes, it doesn't install correctly. This is outside our control.
If you encounter this error, the solution is to remove the package from the `pip` cache and re-run the Invoke installer so a fresh, working version of `controlnet_aux` can be downloaded and installed:
- Run the Invoke launcher
- Choose the developer console option
- Run this command: `pip cache remove controlnet_aux`
- Close the terminal window
- Download and run the [installer][latest release], selecting your current install location
## Out of Memory Issues
The models are large, VRAM is expensive, and you may find yourself
faced with Out of Memory errors when generating images. Here are some
tips to reduce the problem:
!!! info "Optimizing for GPU VRAM"
=== "4GB VRAM GPU"
This should be adequate for 512x512 pixel images using Stable Diffusion 1.5
and derived models, provided that you do not use the NSFW checker. It won't be loaded unless you go into the UI settings and turn it on.
If you are on a CUDA-enabled GPU, we will automatically use xformers or torch-sdp to reduce VRAM requirements, though you can explicitly configure this. See the [configuration docs].
=== "6GB VRAM GPU"
This is a border case. Using the SD 1.5 series you should be able to
generate images up to 640x640 with the NSFW checker enabled, and up to
1024x1024 with it disabled.
If you run into persistent memory issues there are a series of
environment variables that you can set before launching InvokeAI that
alter how the PyTorch machine learning library manages memory. See
<https://pytorch.org/docs/stable/notes/cuda.html#memory-management> for
a list of these tweaks.
=== "12GB VRAM GPU"
This should be sufficient to generate larger images up to about 1280x1280.
## Checkpoint Models Load Slowly or Use Too Much RAM
The difference between diffusers models (a folder containing multiple
subfolders) and checkpoint models (a file ending with .safetensors or
.ckpt) is that InvokeAI is able to load diffusers models into memory
incrementally, while checkpoint models must be loaded all at
once. With very large models, or systems with limited RAM, you may
experience slowdowns and other memory-related issues when loading
checkpoint models.
To solve this, go to the Model Manager tab (the cube), select the
checkpoint model that's giving you trouble, and press the "Convert"
button in the upper right of your browser window. This will convert the
checkpoint into a diffusers model, after which loading should be
faster and less memory-intensive.
The models are large, VRAM is expensive, and you may find yourself faced with Out of Memory errors when generating images. Follow our [Low-VRAM mode guide](./features/low-vram.md) to configure Invoke to prevent these.
## Memory Leak (Linux)
@@ -253,8 +102,6 @@ Note the differences between memory allocated as chunks in an arena vs. memory a
[model install docs]: ./installation/models.md
[system requirements]: ./installation/requirements.md
[latest release]: https://github.com/invoke-ai/InvokeAI/releases/latest
[create an issue]: https://github.com/invoke-ai/InvokeAI/issues
[discord]: https://discord.gg/ZmtBAhwWhy
[configuration docs]: ./configuration.md
[access token]: https://huggingface.co/docs/hub/security-tokens#how-to-manage-user-access-tokens

Binary file not shown.

After

Width:  |  Height:  |  Size: 72 KiB

176
docs/features/low-vram.md Normal file
View File

@@ -0,0 +1,176 @@
---
title: Low-VRAM mode
---
As of v5.6.0, Invoke has a low-VRAM mode. It works on systems with dedicated GPUs (Nvidia GPUs on Windows/Linux and AMD GPUs on Linux).
This allows you to generate even if your GPU doesn't have enough VRAM to hold full models. Most users should be able to run even the beefiest models - like the ~24GB unquantised FLUX dev model.
## Enabling Low-VRAM mode
To enable Low-VRAM mode, add this line to your `invokeai.yaml` configuration file, then restart Invoke:
```yaml
enable_partial_loading: true
```
**Windows users should also [disable the Nvidia sysmem fallback](#disabling-nvidia-sysmem-fallback-windows-only)**.
It is possible to fine-tune the settings for best performance or if you still get out-of-memory errors (OOMs).
!!! tip "How to find `invokeai.yaml`"
The `invokeai.yaml` configuration file lives in your install directory. To access it, run the **Invoke Community Edition** launcher and click the install location. This will open your install directory in a file explorer window.
You'll see `invokeai.yaml` there and can edit it with any text editor. After making changes, restart Invoke.
If you don't see `invokeai.yaml`, launch Invoke once. It will create the file on its first startup.
## Details and fine-tuning
Low-VRAM mode involves 4 features, each of which can be configured or fine-tuned:
- Partial model loading (`enable_partial_loading`)
- PyTorch CUDA allocator config (`pytorch_cuda_alloc_conf`)
- Dynamic RAM and VRAM cache sizes (`max_cache_ram_gb`, `max_cache_vram_gb`)
- Working memory (`device_working_mem_gb`)
- Keeping a RAM weight copy (`keep_ram_copy_of_weights`)
Read on to learn about these features and understand how to fine-tune them for your system and use-cases.
### Partial model loading
Invoke's partial model loading works by streaming model "layers" between RAM and VRAM as they are needed.
When an operation needs layers that are not in VRAM, but there isn't enough room to load them, inactive layers are offloaded to RAM to make room.
#### Enabling partial model loading
As described above, you can enable partial model loading by adding this line to `invokeai.yaml`:
```yaml
enable_partial_loading: true
```
### PyTorch CUDA allocator config
The PyTorch CUDA allocator's behavior can be configured using the `pytorch_cuda_alloc_conf` config. Tuning the allocator configuration can help to reduce the peak reserved VRAM. The optimal configuration is dependent on many factors (e.g. device type, VRAM, CUDA driver version, etc.), but switching from PyTorch's native allocator to using CUDA's built-in allocator works well on many systems. To try this, add the following line to your `invokeai.yaml` file:
```yaml
pytorch_cuda_alloc_conf: "backend:cudaMallocAsync"
```
A more complete explanation of the available configuration options is [here](https://pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf).
### Dynamic RAM and VRAM cache sizes
Loading models from disk is slow and can be a major bottleneck for performance. Invoke uses two model caches - RAM and VRAM - to reduce loading from disk to a minimum.
By default, Invoke manages these caches' sizes dynamically for best performance.
#### Fine-tuning cache sizes
Prior to v5.6.0, the cache sizes were static, and for best performance, many users needed to manually fine-tune the `ram` and `vram` settings in `invokeai.yaml`.
As of v5.6.0, the caches are dynamically sized. The `ram` and `vram` settings are no longer used, and new settings are added to configure the cache.
**Most users will not need to fine-tune the cache sizes.**
But, if your GPU has enough VRAM to hold models fully, you might get a perf boost by manually setting the cache sizes in `invokeai.yaml`:
```yaml
# The default max cache RAM size is logged on InvokeAI startup. It is determined based on your system RAM / VRAM.
# You can override the default value by setting `max_cache_ram_gb`.
# Increasing `max_cache_ram_gb` will increase the amount of RAM used to cache inactive models, resulting in faster model
# reloads for the cached models.
# As an example, if your system has 32GB of RAM and no other heavy processes, setting the `max_cache_ram_gb` to 28GB
# might be a good value to achieve aggressive model caching.
max_cache_ram_gb: 28
# The default max cache VRAM size is adjusted dynamically based on the amount of available VRAM (taking into
# consideration the VRAM used by other processes).
# You can override the default value by setting `max_cache_vram_gb`.
# CAUTION: Most users should not manually set this value. See warning below.
max_cache_vram_gb: 16
```
!!! warning "Max safe value for `max_cache_vram_gb`"
Most users should not manually configure the `max_cache_vram_gb`. This configuration value takes precedence over the `device_working_mem_gb` and any operations that explicitly reserve additional working memory (e.g. VAE decode). As such, manually configuring it increases the likelihood of encountering out-of-memory errors.
For users who wish to configure `max_cache_vram_gb`, the max safe value can be determined by subtracting `device_working_mem_gb` from your GPU's VRAM. As described below, the default for `device_working_mem_gb` is 3GB.
For example, if you have a 12GB GPU, the max safe value for `max_cache_vram_gb` is `12GB - 3GB = 9GB`.
If you had increased `device_working_mem_gb` to 4GB, then the max safe value for `max_cache_vram_gb` is `12GB - 4GB = 8GB`.
Most users who override `max_cache_vram_gb` are doing so because they wish to use significantly less VRAM, and should be setting `max_cache_vram_gb` to a value significantly less than the 'max safe value'.
### Working memory
Invoke cannot use _all_ of your VRAM for model caching and loading. It requires some VRAM to use as working memory for various operations.
Invoke reserves 3GB VRAM as working memory by default, which is enough for most use-cases. However, it is possible to fine-tune this setting if you still get OOMs.
#### Fine-tuning working memory
You can increase the working memory size in `invokeai.yaml` to prevent OOMs:
```yaml
# The default is 3GB - bump it up to 4GB to prevent OOMs.
device_working_mem_gb: 4
```
!!! tip "Operations may request more working memory"
For some operations, we can determine VRAM requirements in advance and allocate additional working memory to prevent OOMs.
VAE decoding is one such operation. This operation converts the generation process's output into an image. For large image outputs, this might use more than the default working memory size of 3GB.
During this decoding step, Invoke calculates how much VRAM will be required to decode and requests that much VRAM from the model manager. If the amount exceeds the working memory size, the model manager will offload cached model layers from VRAM until there's enough VRAM to decode.
Once decoding completes, the model manager "reclaims" the extra VRAM allocated as working memory for future model loading operations.
### Keeping a RAM weight copy
Invoke has the option of keeping a RAM copy of all model weights, even when they are loaded onto the GPU. This optimization is _on_ by default, and enables faster model switching and LoRA patching. Disabling this feature will reduce the average RAM load while running Invoke (peak RAM likely won't change), at the cost of slower model switching and LoRA patching. If you have limited RAM, you can disable this optimization:
```yaml
# Set to false to reduce the average RAM usage at the cost of slower model switching and LoRA patching.
keep_ram_copy_of_weights: false
```
### Disabling Nvidia sysmem fallback (Windows only)
On Windows, Nvidia GPUs are able to use system RAM when their VRAM fills up via **sysmem fallback**. While it sounds like a good idea on the surface, in practice it causes massive slowdowns during generation.
It is strongly suggested to disable this feature:
- Open the **NVIDIA Control Panel** app.
- Expand **3D Settings** on the left panel.
- Click **Manage 3D Settings** in the left panel.
- Find **CUDA - Sysmem Fallback Policy** in the right panel and set it to **Prefer No Sysmem Fallback**.
![cuda-sysmem-fallback](./cuda-sysmem-fallback.png)
!!! tip "Invoke does the same thing, but better"
If the sysmem fallback feature sounds familiar, that's because Invoke's partial model loading strategy is conceptually very similar - use VRAM when there's room, else fall back to RAM.
Unfortunately, the Nvidia implementation is not optimized for applications like Invoke and does more harm than good.
## Troubleshooting
### Windows page file
Invoke has high virtual memory (a.k.a. 'committed memory') requirements. This can cause issues on Windows if the page file size limits are hit. (See this issue for the technical details on why this happens: https://github.com/invoke-ai/InvokeAI/issues/7563).
If you run out of page file space, InvokeAI may crash. Often, these crashes will happen with one of the following errors:
- InvokeAI exits with Windows error code `3221225477`
- InvokeAI crashes without an error, but `eventvwr.msc` reveals an error with code `0xc0000005` (the hex equivalent of `3221225477`)
If you are running out of page file space, try the following solutions:
- Make sure that you have sufficient disk space for the page file to grow. Watch your disk usage as Invoke runs. If it climbs near 100% leading up to the crash, then this is very likely the source of the issue. Clear out some disk space to resolve the issue.
- Make sure that your page file is set to "System managed size" (this is the default) rather than a custom size. Under the "System managed size" policy, the page file will grow dynamically as needed.

View File

@@ -50,11 +50,9 @@ title: Invoke
## Installation
The [installer script](installation/installer.md) is the easiest way to install and update the application.
The [Invoke Launcher](installation/quick_start.md) is the easiest way to install, update and run Invoke on Windows, macOS and Linux.
You can also install Invoke as python package [via PyPI](installation/manual.md) or [docker](installation/docker.md).
See the [installation section](./installation/index.md) for more information.
You can also install Invoke as [python package](installation/manual.md) or with [docker](installation/docker.md).
## Help

View File

@@ -4,7 +4,7 @@ title: Docker
!!! warning "macOS users"
Docker can not access the GPU on macOS, so your generation speeds will be slow. Use the [installer](./installer.md) instead.
Docker can not access the GPU on macOS, so your generation speeds will be slow. Use the [launcher](./quick_start.md) instead.
!!! tip "Linux and Windows Users"

View File

@@ -1,36 +0,0 @@
# Installation and Updating Overview
Before installing, review the [installation requirements](./requirements.md) to ensure your system is set up properly.
See the [FAQ](../faq.md) for frequently-encountered installation issues.
If you need more help, join our [discord](https://discord.gg/ZmtBAhwWhy) or [create a GitHub issue](https://github.com/invoke-ai/InvokeAI/issues).
## Automated Installer & Updates
✅ The automated [installer](./installer.md) is the best way to install Invoke.
⬆️ The same installer is also the best way to update Invoke - simply rerun it for the same folder you installed to.
The installation process simply manages installation for the core libraries & application dependencies that run Invoke.
Models, images, or other assets in the Invoke root folder won't be affected by the installation process.
## Manual Install
If you are familiar with python and want more control over the packages that are installed, you can [install Invoke manually via PyPI](./manual.md).
Updates are managed by reinstalling the latest version through PyPi.
## Developer Install
If you want to contribute to InvokeAI, you'll need to set up a [dev environment](../contributing/dev-environment.md).
## Docker
Invoke publishes docker images. See the [docker installation guide](./docker.md) for details.
## Other Installation Guides
- [PyPatchMatch](./patchmatch.md)
- [Installing Models](./models.md)

View File

@@ -1,115 +0,0 @@
# Automatic Install & Updates
!!! tip "Use the installer to update"
Using the installer for updates will not erase any of your data (images, models, boards, etc). It only updates the core libraries used to run Invoke.
Simply use the same path you installed to originally to update your existing installation.
Both release and pre-release versions can be installed using the installer. It also supports install through a wheel if needed.
Be sure to review the [installation requirements] and ensure your system has everything it needs to install Invoke.
## Getting the Latest Installer
Download the `InvokeAI-installer-vX.Y.Z.zip` file from the [latest release] page. It is at the bottom of the page, under **Assets**.
After unzipping the installer, you should have a `InvokeAI-Installer` folder with some files inside, including `install.bat` and `install.sh`.
## Running the Installer
!!! tip
Windows users should first double-click the `WinLongPathsEnabled.reg` file to prevent a failed installation due to long file paths.
Double-click the install script:
=== "Windows"
```sh
install.bat
```
=== "Linux/macOS"
```sh
install.sh
```
!!! info "Running the Installer from the commandline"
You can also run the install script from cmd/powershell (Windows) or terminal (Linux/macOS).
!!! warning "Untrusted Publisher (Windows)"
You may get a popup saying the file comes from an `Untrusted Publisher`. Click `More Info` and `Run Anyway` to get past this.
The installation process is simple, with a few prompts:
- Select the version to install. Unless you have a specific reason to install a specific version, select the default (the latest version).
- Select location for the install. Be sure you have enough space in this folder for the base application, as described in the [installation requirements].
- Select a GPU device.
!!! info "Slow Installation"
The installer needs to download several GB of data and install it all. It may appear to get stuck at 99.9% when installing `pytorch` or during a step labeled "Installing collected packages".
If it is stuck for over 10 minutes, something has probably gone wrong and you should close the window and restart.
## Running the Application
Find the install location you selected earlier. Double-click the launcher script to run the app:
=== "Windows"
```sh
invoke.bat
```
=== "Linux/macOS"
```sh
invoke.sh
```
Choose the first option to run the UI. After a series of startup messages, you'll see something like this:
```sh
Uvicorn running on http://127.0.0.1:9090 (Press CTRL+C to quit)
```
Copy the URL into your browser and you should see the UI.
## Improved Outpainting with PatchMatch
PatchMatch is an extra add-on that can improve outpainting. Windows users are in luck - it works out of the box.
On macOS and Linux, a few extra steps are needed to set it up. See the [PatchMatch installation guide](./patchmatch.md).
## First-time Setup
You will need to [install some models] before you can generate.
Check the [configuration docs] for details on configuring the application.
## Updating
Updating is exactly the same as installing - download the latest installer, choose the latest version, enter your existing installation path, and the app will update. None of your data (images, models, boards, etc) will be erased.
!!! info "Dependency Resolution Issues"
We've found that pip's dependency resolution can cause issues when upgrading packages. One very common problem was pip "downgrading" torch from CUDA to CPU, but things broke in other novel ways.
The installer doesn't have this kind of problem, so we use it for updating as well.
## Installation Issues
If you have installation issues, please review the [FAQ]. You can also [create an issue] or ask for help on [discord].
[installation requirements]: ./requirements.md
[FAQ]: ../faq.md
[install some models]: ./models.md
[configuration docs]: ../configuration.md
[latest release]: https://github.com/invoke-ai/InvokeAI/releases/latest
[create an issue]: https://github.com/invoke-ai/InvokeAI/issues
[discord]: https://discord.gg/ZmtBAhwWhy

View File

@@ -4,11 +4,11 @@
**Python experience is mandatory.**
If you want to use Invoke locally, you should probably use the [installer](./installer.md).
If you want to use Invoke locally, you should probably use the [launcher](./quick_start.md).
If you want to contribute to Invoke, instead follow the [dev environment](../contributing/dev-environment.md) guide.
If you want to contribute to Invoke or run the app on the latest dev branch, instead follow the [dev environment](../contributing/dev-environment.md) guide.
InvokeAI is distributed as a python package on PyPI, installable with `pip`. There are a few things that are handled by the installer and launcher that you'll need to manage manually, described in this guide.
InvokeAI is distributed as a python package on PyPI, installable with `pip`. There are a few things that are handled by the launcher that you'll need to manage manually, described in this guide.
## Requirements
@@ -16,43 +16,39 @@ Before you start, go through the [installation requirements](./requirements.md).
## Walkthrough
1. Create a directory to contain your InvokeAI library, configuration files, and models. This is known as the "runtime" or "root" directory, and typically lives in your home directory under the name `invokeai`.
We'll use [`uv`](https://github.com/astral-sh/uv) to install python and create a virtual environment, then install the `invokeai` package. `uv` is a modern, very fast alternative to `pip`.
The following commands vary depending on the version of Invoke being installed and the system onto which it is being installed.
1. Install `uv` as described in its [docs](https://docs.astral.sh/uv/getting-started/installation/#standalone-installer). We suggest using the standalone installer method.
Run `uv --version` to confirm that `uv` is installed and working. After installation, you may need to restart your terminal to get access to `uv`.
2. Create a directory for your installation, typically in your home directory (e.g. `~/invokeai` or `$Home/invokeai`):
=== "Linux/macOS"
```bash
mkdir ~/invokeai
cd ~/invokeai
```
=== "Windows (PowerShell)"
```bash
mkdir $Home/invokeai
```
1. Enter the root directory and create a virtual Python environment within it named `.venv`.
!!! warning "Virtual Environment Location"
While you may create the virtual environment anywhere in the file system, we recommend that you create it within the root directory as shown here. This allows the application to automatically detect its data directories.
If you choose a different location for the venv, then you _must_ set the `INVOKEAI_ROOT` environment variable or specify the root directory using the `--root` CLI arg.
=== "Linux/macOS"
```bash
cd ~/invokeai
python3 -m venv .venv --prompt InvokeAI
```
=== "Windows (PowerShell)"
```bash
cd $Home/invokeai
python3 -m venv .venv --prompt InvokeAI
```
1. Activate the new environment:
3. Create a virtual environment in that directory:
```sh
uv venv --relocatable --prompt invoke --python 3.12 --python-preference only-managed .venv
```
This command creates a portable virtual environment at `.venv` complete with a portable python 3.12. It doesn't matter if your system has no python installed, or has a different version - `uv` will handle everything.
4. Activate the virtual environment:
=== "Linux/macOS"
@@ -60,41 +56,55 @@ Before you start, go through the [installation requirements](./requirements.md).
source .venv/bin/activate
```
=== "Windows"
=== "Windows (PowerShell)"
```ps
.venv\Scripts\activate
```
!!! info "Permissions Error (Windows)"
5. Choose a version to install. Review the [GitHub releases page](https://github.com/invoke-ai/InvokeAI/releases).
If you get a permissions error at this point, run this command and try again.
6. Determine the package specifier to use when installing. This is a performance optimization.
`Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser`
- If you have an Nvidia 20xx series GPU or older, use `invokeai[xformers]`.
- If you have an Nvidia 30xx series GPU or newer, or do not have an Nvidia GPU, use `invokeai`.
The command-line prompt should change to to show `(InvokeAI)`, indicating the venv is active.
7. Determine the `PyPI` index URL to use for installation, if any. This is necessary to get the right version of torch installed.
1. Make sure that pip is installed in your virtual environment and up to date:
=== "Invoke v5.10.0 and later"
```bash
python3 -m pip install --upgrade pip
- If you are on Windows or Linux with an Nvidia GPU, use `https://download.pytorch.org/whl/cu126`.
- If you are on Linux with no GPU, use `https://download.pytorch.org/whl/cpu`.
- If you are on Linux with an AMD GPU, use `https://download.pytorch.org/whl/rocm6.2.4`.
- **In all other cases, do not use an index.**
=== "Invoke v5.0.0 to v5.9.1"
- If you are on Windows with an Nvidia GPU, use `https://download.pytorch.org/whl/cu124`.
- If you are on Linux with no GPU, use `https://download.pytorch.org/whl/cpu`.
- If you are on Linux with an AMD GPU, use `https://download.pytorch.org/whl/rocm6.1`.
- **In all other cases, do not use an index.**
=== "Invoke v4"
- If you are on Windows with an Nvidia GPU, use `https://download.pytorch.org/whl/cu124`.
- If you are on Linux with no GPU, use `https://download.pytorch.org/whl/cpu`.
- If you are on Linux with an AMD GPU, use `https://download.pytorch.org/whl/rocm5.2`.
- **In all other cases, do not use an index.**
8. Install the `invokeai` package. Substitute the package specifier and version.
```sh
uv pip install <PACKAGE_SPECIFIER>==<VERSION> --python 3.12 --python-preference only-managed --force-reinstall
```
1. Install the InvokeAI Package. The base command is `pip install InvokeAI --use-pep517`, but you may need to change this depending on your system and the desired features.
If you determined you needed to use a `PyPI` index URL in the previous step, you'll need to add `--index=<INDEX_URL>` like this:
- You may need to provide an [extra index URL](https://pip.pypa.io/en/stable/cli/pip_install/#cmdoption-extra-index-url). Select your platform configuration using [this tool on the PyTorch website](https://pytorch.org/get-started/locally/). Copy the `--extra-index-url` string from this and append it to your install command.
```sh
uv pip install <PACKAGE_SPECIFIER>==<VERSION> --python 3.12 --python-preference only-managed --index=<INDEX_URL> --force-reinstall
```
```bash
pip install InvokeAI --use-pep517 --extra-index-url https://download.pytorch.org/whl/cu121
```
- If you have a CUDA GPU and want to install with `xformers`, you need to add an option to the package name. Note that `xformers` is not strictly necessary. PyTorch includes an implementation of the SDP attention algorithm with similar performance for most GPUs.
```bash
pip install "InvokeAI[xformers]" --use-pep517
```
1. Deactivate and reactivate your venv so that the invokeai-specific commands become available in the environment:
9. Deactivate and reactivate your venv so that the invokeai-specific commands become available in the environment:
=== "Linux/macOS"
@@ -102,17 +112,31 @@ Before you start, go through the [installation requirements](./requirements.md).
deactivate && source .venv/bin/activate
```
=== "Windows"
=== "Windows (PowerShell)"
```ps
deactivate
.venv\Scripts\activate
```
1. Run the application:
10. Run the application, specifying the directory you created earlier as the root directory:
Run `invokeai-web` to start the UI. You must activate the virtual environment before running the app.
=== "Linux/macOS"
!!! warning
```bash
invokeai-web --root ~/invokeai
```
If the virtual environment is _not_ inside the root directory, then you _must_ specify the path to the root directory with `--root \path\to\invokeai` or the `INVOKEAI_ROOT` environment variable.
=== "Windows (PowerShell)"
```bash
invokeai-web --root $Home/invokeai
```
## Headless Install and Launch Scripts
If you run Invoke on a headless server, you might want to install and run Invoke on the command line.
We do not plan to maintain scripts to do this moving forward, instead focusing our dev resources on the GUI [launcher](../installation/quick_start.md).
You can create your own scripts for this by copying the handful of commands in this guide. `uv`'s [`pip` interface docs](https://docs.astral.sh/uv/reference/cli/#uv-pip-install) may be useful.

View File

@@ -0,0 +1,127 @@
# Invoke Community Edition Quick Start
Welcome to Invoke! Follow these steps to install, update, and get started creating.
## Step 1: System Requirements
Invoke runs on Windows 10+, macOS 14+ and Linux (Ubuntu 20.04+ is well-tested).
Hardware requirements vary significantly depending on model and image output size. The requirements below are rough guidelines.
- All Apple Silicon (M1, M2, etc) Macs work, but 16GB+ memory is recommended.
- AMD GPUs are supported on Linux only. The VRAM requirements are the same as Nvidia GPUs.
!!! info "Hardware Requirements (Windows/Linux)"
=== "SD1.5 - 512×512"
- GPU: Nvidia 10xx series or later, 4GB+ VRAM.
- Memory: At least 8GB RAM.
- Disk: 10GB for base installation plus 30GB for models.
=== "SDXL - 1024×1024"
- GPU: Nvidia 20xx series or later, 8GB+ VRAM.
- Memory: At least 16GB RAM.
- Disk: 10GB for base installation plus 100GB for models.
=== "FLUX - 1024×1024"
- GPU: Nvidia 20xx series or later, 10GB+ VRAM.
- Memory: At least 32GB RAM.
- Disk: 10GB for base installation plus 200GB for models.
More detail on system requirements can be found [here](./requirements.md).
## Step 2: Download
Download the most launcher for your operating system:
- [Download for Windows](https://download.invoke.ai/Invoke%20Community%20Edition.exe)
- [Download for macOS](https://download.invoke.ai/Invoke%20Community%20Edition.dmg)
- [Download for Linux](https://download.invoke.ai/Invoke%20Community%20Edition.AppImage)
## Step 3: Install or Update
Run the launcher you just downloaded, click **Install** and follow the instructions to get set up.
If you have an existing Invoke installation, you can select it and let the launcher manage the install. You'll be able to update or launch the installation.
!!! warning "Problem running the launcher on macOS"
macOS may not allow you to run the launcher. We are working to resolve this by signing the launcher executable. Until that is done, you can manually flag the launcher as safe:
- Open the **Invoke Community Edition.dmg** file.
- Drag the launcher to **Applications**.
- Open a terminal.
- Run `xattr -d 'com.apple.quarantine' /Applications/Invoke\ Community\ Edition.app`.
You should now be able to run the launcher.
## Step 4: Launch
Once installed, click **Finish**, then **Launch** to start Invoke.
The very first run after an installation or update will take a few extra moments to get ready.
!!! tip "Server Mode"
The launcher runs Invoke as a desktop application. You can enable **Server Mode** in the launcher's settings to disable this and instead access the UI through your web browser.
## Step 5: Install Models
With Invoke started up, you'll need to install some models.
The quickest way to get started is to install a **Starter Model** bundle. If you already have a model collection, Invoke can use it.
!!! info "Install Models"
=== "Install a Starter Model bundle"
1. Go to the **Models** tab.
2. Click **Starter Models** on the right.
3. Click one of the bundles to install its models. Refer to the [system requirements](#step-1-confirm-system-requirements) if you're unsure which model architecture will work for your system.
=== "Use my model collection"
4. Go to the **Models** tab.
5. Click **Scan Folder** on the right.
6. Paste the path to your models collection and click **Scan Folder**.
7. With **In-place install** enabled, Invoke will leave the model files where they are. If you disable this, **Invoke will move the models into its own folders**.
Youre now ready to start creating!
## Step 6: Learn the Basics
We recommend watching our [Getting Started Playlist](https://www.youtube.com/playlist?list=PLvWK1Kc8iXGrQy8r9TYg6QdUuJ5MMx-ZO). It covers essential features and workflows, including:
- Generating your first image.
- Using control layers and reference guides.
- Refining images with advanced workflows.
## Troubleshooting
If installation fails, retrying the install in Repair Mode may fix it. There's a checkbox to enable this on the Review step of the install flow.
If that doesn't fix it, [clearing the `uv` cache](https://docs.astral.sh/uv/reference/cli/#uv-cache-clean) might do the trick:
- Open and start the dev console (button at the bottom-left of the launcher).
- Run `uv cache clean`.
- Retry the installation. Enable Repair Mode for good measure.
If you are still unable to install, try installing to a different location and see if that works.
If you still have problems, ask for help on the Invoke [discord](https://discord.gg/ZmtBAhwWhy).
## Other Installation Methods
- You can install the Invoke application as a python package. See our [manual install](./manual.md) docs.
- You can run Invoke with docker. See our [docker install](./docker.md) docs.
## Need Help?
- Visit our [Support Portal](https://support.invoke.ai).
- Watch the [Getting Started Playlist](https://www.youtube.com/playlist?list=PLvWK1Kc8iXGrQy8r9TYg6QdUuJ5MMx-ZO).
- Join the conversation on [Discord][discord link].
[discord link]: https://discord.gg/ZmtBAhwWhy

View File

@@ -1,90 +1,35 @@
# Requirements
## GPU
Invoke runs on Windows 10+, macOS 14+ and Linux (Ubuntu 20.04+ is well-tested).
!!! warning "Problematic Nvidia GPUs"
## Hardware
We do not recommend these GPUs. They cannot operate with half precision, but have insufficient VRAM to generate 512x512 images at full precision.
Hardware requirements vary significantly depending on model and image output size.
- NVIDIA 10xx series cards such as the 1080 TI
- GTX 1650 series cards
- GTX 1660 series cards
The requirements below are rough guidelines for best performance. GPUs with less VRAM typically still work, if a bit slower. Follow the [Low-VRAM mode guide](./features/low-vram.md) to optimize performance.
Invoke runs best with a dedicated GPU, but will fall back to running on CPU, albeit much slower. You'll need a beefier GPU for SDXL.
- All Apple Silicon (M1, M2, etc) Macs work, but 16GB+ memory is recommended.
- AMD GPUs are supported on Linux only. The VRAM requirements are the same as Nvidia GPUs.
!!! example "Stable Diffusion 1.5"
!!! info "Hardware Requirements (Windows/Linux)"
=== "Nvidia"
=== "SD1.5 - 512×512"
```
Any GPU with at least 4GB VRAM.
```
- GPU: Nvidia 10xx series or later, 4GB+ VRAM.
- Memory: At least 8GB RAM.
- Disk: 10GB for base installation plus 30GB for models.
=== "AMD"
=== "SDXL - 1024×1024"
```
Any GPU with at least 4GB VRAM. Linux only.
```
- GPU: Nvidia 20xx series or later, 8GB+ VRAM.
- Memory: At least 16GB RAM.
- Disk: 10GB for base installation plus 100GB for models.
=== "Mac"
=== "FLUX - 1024×1024"
```
Any Apple Silicon Mac with at least 8GB memory.
```
!!! example "Stable Diffusion XL"
=== "Nvidia"
```
Any GPU with at least 8GB VRAM.
```
=== "AMD"
```
Any GPU with at least 16GB VRAM. Linux only.
```
=== "Mac"
```
Any Apple Silicon Mac with at least 16GB memory.
```
## RAM
At least 12GB of RAM.
## Disk
SSDs will, of course, offer the best performance.
The base application disk usage depends on the torch backend.
!!! example "Disk"
=== "Nvidia (CUDA)"
```
~6.5GB
```
=== "AMD (ROCm)"
```
~12GB
```
=== "Mac (MPS)"
```
~3.5GB
```
You'll need to set aside some space for images, depending on how much you generate. A couple GB is enough to get started.
You'll need a good chunk of space for models. Even if you only install the most popular models and the usual support models (ControlNet, IP Adapter ,etc), you will quickly hit 50GB of models.
- GPU: Nvidia 20xx series or later, 10GB+ VRAM.
- Memory: At least 32GB RAM.
- Disk: 10GB for base installation plus 200GB for models.
!!! info "`tmpfs` on Linux"
@@ -92,26 +37,32 @@ You'll need a good chunk of space for models. Even if you only install the most
## Python
Invoke requires python 3.10 or 3.11. If you don't already have one of these versions installed, we suggest installing 3.11, as it will be supported for longer.
!!! tip "The launcher installs python for you"
Check that your system has an up-to-date Python installed by running `python --version` in the terminal (Linux, macOS) or cmd/powershell (Windows).
You don't need to do this if you are installing with the [Invoke Launcher](./quick_start.md).
<h3>Installing Python (Windows)</h3>
Invoke requires python 3.10 through 3.12. If you don't already have one of these versions installed, we suggest installing 3.12, as it will be supported for longer.
- Install python 3.11 with [an official installer].
- The installer includes an option to add python to your PATH. Be sure to enable this. If you missed it, re-run the installer, choose to modify an existing installation, and tick that checkbox.
- You may need to install [Microsoft Visual C++ Redistributable].
Check that your system has an up-to-date Python installed by running `python3 --version` in the terminal (Linux, macOS) or cmd/powershell (Windows).
<h3>Installing Python (macOS)</h3>
!!! info "Installing Python"
- Install python 3.11 with [an official installer].
- If model installs fail with a certificate error, you may need to run this command (changing the python version to match what you have installed): `/Applications/Python\ 3.10/Install\ Certificates.command`
- If you haven't already, you will need to install the XCode CLI Tools by running `xcode-select --install` in a terminal.
=== "Windows"
<h3>Installing Python (Linux)</h3>
- Install python with [an official installer].
- The installer includes an option to add python to your PATH. Be sure to enable this. If you missed it, re-run the installer, choose to modify an existing installation, and tick that checkbox.
- You may need to install [Microsoft Visual C++ Redistributable].
- Follow the [linux install instructions], being sure to install python 3.11.
- You'll need to install `libglib2.0-0` and `libgl1-mesa-glx` for OpenCV to work. For example, on a Debian system: `sudo apt update && sudo apt install -y libglib2.0-0 libgl1-mesa-glx`
=== "macOS"
- Install python with [an official installer].
- If model installs fail with a certificate error, you may need to run this command (changing the python version to match what you have installed): `/Applications/Python\ 3.10/Install\ Certificates.command`
- If you haven't already, you will need to install the XCode CLI Tools by running `xcode-select --install` in a terminal.
=== "Linux"
- Installing python varies depending on your system. We recommend [using `uv` to manage your python installation](https://docs.astral.sh/uv/concepts/python-versions/#installing-a-python-version).
- You'll need to install `libglib2.0-0` and `libgl1-mesa-glx` for OpenCV to work. For example, on a Debian system: `sudo apt update && sudo apt install -y libglib2.0-0 libgl1-mesa-glx`
## Drivers
@@ -175,7 +126,4 @@ An alternative to installing ROCm locally is to use a [ROCm docker container] to
[ROCm Documentation]: https://rocm.docs.amd.com/projects/install-on-linux/en/latest/tutorial/quick-start.html
[cuDNN support matrix]: https://docs.nvidia.com/deeplearning/cudnn/support-matrix/index.html
[Nvidia Container Runtime]: https://developer.nvidia.com/container-runtime
[linux install instructions]: https://docs.python-guide.org/starting/install3/linux/
[Microsoft Visual C++ Redistributable]: https://learn.microsoft.com/en-US/cpp/windows/latest-supported-vc-redist?view=msvc-170
[an official installer]: https://www.python.org/downloads/
[CUDA Toolkit Downloads]: https://developer.nvidia.com/cuda-downloads

View File

@@ -49,6 +49,7 @@ To use a community workflow, download the `.json` node graph file and load it in
+ [BriaAI Background Remove](#briaai-remove-background)
+ [Remove Background](#remove-background)
+ [Retroize](#retroize)
+ [Stereogram](#stereogram-nodes)
+ [Size Stepper Nodes](#size-stepper-nodes)
+ [Simple Skin Detection](#simple-skin-detection)
+ [Text font to Image](#text-font-to-image)
@@ -526,6 +527,16 @@ View:
<img src="https://github.com/Ar7ific1al/InvokeAI_nodes_retroize/assets/2306586/de8b4fa6-324c-4c2d-b36c-297600c73974" width="500" />
--------------------------------
### Stereogram Nodes
**Description:** A set of custom nodes for InvokeAI to create cross-view or parallel-view stereograms. Stereograms are 2D images that, when viewed properly, reveal a 3D scene. Check out [r/crossview](https://www.reddit.com/r/CrossView/) for tutorials.
**Node Link:** https://github.com/simonfuhrmann/invokeai-stereo
**Example Workflow and Output**
</br><img src="https://raw.githubusercontent.com/simonfuhrmann/invokeai-stereo/refs/heads/main/docs/example_promo_03.jpg" width="600" />
--------------------------------
### Simple Skin Detection

Binary file not shown.

View File

@@ -1,128 +0,0 @@
@echo off
setlocal EnableExtensions EnableDelayedExpansion
@rem This script requires the user to install Python 3.10 or higher. All other
@rem requirements are downloaded as needed.
@rem change to the script's directory
PUSHD "%~dp0"
set "no_cache_dir=--no-cache-dir"
if "%1" == "use-cache" (
set "no_cache_dir="
)
@rem Config
@rem The version in the next line is replaced by an up to date release number
@rem when create_installer.sh is run. Change the release number there.
set INSTRUCTIONS=https://invoke-ai.github.io/InvokeAI/installation/INSTALL_AUTOMATED/
set TROUBLESHOOTING=https://invoke-ai.github.io/InvokeAI/help/FAQ/
set PYTHON_URL=https://www.python.org/downloads/windows/
set MINIMUM_PYTHON_VERSION=3.10.0
set PYTHON_URL=https://www.python.org/downloads/release/python-3109/
set err_msg=An error has occurred and the script could not continue.
@rem --------------------------- Intro -------------------------------
echo This script will install InvokeAI and its dependencies.
echo.
echo BEFORE YOU START PLEASE MAKE SURE TO DO THE FOLLOWING
echo 1. Install python 3.10 or 3.11. Python version 3.9 is no longer supported.
echo 2. Double-click on the file WinLongPathsEnabled.reg in order to
echo enable long path support on your system.
echo 3. Install the Visual C++ core libraries.
echo Please download and install the libraries from:
echo https://learn.microsoft.com/en-US/cpp/windows/latest-supported-vc-redist?view=msvc-170
echo.
echo See %INSTRUCTIONS% for more details.
echo.
echo FOR THE BEST USER EXPERIENCE WE SUGGEST MAXIMIZING THIS WINDOW NOW.
pause
@rem ---------------------------- check Python version ---------------
echo ***** Checking and Updating Python *****
call python --version >.tmp1 2>.tmp2
if %errorlevel% == 1 (
set err_msg=Please install Python 3.10-11. See %INSTRUCTIONS% for details.
goto err_exit
)
for /f "tokens=2" %%i in (.tmp1) do set python_version=%%i
if "%python_version%" == "" (
set err_msg=No python was detected on your system. Please install Python version %MINIMUM_PYTHON_VERSION% or higher. We recommend Python 3.10.12 from %PYTHON_URL%
goto err_exit
)
call :compareVersions %MINIMUM_PYTHON_VERSION% %python_version%
if %errorlevel% == 1 (
set err_msg=Your version of Python is too low. You need at least %MINIMUM_PYTHON_VERSION% but you have %python_version%. We recommend Python 3.10.12 from %PYTHON_URL%
goto err_exit
)
@rem Cleanup
del /q .tmp1 .tmp2
@rem -------------- Install and Configure ---------------
call python .\lib\main.py
pause
exit /b
@rem ------------------------ Subroutines ---------------
@rem routine to do comparison of semantic version numbers
@rem found at https://stackoverflow.com/questions/15807762/compare-version-numbers-in-batch-file
:compareVersions
::
:: Compares two version numbers and returns the result in the ERRORLEVEL
::
:: Returns 1 if version1 > version2
:: 0 if version1 = version2
:: -1 if version1 < version2
::
:: The nodes must be delimited by . or , or -
::
:: Nodes are normally strictly numeric, without a 0 prefix. A letter suffix
:: is treated as a separate node
::
setlocal enableDelayedExpansion
set "v1=%~1"
set "v2=%~2"
call :divideLetters v1
call :divideLetters v2
:loop
call :parseNode "%v1%" n1 v1
call :parseNode "%v2%" n2 v2
if %n1% gtr %n2% exit /b 1
if %n1% lss %n2% exit /b -1
if not defined v1 if not defined v2 exit /b 0
if not defined v1 exit /b -1
if not defined v2 exit /b 1
goto :loop
:parseNode version nodeVar remainderVar
for /f "tokens=1* delims=.,-" %%A in ("%~1") do (
set "%~2=%%A"
set "%~3=%%B"
)
exit /b
:divideLetters versionVar
for %%C in (a b c d e f g h i j k l m n o p q r s t u v w x y z) do set "%~1=!%~1:%%C=.%%C!"
exit /b
:err_exit
echo %err_msg%
echo The installer will exit now.
pause
exit /b
pause
:Trim
SetLocal EnableDelayedExpansion
set Params=%*
for /f "tokens=1*" %%a in ("!Params!") do EndLocal & set %1=%%b
exit /b

View File

@@ -1,40 +0,0 @@
#!/bin/bash
# make sure we are not already in a venv
# (don't need to check status)
deactivate >/dev/null 2>&1
scriptdir=$(dirname "$0")
cd $scriptdir
function version { echo "$@" | awk -F. '{ printf("%d%03d%03d%03d\n", $1,$2,$3,$4); }'; }
MINIMUM_PYTHON_VERSION=3.10.0
MAXIMUM_PYTHON_VERSION=3.11.100
PYTHON=""
for candidate in python3.11 python3.10 python3 python ; do
if ppath=`which $candidate 2>/dev/null`; then
# when using `pyenv`, the executable for an inactive Python version will exist but will not be operational
# we check that this found executable can actually run
if [ $($candidate --version &>/dev/null; echo ${PIPESTATUS}) -gt 0 ]; then continue; fi
python_version=$($ppath -V | awk '{ print $2 }')
if [ $(version $python_version) -ge $(version "$MINIMUM_PYTHON_VERSION") ]; then
if [ $(version $python_version) -le $(version "$MAXIMUM_PYTHON_VERSION") ]; then
PYTHON=$ppath
break
fi
fi
fi
done
if [ -z "$PYTHON" ]; then
echo "A suitable Python interpreter could not be found"
echo "Please install Python $MINIMUM_PYTHON_VERSION or higher (maximum $MAXIMUM_PYTHON_VERSION) before running this script. See instructions at $INSTRUCTIONS for help."
read -p "Press any key to exit"
exit -1
fi
echo "For the best user experience we suggest enlarging or maximizing this window now."
exec $PYTHON ./lib/main.py ${@}
read -p "Press any key to exit"

View File

@@ -1,438 +0,0 @@
# Copyright (c) 2023 Eugene Brodsky (https://github.com/ebr)
"""
InvokeAI installer script
"""
import locale
import os
import platform
import re
import shutil
import subprocess
import sys
import venv
from pathlib import Path
from tempfile import TemporaryDirectory
from typing import Optional, Tuple
SUPPORTED_PYTHON = ">=3.10.0,<=3.11.100"
INSTALLER_REQS = ["rich", "semver", "requests", "plumbum", "prompt-toolkit"]
BOOTSTRAP_VENV_PREFIX = "invokeai-installer-tmp"
DOCS_URL = "https://invoke-ai.github.io/InvokeAI/"
DISCORD_URL = "https://discord.gg/ZmtBAhwWhy"
OS = platform.uname().system
ARCH = platform.uname().machine
VERSION = "latest"
def get_version_from_wheel_filename(wheel_filename: str) -> str:
match = re.search(r"-(\d+\.\d+\.\d+)", wheel_filename)
if match:
version = match.group(1)
return version
else:
raise ValueError(f"Could not extract version from wheel filename: {wheel_filename}")
class Installer:
"""
Deploys an InvokeAI installation into a given path
"""
reqs: list[str] = INSTALLER_REQS
def __init__(self) -> None:
if os.getenv("VIRTUAL_ENV") is not None:
print("A virtual environment is already activated. Please 'deactivate' before installation.")
sys.exit(-1)
self.bootstrap()
self.available_releases = get_github_releases()
def mktemp_venv(self) -> TemporaryDirectory[str]:
"""
Creates a temporary virtual environment for the installer itself
:return: path to the created virtual environment directory
:rtype: TemporaryDirectory
"""
# Cleaning up temporary directories on Windows results in a race condition
# and a stack trace.
# `ignore_cleanup_errors` was only added in Python 3.10
if OS == "Windows" and int(platform.python_version_tuple()[1]) >= 10:
venv_dir = TemporaryDirectory(prefix=BOOTSTRAP_VENV_PREFIX, ignore_cleanup_errors=True)
else:
venv_dir = TemporaryDirectory(prefix=BOOTSTRAP_VENV_PREFIX)
venv.create(venv_dir.name, with_pip=True)
self.venv_dir = venv_dir
set_sys_path(Path(venv_dir.name))
return venv_dir
def bootstrap(self, verbose: bool = False) -> TemporaryDirectory[str] | None:
"""
Bootstrap the installer venv with packages required at install time
"""
print("Initializing the installer. This may take a minute - please wait...")
venv_dir = self.mktemp_venv()
pip = get_pip_from_venv(Path(venv_dir.name))
cmd = [pip, "install", "--require-virtualenv", "--use-pep517"]
cmd.extend(self.reqs)
try:
# upgrade pip to the latest version to avoid a confusing message
res = upgrade_pip(Path(venv_dir.name))
if verbose:
print(res)
# run the install prerequisites installation
res = subprocess.check_output(cmd).decode()
if verbose:
print(res)
return venv_dir
except subprocess.CalledProcessError as e:
print(e)
def app_venv(self, venv_parent: Path) -> Path:
"""
Create a virtualenv for the InvokeAI installation
"""
venv_dir = venv_parent / ".venv"
# Prefer to copy python executables
# so that updates to system python don't break InvokeAI
try:
venv.create(venv_dir, with_pip=True)
# If installing over an existing environment previously created with symlinks,
# the executables will fail to copy. Keep symlinks in that case
except shutil.SameFileError:
venv.create(venv_dir, with_pip=True, symlinks=True)
return venv_dir
def install(
self,
root: str = "~/invokeai",
yes_to_all: bool = False,
find_links: Optional[str] = None,
wheel: Optional[Path] = None,
) -> None:
"""Install the InvokeAI application into the given runtime path
Args:
root: Destination path for the installation
yes_to_all: Accept defaults to all questions
find_links: A local directory to search for requirement wheels before going to remote indexes
wheel: A wheel file to install
"""
import messages
if wheel:
messages.installing_from_wheel(wheel.name)
version = get_version_from_wheel_filename(wheel.name)
else:
messages.welcome(self.available_releases)
version = messages.choose_version(self.available_releases)
auto_dest = Path(os.environ.get("INVOKEAI_ROOT", root)).expanduser().resolve()
destination = auto_dest if yes_to_all else messages.dest_path(root)
if destination is None:
print("Could not find or create the destination directory. Installation cancelled.")
sys.exit(0)
# create the venv for the app
self.venv = self.app_venv(venv_parent=destination)
self.instance = InvokeAiInstance(runtime=destination, venv=self.venv, version=version)
# install dependencies and the InvokeAI application
(extra_index_url, optional_modules) = get_torch_source() if not yes_to_all else (None, None)
self.instance.install(extra_index_url, optional_modules, find_links, wheel)
# install the launch/update scripts into the runtime directory
self.instance.install_user_scripts()
message = f"""
*** Installation Successful ***
To start the application, run:
{destination}/invoke.{"bat" if sys.platform == "win32" else "sh"}
For more information, troubleshooting and support, visit our docs at:
{DOCS_URL}
Join the community on Discord:
{DISCORD_URL}
"""
print(message)
class InvokeAiInstance:
"""
Manages an installed instance of InvokeAI, comprising a virtual environment and a runtime directory.
The virtual environment *may* reside within the runtime directory.
A single runtime directory *may* be shared by multiple virtual environments, though this isn't currently tested or supported.
"""
def __init__(self, runtime: Path, venv: Path, version: str = "stable") -> None:
self.runtime = runtime
self.venv = venv
self.pip = get_pip_from_venv(venv)
self.version = version
set_sys_path(venv)
os.environ["INVOKEAI_ROOT"] = str(self.runtime.expanduser().resolve())
os.environ["VIRTUAL_ENV"] = str(self.venv.expanduser().resolve())
upgrade_pip(venv)
def get(self) -> tuple[Path, Path]:
"""
Get the location of the virtualenv directory for this installation
:return: Paths of the runtime and the venv directory
:rtype: tuple[Path, Path]
"""
return (self.runtime, self.venv)
def install(
self,
extra_index_url: Optional[str] = None,
optional_modules: Optional[str] = None,
find_links: Optional[str] = None,
wheel: Optional[Path] = None,
):
"""Install the package from PyPi or a wheel, if provided.
Args:
extra_index_url: the "--extra-index-url ..." line for pip to look in extra indexes.
optional_modules: optional modules to install using "[module1,module2]" format.
find_links: path to a directory containing wheels to be searched prior to going to the internet
wheel: a wheel file to install
"""
import messages
# not currently used, but may be useful for "install most recent version" option
if self.version == "prerelease":
version = None
pre_flag = "--pre"
elif self.version == "stable":
version = None
pre_flag = None
else:
version = self.version
pre_flag = None
src = "invokeai"
if optional_modules:
src += optional_modules
if version:
src += f"=={version}"
messages.simple_banner("Installing the InvokeAI Application :art:")
from plumbum import FG, ProcessExecutionError, local
pip = local[self.pip]
# Uninstall xformers if it is present; the correct version of it will be reinstalled if needed
_ = pip["uninstall", "-yqq", "xformers"] & FG
pipeline = pip[
"install",
"--require-virtualenv",
"--force-reinstall",
"--use-pep517",
str(src) if not wheel else str(wheel),
"--find-links" if find_links is not None else None,
find_links,
"--extra-index-url" if extra_index_url is not None else None,
extra_index_url,
pre_flag if not wheel else None, # Ignore the flag if we are installing a wheel
]
try:
_ = pipeline & FG
except ProcessExecutionError as e:
print(f"Error: {e}")
print(
"Could not install InvokeAI. Please try downloading the latest version of the installer and install again."
)
sys.exit(1)
def install_user_scripts(self):
"""
Copy the launch and update scripts to the runtime dir
"""
ext = "bat" if OS == "Windows" else "sh"
scripts = ["invoke"]
for script in scripts:
src = Path(__file__).parent / ".." / "templates" / f"{script}.{ext}.in"
dest = self.runtime / f"{script}.{ext}"
shutil.copy(src, dest)
os.chmod(dest, 0o0755)
### Utility functions ###
def get_pip_from_venv(venv_path: Path) -> str:
"""
Given a path to a virtual environment, get the absolute path to the `pip` executable
in a cross-platform fashion. Does not validate that the pip executable
actually exists in the virtualenv.
:param venv_path: Path to the virtual environment
:type venv_path: Path
:return: Absolute path to the pip executable
:rtype: str
"""
pip = "Scripts\\pip.exe" if OS == "Windows" else "bin/pip"
return str(venv_path.expanduser().resolve() / pip)
def upgrade_pip(venv_path: Path) -> str | None:
"""
Upgrade the pip executable in the given virtual environment
"""
python = "Scripts\\python.exe" if OS == "Windows" else "bin/python"
python = str(venv_path.expanduser().resolve() / python)
try:
result = subprocess.check_output([python, "-m", "pip", "install", "--upgrade", "pip"]).decode(
encoding=locale.getpreferredencoding()
)
except subprocess.CalledProcessError as e:
print(e)
result = None
return result
def set_sys_path(venv_path: Path) -> None:
"""
Given a path to a virtual environment, set the sys.path, in a cross-platform fashion,
such that packages from the given venv may be imported in the current process.
Ensure that the packages from system environment are not visible (emulate
the virtual env 'activate' script) - this doesn't work on Windows yet.
:param venv_path: Path to the virtual environment
:type venv_path: Path
"""
# filter out any paths in sys.path that may be system- or user-wide
# but leave the temporary bootstrap virtualenv as it contains packages we
# temporarily need at install time
sys.path = list(filter(lambda p: not p.endswith("-packages") or p.find(BOOTSTRAP_VENV_PREFIX) != -1, sys.path))
# determine site-packages/lib directory location for the venv
lib = "Lib" if OS == "Windows" else f"lib/python{sys.version_info.major}.{sys.version_info.minor}"
# add the site-packages location to the venv
sys.path.append(str(Path(venv_path, lib, "site-packages").expanduser().resolve()))
def get_github_releases() -> tuple[list[str], list[str]] | None:
"""
Query Github for published (pre-)release versions.
Return a tuple where the first element is a list of stable releases and the second element is a list of pre-releases.
Return None if the query fails for any reason.
"""
import requests
## get latest releases using github api
url = "https://api.github.com/repos/invoke-ai/InvokeAI/releases"
releases: list[str] = []
pre_releases: list[str] = []
try:
res = requests.get(url)
res.raise_for_status()
tag_info = res.json()
for tag in tag_info:
if not tag["prerelease"]:
releases.append(tag["tag_name"].lstrip("v"))
else:
pre_releases.append(tag["tag_name"].lstrip("v"))
except requests.HTTPError as e:
print(f"Error: {e}")
print("Could not fetch version information from GitHub. Please check your network connection and try again.")
return
except Exception as e:
print(f"Error: {e}")
print("An unexpected error occurred while trying to fetch version information from GitHub. Please try again.")
return
releases.sort(reverse=True)
pre_releases.sort(reverse=True)
return releases, pre_releases
def get_torch_source() -> Tuple[str | None, str | None]:
"""
Determine the extra index URL for pip to use for torch installation.
This depends on the OS and the graphics accelerator in use.
This is only applicable to Windows and Linux, since PyTorch does not
offer accelerated builds for macOS.
Prefer CUDA-enabled wheels if the user wasn't sure of their GPU, as it will fallback to CPU if possible.
A NoneType return means just go to PyPi.
:return: tuple consisting of (extra index url or None, optional modules to load or None)
:rtype: list
"""
from messages import GpuType, select_gpu
# device can be one of: "cuda", "rocm", "cpu", "cuda_and_dml, autodetect"
device = select_gpu()
# The correct extra index URLs for torch are inconsistent, see https://pytorch.org/get-started/locally/#start-locally
url = None
optional_modules: str | None = None
if OS == "Linux":
if device == GpuType.ROCM:
url = "https://download.pytorch.org/whl/rocm6.1"
elif device == GpuType.CPU:
url = "https://download.pytorch.org/whl/cpu"
elif device == GpuType.CUDA:
url = "https://download.pytorch.org/whl/cu124"
optional_modules = "[onnx-cuda]"
elif device == GpuType.CUDA_WITH_XFORMERS:
url = "https://download.pytorch.org/whl/cu124"
optional_modules = "[xformers,onnx-cuda]"
elif OS == "Windows":
if device == GpuType.CUDA:
url = "https://download.pytorch.org/whl/cu124"
optional_modules = "[onnx-cuda]"
elif device == GpuType.CUDA_WITH_XFORMERS:
url = "https://download.pytorch.org/whl/cu124"
optional_modules = "[xformers,onnx-cuda]"
elif device.value == "cpu":
# CPU uses the default PyPi index, no optional modules
pass
elif OS == "Darwin":
# macOS uses the default PyPi index, no optional modules
pass
# Fall back to defaults
return (url, optional_modules)

View File

@@ -1,57 +0,0 @@
"""
InvokeAI Installer
"""
import argparse
import os
from pathlib import Path
from installer import Installer
if __name__ == "__main__":
parser = argparse.ArgumentParser()
parser.add_argument(
"-r",
"--root",
dest="root",
type=str,
help="Destination path for installation",
default=os.environ.get("INVOKEAI_ROOT") or "~/invokeai",
)
parser.add_argument(
"-y",
"--yes",
"--yes-to-all",
dest="yes_to_all",
action="store_true",
help="Assume default answers to all questions",
default=False,
)
parser.add_argument(
"--find-links",
dest="find_links",
help="Specifies a directory of local wheel files to be searched prior to searching the online repositories.",
type=Path,
default=None,
)
parser.add_argument(
"--wheel",
dest="wheel",
help="Specifies a wheel for the InvokeAI package. Used for troubleshooting or testing prereleases.",
type=Path,
default=None,
)
args = parser.parse_args()
inst = Installer()
try:
inst.install(**args.__dict__)
except KeyboardInterrupt:
print("\n")
print("Ctrl-C pressed. Aborting.")
print("Come back soon!")

View File

@@ -1,342 +0,0 @@
# Copyright (c) 2023 Eugene Brodsky (https://github.com/ebr)
"""
Installer user interaction
"""
import os
import platform
from enum import Enum
from pathlib import Path
from typing import Optional
from prompt_toolkit import prompt
from prompt_toolkit.completion import FuzzyWordCompleter, PathCompleter
from prompt_toolkit.validation import Validator
from rich import box, print
from rich.console import Console, Group, group
from rich.panel import Panel
from rich.prompt import Confirm
from rich.style import Style
from rich.syntax import Syntax
from rich.text import Text
OS = platform.uname().system
ARCH = platform.uname().machine
if OS == "Windows":
# Windows terminals look better without a background colour
console = Console(style=Style(color="grey74"))
else:
console = Console(style=Style(color="grey74", bgcolor="grey19"))
def welcome(available_releases: tuple[list[str], list[str]] | None = None) -> None:
@group()
def text():
if (platform_specific := _platform_specific_help()) is not None:
yield platform_specific
yield ""
yield Text.from_markup(
"Some of the installation steps take a long time to run. Please be patient. If the script appears to hang for more than 10 minutes, please interrupt with [i]Control-C[/] and retry.",
justify="center",
)
if available_releases is not None:
latest_stable = available_releases[0][0]
last_pre = available_releases[1][0]
yield ""
yield Text.from_markup(
f"[red3]🠶[/] Latest stable release (recommended): [b bright_white]{latest_stable}", justify="center"
)
yield Text.from_markup(
f"[red3]🠶[/] Last published pre-release version: [b bright_white]{last_pre}", justify="center"
)
console.rule()
print(
Panel(
title="[bold wheat1]Welcome to the InvokeAI Installer",
renderable=text(),
box=box.DOUBLE,
expand=True,
padding=(1, 2),
style=Style(bgcolor="grey23", color="orange1"),
subtitle=f"[bold grey39]{OS}-{ARCH}",
)
)
console.line()
def installing_from_wheel(wheel_filename: str) -> None:
"""Display a message about installing from a wheel"""
@group()
def text():
yield Text.from_markup(f"You are installing from a wheel file: [bold]{wheel_filename}\n")
yield Text.from_markup(
"[bold orange3]If you are not sure why you are doing this, you should cancel and install InvokeAI normally."
)
console.print(
Panel(
title="Installing from Wheel",
renderable=text(),
box=box.DOUBLE,
expand=True,
padding=(1, 2),
)
)
should_proceed = Confirm.ask("Do you want to proceed?")
if not should_proceed:
console.print("Installation cancelled.")
exit()
def choose_version(available_releases: tuple[list[str], list[str]] | None = None) -> str:
"""
Prompt the user to choose an Invoke version to install
"""
# short circuit if we couldn't get a version list
# still try to install the latest stable version
if available_releases is None:
return "stable"
console.print(":grey_question: [orange3]Please choose an Invoke version to install.")
choices = available_releases[0] + available_releases[1]
response = prompt(
message=f" <Enter> to install the recommended release ({choices[0]}). <Tab> or type to pick a version: ",
complete_while_typing=True,
completer=FuzzyWordCompleter(choices),
)
console.print(f" Version {choices[0] if response == '' else response} will be installed.")
console.line()
return "stable" if response == "" else response
def confirm_install(dest: Path) -> bool:
if dest.exists():
print(f":stop_sign: Directory {dest} already exists!")
print(" Is this location correct?")
default = False
else:
print(f":file_folder: InvokeAI will be installed in {dest}")
default = True
dest_confirmed = Confirm.ask(" Please confirm:", default=default)
console.line()
return dest_confirmed
def dest_path(dest: Optional[str | Path] = None) -> Path | None:
"""
Prompt the user for the destination path and create the path
:param dest: a filesystem path, defaults to None
:type dest: str, optional
:return: absolute path to the created installation directory
:rtype: Path
"""
if dest is not None:
dest = Path(dest).expanduser().resolve()
else:
dest = Path.cwd().expanduser().resolve()
prev_dest = init_path = dest
dest_confirmed = False
while not dest_confirmed:
browse_start = (dest or Path.cwd()).expanduser().resolve()
path_completer = PathCompleter(
only_directories=True,
expanduser=True,
get_paths=lambda: [str(browse_start)], # noqa: B023
# get_paths=lambda: [".."].extend(list(browse_start.iterdir()))
)
console.line()
console.print(f":grey_question: [orange3]Please select the install destination:[/] \\[{browse_start}]: ")
selected = prompt(
">>> ",
complete_in_thread=True,
completer=path_completer,
default=str(browse_start) + os.sep,
vi_mode=True,
complete_while_typing=True,
# Test that this is not needed on Windows
# complete_style=CompleteStyle.READLINE_LIKE,
)
prev_dest = dest
dest = Path(selected)
console.line()
dest_confirmed = confirm_install(dest.expanduser().resolve())
if not dest_confirmed:
dest = prev_dest
dest = dest.expanduser().resolve()
try:
dest.mkdir(exist_ok=True, parents=True)
return dest
except PermissionError:
console.print(
f"Failed to create directory {dest} due to insufficient permissions",
style=Style(color="red"),
highlight=True,
)
except OSError:
console.print_exception()
if Confirm.ask("Would you like to try again?"):
dest_path(init_path)
else:
console.rule("Goodbye!")
class GpuType(Enum):
CUDA_WITH_XFORMERS = "xformers"
CUDA = "cuda"
ROCM = "rocm"
CPU = "cpu"
def select_gpu() -> GpuType:
"""
Prompt the user to select the GPU driver
"""
if ARCH == "arm64" and OS != "Darwin":
print(f"Only CPU acceleration is available on {ARCH} architecture. Proceeding with that.")
return GpuType.CPU
nvidia = (
"an [gold1 b]NVIDIA[/] RTX 3060 or newer GPU using CUDA",
GpuType.CUDA,
)
vintage_nvidia = (
"an [gold1 b]NVIDIA[/] RTX 20xx or older GPU using CUDA+xFormers",
GpuType.CUDA_WITH_XFORMERS,
)
amd = (
"an [gold1 b]AMD[/] GPU using ROCm",
GpuType.ROCM,
)
cpu = (
"Do not install any GPU support, use CPU for generation (slow)",
GpuType.CPU,
)
options = []
if OS == "Windows":
options = [nvidia, vintage_nvidia, cpu]
if OS == "Linux":
options = [nvidia, vintage_nvidia, amd, cpu]
elif OS == "Darwin":
options = [cpu]
if len(options) == 1:
return options[0][1]
options = {str(i): opt for i, opt in enumerate(options, 1)}
console.rule(":space_invader: GPU (Graphics Card) selection :space_invader:")
console.print(
Panel(
Group(
"\n".join(
[
f"Detected the [gold1]{OS}-{ARCH}[/] platform",
"",
"See [deep_sky_blue1]https://invoke-ai.github.io/InvokeAI/installation/requirements/[/] to ensure your system meets the minimum requirements.",
"",
"[red3]🠶[/] [b]Your GPU drivers must be correctly installed before using InvokeAI![/] [red3]🠴[/]",
]
),
"",
"Please select the type of GPU installed in your computer.",
Panel(
"\n".join([f"[dark_goldenrod b i]{i}[/] [dark_red]🢒[/]{opt[0]}" for (i, opt) in options.items()]),
box=box.MINIMAL,
),
),
box=box.MINIMAL,
padding=(1, 1),
)
)
choice = prompt(
"Please make your selection: ",
validator=Validator.from_callable(
lambda n: n in options.keys(), error_message="Please select one the above options"
),
)
return options[choice][1]
def simple_banner(message: str) -> None:
"""
A simple banner with a message, defined here for styling consistency
:param message: The message to display
:type message: str
"""
console.rule(message)
# TODO this does not yet work correctly
def windows_long_paths_registry() -> None:
"""
Display a message about applying the Windows long paths registry fix
"""
with open(str(Path(__file__).parent / "WinLongPathsEnabled.reg"), "r", encoding="utf-16le") as code:
syntax = Syntax(code.read(), line_numbers=True, lexer="regedit")
console.print(
Panel(
Group(
"\n".join(
[
"We will now apply a registry fix to enable long paths on Windows. InvokeAI needs this to function correctly. We are asking your permission to modify the Windows Registry on your behalf.",
"",
"This is the change that will be applied:",
str(syntax),
]
)
),
title="Windows Long Paths registry fix",
box=box.HORIZONTALS,
padding=(1, 1),
)
)
def _platform_specific_help() -> Text | None:
if OS == "Darwin":
text = Text.from_markup(
"""[b wheat1]macOS Users![/]\n\nPlease be sure you have the [b wheat1]Xcode command-line tools[/] installed before continuing.\nIf not, cancel with [i]Control-C[/] and follow the Xcode install instructions at [deep_sky_blue1]https://www.freecodecamp.org/news/install-xcode-command-line-tools/[/]."""
)
elif OS == "Windows":
text = Text.from_markup(
"""[b wheat1]Windows Users![/]\n\nBefore you start, please do the following:
1. Double-click on the file [b wheat1]WinLongPathsEnabled.reg[/] in order to
enable long path support on your system.
2. Make sure you have the [b wheat1]Visual C++ core libraries[/] installed. If not, install from
[deep_sky_blue1]https://learn.microsoft.com/en-US/cpp/windows/latest-supported-vc-redist?view=msvc-170[/]"""
)
else:
return
return text

View File

@@ -1,52 +0,0 @@
InvokeAI
Project homepage: https://github.com/invoke-ai/InvokeAI
Preparations:
You will need to install Python 3.10 or higher for this installer
to work. Instructions are given here:
https://invoke-ai.github.io/InvokeAI/installation/INSTALL_AUTOMATED/
Before you start the installer, please open up your system's command
line window (Terminal or Command) and type the commands:
python --version
If all is well, it will print "Python 3.X.X", where the version number
is at least 3.10.*, and not higher than 3.11.*.
If this works, check the version of the Python package manager, pip:
pip --version
You should get a message that indicates that the pip package
installer was derived from Python 3.10 or 3.11. For example:
"pip 22.0.1 from /usr/bin/pip (python 3.10)"
Long Paths on Windows:
If you are on Windows, you will need to enable Windows Long Paths to
run InvokeAI successfully. If you're not sure what this is, you
almost certainly need to do this.
Simply double-click the "WinLongPathsEnabled.reg" file located in
this directory, and approve the Windows warnings. Note that you will
need to have admin privileges in order to do this.
Launching the installer:
Windows: double-click the 'install.bat' file (while keeping it inside
the InvokeAI-Installer folder).
Linux and Mac: Please open the terminal application and run
'./install.sh' (while keeping it inside the InvokeAI-Installer
folder).
The installer will create a directory of your choice and install the
InvokeAI application within it. This directory contains everything you need to run
invokeai. Once InvokeAI is up and running, you may delete the
InvokeAI-Installer folder at your convenience.
For more information, please see
https://invoke-ai.github.io/InvokeAI/installation/INSTALL_AUTOMATED/

View File

@@ -1,54 +0,0 @@
@echo off
PUSHD "%~dp0"
setlocal
call .venv\Scripts\activate.bat
set INVOKEAI_ROOT=.
:start
echo Desired action:
echo 1. Generate images with the browser-based interface
echo 2. Open the developer console
echo 3. Command-line help
echo Q - Quit
echo.
echo To update, download and run the installer from https://github.com/invoke-ai/InvokeAI/releases/latest
echo.
set /P choice="Please enter 1-4, Q: [1] "
if not defined choice set choice=1
IF /I "%choice%" == "1" (
echo Starting the InvokeAI browser-based UI..
python .venv\Scripts\invokeai-web.exe %*
) ELSE IF /I "%choice%" == "2" (
echo Developer Console
echo Python command is:
where python
echo Python version is:
python --version
echo *************************
echo You are now in the system shell, with the local InvokeAI Python virtual environment activated,
echo so that you can troubleshoot this InvokeAI installation as necessary.
echo *************************
echo *** Type `exit` to quit this shell and deactivate the Python virtual environment ***
call cmd /k
) ELSE IF /I "%choice%" == "3" (
echo Displaying command line help...
python .venv\Scripts\invokeai-web.exe --help %*
pause
exit /b
) ELSE IF /I "%choice%" == "q" (
echo Goodbye!
goto ending
) ELSE (
echo Invalid selection
pause
exit /b
)
goto start
endlocal
pause
:ending
exit /b

View File

@@ -1,87 +0,0 @@
#!/bin/bash
# MIT License
# Coauthored by Lincoln Stein, Eugene Brodsky and Joshua Kimsey
# Copyright 2023, The InvokeAI Development Team
####
# This launch script assumes that:
# 1. it is located in the runtime directory,
# 2. the .venv is also located in the runtime directory and is named exactly that
#
# If both of the above are not true, this script will likely not work as intended.
# Activate the virtual environment and run `invoke.py` directly.
####
set -eu
# Ensure we're in the correct folder in case user's CWD is somewhere else
scriptdir=$(dirname $(readlink -f "$0"))
cd "$scriptdir"
. .venv/bin/activate
export INVOKEAI_ROOT="$scriptdir"
# Stash the CLI args - when we prompt for user input, `$@` is overwritten
PARAMS=$@
# This setting allows torch to fall back to CPU for operations that are not supported by MPS on macOS.
if [ "$(uname -s)" == "Darwin" ]; then
export PYTORCH_ENABLE_MPS_FALLBACK=1
fi
# Primary function for the case statement to determine user input
do_choice() {
case $1 in
1)
clear
printf "Generate images with a browser-based interface\n"
invokeai-web $PARAMS
;;
2)
clear
printf "Open the developer console\n"
file_name=$(basename "${BASH_SOURCE[0]}")
bash --init-file "$file_name"
;;
3)
clear
printf "Command-line help\n"
invokeai-web --help
;;
*)
clear
printf "Exiting...\n"
exit
;;
esac
clear
}
# Command-line interface for launching Invoke functions
do_line_input() {
clear
printf "What would you like to do?\n"
printf "1: Generate images using the browser-based interface\n"
printf "2: Open the developer console\n"
printf "3: Command-line help\n"
printf "Q: Quit\n\n"
printf "To update, download and run the installer from https://github.com/invoke-ai/InvokeAI/releases/latest\n\n"
read -p "Please enter 1-4, Q: [1] " yn
choice=${yn:='1'}
do_choice $choice
clear
}
# Main IF statement for launching Invoke, and for checking if the user is in the developer console
if [ "$0" != "bash" ]; then
while true; do
do_line_input
done
else # in developer console
python --version
printf "Press ^D to exit\n"
export PS1="(InvokeAI) \u@\h \w> "
fi

View File

@@ -36,7 +36,15 @@ from invokeai.app.services.style_preset_images.style_preset_images_disk import S
from invokeai.app.services.style_preset_records.style_preset_records_sqlite import SqliteStylePresetRecordsStorage
from invokeai.app.services.urls.urls_default import LocalUrlService
from invokeai.app.services.workflow_records.workflow_records_sqlite import SqliteWorkflowRecordsStorage
from invokeai.backend.stable_diffusion.diffusion.conditioning_data import ConditioningFieldData
from invokeai.app.services.workflow_thumbnails.workflow_thumbnails_disk import WorkflowThumbnailFileStorageDisk
from invokeai.backend.stable_diffusion.diffusion.conditioning_data import (
BasicConditioningInfo,
CogView4ConditioningInfo,
ConditioningFieldData,
FLUXConditioningInfo,
SD3ConditioningInfo,
SDXLConditioningInfo,
)
from invokeai.backend.util.logging import InvokeAILogger
from invokeai.version.invokeai_version import __version__
@@ -83,6 +91,7 @@ class ApiDependencies:
model_images_folder = config.models_path
style_presets_folder = config.style_presets_path
workflow_thumbnails_folder = config.workflow_thumbnails_path
db = init_db(config=config, logger=logger, image_files=image_files)
@@ -99,10 +108,25 @@ class ApiDependencies:
images = ImageService()
invocation_cache = MemoryInvocationCache(max_cache_size=config.node_cache_size)
tensors = ObjectSerializerForwardCache(
ObjectSerializerDisk[torch.Tensor](output_folder / "tensors", ephemeral=True)
ObjectSerializerDisk[torch.Tensor](
output_folder / "tensors",
safe_globals=[torch.Tensor],
ephemeral=True,
),
)
conditioning = ObjectSerializerForwardCache(
ObjectSerializerDisk[ConditioningFieldData](output_folder / "conditioning", ephemeral=True)
ObjectSerializerDisk[ConditioningFieldData](
output_folder / "conditioning",
safe_globals=[
ConditioningFieldData,
BasicConditioningInfo,
SDXLConditioningInfo,
FLUXConditioningInfo,
SD3ConditioningInfo,
CogView4ConditioningInfo,
],
ephemeral=True,
),
)
download_queue_service = DownloadQueueService(app_config=configuration, event_bus=events)
model_images_service = ModelImageFileStorageDisk(model_images_folder / "model_images")
@@ -120,6 +144,7 @@ class ApiDependencies:
workflow_records = SqliteWorkflowRecordsStorage(db=db)
style_preset_records = SqliteStylePresetRecordsStorage(db=db)
style_preset_image_files = StylePresetImageFileStorageDisk(style_presets_folder / "images")
workflow_thumbnails = WorkflowThumbnailFileStorageDisk(workflow_thumbnails_folder)
services = InvocationServices(
board_image_records=board_image_records,
@@ -147,6 +172,7 @@ class ApiDependencies:
conditioning=conditioning,
style_preset_records=style_preset_records,
style_preset_image_files=style_preset_image_files,
workflow_thumbnails=workflow_thumbnails,
)
ApiDependencies.invoker = Invoker(services)

View File

@@ -0,0 +1,124 @@
import json
import logging
from dataclasses import dataclass
from PIL import Image
from invokeai.app.services.workflow_records.workflow_records_common import WorkflowWithoutIDValidator
@dataclass
class ExtractedMetadata:
invokeai_metadata: str | None
invokeai_workflow: str | None
invokeai_graph: str | None
def extract_metadata_from_image(
pil_image: Image.Image,
invokeai_metadata_override: str | None,
invokeai_workflow_override: str | None,
invokeai_graph_override: str | None,
logger: logging.Logger,
) -> ExtractedMetadata:
"""
Extracts the "invokeai_metadata", "invokeai_workflow", and "invokeai_graph" data embedded in the PIL Image.
These items are stored as stringified JSON in the image file's metadata, so we need to do some parsing to validate
them. Once parsed, the values are returned as they came (as strings), or None if they are not present or invalid.
In some situations, we may prefer to override the values extracted from the image file with some other values.
For example, when uploading an image via API, the client can optionally provide the metadata directly in the request,
as opposed to embedding it in the image file. In this case, the client-provided metadata will be used instead of the
metadata embedded in the image file.
Args:
pil_image: The PIL Image object.
invokeai_metadata_override: The metadata override provided by the client.
invokeai_workflow_override: The workflow override provided by the client.
invokeai_graph_override: The graph override provided by the client.
logger: The logger to use for debug logging.
Returns:
ExtractedMetadata: The extracted metadata, workflow, and graph.
"""
# The fallback value for metadata is None.
stringified_metadata: str | None = None
# Use the metadata override if provided, else attempt to extract it from the image file.
metadata_raw = invokeai_metadata_override or pil_image.info.get("invokeai_metadata", None)
# If the metadata is present in the image file, we will attempt to parse it as JSON. When we create images,
# we always store metadata as a stringified JSON dict. So, we expect it to be a string here.
if isinstance(metadata_raw, str):
try:
# Must be a JSON string
metadata_parsed = json.loads(metadata_raw)
# Must be a dict
if isinstance(metadata_parsed, dict):
# Looks good, overwrite the fallback value
stringified_metadata = metadata_raw
except Exception as e:
logger.debug(f"Failed to parse metadata for uploaded image, {e}")
pass
# We expect the workflow, if embedded in the image, to be a JSON-stringified WorkflowWithoutID. We will store it
# as a string.
workflow_raw: str | None = invokeai_workflow_override or pil_image.info.get("invokeai_workflow", None)
# The fallback value for workflow is None.
stringified_workflow: str | None = None
# If the workflow is present in the image file, we will attempt to parse it as JSON. When we create images, we
# always store workflows as a stringified JSON WorkflowWithoutID. So, we expect it to be a string here.
if isinstance(workflow_raw, str):
try:
# Validate the workflow JSON before storing it
WorkflowWithoutIDValidator.validate_json(workflow_raw)
# Looks good, overwrite the fallback value
stringified_workflow = workflow_raw
except Exception:
logger.debug("Failed to parse workflow for uploaded image")
pass
# We expect the workflow, if embedded in the image, to be a JSON-stringified Graph. We will store it as a
# string.
graph_raw: str | None = invokeai_graph_override or pil_image.info.get("invokeai_graph", None)
# The fallback value for graph is None.
stringified_graph: str | None = None
# If the graph is present in the image file, we will attempt to parse it as JSON. When we create images, we
# always store graphs as a stringified JSON Graph. So, we expect it to be a string here.
if isinstance(graph_raw, str):
try:
# TODO(psyche): Due to pydantic's handling of None values, it is possible for the graph to fail validation,
# even if it is a direct dump of a valid graph. Node fields in the graph are allowed to have be unset if
# they have incoming connections, but something about the ser/de process cannot adequately handle this.
#
# In lieu of fixing the graph validation, we will just do a simple check here to see if the graph is dict
# with the correct keys. This is not a perfect solution, but it should be good enough for now.
# FIX ME: Validate the graph JSON before storing it
# Graph.model_validate_json(graph_raw)
# Crappy workaround to validate JSON
graph_parsed = json.loads(graph_raw)
if not isinstance(graph_parsed, dict):
raise ValueError("Not a dict")
if not isinstance(graph_parsed.get("nodes", None), dict):
raise ValueError("'nodes' is not a dict")
if not isinstance(graph_parsed.get("edges", None), list):
raise ValueError("'edges' is not a list")
# Looks good, overwrite the fallback value
stringified_graph = graph_raw
except Exception as e:
logger.debug(f"Failed to parse graph for uploaded image, {e}")
pass
return ExtractedMetadata(
invokeai_metadata=stringified_metadata, invokeai_workflow=stringified_workflow, invokeai_graph=stringified_graph
)

View File

@@ -12,6 +12,7 @@ from pydantic import BaseModel, Field
from invokeai.app.api.dependencies import ApiDependencies
from invokeai.app.invocations.upscale import ESRGAN_MODELS
from invokeai.app.services.config.config_default import InvokeAIAppConfig, get_config
from invokeai.app.services.invocation_cache.invocation_cache_common import InvocationCacheStatus
from invokeai.backend.image_util.infill_methods.patchmatch import PatchMatch
from invokeai.backend.util.logging import logging
@@ -99,7 +100,7 @@ async def get_app_deps() -> AppDependencyVersions:
@app_router.get("/config", operation_id="get_config", status_code=200, response_model=AppConfig)
async def get_config() -> AppConfig:
async def get_config_() -> AppConfig:
infill_methods = ["lama", "tile", "cv2", "color"] # TODO: add mosaic back
if PatchMatch.patchmatch_available():
infill_methods.append("patchmatch")
@@ -121,6 +122,21 @@ async def get_config() -> AppConfig:
)
class InvokeAIAppConfigWithSetFields(BaseModel):
"""InvokeAI App Config with model fields set"""
set_fields: set[str] = Field(description="The set fields")
config: InvokeAIAppConfig = Field(description="The InvokeAI App Config")
@app_router.get(
"/runtime_config", operation_id="get_runtime_config", status_code=200, response_model=InvokeAIAppConfigWithSetFields
)
async def get_runtime_config() -> InvokeAIAppConfigWithSetFields:
config = get_config()
return InvokeAIAppConfigWithSetFields(set_fields=config.model_fields_set, config=config)
@app_router.get(
"/logging",
operation_id="get_log_level",

View File

@@ -7,6 +7,7 @@ from pydantic import BaseModel, Field
from invokeai.app.api.dependencies import ApiDependencies
from invokeai.app.services.board_records.board_records_common import BoardChanges, BoardRecordOrderBy
from invokeai.app.services.boards.boards_common import BoardDTO
from invokeai.app.services.image_records.image_records_common import ImageCategory
from invokeai.app.services.shared.pagination import OffsetPaginatedResults
from invokeai.app.services.shared.sqlite.sqlite_common import SQLiteDirection
@@ -31,7 +32,7 @@ class DeleteBoardResult(BaseModel):
response_model=BoardDTO,
)
async def create_board(
board_name: str = Query(description="The name of the board to create"),
board_name: str = Query(description="The name of the board to create", max_length=300),
is_private: bool = Query(default=False, description="Whether the board is private"),
) -> BoardDTO:
"""Creates a board"""
@@ -87,7 +88,9 @@ async def delete_board(
try:
if include_images is True:
deleted_images = ApiDependencies.invoker.services.board_images.get_all_board_image_names_for_board(
board_id=board_id
board_id=board_id,
categories=None,
is_intermediate=None,
)
ApiDependencies.invoker.services.images.delete_images_on_board(board_id=board_id)
ApiDependencies.invoker.services.boards.delete(board_id=board_id)
@@ -98,7 +101,9 @@ async def delete_board(
)
else:
deleted_board_images = ApiDependencies.invoker.services.board_images.get_all_board_image_names_for_board(
board_id=board_id
board_id=board_id,
categories=None,
is_intermediate=None,
)
ApiDependencies.invoker.services.boards.delete(board_id=board_id)
return DeleteBoardResult(
@@ -142,10 +147,14 @@ async def list_boards(
)
async def list_all_board_image_names(
board_id: str = Path(description="The id of the board"),
categories: list[ImageCategory] | None = Query(default=None, description="The categories of image to include."),
is_intermediate: bool | None = Query(default=None, description="Whether to list intermediate images."),
) -> list[str]:
"""Gets a list of images for a board"""
image_names = ApiDependencies.invoker.services.board_images.get_all_board_image_names_for_board(
board_id,
categories,
is_intermediate,
)
return image_names

View File

@@ -6,9 +6,10 @@ from fastapi import BackgroundTasks, Body, HTTPException, Path, Query, Request,
from fastapi.responses import FileResponse
from fastapi.routing import APIRouter
from PIL import Image
from pydantic import BaseModel, Field, JsonValue
from pydantic import BaseModel, Field
from invokeai.app.api.dependencies import ApiDependencies
from invokeai.app.api.extract_metadata_from_image import extract_metadata_from_image
from invokeai.app.invocations.fields import MetadataField
from invokeai.app.services.image_records.image_records_common import (
ImageCategory,
@@ -45,18 +46,16 @@ async def upload_image(
board_id: Optional[str] = Query(default=None, description="The board to add this image to, if any"),
session_id: Optional[str] = Query(default=None, description="The session ID associated with this upload, if any"),
crop_visible: Optional[bool] = Query(default=False, description="Whether to crop the image"),
metadata: Optional[JsonValue] = Body(
default=None, description="The metadata to associate with the image", embed=True
metadata: Optional[str] = Body(
default=None,
description="The metadata to associate with the image, must be a stringified JSON dict",
embed=True,
),
) -> ImageDTO:
"""Uploads an image"""
if not file.content_type or not file.content_type.startswith("image"):
raise HTTPException(status_code=415, detail="Not an image")
_metadata = None
_workflow = None
_graph = None
contents = await file.read()
try:
pil_image = Image.open(io.BytesIO(contents))
@@ -67,30 +66,13 @@ async def upload_image(
ApiDependencies.invoker.services.logger.error(traceback.format_exc())
raise HTTPException(status_code=415, detail="Failed to read image")
# TODO: retain non-invokeai metadata on upload?
# attempt to parse metadata from image
metadata_raw = metadata if isinstance(metadata, str) else pil_image.info.get("invokeai_metadata", None)
if isinstance(metadata_raw, str):
_metadata = metadata_raw
else:
ApiDependencies.invoker.services.logger.debug("Failed to parse metadata for uploaded image")
pass
# attempt to parse workflow from image
workflow_raw = pil_image.info.get("invokeai_workflow", None)
if isinstance(workflow_raw, str):
_workflow = workflow_raw
else:
ApiDependencies.invoker.services.logger.debug("Failed to parse workflow for uploaded image")
pass
# attempt to extract graph from image
graph_raw = pil_image.info.get("invokeai_graph", None)
if isinstance(graph_raw, str):
_graph = graph_raw
else:
ApiDependencies.invoker.services.logger.debug("Failed to parse graph for uploaded image")
pass
extracted_metadata = extract_metadata_from_image(
pil_image=pil_image,
invokeai_metadata_override=metadata,
invokeai_workflow_override=None,
invokeai_graph_override=None,
logger=ApiDependencies.invoker.services.logger,
)
try:
image_dto = ApiDependencies.invoker.services.images.create(
@@ -99,9 +81,9 @@ async def upload_image(
image_category=image_category,
session_id=session_id,
board_id=board_id,
metadata=_metadata,
workflow=_workflow,
graph=_graph,
metadata=extracted_metadata.invokeai_metadata,
workflow=extracted_metadata.invokeai_workflow,
graph=extracted_metadata.invokeai_graph,
is_intermediate=is_intermediate,
)
@@ -114,6 +96,22 @@ async def upload_image(
raise HTTPException(status_code=500, detail="Failed to create image")
class ImageUploadEntry(BaseModel):
image_dto: ImageDTO = Body(description="The image DTO")
presigned_url: str = Body(description="The URL to get the presigned URL for the image upload")
@images_router.post("/", operation_id="create_image_upload_entry")
async def create_image_upload_entry(
width: int = Body(description="The width of the image"),
height: int = Body(description="The height of the image"),
board_id: Optional[str] = Body(default=None, description="The board to add this image to, if any"),
) -> ImageUploadEntry:
"""Uploads an image from a URL, not implemented"""
raise HTTPException(status_code=501, detail="Not implemented")
@images_router.delete("/i/{image_name}", operation_id="delete_image")
async def delete_image(
image_name: str = Path(description="The name of the image to delete"),

View File

@@ -4,7 +4,6 @@
import contextlib
import io
import pathlib
import shutil
import traceback
from copy import deepcopy
from enum import Enum
@@ -21,7 +20,6 @@ from starlette.exceptions import HTTPException
from typing_extensions import Annotated
from invokeai.app.api.dependencies import ApiDependencies
from invokeai.app.services.config import get_config
from invokeai.app.services.model_images.model_images_common import ModelImageFileNotFoundException
from invokeai.app.services.model_install.model_install_common import ModelInstallJob
from invokeai.app.services.model_records import (
@@ -30,14 +28,12 @@ from invokeai.app.services.model_records import (
UnknownModelException,
)
from invokeai.app.util.suppress_output import SuppressOutput
from invokeai.backend.model_manager import BaseModelType, ModelFormat, ModelType
from invokeai.backend.model_manager.config import (
AnyModelConfig,
BaseModelType,
MainCheckpointConfig,
ModelFormat,
ModelType,
)
from invokeai.backend.model_manager.load.model_cache.model_cache_base import CacheStats
from invokeai.backend.model_manager.load.model_cache.cache_stats import CacheStats
from invokeai.backend.model_manager.metadata.fetch.huggingface import HuggingFaceMetadataFetch
from invokeai.backend.model_manager.metadata.metadata_base import ModelMetadataWithFiles, UnknownMetadataException
from invokeai.backend.model_manager.search import ModelSearch
@@ -89,6 +85,7 @@ example_model_config = {
"config_path": "string",
"key": "string",
"hash": "string",
"file_size": 1,
"description": "string",
"source": "string",
"converted_at": 0,
@@ -848,74 +845,6 @@ async def get_starter_models() -> StarterModelResponse:
return StarterModelResponse(starter_models=starter_models, starter_bundles=starter_bundles)
@model_manager_router.get(
"/model_cache",
operation_id="get_cache_size",
response_model=float,
summary="Get maximum size of model manager RAM or VRAM cache.",
)
async def get_cache_size(cache_type: CacheType = Query(description="The cache type", default=CacheType.RAM)) -> float:
"""Return the current RAM or VRAM cache size setting (in GB)."""
cache = ApiDependencies.invoker.services.model_manager.load.ram_cache
value = 0.0
if cache_type == CacheType.RAM:
value = cache.max_cache_size
elif cache_type == CacheType.VRAM:
value = cache.max_vram_cache_size
return value
@model_manager_router.put(
"/model_cache",
operation_id="set_cache_size",
response_model=float,
summary="Set maximum size of model manager RAM or VRAM cache, optionally writing new value out to invokeai.yaml config file.",
)
async def set_cache_size(
value: float = Query(description="The new value for the maximum cache size"),
cache_type: CacheType = Query(description="The cache type", default=CacheType.RAM),
persist: bool = Query(description="Write new value out to invokeai.yaml", default=False),
) -> float:
"""Set the current RAM or VRAM cache size setting (in GB). ."""
cache = ApiDependencies.invoker.services.model_manager.load.ram_cache
app_config = get_config()
# Record initial state.
vram_old = app_config.vram
ram_old = app_config.ram
# Prepare target state.
vram_new = vram_old
ram_new = ram_old
if cache_type == CacheType.RAM:
ram_new = value
elif cache_type == CacheType.VRAM:
vram_new = value
else:
raise ValueError(f"Unexpected {cache_type=}.")
config_path = app_config.config_file_path
new_config_path = config_path.with_suffix(".yaml.new")
try:
# Try to apply the target state.
cache.max_vram_cache_size = vram_new
cache.max_cache_size = ram_new
app_config.ram = ram_new
app_config.vram = vram_new
if persist:
app_config.write_file(new_config_path)
shutil.move(new_config_path, config_path)
except Exception as e:
# If there was a failure, restore the initial state.
cache.max_cache_size = ram_old
cache.max_vram_cache_size = vram_old
app_config.ram = ram_old
app_config.vram = vram_old
raise RuntimeError("Failed to update cache size") from e
return value
@model_manager_router.get(
"/stats",
operation_id="get_stats",
@@ -928,6 +857,18 @@ async def get_stats() -> Optional[CacheStats]:
return ApiDependencies.invoker.services.model_manager.load.ram_cache.stats
@model_manager_router.post(
"/empty_model_cache",
operation_id="empty_model_cache",
status_code=200,
)
async def empty_model_cache() -> None:
"""Drop all models from the model cache to free RAM/VRAM. 'Locked' models that are in active use will not be dropped."""
# Request 1000GB of room in order to force the cache to drop all models.
ApiDependencies.invoker.services.logger.info("Emptying model cache.")
ApiDependencies.invoker.services.model_manager.load.ram_cache.make_room(1000 * 2**30)
class HFTokenStatus(str, Enum):
VALID = "valid"
INVALID = "invalid"
@@ -952,6 +893,12 @@ class HFTokenHelper:
huggingface_hub.login(token=token, add_to_git_credential=False)
return cls.get_status()
@classmethod
def reset_token(cls) -> HFTokenStatus:
with SuppressOutput(), contextlib.suppress(Exception):
huggingface_hub.logout()
return cls.get_status()
@model_manager_router.get("/hf_login", operation_id="get_hf_login_status", response_model=HFTokenStatus)
async def get_hf_login_status() -> HFTokenStatus:
@@ -974,3 +921,8 @@ async def do_hf_login(
ApiDependencies.invoker.services.logger.warning("Unable to verify HF token")
return token_status
@model_manager_router.delete("/hf_login", operation_id="reset_hf_token", response_model=HFTokenStatus)
async def reset_hf_token() -> HFTokenStatus:
return HFTokenHelper.reset_token()

View File

@@ -2,7 +2,7 @@ from typing import Optional
from fastapi import Body, Path, Query
from fastapi.routing import APIRouter
from pydantic import BaseModel
from pydantic import BaseModel, Field
from invokeai.app.api.dependencies import ApiDependencies
from invokeai.app.services.session_processor.session_processor_common import SessionProcessorStatus
@@ -10,11 +10,14 @@ from invokeai.app.services.session_queue.session_queue_common import (
QUEUE_ITEM_STATUS,
Batch,
BatchStatus,
CancelAllExceptCurrentResult,
CancelByBatchIDsResult,
CancelByDestinationResult,
ClearResult,
EnqueueBatchResult,
FieldIdentifier,
PruneResult,
RetryItemsResult,
SessionQueueCountsByDestination,
SessionQueueItem,
SessionQueueItemDTO,
@@ -32,6 +35,12 @@ class SessionQueueAndProcessorStatus(BaseModel):
processor: SessionProcessorStatus
class ValidationRunData(BaseModel):
workflow_id: str = Field(description="The id of the workflow being published.")
input_fields: list[FieldIdentifier] = Body(description="The input fields for the published workflow")
output_fields: list[FieldIdentifier] = Body(description="The output fields for the published workflow")
@session_queue_router.post(
"/{queue_id}/enqueue_batch",
operation_id="enqueue_batch",
@@ -43,10 +52,16 @@ async def enqueue_batch(
queue_id: str = Path(description="The queue id to perform this operation on"),
batch: Batch = Body(description="Batch to process"),
prepend: bool = Body(default=False, description="Whether or not to prepend this batch in the queue"),
validation_run_data: Optional[ValidationRunData] = Body(
default=None,
description="The validation run data to use for this batch. This is only used if this is a validation run.",
),
) -> EnqueueBatchResult:
"""Processes a batch and enqueues the output graphs for execution."""
return ApiDependencies.invoker.services.session_queue.enqueue_batch(queue_id=queue_id, batch=batch, prepend=prepend)
return await ApiDependencies.invoker.services.session_queue.enqueue_batch(
queue_id=queue_id, batch=batch, prepend=prepend
)
@session_queue_router.get(
@@ -94,6 +109,18 @@ async def Pause(
return ApiDependencies.invoker.services.session_processor.pause()
@session_queue_router.put(
"/{queue_id}/cancel_all_except_current",
operation_id="cancel_all_except_current",
responses={200: {"model": CancelAllExceptCurrentResult}},
)
async def cancel_all_except_current(
queue_id: str = Path(description="The queue id to perform this operation on"),
) -> CancelAllExceptCurrentResult:
"""Immediately cancels all queue items except in-processing items"""
return ApiDependencies.invoker.services.session_queue.cancel_all_except_current(queue_id=queue_id)
@session_queue_router.put(
"/{queue_id}/cancel_by_batch_ids",
operation_id="cancel_by_batch_ids",
@@ -110,7 +137,7 @@ async def cancel_by_batch_ids(
@session_queue_router.put(
"/{queue_id}/cancel_by_destination",
operation_id="cancel_by_destination",
responses={200: {"model": CancelByBatchIDsResult}},
responses={200: {"model": CancelByDestinationResult}},
)
async def cancel_by_destination(
queue_id: str = Path(description="The queue id to perform this operation on"),
@@ -122,6 +149,19 @@ async def cancel_by_destination(
)
@session_queue_router.put(
"/{queue_id}/retry_items_by_id",
operation_id="retry_items_by_id",
responses={200: {"model": RetryItemsResult}},
)
async def retry_items_by_id(
queue_id: str = Path(description="The queue id to perform this operation on"),
item_ids: list[int] = Body(description="The queue item ids to retry"),
) -> RetryItemsResult:
"""Immediately cancels all queue items with the given origin"""
return ApiDependencies.invoker.services.session_queue.retry_items_by_id(queue_id=queue_id, item_ids=item_ids)
@session_queue_router.put(
"/{queue_id}/clear",
operation_id="clear",

View File

@@ -25,6 +25,7 @@ async def parse_dynamicprompts(
prompt: str = Body(description="The prompt to parse with dynamicprompts"),
max_prompts: int = Body(ge=1, le=10000, default=1000, description="The max number of prompts to generate"),
combinatorial: bool = Body(default=True, description="Whether to use the combinatorial generator"),
seed: int | None = Body(None, description="The seed to use for random generation. Only used if not combinatorial"),
) -> DynamicPromptsResponse:
"""Creates a batch process"""
max_prompts = min(max_prompts, 10000)
@@ -35,7 +36,7 @@ async def parse_dynamicprompts(
generator = CombinatorialPromptGenerator()
prompts = generator.generate(prompt, max_prompts=max_prompts)
else:
generator = RandomPromptGenerator()
generator = RandomPromptGenerator(seed=seed)
prompts = generator.generate(prompt, num_images=max_prompts)
except ParseException as e:
prompts = [prompt]

View File

@@ -1,6 +1,10 @@
import io
import traceback
from typing import Optional
from fastapi import APIRouter, Body, HTTPException, Path, Query
from fastapi import APIRouter, Body, File, HTTPException, Path, Query, UploadFile
from fastapi.responses import FileResponse
from PIL import Image
from invokeai.app.api.dependencies import ApiDependencies
from invokeai.app.services.shared.pagination import PaginatedResults
@@ -10,11 +14,14 @@ from invokeai.app.services.workflow_records.workflow_records_common import (
WorkflowCategory,
WorkflowNotFoundError,
WorkflowRecordDTO,
WorkflowRecordListItemDTO,
WorkflowRecordListItemWithThumbnailDTO,
WorkflowRecordOrderBy,
WorkflowRecordWithThumbnailDTO,
WorkflowWithoutID,
)
from invokeai.app.services.workflow_thumbnails.workflow_thumbnails_common import WorkflowThumbnailFileNotFoundException
IMAGE_MAX_AGE = 31536000
workflows_router = APIRouter(prefix="/v1/workflows", tags=["workflows"])
@@ -22,15 +29,17 @@ workflows_router = APIRouter(prefix="/v1/workflows", tags=["workflows"])
"/i/{workflow_id}",
operation_id="get_workflow",
responses={
200: {"model": WorkflowRecordDTO},
200: {"model": WorkflowRecordWithThumbnailDTO},
},
)
async def get_workflow(
workflow_id: str = Path(description="The workflow to get"),
) -> WorkflowRecordDTO:
) -> WorkflowRecordWithThumbnailDTO:
"""Gets a workflow"""
try:
return ApiDependencies.invoker.services.workflow_records.get(workflow_id)
thumbnail_url = ApiDependencies.invoker.services.workflow_thumbnails.get_url(workflow_id)
workflow = ApiDependencies.invoker.services.workflow_records.get(workflow_id)
return WorkflowRecordWithThumbnailDTO(thumbnail_url=thumbnail_url, **workflow.model_dump())
except WorkflowNotFoundError:
raise HTTPException(status_code=404, detail="Workflow not found")
@@ -57,6 +66,11 @@ async def delete_workflow(
workflow_id: str = Path(description="The workflow to delete"),
) -> None:
"""Deletes a workflow"""
try:
ApiDependencies.invoker.services.workflow_thumbnails.delete(workflow_id)
except WorkflowThumbnailFileNotFoundException:
# It's OK if the workflow has no thumbnail file. We can still delete the workflow.
pass
ApiDependencies.invoker.services.workflow_records.delete(workflow_id)
@@ -78,7 +92,7 @@ async def create_workflow(
"/",
operation_id="list_workflows",
responses={
200: {"model": PaginatedResults[WorkflowRecordListItemDTO]},
200: {"model": PaginatedResults[WorkflowRecordListItemWithThumbnailDTO]},
},
)
async def list_workflows(
@@ -88,10 +102,160 @@ async def list_workflows(
default=WorkflowRecordOrderBy.Name, description="The attribute to order by"
),
direction: SQLiteDirection = Query(default=SQLiteDirection.Ascending, description="The direction to order by"),
category: WorkflowCategory = Query(default=WorkflowCategory.User, description="The category of workflow to get"),
categories: Optional[list[WorkflowCategory]] = Query(default=None, description="The categories of workflow to get"),
tags: Optional[list[str]] = Query(default=None, description="The tags of workflow to get"),
query: Optional[str] = Query(default=None, description="The text to query by (matches name and description)"),
) -> PaginatedResults[WorkflowRecordListItemDTO]:
has_been_opened: Optional[bool] = Query(default=None, description="Whether to include/exclude recent workflows"),
is_published: Optional[bool] = Query(default=None, description="Whether to include/exclude published workflows"),
) -> PaginatedResults[WorkflowRecordListItemWithThumbnailDTO]:
"""Gets a page of workflows"""
return ApiDependencies.invoker.services.workflow_records.get_many(
order_by=order_by, direction=direction, page=page, per_page=per_page, query=query, category=category
workflows_with_thumbnails: list[WorkflowRecordListItemWithThumbnailDTO] = []
workflows = ApiDependencies.invoker.services.workflow_records.get_many(
order_by=order_by,
direction=direction,
page=page,
per_page=per_page,
query=query,
categories=categories,
tags=tags,
has_been_opened=has_been_opened,
is_published=is_published,
)
for workflow in workflows.items:
workflows_with_thumbnails.append(
WorkflowRecordListItemWithThumbnailDTO(
thumbnail_url=ApiDependencies.invoker.services.workflow_thumbnails.get_url(workflow.workflow_id),
**workflow.model_dump(),
)
)
return PaginatedResults[WorkflowRecordListItemWithThumbnailDTO](
items=workflows_with_thumbnails,
total=workflows.total,
page=workflows.page,
pages=workflows.pages,
per_page=workflows.per_page,
)
@workflows_router.put(
"/i/{workflow_id}/thumbnail",
operation_id="set_workflow_thumbnail",
responses={
200: {"model": WorkflowRecordDTO},
},
)
async def set_workflow_thumbnail(
workflow_id: str = Path(description="The workflow to update"),
image: UploadFile = File(description="The image file to upload"),
):
"""Sets a workflow's thumbnail image"""
try:
ApiDependencies.invoker.services.workflow_records.get(workflow_id)
except WorkflowNotFoundError:
raise HTTPException(status_code=404, detail="Workflow not found")
if not image.content_type or not image.content_type.startswith("image"):
raise HTTPException(status_code=415, detail="Not an image")
contents = await image.read()
try:
pil_image = Image.open(io.BytesIO(contents))
except Exception:
ApiDependencies.invoker.services.logger.error(traceback.format_exc())
raise HTTPException(status_code=415, detail="Failed to read image")
try:
ApiDependencies.invoker.services.workflow_thumbnails.save(workflow_id, pil_image)
except Exception as e:
raise HTTPException(status_code=500, detail=str(e))
@workflows_router.delete(
"/i/{workflow_id}/thumbnail",
operation_id="delete_workflow_thumbnail",
responses={
200: {"model": WorkflowRecordDTO},
},
)
async def delete_workflow_thumbnail(
workflow_id: str = Path(description="The workflow to update"),
):
"""Removes a workflow's thumbnail image"""
try:
ApiDependencies.invoker.services.workflow_records.get(workflow_id)
except WorkflowNotFoundError:
raise HTTPException(status_code=404, detail="Workflow not found")
try:
ApiDependencies.invoker.services.workflow_thumbnails.delete(workflow_id)
except ValueError as e:
raise HTTPException(status_code=500, detail=str(e))
@workflows_router.get(
"/i/{workflow_id}/thumbnail",
operation_id="get_workflow_thumbnail",
responses={
200: {
"description": "The workflow thumbnail was fetched successfully",
},
400: {"description": "Bad request"},
404: {"description": "The workflow thumbnail could not be found"},
},
status_code=200,
)
async def get_workflow_thumbnail(
workflow_id: str = Path(description="The id of the workflow thumbnail to get"),
) -> FileResponse:
"""Gets a workflow's thumbnail image"""
try:
path = ApiDependencies.invoker.services.workflow_thumbnails.get_path(workflow_id)
response = FileResponse(
path,
media_type="image/png",
filename=workflow_id + ".png",
content_disposition_type="inline",
)
response.headers["Cache-Control"] = f"max-age={IMAGE_MAX_AGE}"
return response
except Exception:
raise HTTPException(status_code=404)
@workflows_router.get("/counts_by_tag", operation_id="get_counts_by_tag")
async def get_counts_by_tag(
tags: list[str] = Query(description="The tags to get counts for"),
categories: Optional[list[WorkflowCategory]] = Query(default=None, description="The categories to include"),
has_been_opened: Optional[bool] = Query(default=None, description="Whether to include/exclude recent workflows"),
) -> dict[str, int]:
"""Counts workflows by tag"""
return ApiDependencies.invoker.services.workflow_records.counts_by_tag(
tags=tags, categories=categories, has_been_opened=has_been_opened
)
@workflows_router.get("/counts_by_category", operation_id="counts_by_category")
async def counts_by_category(
categories: list[WorkflowCategory] = Query(description="The categories to include"),
has_been_opened: Optional[bool] = Query(default=None, description="Whether to include/exclude recent workflows"),
) -> dict[str, int]:
"""Counts workflows by category"""
return ApiDependencies.invoker.services.workflow_records.counts_by_category(
categories=categories, has_been_opened=has_been_opened
)
@workflows_router.put(
"/i/{workflow_id}/opened_at",
operation_id="update_opened_at",
)
async def update_opened_at(
workflow_id: str = Path(description="The workflow to update"),
) -> None:
"""Updates the opened_at field of a workflow"""
ApiDependencies.invoker.services.workflow_records.update_opened_at(workflow_id)

View File

@@ -1,12 +1,8 @@
import asyncio
import logging
import mimetypes
import socket
from contextlib import asynccontextmanager
from pathlib import Path
import torch
import uvicorn
from fastapi import FastAPI, Request
from fastapi.middleware.cors import CORSMiddleware
from fastapi.middleware.gzip import GZipMiddleware
@@ -15,11 +11,7 @@ from fastapi.responses import HTMLResponse, RedirectResponse
from fastapi_events.handlers.local import local_handler
from fastapi_events.middleware import EventHandlerASGIMiddleware
from starlette.middleware.base import BaseHTTPMiddleware, RequestResponseEndpoint
from torch.backends.mps import is_available as is_mps_available
# for PyCharm:
# noinspection PyUnresolvedReferences
import invokeai.backend.util.hotfixes # noqa: F401 (monkeypatching on import)
import invokeai.frontend.web as web_dir
from invokeai.app.api.dependencies import ApiDependencies
from invokeai.app.api.no_cache_staticfiles import NoCacheStaticFiles
@@ -38,24 +30,10 @@ from invokeai.app.api.routers import (
from invokeai.app.api.sockets import SocketIO
from invokeai.app.services.config.config_default import get_config
from invokeai.app.util.custom_openapi import get_openapi_func
from invokeai.backend.util.devices import TorchDevice
from invokeai.backend.util.logging import InvokeAILogger
app_config = get_config()
if is_mps_available():
import invokeai.backend.util.mps_fixes # noqa: F401 (monkeypatching on import)
logger = InvokeAILogger.get_logger(config=app_config)
# fix for windows mimetypes registry entries being borked
# see https://github.com/invoke-ai/InvokeAI/discussions/3684#discussioncomment-6391352
mimetypes.add_type("application/javascript", ".js")
mimetypes.add_type("text/css", ".css")
torch_device_name = TorchDevice.get_torch_device_name()
logger.info(f"Using torch device: {torch_device_name}")
loop = asyncio.new_event_loop()
@@ -64,6 +42,23 @@ loop = asyncio.new_event_loop()
async def lifespan(app: FastAPI):
# Add startup event to load dependencies
ApiDependencies.initialize(config=app_config, event_handler_id=event_handler_id, loop=loop, logger=logger)
# Log the server address when it starts - in case the network log level is not high enough to see the startup log
proto = "https" if app_config.ssl_certfile else "http"
msg = f"Invoke running on {proto}://{app_config.host}:{app_config.port} (Press CTRL+C to quit)"
# Logging this way ignores the logger's log level and _always_ logs the message
record = logger.makeRecord(
name=logger.name,
level=logging.INFO,
fn="",
lno=0,
msg=msg,
args=(),
exc_info=None,
)
logger.handle(record)
yield
# Shut down threads
ApiDependencies.shutdown()
@@ -165,73 +160,3 @@ except RuntimeError:
app.mount(
"/static", NoCacheStaticFiles(directory=Path(web_root_path, "static/")), name="static"
) # docs favicon is in here
def check_cudnn(logger: logging.Logger) -> None:
"""Check for cuDNN issues that could be causing degraded performance."""
if torch.backends.cudnn.is_available():
try:
# Note: At the time of writing (torch 2.2.1), torch.backends.cudnn.version() only raises an error the first
# time it is called. Subsequent calls will return the version number without complaining about a mismatch.
cudnn_version = torch.backends.cudnn.version()
logger.info(f"cuDNN version: {cudnn_version}")
except RuntimeError as e:
logger.warning(
"Encountered a cuDNN version issue. This may result in degraded performance. This issue is usually "
"caused by an incompatible cuDNN version installed in your python environment, or on the host "
f"system. Full error message:\n{e}"
)
def invoke_api() -> None:
def find_port(port: int) -> int:
"""Find a port not in use starting at given port"""
# Taken from https://waylonwalker.com/python-find-available-port/, thanks Waylon!
# https://github.com/WaylonWalker
with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
s.settimeout(1)
if s.connect_ex(("localhost", port)) == 0:
return find_port(port=port + 1)
else:
return port
if app_config.dev_reload:
try:
import jurigged
except ImportError as e:
logger.error(
'Can\'t start `--dev_reload` because jurigged is not found; `pip install -e ".[dev]"` to include development dependencies.',
exc_info=e,
)
else:
jurigged.watch(logger=InvokeAILogger.get_logger(name="jurigged").info)
port = find_port(app_config.port)
if port != app_config.port:
logger.warn(f"Port {app_config.port} in use, using port {port}")
check_cudnn(logger)
config = uvicorn.Config(
app=app,
host=app_config.host,
port=port,
loop="asyncio",
log_level=app_config.log_level,
ssl_certfile=app_config.ssl_certfile,
ssl_keyfile=app_config.ssl_keyfile,
)
server = uvicorn.Server(config)
# replace uvicorn's loggers with InvokeAI's for consistent appearance
for logname in ["uvicorn.access", "uvicorn"]:
log = InvokeAILogger.get_logger(logname)
log.handlers.clear()
for ch in logger.handlers:
log.addHandler(ch)
loop.run_until_complete(server.serve())
if __name__ == "__main__":
invoke_api()

View File

@@ -1,28 +1,5 @@
import shutil
import sys
from importlib.util import module_from_spec, spec_from_file_location
from pathlib import Path
from invokeai.app.services.config.config_default import get_config
custom_nodes_path = Path(get_config().custom_nodes_path)
custom_nodes_path.mkdir(parents=True, exist_ok=True)
custom_nodes_init_path = str(custom_nodes_path / "__init__.py")
custom_nodes_readme_path = str(custom_nodes_path / "README.md")
# copy our custom nodes __init__.py to the custom nodes directory
shutil.copy(Path(__file__).parent / "custom_nodes/init.py", custom_nodes_init_path)
shutil.copy(Path(__file__).parent / "custom_nodes/README.md", custom_nodes_readme_path)
# Import custom nodes, see https://docs.python.org/3/library/importlib.html#importing-programmatically
spec = spec_from_file_location("custom_nodes", custom_nodes_init_path)
if spec is None or spec.loader is None:
raise RuntimeError(f"Could not load custom nodes from {custom_nodes_init_path}")
module = module_from_spec(spec)
sys.modules[spec.name] = module
spec.loader.exec_module(module)
# add core nodes to __all__
python_files = filter(lambda f: not f.name.startswith("_"), Path(__file__).parent.glob("*.py"))
__all__ = [f.stem for f in python_files] # type: ignore

View File

@@ -8,6 +8,7 @@ import sys
import warnings
from abc import ABC, abstractmethod
from enum import Enum
from functools import lru_cache
from inspect import signature
from typing import (
TYPE_CHECKING,
@@ -24,10 +25,9 @@ from typing import (
)
import semver
from pydantic import BaseModel, ConfigDict, Field, TypeAdapter, create_model
from pydantic import BaseModel, ConfigDict, Field, JsonValue, TypeAdapter, create_model
from pydantic.fields import FieldInfo
from pydantic_core import PydanticUndefined
from typing_extensions import TypeAliasType
from invokeai.app.invocations.fields import (
FieldKind,
@@ -44,8 +44,6 @@ if TYPE_CHECKING:
logger = InvokeAILogger.get_logger()
CUSTOM_NODE_PACK_SUFFIX = "__invokeai-custom-node"
class InvalidVersionError(ValueError):
pass
@@ -80,7 +78,7 @@ class UIConfigBase(BaseModel):
This is used internally by the @invocation decorator logic. Do not use this directly.
"""
tags: Optional[list[str]] = Field(default_factory=None, description="The node's tags")
tags: Optional[list[str]] = Field(default=None, description="The node's tags")
title: Optional[str] = Field(default=None, description="The node's display name")
category: Optional[str] = Field(default=None, description="The node's category")
version: str = Field(
@@ -102,36 +100,11 @@ class BaseInvocationOutput(BaseModel):
All invocation outputs must use the `@invocation_output` decorator to provide their unique type.
"""
_output_classes: ClassVar[set[BaseInvocationOutput]] = set()
_typeadapter: ClassVar[Optional[TypeAdapter[Any]]] = None
_typeadapter_needs_update: ClassVar[bool] = False
@classmethod
def register_output(cls, output: BaseInvocationOutput) -> None:
"""Registers an invocation output."""
cls._output_classes.add(output)
cls._typeadapter_needs_update = True
@classmethod
def get_outputs(cls) -> Iterable[BaseInvocationOutput]:
"""Gets all invocation outputs."""
return cls._output_classes
@classmethod
def get_typeadapter(cls) -> TypeAdapter[Any]:
"""Gets a pydantc TypeAdapter for the union of all invocation output types."""
if not cls._typeadapter or cls._typeadapter_needs_update:
AnyInvocationOutput = TypeAliasType(
"AnyInvocationOutput", Annotated[Union[tuple(cls._output_classes)], Field(discriminator="type")]
)
cls._typeadapter = TypeAdapter(AnyInvocationOutput)
cls._typeadapter_needs_update = False
return cls._typeadapter
@classmethod
def get_output_types(cls) -> Iterable[str]:
"""Gets all invocation output types."""
return (i.get_type() for i in BaseInvocationOutput.get_outputs())
output_meta: Optional[dict[str, JsonValue]] = Field(
default=None,
description="Optional dictionary of metadata for the invocation output, unrelated to the invocation's actual output value. This is not exposed as an output field.",
json_schema_extra={"field_kind": FieldKind.NodeAttribute},
)
@staticmethod
def json_schema_extra(schema: dict[str, Any], model_class: Type[BaseInvocationOutput]) -> None:
@@ -175,66 +148,11 @@ class BaseInvocation(ABC, BaseModel):
All invocations must use the `@invocation` decorator to provide their unique type.
"""
_invocation_classes: ClassVar[set[BaseInvocation]] = set()
_typeadapter: ClassVar[Optional[TypeAdapter[Any]]] = None
_typeadapter_needs_update: ClassVar[bool] = False
@classmethod
def get_type(cls) -> str:
"""Gets the invocation's type, as provided by the `@invocation` decorator."""
return cls.model_fields["type"].default
@classmethod
def register_invocation(cls, invocation: BaseInvocation) -> None:
"""Registers an invocation."""
cls._invocation_classes.add(invocation)
cls._typeadapter_needs_update = True
@classmethod
def get_typeadapter(cls) -> TypeAdapter[Any]:
"""Gets a pydantc TypeAdapter for the union of all invocation types."""
if not cls._typeadapter or cls._typeadapter_needs_update:
AnyInvocation = TypeAliasType(
"AnyInvocation", Annotated[Union[tuple(cls.get_invocations())], Field(discriminator="type")]
)
cls._typeadapter = TypeAdapter(AnyInvocation)
cls._typeadapter_needs_update = False
return cls._typeadapter
@classmethod
def invalidate_typeadapter(cls) -> None:
"""Invalidates the typeadapter, forcing it to be rebuilt on next access. If the invocation allowlist or
denylist is changed, this should be called to ensure the typeadapter is updated and validation respects
the updated allowlist and denylist."""
cls._typeadapter_needs_update = True
@classmethod
def get_invocations(cls) -> Iterable[BaseInvocation]:
"""Gets all invocations, respecting the allowlist and denylist."""
app_config = get_config()
allowed_invocations: set[BaseInvocation] = set()
for sc in cls._invocation_classes:
invocation_type = sc.get_type()
is_in_allowlist = (
invocation_type in app_config.allow_nodes if isinstance(app_config.allow_nodes, list) else True
)
is_in_denylist = (
invocation_type in app_config.deny_nodes if isinstance(app_config.deny_nodes, list) else False
)
if is_in_allowlist and not is_in_denylist:
allowed_invocations.add(sc)
return allowed_invocations
@classmethod
def get_invocations_map(cls) -> dict[str, BaseInvocation]:
"""Gets a map of all invocation types to their invocation classes."""
return {i.get_type(): i for i in BaseInvocation.get_invocations()}
@classmethod
def get_invocation_types(cls) -> Iterable[str]:
"""Gets all invocation types."""
return (i.get_type() for i in BaseInvocation.get_invocations())
@classmethod
def get_output_annotation(cls) -> BaseInvocationOutput:
"""Gets the invocation's output annotation (i.e. the return annotation of its `invoke()` method)."""
@@ -337,6 +255,144 @@ class BaseInvocation(ABC, BaseModel):
TBaseInvocation = TypeVar("TBaseInvocation", bound=BaseInvocation)
class InvocationRegistry:
_invocation_classes: ClassVar[set[type[BaseInvocation]]] = set()
_output_classes: ClassVar[set[type[BaseInvocationOutput]]] = set()
@classmethod
def register_invocation(cls, invocation: type[BaseInvocation]) -> None:
"""Registers an invocation."""
invocation_type = invocation.get_type()
node_pack = invocation.UIConfig.node_pack
# Log a warning when an existing invocation is being clobbered by the one we are registering
clobbered_invocation = InvocationRegistry.get_invocation_for_type(invocation_type)
if clobbered_invocation is not None:
# This should always be true - we just checked if the invocation type was in the set
clobbered_node_pack = clobbered_invocation.UIConfig.node_pack
if clobbered_node_pack == "invokeai":
# The invocation being clobbered is a core invocation
logger.warning(f'Overriding core node "{invocation_type}" with node from "{node_pack}"')
else:
# The invocation being clobbered is a custom invocation
logger.warning(
f'Overriding node "{invocation_type}" from "{node_pack}" with node from "{clobbered_node_pack}"'
)
cls._invocation_classes.remove(clobbered_invocation)
cls._invocation_classes.add(invocation)
cls.invalidate_invocation_typeadapter()
@classmethod
@lru_cache(maxsize=1)
def get_invocation_typeadapter(cls) -> TypeAdapter[Any]:
"""Gets a pydantic TypeAdapter for the union of all invocation types.
This is used to parse serialized invocations into the correct invocation class.
This method is cached to avoid rebuilding the TypeAdapter on every access. If the invocation allowlist or
denylist is changed, the cache should be cleared to ensure the TypeAdapter is updated and validation respects
the updated allowlist and denylist.
@see https://docs.pydantic.dev/latest/concepts/type_adapter/
"""
return TypeAdapter(Annotated[Union[tuple(cls.get_invocation_classes())], Field(discriminator="type")])
@classmethod
def invalidate_invocation_typeadapter(cls) -> None:
"""Invalidates the cached invocation type adapter."""
cls.get_invocation_typeadapter.cache_clear()
@classmethod
def get_invocation_classes(cls) -> Iterable[type[BaseInvocation]]:
"""Gets all invocations, respecting the allowlist and denylist."""
app_config = get_config()
allowed_invocations: set[type[BaseInvocation]] = set()
for sc in cls._invocation_classes:
invocation_type = sc.get_type()
is_in_allowlist = (
invocation_type in app_config.allow_nodes if isinstance(app_config.allow_nodes, list) else True
)
is_in_denylist = (
invocation_type in app_config.deny_nodes if isinstance(app_config.deny_nodes, list) else False
)
if is_in_allowlist and not is_in_denylist:
allowed_invocations.add(sc)
return allowed_invocations
@classmethod
def get_invocations_map(cls) -> dict[str, type[BaseInvocation]]:
"""Gets a map of all invocation types to their invocation classes."""
return {i.get_type(): i for i in cls.get_invocation_classes()}
@classmethod
def get_invocation_types(cls) -> Iterable[str]:
"""Gets all invocation types."""
return (i.get_type() for i in cls.get_invocation_classes())
@classmethod
def get_invocation_for_type(cls, invocation_type: str) -> type[BaseInvocation] | None:
"""Gets the invocation class for a given invocation type."""
return cls.get_invocations_map().get(invocation_type)
@classmethod
def register_output(cls, output: "type[TBaseInvocationOutput]") -> None:
"""Registers an invocation output."""
output_type = output.get_type()
# Log a warning when an existing invocation is being clobbered by the one we are registering
clobbered_output = InvocationRegistry.get_output_for_type(output_type)
if clobbered_output is not None:
# TODO(psyche): We do not record the node pack of the output, so we cannot log it here
logger.warning(f'Overriding invocation output "{output_type}"')
cls._output_classes.remove(clobbered_output)
cls._output_classes.add(output)
cls.invalidate_output_typeadapter()
@classmethod
def get_output_classes(cls) -> Iterable[type[BaseInvocationOutput]]:
"""Gets all invocation outputs."""
return cls._output_classes
@classmethod
def get_outputs_map(cls) -> dict[str, type[BaseInvocationOutput]]:
"""Gets a map of all output types to their output classes."""
return {i.get_type(): i for i in cls.get_output_classes()}
@classmethod
@lru_cache(maxsize=1)
def get_output_typeadapter(cls) -> TypeAdapter[Any]:
"""Gets a pydantic TypeAdapter for the union of all invocation output types.
This is used to parse serialized invocation outputs into the correct invocation output class.
This method is cached to avoid rebuilding the TypeAdapter on every access. If the invocation allowlist or
denylist is changed, the cache should be cleared to ensure the TypeAdapter is updated and validation respects
the updated allowlist and denylist.
@see https://docs.pydantic.dev/latest/concepts/type_adapter/
"""
return TypeAdapter(Annotated[Union[tuple(cls._output_classes)], Field(discriminator="type")])
@classmethod
def invalidate_output_typeadapter(cls) -> None:
"""Invalidates the cached invocation output type adapter."""
cls.get_output_typeadapter.cache_clear()
@classmethod
def get_output_types(cls) -> Iterable[str]:
"""Gets all invocation output types."""
return (i.get_type() for i in cls.get_output_classes())
@classmethod
def get_output_for_type(cls, output_type: str) -> type[BaseInvocationOutput] | None:
"""Gets the output class for a given output type."""
return cls.get_outputs_map().get(output_type)
RESERVED_NODE_ATTRIBUTE_FIELD_NAMES = {
"id",
"is_intermediate",
@@ -347,7 +403,7 @@ RESERVED_NODE_ATTRIBUTE_FIELD_NAMES = {
RESERVED_INPUT_FIELD_NAMES = {"metadata", "board"}
RESERVED_OUTPUT_FIELD_NAMES = {"type"}
RESERVED_OUTPUT_FIELD_NAMES = {"type", "output_meta"}
class _Model(BaseModel):
@@ -414,7 +470,7 @@ def validate_fields(model_fields: dict[str, FieldInfo], model_type: str) -> None
ui_type = field.json_schema_extra.get("ui_type", None)
if isinstance(ui_type, str) and ui_type.startswith("DEPRECATED_"):
logger.warn(f"\"UIType.{ui_type.split('_')[-1]}\" is deprecated, ignoring")
logger.warn(f'"UIType.{ui_type.split("_")[-1]}" is deprecated, ignoring')
field.json_schema_extra.pop("ui_type")
return None
@@ -446,8 +502,8 @@ def invocation(
if re.compile(r"^\S+$").match(invocation_type) is None:
raise ValueError(f'"invocation_type" must consist of non-whitespace characters, got "{invocation_type}"')
if invocation_type in BaseInvocation.get_invocation_types():
raise ValueError(f'Invocation type "{invocation_type}" already exists')
# The node pack is the module name - will be "invokeai" for built-in nodes
node_pack = cls.__module__.split(".")[0]
validate_fields(cls.model_fields, invocation_type)
@@ -457,8 +513,7 @@ def invocation(
uiconfig["tags"] = tags
uiconfig["category"] = category
uiconfig["classification"] = classification
# The node pack is the module name - will be "invokeai" for built-in nodes
uiconfig["node_pack"] = cls.__module__.split(".")[0]
uiconfig["node_pack"] = node_pack
if version is not None:
try:
@@ -518,8 +573,7 @@ def invocation(
)
cls.__doc__ = docstring
# TODO: how to type this correctly? it's typed as ModelMetaclass, a private class in pydantic
BaseInvocation.register_invocation(cls) # type: ignore
InvocationRegistry.register_invocation(cls)
return cls
@@ -544,13 +598,9 @@ def invocation_output(
if re.compile(r"^\S+$").match(output_type) is None:
raise ValueError(f'"output_type" must consist of non-whitespace characters, got "{output_type}"')
if output_type in BaseInvocationOutput.get_output_types():
raise ValueError(f'Invocation type "{output_type}" already exists')
validate_fields(cls.model_fields, output_type)
# Add the output type to the model.
output_type_annotation = Literal[output_type] # type: ignore
output_type_field = Field(
title="type", default=output_type, json_schema_extra={"field_kind": FieldKind.NodeAttribute}
@@ -565,7 +615,7 @@ def invocation_output(
)
cls.__doc__ = docstring
BaseInvocationOutput.register_output(cls) # type: ignore # TODO: how to type this correctly?
InvocationRegistry.register_output(cls)
return cls

View File

@@ -0,0 +1,274 @@
from typing import Literal
from pydantic import BaseModel
from invokeai.app.invocations.baseinvocation import (
BaseInvocation,
BaseInvocationOutput,
Classification,
invocation,
invocation_output,
)
from invokeai.app.invocations.fields import (
ImageField,
Input,
InputField,
OutputField,
)
from invokeai.app.invocations.primitives import (
FloatOutput,
ImageOutput,
IntegerOutput,
StringOutput,
)
from invokeai.app.services.shared.invocation_context import InvocationContext
BATCH_GROUP_IDS = Literal[
"None",
"Group 1",
"Group 2",
"Group 3",
"Group 4",
"Group 5",
]
class NotExecutableNodeError(Exception):
def __init__(self, message: str = "This class should never be executed or instantiated directly."):
super().__init__(message)
pass
class BaseBatchInvocation(BaseInvocation):
batch_group_id: BATCH_GROUP_IDS = InputField(
default="None",
description="The ID of this batch node's group. If provided, all batch nodes in with the same ID will be 'zipped' before execution, and all nodes' collections must be of the same size.",
input=Input.Direct,
title="Batch Group",
)
def __init__(self):
raise NotExecutableNodeError()
@invocation(
"image_batch",
title="Image Batch",
tags=["primitives", "image", "batch", "special"],
category="primitives",
version="1.0.0",
classification=Classification.Special,
)
class ImageBatchInvocation(BaseBatchInvocation):
"""Create a batched generation, where the workflow is executed once for each image in the batch."""
images: list[ImageField] = InputField(
default=[],
min_length=1,
description="The images to batch over",
)
def invoke(self, context: InvocationContext) -> ImageOutput:
raise NotExecutableNodeError()
@invocation_output("image_generator_output")
class ImageGeneratorOutput(BaseInvocationOutput):
"""Base class for nodes that output a collection of boards"""
images: list[ImageField] = OutputField(description="The generated images")
class ImageGeneratorField(BaseModel):
pass
@invocation(
"image_generator",
title="Image Generator",
tags=["primitives", "board", "image", "batch", "special"],
category="primitives",
version="1.0.0",
classification=Classification.Special,
)
class ImageGenerator(BaseInvocation):
"""Generated a collection of images for use in a batched generation"""
generator: ImageGeneratorField = InputField(
description="The image generator.",
input=Input.Direct,
title="Generator Type",
)
def __init__(self):
raise NotExecutableNodeError()
def invoke(self, context: InvocationContext) -> ImageGeneratorOutput:
raise NotExecutableNodeError()
@invocation(
"string_batch",
title="String Batch",
tags=["primitives", "string", "batch", "special"],
category="primitives",
version="1.0.0",
classification=Classification.Special,
)
class StringBatchInvocation(BaseBatchInvocation):
"""Create a batched generation, where the workflow is executed once for each string in the batch."""
strings: list[str] = InputField(
default=[],
min_length=1,
description="The strings to batch over",
)
def invoke(self, context: InvocationContext) -> StringOutput:
raise NotExecutableNodeError()
@invocation_output("string_generator_output")
class StringGeneratorOutput(BaseInvocationOutput):
"""Base class for nodes that output a collection of strings"""
strings: list[str] = OutputField(description="The generated strings")
class StringGeneratorField(BaseModel):
pass
@invocation(
"string_generator",
title="String Generator",
tags=["primitives", "string", "number", "batch", "special"],
category="primitives",
version="1.0.0",
classification=Classification.Special,
)
class StringGenerator(BaseInvocation):
"""Generated a range of strings for use in a batched generation"""
generator: StringGeneratorField = InputField(
description="The string generator.",
input=Input.Direct,
title="Generator Type",
)
def __init__(self):
raise NotExecutableNodeError()
def invoke(self, context: InvocationContext) -> StringGeneratorOutput:
raise NotExecutableNodeError()
@invocation(
"integer_batch",
title="Integer Batch",
tags=["primitives", "integer", "number", "batch", "special"],
category="primitives",
version="1.0.0",
classification=Classification.Special,
)
class IntegerBatchInvocation(BaseBatchInvocation):
"""Create a batched generation, where the workflow is executed once for each integer in the batch."""
integers: list[int] = InputField(
default=[],
min_length=1,
description="The integers to batch over",
)
def invoke(self, context: InvocationContext) -> IntegerOutput:
raise NotExecutableNodeError()
@invocation_output("integer_generator_output")
class IntegerGeneratorOutput(BaseInvocationOutput):
integers: list[int] = OutputField(description="The generated integers")
class IntegerGeneratorField(BaseModel):
pass
@invocation(
"integer_generator",
title="Integer Generator",
tags=["primitives", "int", "number", "batch", "special"],
category="primitives",
version="1.0.0",
classification=Classification.Special,
)
class IntegerGenerator(BaseInvocation):
"""Generated a range of integers for use in a batched generation"""
generator: IntegerGeneratorField = InputField(
description="The integer generator.",
input=Input.Direct,
title="Generator Type",
)
def __init__(self):
raise NotExecutableNodeError()
def invoke(self, context: InvocationContext) -> IntegerGeneratorOutput:
raise NotExecutableNodeError()
@invocation(
"float_batch",
title="Float Batch",
tags=["primitives", "float", "number", "batch", "special"],
category="primitives",
version="1.0.0",
classification=Classification.Special,
)
class FloatBatchInvocation(BaseBatchInvocation):
"""Create a batched generation, where the workflow is executed once for each float in the batch."""
floats: list[float] = InputField(
default=[],
min_length=1,
description="The floats to batch over",
)
def invoke(self, context: InvocationContext) -> FloatOutput:
raise NotExecutableNodeError()
@invocation_output("float_generator_output")
class FloatGeneratorOutput(BaseInvocationOutput):
"""Base class for nodes that output a collection of floats"""
floats: list[float] = OutputField(description="The generated floats")
class FloatGeneratorField(BaseModel):
pass
@invocation(
"float_generator",
title="Float Generator",
tags=["primitives", "float", "number", "batch", "special"],
category="primitives",
version="1.0.0",
classification=Classification.Special,
)
class FloatGenerator(BaseInvocation):
"""Generated a range of floats for use in a batched generation"""
generator: FloatGeneratorField = InputField(
description="The float generator.",
input=Input.Direct,
title="Generator Type",
)
def __init__(self):
raise NotExecutableNodeError()
def invoke(self, context: InvocationContext) -> FloatGeneratorOutput:
raise NotExecutableNodeError()

View File

@@ -1,98 +1,120 @@
from typing import Any, Union
from typing import Optional, Union
import numpy as np
import numpy.typing as npt
import torch
import torchvision.transforms as T
from PIL import Image
from torchvision.transforms.functional import resize as tv_resize
from invokeai.app.invocations.baseinvocation import BaseInvocation, invocation
from invokeai.app.invocations.fields import FieldDescriptions, Input, InputField, LatentsField
from invokeai.app.invocations.fields import FieldDescriptions, ImageField, Input, InputField, LatentsField
from invokeai.app.invocations.primitives import LatentsOutput
from invokeai.app.services.shared.invocation_context import InvocationContext
from invokeai.backend.stable_diffusion.diffusers_pipeline import image_resized_to_grid_as_tensor
from invokeai.backend.util.devices import TorchDevice
def slerp(
t: Union[float, np.ndarray],
v0: Union[torch.Tensor, np.ndarray],
v1: Union[torch.Tensor, np.ndarray],
device: torch.device,
DOT_THRESHOLD: float = 0.9995,
):
"""
Spherical linear interpolation
Args:
t (float/np.ndarray): Float value between 0.0 and 1.0
v0 (np.ndarray): Starting vector
v1 (np.ndarray): Final vector
DOT_THRESHOLD (float): Threshold for considering the two vectors as
colineal. Not recommended to alter this.
Returns:
v2 (np.ndarray): Interpolation vector between v0 and v1
"""
inputs_are_torch = False
if not isinstance(v0, np.ndarray):
inputs_are_torch = True
v0 = v0.detach().cpu().numpy()
if not isinstance(v1, np.ndarray):
inputs_are_torch = True
v1 = v1.detach().cpu().numpy()
dot = np.sum(v0 * v1 / (np.linalg.norm(v0) * np.linalg.norm(v1)))
if np.abs(dot) > DOT_THRESHOLD:
v2 = (1 - t) * v0 + t * v1
else:
theta_0 = np.arccos(dot)
sin_theta_0 = np.sin(theta_0)
theta_t = theta_0 * t
sin_theta_t = np.sin(theta_t)
s0 = np.sin(theta_0 - theta_t) / sin_theta_0
s1 = sin_theta_t / sin_theta_0
v2 = s0 * v0 + s1 * v1
if inputs_are_torch:
v2 = torch.from_numpy(v2).to(device)
return v2
@invocation(
"lblend",
title="Blend Latents",
tags=["latents", "blend"],
tags=["latents", "blend", "mask"],
category="latents",
version="1.0.3",
version="1.1.0",
)
class BlendLatentsInvocation(BaseInvocation):
"""Blend two latents using a given alpha. Latents must have same size."""
"""Blend two latents using a given alpha. If a mask is provided, the second latents will be masked before blending.
Latents must have same size. Masking functionality added by @dwringer."""
latents_a: LatentsField = InputField(
description=FieldDescriptions.latents,
input=Input.Connection,
)
latents_b: LatentsField = InputField(
description=FieldDescriptions.latents,
input=Input.Connection,
)
alpha: float = InputField(default=0.5, description=FieldDescriptions.blend_alpha)
latents_a: LatentsField = InputField(description=FieldDescriptions.latents, input=Input.Connection)
latents_b: LatentsField = InputField(description=FieldDescriptions.latents, input=Input.Connection)
mask: Optional[ImageField] = InputField(default=None, description="Mask for blending in latents B")
alpha: float = InputField(ge=0, default=0.5, description=FieldDescriptions.blend_alpha)
def prep_mask_tensor(self, mask_image: Image.Image) -> torch.Tensor:
if mask_image.mode != "L":
mask_image = mask_image.convert("L")
mask_tensor = image_resized_to_grid_as_tensor(mask_image, normalize=False)
if mask_tensor.dim() == 3:
mask_tensor = mask_tensor.unsqueeze(0)
return mask_tensor
def replace_tensor_from_masked_tensor(
self, tensor: torch.Tensor, other_tensor: torch.Tensor, mask_tensor: torch.Tensor
):
output = tensor.clone()
mask_tensor = mask_tensor.expand(output.shape)
if output.dtype != torch.float16:
output = torch.add(output, mask_tensor * torch.sub(other_tensor, tensor))
else:
output = torch.add(output, mask_tensor.half() * torch.sub(other_tensor, tensor))
return output
def invoke(self, context: InvocationContext) -> LatentsOutput:
latents_a = context.tensors.load(self.latents_a.latents_name)
latents_b = context.tensors.load(self.latents_b.latents_name)
if self.mask is None:
mask_tensor = torch.zeros(latents_a.shape[-2:])
else:
mask_tensor = self.prep_mask_tensor(context.images.get_pil(self.mask.image_name))
mask_tensor = tv_resize(mask_tensor, latents_a.shape[-2:], T.InterpolationMode.BILINEAR, antialias=False)
latents_b = self.replace_tensor_from_masked_tensor(latents_b, latents_a, mask_tensor)
if latents_a.shape != latents_b.shape:
raise Exception("Latents to blend must be the same size.")
raise ValueError("Latents to blend must be the same size.")
device = TorchDevice.choose_torch_device()
def slerp(
t: Union[float, npt.NDArray[Any]], # FIXME: maybe use np.float32 here?
v0: Union[torch.Tensor, npt.NDArray[Any]],
v1: Union[torch.Tensor, npt.NDArray[Any]],
DOT_THRESHOLD: float = 0.9995,
) -> Union[torch.Tensor, npt.NDArray[Any]]:
"""
Spherical linear interpolation
Args:
t (float/np.ndarray): Float value between 0.0 and 1.0
v0 (np.ndarray): Starting vector
v1 (np.ndarray): Final vector
DOT_THRESHOLD (float): Threshold for considering the two vectors as
colineal. Not recommended to alter this.
Returns:
v2 (np.ndarray): Interpolation vector between v0 and v1
"""
inputs_are_torch = False
if not isinstance(v0, np.ndarray):
inputs_are_torch = True
v0 = v0.detach().cpu().numpy()
if not isinstance(v1, np.ndarray):
inputs_are_torch = True
v1 = v1.detach().cpu().numpy()
dot = np.sum(v0 * v1 / (np.linalg.norm(v0) * np.linalg.norm(v1)))
if np.abs(dot) > DOT_THRESHOLD:
v2 = (1 - t) * v0 + t * v1
else:
theta_0 = np.arccos(dot)
sin_theta_0 = np.sin(theta_0)
theta_t = theta_0 * t
sin_theta_t = np.sin(theta_t)
s0 = np.sin(theta_0 - theta_t) / sin_theta_0
s1 = sin_theta_t / sin_theta_0
v2 = s0 * v0 + s1 * v1
if inputs_are_torch:
v2_torch: torch.Tensor = torch.from_numpy(v2).to(device)
return v2_torch
else:
assert isinstance(v2, np.ndarray)
return v2
# blend
bl = slerp(self.alpha, latents_a, latents_b)
assert isinstance(bl, torch.Tensor)
blended_latents: torch.Tensor = bl # for type checking convenience
blended_latents = slerp(self.alpha, latents_a, latents_b, device)
# https://discuss.huggingface.co/t/memory-usage-by-later-pipeline-stages/23699
blended_latents = blended_latents.to("cpu")
TorchDevice.empty_cache()
torch.cuda.empty_cache()
name = context.tensors.save(tensor=blended_latents)
return LatentsOutput.build(latents_name=name, latents=blended_latents, seed=self.latents_a.seed)
return LatentsOutput.build(latents_name=name, latents=blended_latents)

View File

@@ -0,0 +1,363 @@
from typing import Callable, Optional
import torch
import torchvision.transforms as tv_transforms
from diffusers.models.transformers.transformer_cogview4 import CogView4Transformer2DModel
from torchvision.transforms.functional import resize as tv_resize
from tqdm import tqdm
from invokeai.app.invocations.baseinvocation import BaseInvocation, Classification, invocation
from invokeai.app.invocations.constants import LATENT_SCALE_FACTOR
from invokeai.app.invocations.fields import (
CogView4ConditioningField,
DenoiseMaskField,
FieldDescriptions,
Input,
InputField,
LatentsField,
WithBoard,
WithMetadata,
)
from invokeai.app.invocations.model import TransformerField
from invokeai.app.invocations.primitives import LatentsOutput
from invokeai.app.services.shared.invocation_context import InvocationContext
from invokeai.backend.flux.sampling_utils import clip_timestep_schedule_fractional
from invokeai.backend.model_manager.config import BaseModelType
from invokeai.backend.rectified_flow.rectified_flow_inpaint_extension import RectifiedFlowInpaintExtension
from invokeai.backend.stable_diffusion.diffusers_pipeline import PipelineIntermediateState
from invokeai.backend.stable_diffusion.diffusion.conditioning_data import CogView4ConditioningInfo
from invokeai.backend.util.devices import TorchDevice
@invocation(
"cogview4_denoise",
title="Denoise - CogView4",
tags=["image", "cogview4"],
category="image",
version="1.0.0",
classification=Classification.Prototype,
)
class CogView4DenoiseInvocation(BaseInvocation, WithMetadata, WithBoard):
"""Run the denoising process with a CogView4 model."""
# If latents is provided, this means we are doing image-to-image.
latents: Optional[LatentsField] = InputField(
default=None, description=FieldDescriptions.latents, input=Input.Connection
)
# denoise_mask is used for image-to-image inpainting. Only the masked region is modified.
denoise_mask: Optional[DenoiseMaskField] = InputField(
default=None, description=FieldDescriptions.denoise_mask, input=Input.Connection
)
denoising_start: float = InputField(default=0.0, ge=0, le=1, description=FieldDescriptions.denoising_start)
denoising_end: float = InputField(default=1.0, ge=0, le=1, description=FieldDescriptions.denoising_end)
transformer: TransformerField = InputField(
description=FieldDescriptions.cogview4_model, input=Input.Connection, title="Transformer"
)
positive_conditioning: CogView4ConditioningField = InputField(
description=FieldDescriptions.positive_cond, input=Input.Connection
)
negative_conditioning: CogView4ConditioningField = InputField(
description=FieldDescriptions.negative_cond, input=Input.Connection
)
cfg_scale: float | list[float] = InputField(default=3.5, description=FieldDescriptions.cfg_scale, title="CFG Scale")
width: int = InputField(default=1024, multiple_of=32, description="Width of the generated image.")
height: int = InputField(default=1024, multiple_of=32, description="Height of the generated image.")
steps: int = InputField(default=25, gt=0, description=FieldDescriptions.steps)
seed: int = InputField(default=0, description="Randomness seed for reproducibility.")
@torch.no_grad()
def invoke(self, context: InvocationContext) -> LatentsOutput:
latents = self._run_diffusion(context)
latents = latents.detach().to("cpu")
name = context.tensors.save(tensor=latents)
return LatentsOutput.build(latents_name=name, latents=latents, seed=None)
def _prep_inpaint_mask(self, context: InvocationContext, latents: torch.Tensor) -> torch.Tensor | None:
"""Prepare the inpaint mask.
- Loads the mask
- Resizes if necessary
- Casts to same device/dtype as latents
Args:
context (InvocationContext): The invocation context, for loading the inpaint mask.
latents (torch.Tensor): A latent image tensor. Used to determine the target shape, device, and dtype for the
inpaint mask.
Returns:
torch.Tensor | None: Inpaint mask. Values of 0.0 represent the regions to be fully denoised, and 1.0
represent the regions to be preserved.
"""
if self.denoise_mask is None:
return None
mask = context.tensors.load(self.denoise_mask.mask_name)
# The input denoise_mask contains values in [0, 1], where 0.0 represents the regions to be fully denoised, and
# 1.0 represents the regions to be preserved.
# We invert the mask so that the regions to be preserved are 0.0 and the regions to be denoised are 1.0.
mask = 1.0 - mask
_, _, latent_height, latent_width = latents.shape
mask = tv_resize(
img=mask,
size=[latent_height, latent_width],
interpolation=tv_transforms.InterpolationMode.BILINEAR,
antialias=False,
)
mask = mask.to(device=latents.device, dtype=latents.dtype)
return mask
def _load_text_conditioning(
self,
context: InvocationContext,
conditioning_name: str,
dtype: torch.dtype,
device: torch.device,
) -> torch.Tensor:
# Load the conditioning data.
cond_data = context.conditioning.load(conditioning_name)
assert len(cond_data.conditionings) == 1
cogview4_conditioning = cond_data.conditionings[0]
assert isinstance(cogview4_conditioning, CogView4ConditioningInfo)
cogview4_conditioning = cogview4_conditioning.to(dtype=dtype, device=device)
return cogview4_conditioning.glm_embeds
def _get_noise(
self,
batch_size: int,
num_channels_latents: int,
height: int,
width: int,
dtype: torch.dtype,
device: torch.device,
seed: int,
) -> torch.Tensor:
# We always generate noise on the same device and dtype then cast to ensure consistency across devices/dtypes.
rand_device = "cpu"
rand_dtype = torch.float16
return torch.randn(
batch_size,
num_channels_latents,
int(height) // LATENT_SCALE_FACTOR,
int(width) // LATENT_SCALE_FACTOR,
device=rand_device,
dtype=rand_dtype,
generator=torch.Generator(device=rand_device).manual_seed(seed),
).to(device=device, dtype=dtype)
def _prepare_cfg_scale(self, num_timesteps: int) -> list[float]:
"""Prepare the CFG scale list.
Args:
num_timesteps (int): The number of timesteps in the scheduler. Could be different from num_steps depending
on the scheduler used (e.g. higher order schedulers).
Returns:
list[float]: _description_
"""
if isinstance(self.cfg_scale, float):
cfg_scale = [self.cfg_scale] * num_timesteps
elif isinstance(self.cfg_scale, list):
assert len(self.cfg_scale) == num_timesteps
cfg_scale = self.cfg_scale
else:
raise ValueError(f"Invalid CFG scale type: {type(self.cfg_scale)}")
return cfg_scale
def _convert_timesteps_to_sigmas(self, image_seq_len: int, timesteps: torch.Tensor) -> list[float]:
# The logic to prepare the timestep / sigma schedule is based on:
# https://github.com/huggingface/diffusers/blob/b38450d5d2e5b87d5ff7088ee5798c85587b9635/src/diffusers/pipelines/cogview4/pipeline_cogview4.py#L575-L595
# The default FlowMatchEulerDiscreteScheduler configs are based on:
# https://huggingface.co/THUDM/CogView4-6B/blob/fb6f57289c73ac6d139e8d81bd5a4602d1877847/scheduler/scheduler_config.json
# This implementation differs slightly from the original for the sake of simplicity (differs in terminal value
# handling, not quantizing timesteps to integers, etc.).
def calculate_timestep_shift(
image_seq_len: int, base_seq_len: int = 256, base_shift: float = 0.25, max_shift: float = 0.75
) -> float:
m = (image_seq_len / base_seq_len) ** 0.5
mu = m * max_shift + base_shift
return mu
def time_shift_linear(mu: float, sigma: float, t: torch.Tensor) -> torch.Tensor:
return mu / (mu + (1 / t - 1) ** sigma)
mu = calculate_timestep_shift(image_seq_len)
sigmas = time_shift_linear(mu, 1.0, timesteps)
return sigmas.tolist()
def _run_diffusion(
self,
context: InvocationContext,
):
inference_dtype = torch.bfloat16
device = TorchDevice.choose_torch_device()
transformer_info = context.models.load(self.transformer.transformer)
assert isinstance(transformer_info.model, CogView4Transformer2DModel)
# Load/process the conditioning data.
# TODO(ryand): Make CFG optional.
do_classifier_free_guidance = True
pos_prompt_embeds = self._load_text_conditioning(
context=context,
conditioning_name=self.positive_conditioning.conditioning_name,
dtype=inference_dtype,
device=device,
)
neg_prompt_embeds = self._load_text_conditioning(
context=context,
conditioning_name=self.negative_conditioning.conditioning_name,
dtype=inference_dtype,
device=device,
)
# Prepare misc. conditioning variables.
# TODO(ryand): We could expose these as params (like with SDXL). But, we should experiment to see if they are
# useful first.
original_size = torch.tensor([(self.height, self.width)], dtype=pos_prompt_embeds.dtype, device=device)
target_size = torch.tensor([(self.height, self.width)], dtype=pos_prompt_embeds.dtype, device=device)
crops_coords_top_left = torch.tensor([(0, 0)], dtype=pos_prompt_embeds.dtype, device=device)
# Prepare the timestep / sigma schedule.
patch_size = transformer_info.model.config.patch_size # type: ignore
assert isinstance(patch_size, int)
image_seq_len = ((self.height // LATENT_SCALE_FACTOR) * (self.width // LATENT_SCALE_FACTOR)) // (patch_size**2)
# We add an extra step to the end to account for the final timestep of 0.0.
timesteps: list[float] = torch.linspace(1, 0, self.steps + 1).tolist()
# Clip the timesteps schedule based on denoising_start and denoising_end.
timesteps = clip_timestep_schedule_fractional(timesteps, self.denoising_start, self.denoising_end)
sigmas = self._convert_timesteps_to_sigmas(image_seq_len, torch.tensor(timesteps))
total_steps = len(timesteps) - 1
# Prepare the CFG scale list.
cfg_scale = self._prepare_cfg_scale(total_steps)
# Load the input latents, if provided.
init_latents = context.tensors.load(self.latents.latents_name) if self.latents else None
if init_latents is not None:
init_latents = init_latents.to(device=device, dtype=inference_dtype)
# Generate initial latent noise.
num_channels_latents = transformer_info.model.config.in_channels # type: ignore
assert isinstance(num_channels_latents, int)
noise = self._get_noise(
batch_size=1,
num_channels_latents=num_channels_latents,
height=self.height,
width=self.width,
dtype=inference_dtype,
device=device,
seed=self.seed,
)
# Prepare input latent image.
if init_latents is not None:
# Noise the init_latents by the appropriate amount for the first timestep.
s_0 = sigmas[0]
latents = s_0 * noise + (1.0 - s_0) * init_latents
else:
# init_latents are not provided, so we are not doing image-to-image (i.e. we are starting from pure noise).
if self.denoising_start > 1e-5:
raise ValueError("denoising_start should be 0 when initial latents are not provided.")
latents = noise
# If len(timesteps) == 1, then short-circuit. We are just noising the input latents, but not taking any
# denoising steps.
if len(timesteps) <= 1:
return latents
# Prepare inpaint extension.
inpaint_mask = self._prep_inpaint_mask(context, latents)
inpaint_extension: RectifiedFlowInpaintExtension | None = None
if inpaint_mask is not None:
assert init_latents is not None
inpaint_extension = RectifiedFlowInpaintExtension(
init_latents=init_latents,
inpaint_mask=inpaint_mask,
noise=noise,
)
step_callback = self._build_step_callback(context)
step_callback(
PipelineIntermediateState(
step=0,
order=1,
total_steps=total_steps,
timestep=int(timesteps[0]),
latents=latents,
),
)
with transformer_info.model_on_device() as (_, transformer):
assert isinstance(transformer, CogView4Transformer2DModel)
# Denoising loop
for step_idx in tqdm(range(total_steps)):
t_curr = timesteps[step_idx]
sigma_curr = sigmas[step_idx]
sigma_prev = sigmas[step_idx + 1]
# Expand the timestep to match the latent model input.
# Multiply by 1000 to match the default FlowMatchEulerDiscreteScheduler num_train_timesteps.
timestep = torch.tensor([t_curr * 1000], device=device).expand(latents.shape[0])
# TODO(ryand): Support both sequential and batched CFG inference.
noise_pred_cond = transformer(
hidden_states=latents,
encoder_hidden_states=pos_prompt_embeds,
timestep=timestep,
original_size=original_size,
target_size=target_size,
crop_coords=crops_coords_top_left,
return_dict=False,
)[0]
# Apply CFG.
if do_classifier_free_guidance:
noise_pred_uncond = transformer(
hidden_states=latents,
encoder_hidden_states=neg_prompt_embeds,
timestep=timestep,
original_size=original_size,
target_size=target_size,
crop_coords=crops_coords_top_left,
return_dict=False,
)[0]
noise_pred = noise_pred_uncond + cfg_scale[step_idx] * (noise_pred_cond - noise_pred_uncond)
else:
noise_pred = noise_pred_cond
# Compute the previous noisy sample x_t -> x_t-1.
latents_dtype = latents.dtype
# TODO(ryand): Is casting to float32 necessary for precision/stability? I copied this from SD3.
latents = latents.to(dtype=torch.float32)
latents = latents + (sigma_prev - sigma_curr) * noise_pred
latents = latents.to(dtype=latents_dtype)
if inpaint_extension is not None:
latents = inpaint_extension.merge_intermediate_latents_with_init_latents(latents, sigma_prev)
step_callback(
PipelineIntermediateState(
step=step_idx + 1,
order=1,
total_steps=total_steps,
timestep=int(t_curr),
latents=latents,
),
)
return latents
def _build_step_callback(self, context: InvocationContext) -> Callable[[PipelineIntermediateState], None]:
def step_callback(state: PipelineIntermediateState) -> None:
context.util.sd_step_callback(state, BaseModelType.CogView4)
return step_callback

View File

@@ -0,0 +1,69 @@
import einops
import torch
from diffusers.models.autoencoders.autoencoder_kl import AutoencoderKL
from invokeai.app.invocations.baseinvocation import BaseInvocation, Classification, invocation
from invokeai.app.invocations.fields import (
FieldDescriptions,
ImageField,
Input,
InputField,
WithBoard,
WithMetadata,
)
from invokeai.app.invocations.model import VAEField
from invokeai.app.invocations.primitives import LatentsOutput
from invokeai.app.services.shared.invocation_context import InvocationContext
from invokeai.backend.model_manager.load.load_base import LoadedModel
from invokeai.backend.stable_diffusion.diffusers_pipeline import image_resized_to_grid_as_tensor
from invokeai.backend.util.devices import TorchDevice
# TODO(ryand): This is effectively a copy of SD3ImageToLatentsInvocation and a subset of ImageToLatentsInvocation. We
# should refactor to avoid this duplication.
@invocation(
"cogview4_i2l",
title="Image to Latents - CogView4",
tags=["image", "latents", "vae", "i2l", "cogview4"],
category="image",
version="1.0.0",
classification=Classification.Prototype,
)
class CogView4ImageToLatentsInvocation(BaseInvocation, WithMetadata, WithBoard):
"""Generates latents from an image."""
image: ImageField = InputField(description="The image to encode.")
vae: VAEField = InputField(description=FieldDescriptions.vae, input=Input.Connection)
@staticmethod
def vae_encode(vae_info: LoadedModel, image_tensor: torch.Tensor) -> torch.Tensor:
with vae_info as vae:
assert isinstance(vae, AutoencoderKL)
vae.disable_tiling()
image_tensor = image_tensor.to(device=TorchDevice.choose_torch_device(), dtype=vae.dtype)
with torch.inference_mode():
image_tensor_dist = vae.encode(image_tensor).latent_dist
# TODO: Use seed to make sampling reproducible.
latents: torch.Tensor = image_tensor_dist.sample().to(dtype=vae.dtype)
latents = vae.config.scaling_factor * latents
return latents
@torch.no_grad()
def invoke(self, context: InvocationContext) -> LatentsOutput:
image = context.images.get_pil(self.image.image_name)
image_tensor = image_resized_to_grid_as_tensor(image.convert("RGB"))
if image_tensor.dim() == 3:
image_tensor = einops.rearrange(image_tensor, "c h w -> 1 c h w")
vae_info = context.models.load(self.vae.vae)
latents = self.vae_encode(vae_info=vae_info, image_tensor=image_tensor)
latents = latents.to("cpu")
name = context.tensors.save(tensor=latents)
return LatentsOutput.build(latents_name=name, latents=latents, seed=None)

View File

@@ -0,0 +1,86 @@
from contextlib import nullcontext
import torch
from diffusers.models.autoencoders.autoencoder_kl import AutoencoderKL
from einops import rearrange
from PIL import Image
from invokeai.app.invocations.baseinvocation import BaseInvocation, Classification, invocation
from invokeai.app.invocations.constants import LATENT_SCALE_FACTOR
from invokeai.app.invocations.fields import (
FieldDescriptions,
Input,
InputField,
LatentsField,
WithBoard,
WithMetadata,
)
from invokeai.app.invocations.model import VAEField
from invokeai.app.invocations.primitives import ImageOutput
from invokeai.app.services.shared.invocation_context import InvocationContext
from invokeai.backend.stable_diffusion.extensions.seamless import SeamlessExt
from invokeai.backend.util.devices import TorchDevice
# TODO(ryand): This is effectively a copy of SD3LatentsToImageInvocation and a subset of LatentsToImageInvocation. We
# should refactor to avoid this duplication.
@invocation(
"cogview4_l2i",
title="Latents to Image - CogView4",
tags=["latents", "image", "vae", "l2i", "cogview4"],
category="latents",
version="1.0.0",
classification=Classification.Prototype,
)
class CogView4LatentsToImageInvocation(BaseInvocation, WithMetadata, WithBoard):
"""Generates an image from latents."""
latents: LatentsField = InputField(description=FieldDescriptions.latents, input=Input.Connection)
vae: VAEField = InputField(description=FieldDescriptions.vae, input=Input.Connection)
def _estimate_working_memory(self, latents: torch.Tensor, vae: AutoencoderKL) -> int:
"""Estimate the working memory required by the invocation in bytes."""
out_h = LATENT_SCALE_FACTOR * latents.shape[-2]
out_w = LATENT_SCALE_FACTOR * latents.shape[-1]
element_size = next(vae.parameters()).element_size()
scaling_constant = 2200 # Determined experimentally.
working_memory = out_h * out_w * element_size * scaling_constant
return int(working_memory)
@torch.no_grad()
def invoke(self, context: InvocationContext) -> ImageOutput:
latents = context.tensors.load(self.latents.latents_name)
vae_info = context.models.load(self.vae.vae)
assert isinstance(vae_info.model, (AutoencoderKL))
estimated_working_memory = self._estimate_working_memory(latents, vae_info.model)
with (
SeamlessExt.static_patch_model(vae_info.model, self.vae.seamless_axes),
vae_info.model_on_device(working_mem_bytes=estimated_working_memory) as (_, vae),
):
context.util.signal_progress("Running VAE")
assert isinstance(vae, (AutoencoderKL))
latents = latents.to(TorchDevice.choose_torch_device())
vae.disable_tiling()
tiling_context = nullcontext()
# clear memory as vae decode can request a lot
TorchDevice.empty_cache()
with torch.inference_mode(), tiling_context:
# copied from diffusers pipeline
latents = latents / vae.config.scaling_factor
img = vae.decode(latents, return_dict=False)[0]
img = img.clamp(-1, 1)
img = rearrange(img[0], "c h w -> h w c") # noqa: F821
img_pil = Image.fromarray((127.5 * (img + 1.0)).byte().cpu().numpy())
TorchDevice.empty_cache()
image_dto = context.images.save(image=img_pil)
return ImageOutput.build(image_dto)

View File

@@ -0,0 +1,55 @@
from invokeai.app.invocations.baseinvocation import (
BaseInvocation,
BaseInvocationOutput,
Classification,
invocation,
invocation_output,
)
from invokeai.app.invocations.fields import FieldDescriptions, Input, InputField, OutputField, UIType
from invokeai.app.invocations.model import (
GlmEncoderField,
ModelIdentifierField,
TransformerField,
VAEField,
)
from invokeai.app.services.shared.invocation_context import InvocationContext
from invokeai.backend.model_manager.config import SubModelType
@invocation_output("cogview4_model_loader_output")
class CogView4ModelLoaderOutput(BaseInvocationOutput):
"""CogView4 base model loader output."""
transformer: TransformerField = OutputField(description=FieldDescriptions.transformer, title="Transformer")
glm_encoder: GlmEncoderField = OutputField(description=FieldDescriptions.glm_encoder, title="GLM Encoder")
vae: VAEField = OutputField(description=FieldDescriptions.vae, title="VAE")
@invocation(
"cogview4_model_loader",
title="Main Model - CogView4",
tags=["model", "cogview4"],
category="model",
version="1.0.0",
classification=Classification.Prototype,
)
class CogView4ModelLoaderInvocation(BaseInvocation):
"""Loads a CogView4 base model, outputting its submodels."""
model: ModelIdentifierField = InputField(
description=FieldDescriptions.cogview4_model,
ui_type=UIType.CogView4MainModel,
input=Input.Direct,
)
def invoke(self, context: InvocationContext) -> CogView4ModelLoaderOutput:
transformer = self.model.model_copy(update={"submodel_type": SubModelType.Transformer})
vae = self.model.model_copy(update={"submodel_type": SubModelType.VAE})
glm_tokenizer = self.model.model_copy(update={"submodel_type": SubModelType.Tokenizer})
glm_encoder = self.model.model_copy(update={"submodel_type": SubModelType.TextEncoder})
return CogView4ModelLoaderOutput(
transformer=TransformerField(transformer=transformer, loras=[]),
glm_encoder=GlmEncoderField(tokenizer=glm_tokenizer, text_encoder=glm_encoder),
vae=VAEField(vae=vae),
)

View File

@@ -0,0 +1,92 @@
import torch
from transformers import GlmModel, PreTrainedTokenizerFast
from invokeai.app.invocations.baseinvocation import BaseInvocation, Classification, invocation
from invokeai.app.invocations.fields import FieldDescriptions, Input, InputField, UIComponent
from invokeai.app.invocations.model import GlmEncoderField
from invokeai.app.invocations.primitives import CogView4ConditioningOutput
from invokeai.app.services.shared.invocation_context import InvocationContext
from invokeai.backend.stable_diffusion.diffusion.conditioning_data import (
CogView4ConditioningInfo,
ConditioningFieldData,
)
from invokeai.backend.util.devices import TorchDevice
# The CogView4 GLM Text Encoder max sequence length set based on the default in diffusers.
COGVIEW4_GLM_MAX_SEQ_LEN = 1024
@invocation(
"cogview4_text_encoder",
title="Prompt - CogView4",
tags=["prompt", "conditioning", "cogview4"],
category="conditioning",
version="1.0.0",
classification=Classification.Prototype,
)
class CogView4TextEncoderInvocation(BaseInvocation):
"""Encodes and preps a prompt for a cogview4 image."""
prompt: str = InputField(description="Text prompt to encode.", ui_component=UIComponent.Textarea)
glm_encoder: GlmEncoderField = InputField(
title="GLM Encoder",
description=FieldDescriptions.glm_encoder,
input=Input.Connection,
)
@torch.no_grad()
def invoke(self, context: InvocationContext) -> CogView4ConditioningOutput:
glm_embeds = self._glm_encode(context, max_seq_len=COGVIEW4_GLM_MAX_SEQ_LEN)
conditioning_data = ConditioningFieldData(conditionings=[CogView4ConditioningInfo(glm_embeds=glm_embeds)])
conditioning_name = context.conditioning.save(conditioning_data)
return CogView4ConditioningOutput.build(conditioning_name)
def _glm_encode(self, context: InvocationContext, max_seq_len: int) -> torch.Tensor:
prompt = [self.prompt]
# TODO(ryand): Add model inputs to the invocation rather than hard-coding.
with (
context.models.load(self.glm_encoder.text_encoder).model_on_device() as (_, glm_text_encoder),
context.models.load(self.glm_encoder.tokenizer).model_on_device() as (_, glm_tokenizer),
):
context.util.signal_progress("Running GLM text encoder")
assert isinstance(glm_text_encoder, GlmModel)
assert isinstance(glm_tokenizer, PreTrainedTokenizerFast)
text_inputs = glm_tokenizer(
prompt,
padding="longest",
max_length=max_seq_len,
truncation=True,
add_special_tokens=True,
return_tensors="pt",
)
text_input_ids = text_inputs.input_ids
untruncated_ids = glm_tokenizer(prompt, padding="longest", return_tensors="pt").input_ids
assert isinstance(text_input_ids, torch.Tensor)
assert isinstance(untruncated_ids, torch.Tensor)
if untruncated_ids.shape[-1] >= text_input_ids.shape[-1] and not torch.equal(
text_input_ids, untruncated_ids
):
removed_text = glm_tokenizer.batch_decode(untruncated_ids[:, max_seq_len - 1 : -1])
context.logger.warning(
"The following part of your input was truncated because `max_sequence_length` is set to "
f" {max_seq_len} tokens: {removed_text}"
)
current_length = text_input_ids.shape[1]
pad_length = (16 - (current_length % 16)) % 16
if pad_length > 0:
pad_ids = torch.full(
(text_input_ids.shape[0], pad_length),
fill_value=glm_tokenizer.pad_token_id,
dtype=text_input_ids.dtype,
device=text_input_ids.device,
)
text_input_ids = torch.cat([pad_ids, text_input_ids], dim=1)
prompt_embeds = glm_text_encoder(
text_input_ids.to(TorchDevice.choose_torch_device()), output_hidden_states=True
).hidden_states[-2]
assert isinstance(prompt_embeds, torch.Tensor)
return prompt_embeds

View File

@@ -19,9 +19,9 @@ from invokeai.app.invocations.model import CLIPField
from invokeai.app.invocations.primitives import ConditioningOutput
from invokeai.app.services.shared.invocation_context import InvocationContext
from invokeai.app.util.ti_utils import generate_ti_list
from invokeai.backend.lora.lora_model_raw import LoRAModelRaw
from invokeai.backend.lora.lora_patcher import LoRAPatcher
from invokeai.backend.model_patcher import ModelPatcher
from invokeai.backend.patches.layer_patcher import LayerPatcher
from invokeai.backend.patches.model_patch_raw import ModelPatchRaw
from invokeai.backend.stable_diffusion.diffusion.conditioning_data import (
BasicConditioningInfo,
ConditioningFieldData,
@@ -40,10 +40,10 @@ from invokeai.backend.util.devices import TorchDevice
@invocation(
"compel",
title="Prompt",
title="Prompt - SD1.5",
tags=["prompt", "compel"],
category="conditioning",
version="1.2.0",
version="1.2.1",
)
class CompelInvocation(BaseInvocation):
"""Parse prompt using compel package to conditioning."""
@@ -63,29 +63,28 @@ class CompelInvocation(BaseInvocation):
@torch.no_grad()
def invoke(self, context: InvocationContext) -> ConditioningOutput:
tokenizer_info = context.models.load(self.clip.tokenizer)
text_encoder_info = context.models.load(self.clip.text_encoder)
def _lora_loader() -> Iterator[Tuple[LoRAModelRaw, float]]:
def _lora_loader() -> Iterator[Tuple[ModelPatchRaw, float]]:
for lora in self.clip.loras:
lora_info = context.models.load(lora.lora)
assert isinstance(lora_info.model, LoRAModelRaw)
assert isinstance(lora_info.model, ModelPatchRaw)
yield (lora_info.model, lora.weight)
del lora_info
return
# loras = [(context.models.get(**lora.dict(exclude={"weight"})).context.model, lora.weight) for lora in self.clip.loras]
text_encoder_info = context.models.load(self.clip.text_encoder)
ti_list = generate_ti_list(self.prompt, text_encoder_info.config.base, context)
with (
# apply all patches while the model is on the target device
text_encoder_info.model_on_device() as (cached_weights, text_encoder),
tokenizer_info as tokenizer,
LoRAPatcher.apply_lora_patches(
context.models.load(self.clip.tokenizer) as tokenizer,
LayerPatcher.apply_smart_model_patches(
model=text_encoder,
patches=_lora_loader(),
prefix="lora_te_",
dtype=text_encoder.dtype,
cached_weights=cached_weights,
),
# Apply CLIP Skip after LoRA to prevent LoRA application from failing on skipped layers.
@@ -104,6 +103,7 @@ class CompelInvocation(BaseInvocation):
textual_inversion_manager=ti_manager,
dtype_for_device_getter=TorchDevice.choose_torch_dtype,
truncate_long_prompts=False,
device=TorchDevice.choose_torch_device(),
)
conjunction = Compel.parse_prompt_string(self.prompt)
@@ -138,9 +138,7 @@ class SDXLPromptInvocationBase:
lora_prefix: str,
zero_on_empty: bool,
) -> Tuple[torch.Tensor, Optional[torch.Tensor]]:
tokenizer_info = context.models.load(clip_field.tokenizer)
text_encoder_info = context.models.load(clip_field.text_encoder)
# return zero on empty
if prompt == "" and zero_on_empty:
cpu_text_encoder = text_encoder_info.model
@@ -162,11 +160,11 @@ class SDXLPromptInvocationBase:
c_pooled = None
return c, c_pooled
def _lora_loader() -> Iterator[Tuple[LoRAModelRaw, float]]:
def _lora_loader() -> Iterator[Tuple[ModelPatchRaw, float]]:
for lora in clip_field.loras:
lora_info = context.models.load(lora.lora)
lora_model = lora_info.model
assert isinstance(lora_model, LoRAModelRaw)
assert isinstance(lora_model, ModelPatchRaw)
yield (lora_model, lora.weight)
del lora_info
return
@@ -178,11 +176,12 @@ class SDXLPromptInvocationBase:
with (
# apply all patches while the model is on the target device
text_encoder_info.model_on_device() as (cached_weights, text_encoder),
tokenizer_info as tokenizer,
LoRAPatcher.apply_lora_patches(
text_encoder,
context.models.load(clip_field.tokenizer) as tokenizer,
LayerPatcher.apply_smart_model_patches(
model=text_encoder,
patches=_lora_loader(),
prefix=lora_prefix,
dtype=text_encoder.dtype,
cached_weights=cached_weights,
),
# Apply CLIP Skip after LoRA to prevent LoRA application from failing on skipped layers.
@@ -205,6 +204,7 @@ class SDXLPromptInvocationBase:
truncate_long_prompts=False, # TODO:
returned_embeddings_type=ReturnedEmbeddingsType.PENULTIMATE_HIDDEN_STATES_NON_NORMALIZED, # TODO: clip skip
requires_pooled=get_pooled,
device=TorchDevice.choose_torch_device(),
)
conjunction = Compel.parse_prompt_string(prompt)
@@ -222,7 +222,6 @@ class SDXLPromptInvocationBase:
del tokenizer
del text_encoder
del tokenizer_info
del text_encoder_info
c = c.detach().to("cpu")
@@ -234,10 +233,10 @@ class SDXLPromptInvocationBase:
@invocation(
"sdxl_compel_prompt",
title="SDXL Prompt",
title="Prompt - SDXL",
tags=["sdxl", "compel", "prompt"],
category="conditioning",
version="1.2.0",
version="1.2.1",
)
class SDXLCompelPromptInvocation(BaseInvocation, SDXLPromptInvocationBase):
"""Parse prompt using compel package to conditioning."""
@@ -328,10 +327,10 @@ class SDXLCompelPromptInvocation(BaseInvocation, SDXLPromptInvocationBase):
@invocation(
"sdxl_refiner_compel_prompt",
title="SDXL Refiner Prompt",
title="Prompt - SDXL Refiner",
tags=["sdxl", "compel", "prompt"],
category="conditioning",
version="1.1.1",
version="1.1.2",
)
class SDXLRefinerCompelPromptInvocation(BaseInvocation, SDXLPromptInvocationBase):
"""Parse prompt using compel package to conditioning."""
@@ -377,10 +376,10 @@ class CLIPSkipInvocationOutput(BaseInvocationOutput):
@invocation(
"clip_skip",
title="CLIP Skip",
title="Apply CLIP Skip - SD1.5, SDXL",
tags=["clipskip", "clip", "skip"],
category="conditioning",
version="1.1.0",
version="1.1.1",
)
class CLIPSkipInvocation(BaseInvocation):
"""Skip layers in clip text_encoder model."""
@@ -514,7 +513,7 @@ def log_tokenization_for_text(
usedTokens += 1
if usedTokens > 0:
print(f'\n>> [TOKENLOG] Tokens {display_label or ""} ({usedTokens}):')
print(f"\n>> [TOKENLOG] Tokens {display_label or ''} ({usedTokens}):")
print(f"{tokenized}\x1b[0m")
if discarded != "":

File diff suppressed because it is too large Load Diff

View File

@@ -1,7 +1,5 @@
from typing import Literal
from invokeai.backend.util.devices import TorchDevice
LATENT_SCALE_FACTOR = 8
"""
HACK: Many nodes are currently hard-coded to use a fixed latent scale factor of 8. This is fragile, and will need to
@@ -12,5 +10,3 @@ The ratio of image:latent dimensions is LATENT_SCALE_FACTOR:1, or 8:1.
IMAGE_MODES = Literal["L", "RGB", "RGBA", "CMYK", "YCbCr", "LAB", "HSV", "I", "F"]
"""A literal type for PIL image modes supported by Invoke"""
DEFAULT_PRECISION = TorchDevice.choose_torch_dtype()

View File

@@ -0,0 +1,128 @@
# Invocations for ControlNet image preprocessors
# initial implementation by Gregg Helt, 2023
from typing import List, Union
from pydantic import BaseModel, Field, field_validator, model_validator
from invokeai.app.invocations.baseinvocation import (
BaseInvocation,
BaseInvocationOutput,
Classification,
invocation,
invocation_output,
)
from invokeai.app.invocations.fields import (
FieldDescriptions,
ImageField,
InputField,
OutputField,
UIType,
)
from invokeai.app.invocations.model import ModelIdentifierField
from invokeai.app.invocations.primitives import ImageOutput
from invokeai.app.invocations.util import validate_begin_end_step, validate_weights
from invokeai.app.services.shared.invocation_context import InvocationContext
from invokeai.app.util.controlnet_utils import CONTROLNET_MODE_VALUES, CONTROLNET_RESIZE_VALUES, heuristic_resize
from invokeai.backend.image_util.util import np_to_pil, pil_to_np
class ControlField(BaseModel):
image: ImageField = Field(description="The control image")
control_model: ModelIdentifierField = Field(description="The ControlNet model to use")
control_weight: Union[float, List[float]] = Field(default=1, description="The weight given to the ControlNet")
begin_step_percent: float = Field(
default=0, ge=0, le=1, description="When the ControlNet is first applied (% of total steps)"
)
end_step_percent: float = Field(
default=1, ge=0, le=1, description="When the ControlNet is last applied (% of total steps)"
)
control_mode: CONTROLNET_MODE_VALUES = Field(default="balanced", description="The control mode to use")
resize_mode: CONTROLNET_RESIZE_VALUES = Field(default="just_resize", description="The resize mode to use")
@field_validator("control_weight")
@classmethod
def validate_control_weight(cls, v):
validate_weights(v)
return v
@model_validator(mode="after")
def validate_begin_end_step_percent(self):
validate_begin_end_step(self.begin_step_percent, self.end_step_percent)
return self
@invocation_output("control_output")
class ControlOutput(BaseInvocationOutput):
"""node output for ControlNet info"""
# Outputs
control: ControlField = OutputField(description=FieldDescriptions.control)
@invocation("controlnet", title="ControlNet - SD1.5, SDXL", tags=["controlnet"], category="controlnet", version="1.1.3")
class ControlNetInvocation(BaseInvocation):
"""Collects ControlNet info to pass to other nodes"""
image: ImageField = InputField(description="The control image")
control_model: ModelIdentifierField = InputField(
description=FieldDescriptions.controlnet_model, ui_type=UIType.ControlNetModel
)
control_weight: Union[float, List[float]] = InputField(
default=1.0, ge=-1, le=2, description="The weight given to the ControlNet"
)
begin_step_percent: float = InputField(
default=0, ge=0, le=1, description="When the ControlNet is first applied (% of total steps)"
)
end_step_percent: float = InputField(
default=1, ge=0, le=1, description="When the ControlNet is last applied (% of total steps)"
)
control_mode: CONTROLNET_MODE_VALUES = InputField(default="balanced", description="The control mode used")
resize_mode: CONTROLNET_RESIZE_VALUES = InputField(default="just_resize", description="The resize mode used")
@field_validator("control_weight")
@classmethod
def validate_control_weight(cls, v):
validate_weights(v)
return v
@model_validator(mode="after")
def validate_begin_end_step_percent(self) -> "ControlNetInvocation":
validate_begin_end_step(self.begin_step_percent, self.end_step_percent)
return self
def invoke(self, context: InvocationContext) -> ControlOutput:
return ControlOutput(
control=ControlField(
image=self.image,
control_model=self.control_model,
control_weight=self.control_weight,
begin_step_percent=self.begin_step_percent,
end_step_percent=self.end_step_percent,
control_mode=self.control_mode,
resize_mode=self.resize_mode,
),
)
@invocation(
"heuristic_resize",
title="Heuristic Resize",
tags=["image, controlnet"],
category="image",
version="1.0.1",
classification=Classification.Prototype,
)
class HeuristicResizeInvocation(BaseInvocation):
"""Resize an image using a heuristic method. Preserves edge maps."""
image: ImageField = InputField(description="The image to resize")
width: int = InputField(default=512, ge=1, description="The width to resize to (px)")
height: int = InputField(default=512, ge=1, description="The height to resize to (px)")
def invoke(self, context: InvocationContext) -> ImageOutput:
image = context.images.get_pil(self.image.image_name, "RGB")
np_img = pil_to_np(image)
np_resized = heuristic_resize(np_img, (self.width, self.height))
resized = np_to_pil(np_resized)
image_dto = context.images.save(image=resized)
return ImageOutput.build(image_dto)

View File

@@ -1,716 +0,0 @@
# Invocations for ControlNet image preprocessors
# initial implementation by Gregg Helt, 2023
# heavily leverages controlnet_aux package: https://github.com/patrickvonplaten/controlnet_aux
from builtins import bool, float
from pathlib import Path
from typing import Dict, List, Literal, Union
import cv2
import numpy as np
from controlnet_aux import (
ContentShuffleDetector,
LeresDetector,
MediapipeFaceDetector,
MidasDetector,
MLSDdetector,
NormalBaeDetector,
PidiNetDetector,
SamDetector,
ZoeDetector,
)
from controlnet_aux.util import HWC3, ade_palette
from PIL import Image
from pydantic import BaseModel, Field, field_validator, model_validator
from transformers import pipeline
from transformers.pipelines import DepthEstimationPipeline
from invokeai.app.invocations.baseinvocation import (
BaseInvocation,
BaseInvocationOutput,
Classification,
invocation,
invocation_output,
)
from invokeai.app.invocations.fields import (
FieldDescriptions,
ImageField,
InputField,
OutputField,
UIType,
WithBoard,
WithMetadata,
)
from invokeai.app.invocations.model import ModelIdentifierField
from invokeai.app.invocations.primitives import ImageOutput
from invokeai.app.invocations.util import validate_begin_end_step, validate_weights
from invokeai.app.services.shared.invocation_context import InvocationContext
from invokeai.app.util.controlnet_utils import CONTROLNET_MODE_VALUES, CONTROLNET_RESIZE_VALUES, heuristic_resize
from invokeai.backend.image_util.canny import get_canny_edges
from invokeai.backend.image_util.depth_anything.depth_anything_pipeline import DepthAnythingPipeline
from invokeai.backend.image_util.dw_openpose import DWPOSE_MODELS, DWOpenposeDetector
from invokeai.backend.image_util.hed import HEDProcessor
from invokeai.backend.image_util.lineart import LineartProcessor
from invokeai.backend.image_util.lineart_anime import LineartAnimeProcessor
from invokeai.backend.image_util.util import np_to_pil, pil_to_np
class ControlField(BaseModel):
image: ImageField = Field(description="The control image")
control_model: ModelIdentifierField = Field(description="The ControlNet model to use")
control_weight: Union[float, List[float]] = Field(default=1, description="The weight given to the ControlNet")
begin_step_percent: float = Field(
default=0, ge=0, le=1, description="When the ControlNet is first applied (% of total steps)"
)
end_step_percent: float = Field(
default=1, ge=0, le=1, description="When the ControlNet is last applied (% of total steps)"
)
control_mode: CONTROLNET_MODE_VALUES = Field(default="balanced", description="The control mode to use")
resize_mode: CONTROLNET_RESIZE_VALUES = Field(default="just_resize", description="The resize mode to use")
@field_validator("control_weight")
@classmethod
def validate_control_weight(cls, v):
validate_weights(v)
return v
@model_validator(mode="after")
def validate_begin_end_step_percent(self):
validate_begin_end_step(self.begin_step_percent, self.end_step_percent)
return self
@invocation_output("control_output")
class ControlOutput(BaseInvocationOutput):
"""node output for ControlNet info"""
# Outputs
control: ControlField = OutputField(description=FieldDescriptions.control)
@invocation("controlnet", title="ControlNet", tags=["controlnet"], category="controlnet", version="1.1.2")
class ControlNetInvocation(BaseInvocation):
"""Collects ControlNet info to pass to other nodes"""
image: ImageField = InputField(description="The control image")
control_model: ModelIdentifierField = InputField(
description=FieldDescriptions.controlnet_model, ui_type=UIType.ControlNetModel
)
control_weight: Union[float, List[float]] = InputField(
default=1.0, ge=-1, le=2, description="The weight given to the ControlNet"
)
begin_step_percent: float = InputField(
default=0, ge=0, le=1, description="When the ControlNet is first applied (% of total steps)"
)
end_step_percent: float = InputField(
default=1, ge=0, le=1, description="When the ControlNet is last applied (% of total steps)"
)
control_mode: CONTROLNET_MODE_VALUES = InputField(default="balanced", description="The control mode used")
resize_mode: CONTROLNET_RESIZE_VALUES = InputField(default="just_resize", description="The resize mode used")
@field_validator("control_weight")
@classmethod
def validate_control_weight(cls, v):
validate_weights(v)
return v
@model_validator(mode="after")
def validate_begin_end_step_percent(self) -> "ControlNetInvocation":
validate_begin_end_step(self.begin_step_percent, self.end_step_percent)
return self
def invoke(self, context: InvocationContext) -> ControlOutput:
return ControlOutput(
control=ControlField(
image=self.image,
control_model=self.control_model,
control_weight=self.control_weight,
begin_step_percent=self.begin_step_percent,
end_step_percent=self.end_step_percent,
control_mode=self.control_mode,
resize_mode=self.resize_mode,
),
)
# This invocation exists for other invocations to subclass it - do not register with @invocation!
class ImageProcessorInvocation(BaseInvocation, WithMetadata, WithBoard):
"""Base class for invocations that preprocess images for ControlNet"""
image: ImageField = InputField(description="The image to process")
def run_processor(self, image: Image.Image) -> Image.Image:
# superclass just passes through image without processing
return image
def load_image(self, context: InvocationContext) -> Image.Image:
# allows override for any special formatting specific to the preprocessor
return context.images.get_pil(self.image.image_name, "RGB")
def invoke(self, context: InvocationContext) -> ImageOutput:
self._context = context
raw_image = self.load_image(context)
# image type should be PIL.PngImagePlugin.PngImageFile ?
processed_image = self.run_processor(raw_image)
# currently can't see processed image in node UI without a showImage node,
# so for now setting image_type to RESULT instead of INTERMEDIATE so will get saved in gallery
image_dto = context.images.save(image=processed_image)
"""Builds an ImageOutput and its ImageField"""
processed_image_field = ImageField(image_name=image_dto.image_name)
return ImageOutput(
image=processed_image_field,
# width=processed_image.width,
width=image_dto.width,
# height=processed_image.height,
height=image_dto.height,
# mode=processed_image.mode,
)
@invocation(
"canny_image_processor",
title="Canny Processor",
tags=["controlnet", "canny"],
category="controlnet",
version="1.3.3",
classification=Classification.Deprecated,
)
class CannyImageProcessorInvocation(ImageProcessorInvocation):
"""Canny edge detection for ControlNet"""
detect_resolution: int = InputField(default=512, ge=1, description=FieldDescriptions.detect_res)
image_resolution: int = InputField(default=512, ge=1, description=FieldDescriptions.image_res)
low_threshold: int = InputField(
default=100, ge=0, le=255, description="The low threshold of the Canny pixel gradient (0-255)"
)
high_threshold: int = InputField(
default=200, ge=0, le=255, description="The high threshold of the Canny pixel gradient (0-255)"
)
def load_image(self, context: InvocationContext) -> Image.Image:
# Keep alpha channel for Canny processing to detect edges of transparent areas
return context.images.get_pil(self.image.image_name, "RGBA")
def run_processor(self, image: Image.Image) -> Image.Image:
processed_image = get_canny_edges(
image,
self.low_threshold,
self.high_threshold,
detect_resolution=self.detect_resolution,
image_resolution=self.image_resolution,
)
return processed_image
@invocation(
"hed_image_processor",
title="HED (softedge) Processor",
tags=["controlnet", "hed", "softedge"],
category="controlnet",
version="1.2.3",
classification=Classification.Deprecated,
)
class HedImageProcessorInvocation(ImageProcessorInvocation):
"""Applies HED edge detection to image"""
detect_resolution: int = InputField(default=512, ge=1, description=FieldDescriptions.detect_res)
image_resolution: int = InputField(default=512, ge=1, description=FieldDescriptions.image_res)
# safe not supported in controlnet_aux v0.0.3
# safe: bool = InputField(default=False, description=FieldDescriptions.safe_mode)
scribble: bool = InputField(default=False, description=FieldDescriptions.scribble_mode)
def run_processor(self, image: Image.Image) -> Image.Image:
hed_processor = HEDProcessor()
processed_image = hed_processor.run(
image,
detect_resolution=self.detect_resolution,
image_resolution=self.image_resolution,
# safe not supported in controlnet_aux v0.0.3
# safe=self.safe,
scribble=self.scribble,
)
return processed_image
@invocation(
"lineart_image_processor",
title="Lineart Processor",
tags=["controlnet", "lineart"],
category="controlnet",
version="1.2.3",
classification=Classification.Deprecated,
)
class LineartImageProcessorInvocation(ImageProcessorInvocation):
"""Applies line art processing to image"""
detect_resolution: int = InputField(default=512, ge=1, description=FieldDescriptions.detect_res)
image_resolution: int = InputField(default=512, ge=1, description=FieldDescriptions.image_res)
coarse: bool = InputField(default=False, description="Whether to use coarse mode")
def run_processor(self, image: Image.Image) -> Image.Image:
lineart_processor = LineartProcessor()
processed_image = lineart_processor.run(
image, detect_resolution=self.detect_resolution, image_resolution=self.image_resolution, coarse=self.coarse
)
return processed_image
@invocation(
"lineart_anime_image_processor",
title="Lineart Anime Processor",
tags=["controlnet", "lineart", "anime"],
category="controlnet",
version="1.2.3",
classification=Classification.Deprecated,
)
class LineartAnimeImageProcessorInvocation(ImageProcessorInvocation):
"""Applies line art anime processing to image"""
detect_resolution: int = InputField(default=512, ge=1, description=FieldDescriptions.detect_res)
image_resolution: int = InputField(default=512, ge=1, description=FieldDescriptions.image_res)
def run_processor(self, image: Image.Image) -> Image.Image:
processor = LineartAnimeProcessor()
processed_image = processor.run(
image,
detect_resolution=self.detect_resolution,
image_resolution=self.image_resolution,
)
return processed_image
@invocation(
"midas_depth_image_processor",
title="Midas Depth Processor",
tags=["controlnet", "midas"],
category="controlnet",
version="1.2.4",
classification=Classification.Deprecated,
)
class MidasDepthImageProcessorInvocation(ImageProcessorInvocation):
"""Applies Midas depth processing to image"""
a_mult: float = InputField(default=2.0, ge=0, description="Midas parameter `a_mult` (a = a_mult * PI)")
bg_th: float = InputField(default=0.1, ge=0, description="Midas parameter `bg_th`")
detect_resolution: int = InputField(default=512, ge=1, description=FieldDescriptions.detect_res)
image_resolution: int = InputField(default=512, ge=1, description=FieldDescriptions.image_res)
# depth_and_normal not supported in controlnet_aux v0.0.3
# depth_and_normal: bool = InputField(default=False, description="whether to use depth and normal mode")
def run_processor(self, image: Image.Image) -> Image.Image:
# TODO: replace from_pretrained() calls with context.models.download_and_cache() (or similar)
midas_processor = MidasDetector.from_pretrained("lllyasviel/Annotators")
processed_image = midas_processor(
image,
a=np.pi * self.a_mult,
bg_th=self.bg_th,
image_resolution=self.image_resolution,
detect_resolution=self.detect_resolution,
# dept_and_normal not supported in controlnet_aux v0.0.3
# depth_and_normal=self.depth_and_normal,
)
return processed_image
@invocation(
"normalbae_image_processor",
title="Normal BAE Processor",
tags=["controlnet"],
category="controlnet",
version="1.2.3",
classification=Classification.Deprecated,
)
class NormalbaeImageProcessorInvocation(ImageProcessorInvocation):
"""Applies NormalBae processing to image"""
detect_resolution: int = InputField(default=512, ge=1, description=FieldDescriptions.detect_res)
image_resolution: int = InputField(default=512, ge=1, description=FieldDescriptions.image_res)
def run_processor(self, image: Image.Image) -> Image.Image:
normalbae_processor = NormalBaeDetector.from_pretrained("lllyasviel/Annotators")
processed_image = normalbae_processor(
image, detect_resolution=self.detect_resolution, image_resolution=self.image_resolution
)
return processed_image
@invocation(
"mlsd_image_processor",
title="MLSD Processor",
tags=["controlnet", "mlsd"],
category="controlnet",
version="1.2.3",
classification=Classification.Deprecated,
)
class MlsdImageProcessorInvocation(ImageProcessorInvocation):
"""Applies MLSD processing to image"""
detect_resolution: int = InputField(default=512, ge=1, description=FieldDescriptions.detect_res)
image_resolution: int = InputField(default=512, ge=1, description=FieldDescriptions.image_res)
thr_v: float = InputField(default=0.1, ge=0, description="MLSD parameter `thr_v`")
thr_d: float = InputField(default=0.1, ge=0, description="MLSD parameter `thr_d`")
def run_processor(self, image: Image.Image) -> Image.Image:
mlsd_processor = MLSDdetector.from_pretrained("lllyasviel/Annotators")
processed_image = mlsd_processor(
image,
detect_resolution=self.detect_resolution,
image_resolution=self.image_resolution,
thr_v=self.thr_v,
thr_d=self.thr_d,
)
return processed_image
@invocation(
"pidi_image_processor",
title="PIDI Processor",
tags=["controlnet", "pidi"],
category="controlnet",
version="1.2.3",
classification=Classification.Deprecated,
)
class PidiImageProcessorInvocation(ImageProcessorInvocation):
"""Applies PIDI processing to image"""
detect_resolution: int = InputField(default=512, ge=1, description=FieldDescriptions.detect_res)
image_resolution: int = InputField(default=512, ge=1, description=FieldDescriptions.image_res)
safe: bool = InputField(default=False, description=FieldDescriptions.safe_mode)
scribble: bool = InputField(default=False, description=FieldDescriptions.scribble_mode)
def run_processor(self, image: Image.Image) -> Image.Image:
pidi_processor = PidiNetDetector.from_pretrained("lllyasviel/Annotators")
processed_image = pidi_processor(
image,
detect_resolution=self.detect_resolution,
image_resolution=self.image_resolution,
safe=self.safe,
scribble=self.scribble,
)
return processed_image
@invocation(
"content_shuffle_image_processor",
title="Content Shuffle Processor",
tags=["controlnet", "contentshuffle"],
category="controlnet",
version="1.2.3",
classification=Classification.Deprecated,
)
class ContentShuffleImageProcessorInvocation(ImageProcessorInvocation):
"""Applies content shuffle processing to image"""
detect_resolution: int = InputField(default=512, ge=1, description=FieldDescriptions.detect_res)
image_resolution: int = InputField(default=512, ge=1, description=FieldDescriptions.image_res)
h: int = InputField(default=512, ge=0, description="Content shuffle `h` parameter")
w: int = InputField(default=512, ge=0, description="Content shuffle `w` parameter")
f: int = InputField(default=256, ge=0, description="Content shuffle `f` parameter")
def run_processor(self, image: Image.Image) -> Image.Image:
content_shuffle_processor = ContentShuffleDetector()
processed_image = content_shuffle_processor(
image,
detect_resolution=self.detect_resolution,
image_resolution=self.image_resolution,
h=self.h,
w=self.w,
f=self.f,
)
return processed_image
# should work with controlnet_aux >= 0.0.4 and timm <= 0.6.13
@invocation(
"zoe_depth_image_processor",
title="Zoe (Depth) Processor",
tags=["controlnet", "zoe", "depth"],
category="controlnet",
version="1.2.3",
classification=Classification.Deprecated,
)
class ZoeDepthImageProcessorInvocation(ImageProcessorInvocation):
"""Applies Zoe depth processing to image"""
def run_processor(self, image: Image.Image) -> Image.Image:
zoe_depth_processor = ZoeDetector.from_pretrained("lllyasviel/Annotators")
processed_image = zoe_depth_processor(image)
return processed_image
@invocation(
"mediapipe_face_processor",
title="Mediapipe Face Processor",
tags=["controlnet", "mediapipe", "face"],
category="controlnet",
version="1.2.4",
classification=Classification.Deprecated,
)
class MediapipeFaceProcessorInvocation(ImageProcessorInvocation):
"""Applies mediapipe face processing to image"""
max_faces: int = InputField(default=1, ge=1, description="Maximum number of faces to detect")
min_confidence: float = InputField(default=0.5, ge=0, le=1, description="Minimum confidence for face detection")
detect_resolution: int = InputField(default=512, ge=1, description=FieldDescriptions.detect_res)
image_resolution: int = InputField(default=512, ge=1, description=FieldDescriptions.image_res)
def run_processor(self, image: Image.Image) -> Image.Image:
mediapipe_face_processor = MediapipeFaceDetector()
processed_image = mediapipe_face_processor(
image,
max_faces=self.max_faces,
min_confidence=self.min_confidence,
image_resolution=self.image_resolution,
detect_resolution=self.detect_resolution,
)
return processed_image
@invocation(
"leres_image_processor",
title="Leres (Depth) Processor",
tags=["controlnet", "leres", "depth"],
category="controlnet",
version="1.2.3",
classification=Classification.Deprecated,
)
class LeresImageProcessorInvocation(ImageProcessorInvocation):
"""Applies leres processing to image"""
thr_a: float = InputField(default=0, description="Leres parameter `thr_a`")
thr_b: float = InputField(default=0, description="Leres parameter `thr_b`")
boost: bool = InputField(default=False, description="Whether to use boost mode")
detect_resolution: int = InputField(default=512, ge=1, description=FieldDescriptions.detect_res)
image_resolution: int = InputField(default=512, ge=1, description=FieldDescriptions.image_res)
def run_processor(self, image: Image.Image) -> Image.Image:
leres_processor = LeresDetector.from_pretrained("lllyasviel/Annotators")
processed_image = leres_processor(
image,
thr_a=self.thr_a,
thr_b=self.thr_b,
boost=self.boost,
detect_resolution=self.detect_resolution,
image_resolution=self.image_resolution,
)
return processed_image
@invocation(
"tile_image_processor",
title="Tile Resample Processor",
tags=["controlnet", "tile"],
category="controlnet",
version="1.2.3",
classification=Classification.Deprecated,
)
class TileResamplerProcessorInvocation(ImageProcessorInvocation):
"""Tile resampler processor"""
# res: int = InputField(default=512, ge=0, le=1024, description="The pixel resolution for each tile")
down_sampling_rate: float = InputField(default=1.0, ge=1.0, le=8.0, description="Down sampling rate")
# tile_resample copied from sd-webui-controlnet/scripts/processor.py
def tile_resample(
self,
np_img: np.ndarray,
res=512, # never used?
down_sampling_rate=1.0,
):
np_img = HWC3(np_img)
if down_sampling_rate < 1.1:
return np_img
H, W, C = np_img.shape
H = int(float(H) / float(down_sampling_rate))
W = int(float(W) / float(down_sampling_rate))
np_img = cv2.resize(np_img, (W, H), interpolation=cv2.INTER_AREA)
return np_img
def run_processor(self, image: Image.Image) -> Image.Image:
np_img = np.array(image, dtype=np.uint8)
processed_np_image = self.tile_resample(
np_img,
# res=self.tile_size,
down_sampling_rate=self.down_sampling_rate,
)
processed_image = Image.fromarray(processed_np_image)
return processed_image
@invocation(
"segment_anything_processor",
title="Segment Anything Processor",
tags=["controlnet", "segmentanything"],
category="controlnet",
version="1.2.4",
classification=Classification.Deprecated,
)
class SegmentAnythingProcessorInvocation(ImageProcessorInvocation):
"""Applies segment anything processing to image"""
detect_resolution: int = InputField(default=512, ge=1, description=FieldDescriptions.detect_res)
image_resolution: int = InputField(default=512, ge=1, description=FieldDescriptions.image_res)
def run_processor(self, image: Image.Image) -> Image.Image:
# segment_anything_processor = SamDetector.from_pretrained("ybelkada/segment-anything", subfolder="checkpoints")
segment_anything_processor = SamDetectorReproducibleColors.from_pretrained(
"ybelkada/segment-anything", subfolder="checkpoints"
)
np_img = np.array(image, dtype=np.uint8)
processed_image = segment_anything_processor(
np_img, image_resolution=self.image_resolution, detect_resolution=self.detect_resolution
)
return processed_image
class SamDetectorReproducibleColors(SamDetector):
# overriding SamDetector.show_anns() method to use reproducible colors for segmentation image
# base class show_anns() method randomizes colors,
# which seems to also lead to non-reproducible image generation
# so using ADE20k color palette instead
def show_anns(self, anns: List[Dict]):
if len(anns) == 0:
return
sorted_anns = sorted(anns, key=(lambda x: x["area"]), reverse=True)
h, w = anns[0]["segmentation"].shape
final_img = Image.fromarray(np.zeros((h, w, 3), dtype=np.uint8), mode="RGB")
palette = ade_palette()
for i, ann in enumerate(sorted_anns):
m = ann["segmentation"]
img = np.empty((m.shape[0], m.shape[1], 3), dtype=np.uint8)
# doing modulo just in case number of annotated regions exceeds number of colors in palette
ann_color = palette[i % len(palette)]
img[:, :] = ann_color
final_img.paste(Image.fromarray(img, mode="RGB"), (0, 0), Image.fromarray(np.uint8(m * 255)))
return np.array(final_img, dtype=np.uint8)
@invocation(
"color_map_image_processor",
title="Color Map Processor",
tags=["controlnet"],
category="controlnet",
version="1.2.3",
classification=Classification.Deprecated,
)
class ColorMapImageProcessorInvocation(ImageProcessorInvocation):
"""Generates a color map from the provided image"""
color_map_tile_size: int = InputField(default=64, ge=1, description=FieldDescriptions.tile_size)
def run_processor(self, image: Image.Image) -> Image.Image:
np_image = np.array(image, dtype=np.uint8)
height, width = np_image.shape[:2]
width_tile_size = min(self.color_map_tile_size, width)
height_tile_size = min(self.color_map_tile_size, height)
color_map = cv2.resize(
np_image,
(width // width_tile_size, height // height_tile_size),
interpolation=cv2.INTER_CUBIC,
)
color_map = cv2.resize(color_map, (width, height), interpolation=cv2.INTER_NEAREST)
color_map = Image.fromarray(color_map)
return color_map
DEPTH_ANYTHING_MODEL_SIZES = Literal["large", "base", "small", "small_v2"]
# DepthAnything V2 Small model is licensed under Apache 2.0 but not the base and large models.
DEPTH_ANYTHING_MODELS = {
"large": "LiheYoung/depth-anything-large-hf",
"base": "LiheYoung/depth-anything-base-hf",
"small": "LiheYoung/depth-anything-small-hf",
"small_v2": "depth-anything/Depth-Anything-V2-Small-hf",
}
@invocation(
"depth_anything_image_processor",
title="Depth Anything Processor",
tags=["controlnet", "depth", "depth anything"],
category="controlnet",
version="1.1.3",
classification=Classification.Deprecated,
)
class DepthAnythingImageProcessorInvocation(ImageProcessorInvocation):
"""Generates a depth map based on the Depth Anything algorithm"""
model_size: DEPTH_ANYTHING_MODEL_SIZES = InputField(
default="small_v2", description="The size of the depth model to use"
)
resolution: int = InputField(default=512, ge=1, description=FieldDescriptions.image_res)
def run_processor(self, image: Image.Image) -> Image.Image:
def load_depth_anything(model_path: Path):
depth_anything_pipeline = pipeline(model=str(model_path), task="depth-estimation", local_files_only=True)
assert isinstance(depth_anything_pipeline, DepthEstimationPipeline)
return DepthAnythingPipeline(depth_anything_pipeline)
with self._context.models.load_remote_model(
source=DEPTH_ANYTHING_MODELS[self.model_size], loader=load_depth_anything
) as depth_anything_detector:
assert isinstance(depth_anything_detector, DepthAnythingPipeline)
depth_map = depth_anything_detector.generate_depth(image)
# Resizing to user target specified size
new_height = int(image.size[1] * (self.resolution / image.size[0]))
depth_map = depth_map.resize((self.resolution, new_height))
return depth_map
@invocation(
"dw_openpose_image_processor",
title="DW Openpose Image Processor",
tags=["controlnet", "dwpose", "openpose"],
category="controlnet",
version="1.1.1",
classification=Classification.Deprecated,
)
class DWOpenposeImageProcessorInvocation(ImageProcessorInvocation):
"""Generates an openpose pose from an image using DWPose"""
draw_body: bool = InputField(default=True)
draw_face: bool = InputField(default=False)
draw_hands: bool = InputField(default=False)
image_resolution: int = InputField(default=512, ge=1, description=FieldDescriptions.image_res)
def run_processor(self, image: Image.Image) -> Image.Image:
onnx_det = self._context.models.download_and_cache_model(DWPOSE_MODELS["yolox_l.onnx"])
onnx_pose = self._context.models.download_and_cache_model(DWPOSE_MODELS["dw-ll_ucoco_384.onnx"])
dw_openpose = DWOpenposeDetector(onnx_det=onnx_det, onnx_pose=onnx_pose)
processed_image = dw_openpose(
image,
draw_face=self.draw_face,
draw_hands=self.draw_hands,
draw_body=self.draw_body,
resolution=self.image_resolution,
)
return processed_image
@invocation(
"heuristic_resize",
title="Heuristic Resize",
tags=["image, controlnet"],
category="image",
version="1.0.1",
classification=Classification.Prototype,
)
class HeuristicResizeInvocation(BaseInvocation):
"""Resize an image using a heuristic method. Preserves edge maps."""
image: ImageField = InputField(description="The image to resize")
width: int = InputField(default=512, ge=1, description="The width to resize to (px)")
height: int = InputField(default=512, ge=1, description="The height to resize to (px)")
def invoke(self, context: InvocationContext) -> ImageOutput:
image = context.images.get_pil(self.image.image_name, "RGB")
np_img = pil_to_np(image)
np_resized = heuristic_resize(np_img, (self.width, self.height))
resized = np_to_pil(np_resized)
image_dto = context.images.save(image=resized)
return ImageOutput.build(image_dto)

View File

@@ -6,7 +6,6 @@ from PIL import Image
from torchvision.transforms.functional import resize as tv_resize
from invokeai.app.invocations.baseinvocation import BaseInvocation, invocation
from invokeai.app.invocations.constants import DEFAULT_PRECISION
from invokeai.app.invocations.fields import FieldDescriptions, ImageField, Input, InputField
from invokeai.app.invocations.image_to_latents import ImageToLatentsInvocation
from invokeai.app.invocations.model import VAEField
@@ -29,11 +28,7 @@ class CreateDenoiseMaskInvocation(BaseInvocation):
image: Optional[ImageField] = InputField(default=None, description="Image which will be masked", ui_order=1)
mask: ImageField = InputField(description="The mask to use when pasting", ui_order=2)
tiled: bool = InputField(default=False, description=FieldDescriptions.tiled, ui_order=3)
fp32: bool = InputField(
default=DEFAULT_PRECISION == torch.float32,
description=FieldDescriptions.fp32,
ui_order=4,
)
fp32: bool = InputField(default=False, description=FieldDescriptions.fp32, ui_order=4)
def prep_mask_tensor(self, mask_image: Image.Image) -> torch.Tensor:
if mask_image.mode != "L":

View File

@@ -7,7 +7,6 @@ from PIL import Image, ImageFilter
from torchvision.transforms.functional import resize as tv_resize
from invokeai.app.invocations.baseinvocation import BaseInvocation, BaseInvocationOutput, invocation, invocation_output
from invokeai.app.invocations.constants import DEFAULT_PRECISION
from invokeai.app.invocations.fields import (
DenoiseMaskField,
FieldDescriptions,
@@ -20,7 +19,8 @@ from invokeai.app.invocations.image_to_latents import ImageToLatentsInvocation
from invokeai.app.invocations.model import UNetField, VAEField
from invokeai.app.services.shared.invocation_context import InvocationContext
from invokeai.backend.model_manager import LoadedModel
from invokeai.backend.model_manager.config import MainConfigBase, ModelVariantType
from invokeai.backend.model_manager.config import MainConfigBase
from invokeai.backend.model_manager.taxonomy import ModelVariantType
from invokeai.backend.stable_diffusion.diffusers_pipeline import image_resized_to_grid_as_tensor
@@ -76,11 +76,7 @@ class CreateGradientMaskInvocation(BaseInvocation):
ui_order=7,
)
tiled: bool = InputField(default=False, description=FieldDescriptions.tiled, ui_order=8)
fp32: bool = InputField(
default=DEFAULT_PRECISION == torch.float32,
description=FieldDescriptions.fp32,
ui_order=9,
)
fp32: bool = InputField(default=False, description=FieldDescriptions.fp32, ui_order=9)
@torch.no_grad()
def invoke(self, context: InvocationContext) -> GradientMaskOutput:

View File

@@ -1,58 +0,0 @@
"""
Invoke-managed custom node loader. See README.md for more information.
"""
import sys
import traceback
from importlib.util import module_from_spec, spec_from_file_location
from pathlib import Path
from invokeai.backend.util.logging import InvokeAILogger
logger = InvokeAILogger.get_logger()
loaded_count = 0
for d in Path(__file__).parent.iterdir():
# skip files
if not d.is_dir():
continue
# skip hidden directories
if d.name.startswith("_") or d.name.startswith("."):
continue
# skip directories without an `__init__.py`
init = d / "__init__.py"
if not init.exists():
continue
module_name = init.parent.stem
# skip if already imported
if module_name in globals():
continue
# load the module, appending adding a suffix to identify it as a custom node pack
spec = spec_from_file_location(module_name, init.absolute())
if spec is None or spec.loader is None:
logger.warn(f"Could not load {init}")
continue
logger.info(f"Loading node pack {module_name}")
try:
module = module_from_spec(spec)
sys.modules[spec.name] = module
spec.loader.exec_module(module)
loaded_count += 1
except Exception:
full_error = traceback.format_exc()
logger.error(f"Failed to load node pack {module_name}:\n{full_error}")
del init, module_name
if loaded_count > 0:
logger.info(f"Loaded {loaded_count} node packs from {Path(__file__).parent}")

View File

@@ -10,7 +10,9 @@ import torchvision.transforms as T
from diffusers.configuration_utils import ConfigMixin
from diffusers.models.adapter import T2IAdapter
from diffusers.models.unets.unet_2d_condition import UNet2DConditionModel
from diffusers.schedulers.scheduling_dpmsolver_multistep import DPMSolverMultistepScheduler
from diffusers.schedulers.scheduling_dpmsolver_sde import DPMSolverSDEScheduler
from diffusers.schedulers.scheduling_dpmsolver_singlestep import DPMSolverSinglestepScheduler
from diffusers.schedulers.scheduling_tcd import TCDScheduler
from diffusers.schedulers.scheduling_utils import SchedulerMixin as Scheduler
from PIL import Image
@@ -20,7 +22,7 @@ from transformers import CLIPVisionModelWithProjection
from invokeai.app.invocations.baseinvocation import BaseInvocation, invocation
from invokeai.app.invocations.constants import LATENT_SCALE_FACTOR
from invokeai.app.invocations.controlnet_image_processors import ControlField
from invokeai.app.invocations.controlnet import ControlField
from invokeai.app.invocations.fields import (
ConditioningField,
DenoiseMaskField,
@@ -37,10 +39,11 @@ from invokeai.app.invocations.t2i_adapter import T2IAdapterField
from invokeai.app.services.shared.invocation_context import InvocationContext
from invokeai.app.util.controlnet_utils import prepare_control_image
from invokeai.backend.ip_adapter.ip_adapter import IPAdapter
from invokeai.backend.lora.lora_model_raw import LoRAModelRaw
from invokeai.backend.lora.lora_patcher import LoRAPatcher
from invokeai.backend.model_manager import BaseModelType, ModelVariantType
from invokeai.backend.model_manager.config import AnyModelConfig
from invokeai.backend.model_manager.taxonomy import BaseModelType, ModelVariantType
from invokeai.backend.model_patcher import ModelPatcher
from invokeai.backend.patches.layer_patcher import LayerPatcher
from invokeai.backend.patches.model_patch_raw import ModelPatchRaw
from invokeai.backend.stable_diffusion import PipelineIntermediateState
from invokeai.backend.stable_diffusion.denoise_context import DenoiseContext, DenoiseInputs
from invokeai.backend.stable_diffusion.diffusers_pipeline import (
@@ -83,12 +86,14 @@ def get_scheduler(
scheduler_info: ModelIdentifierField,
scheduler_name: str,
seed: int,
unet_config: AnyModelConfig,
) -> Scheduler:
"""Load a scheduler and apply some scheduler-specific overrides."""
# TODO(ryand): Silently falling back to ddim seems like a bad idea. Look into why this was added and remove if
# possible.
scheduler_class, scheduler_extra_config = SCHEDULER_MAP.get(scheduler_name, SCHEDULER_MAP["ddim"])
orig_scheduler_info = context.models.load(scheduler_info)
with orig_scheduler_info as orig_scheduler:
scheduler_config = orig_scheduler.config
@@ -100,10 +105,17 @@ def get_scheduler(
"_backup": scheduler_config,
}
if hasattr(unet_config, "prediction_type"):
scheduler_config["prediction_type"] = unet_config.prediction_type
# make dpmpp_sde reproducable(seed can be passed only in initializer)
if scheduler_class is DPMSolverSDEScheduler:
scheduler_config["noise_sampler_seed"] = seed
if scheduler_class is DPMSolverMultistepScheduler or scheduler_class is DPMSolverSinglestepScheduler:
if scheduler_config["_class_name"] == "DEISMultistepScheduler" and scheduler_config["algorithm_type"] == "deis":
scheduler_config["algorithm_type"] = "dpmsolver++"
scheduler = scheduler_class.from_config(scheduler_config)
# hack copied over from generate.py
@@ -115,10 +127,10 @@ def get_scheduler(
@invocation(
"denoise_latents",
title="Denoise Latents",
title="Denoise - SD1.5, SDXL",
tags=["latents", "denoise", "txt2img", "t2i", "t2l", "img2img", "i2i", "l2l"],
category="latents",
version="1.5.3",
version="1.5.4",
)
class DenoiseLatentsInvocation(BaseInvocation):
"""Denoises noisy latents to decodable images"""
@@ -411,6 +423,7 @@ class DenoiseLatentsInvocation(BaseInvocation):
context: InvocationContext,
control_input: ControlField | list[ControlField] | None,
latents_shape: List[int],
device: torch.device,
exit_stack: ExitStack,
do_classifier_free_guidance: bool = True,
) -> list[ControlNetData] | None:
@@ -452,7 +465,7 @@ class DenoiseLatentsInvocation(BaseInvocation):
height=control_height_resize,
# batch_size=batch_size * num_images_per_prompt,
# num_images_per_prompt=num_images_per_prompt,
device=control_model.device,
device=device,
dtype=control_model.dtype,
control_mode=control_info.control_mode,
resize_mode=control_info.resize_mode,
@@ -547,7 +560,6 @@ class DenoiseLatentsInvocation(BaseInvocation):
for single_ip_adapter in ip_adapters:
with context.models.load(single_ip_adapter.ip_adapter_model) as ip_adapter_model:
assert isinstance(ip_adapter_model, IPAdapter)
image_encoder_model_info = context.models.load(single_ip_adapter.image_encoder_model)
# `single_ip_adapter.image` could be a list or a single ImageField. Normalize to a list here.
single_ipa_image_fields = single_ip_adapter.image
if not isinstance(single_ipa_image_fields, list):
@@ -556,7 +568,7 @@ class DenoiseLatentsInvocation(BaseInvocation):
single_ipa_images = [
context.images.get_pil(image.image_name, mode="RGB") for image in single_ipa_image_fields
]
with image_encoder_model_info as image_encoder_model:
with context.models.load(single_ip_adapter.image_encoder_model) as image_encoder_model:
assert isinstance(image_encoder_model, CLIPVisionModelWithProjection)
# Get image embeddings from CLIP and ImageProjModel.
image_prompt_embeds, uncond_image_prompt_embeds = ip_adapter_model.get_image_embeds(
@@ -606,6 +618,7 @@ class DenoiseLatentsInvocation(BaseInvocation):
context: InvocationContext,
t2i_adapter: Optional[Union[T2IAdapterField, list[T2IAdapterField]]],
latents_shape: list[int],
device: torch.device,
do_classifier_free_guidance: bool,
) -> Optional[list[T2IAdapterData]]:
if t2i_adapter is None:
@@ -621,7 +634,6 @@ class DenoiseLatentsInvocation(BaseInvocation):
t2i_adapter_data = []
for t2i_adapter_field in t2i_adapter:
t2i_adapter_model_config = context.models.get_config(t2i_adapter_field.t2i_adapter_model.key)
t2i_adapter_loaded_model = context.models.load(t2i_adapter_field.t2i_adapter_model)
image = context.images.get_pil(t2i_adapter_field.image.image_name, mode="RGB")
# The max_unet_downscale is the maximum amount that the UNet model downscales the latent image internally.
@@ -637,7 +649,7 @@ class DenoiseLatentsInvocation(BaseInvocation):
raise ValueError(f"Unexpected T2I-Adapter base model type: '{t2i_adapter_model_config.base}'.")
t2i_adapter_model: T2IAdapter
with t2i_adapter_loaded_model as t2i_adapter_model:
with context.models.load(t2i_adapter_field.t2i_adapter_model) as t2i_adapter_model:
total_downscale_factor = t2i_adapter_model.total_downscale_factor
# Note: We have hard-coded `do_classifier_free_guidance=False`. This is because we only want to prepare
@@ -657,7 +669,7 @@ class DenoiseLatentsInvocation(BaseInvocation):
width=control_width_resize,
height=control_height_resize,
num_channels=t2i_adapter_model.config["in_channels"], # mypy treats this as a FrozenDict
device=t2i_adapter_model.device,
device=device,
dtype=t2i_adapter_model.dtype,
resize_mode=t2i_adapter_field.resize_mode,
)
@@ -822,6 +834,9 @@ class DenoiseLatentsInvocation(BaseInvocation):
seed, noise, latents = self.prepare_noise_and_latents(context, self.noise, self.latents)
_, _, latent_height, latent_width = latents.shape
# get the unet's config so that we can pass the base to sd_step_callback()
unet_config = context.models.get_config(self.unet.unet.key)
conditioning_data = self.get_conditioning_data(
context=context,
positive_conditioning_field=self.positive_conditioning,
@@ -841,6 +856,7 @@ class DenoiseLatentsInvocation(BaseInvocation):
scheduler_info=self.unet.scheduler,
scheduler_name=self.scheduler,
seed=seed,
unet_config=unet_config,
)
timesteps, init_timestep, scheduler_step_kwargs = self.init_scheduler(
@@ -852,9 +868,6 @@ class DenoiseLatentsInvocation(BaseInvocation):
denoising_end=self.denoising_end,
)
# get the unet's config so that we can pass the base to sd_step_callback()
unet_config = context.models.get_config(self.unet.unet.key)
### preview
def step_callback(state: PipelineIntermediateState) -> None:
context.util.sd_step_callback(state, unet_config.base)
@@ -885,7 +898,7 @@ class DenoiseLatentsInvocation(BaseInvocation):
### inpaint
mask, masked_latents, is_gradient_mask = self.prep_inpaint_mask(context, latents)
# NOTE: We used to identify inpainting models by inpecting the shape of the loaded UNet model weights. Now we
# NOTE: We used to identify inpainting models by inspecting the shape of the loaded UNet model weights. Now we
# use the ModelVariantType config. During testing, there was a report of a user with models that had an
# incorrect ModelVariantType value. Re-installing the model fixed the issue. If this issue turns out to be
# prevalent, we will have to revisit how we initialize the inpainting extensions.
@@ -926,10 +939,8 @@ class DenoiseLatentsInvocation(BaseInvocation):
# ext: t2i/ip adapter
ext_manager.run_callback(ExtensionCallbackType.SETUP, denoise_ctx)
unet_info = context.models.load(self.unet.unet)
assert isinstance(unet_info.model, UNet2DConditionModel)
with (
unet_info.model_on_device() as (cached_weights, unet),
context.models.load(self.unet.unet).model_on_device() as (cached_weights, unet),
ModelPatcher.patch_unet_attention_processor(unet, denoise_ctx.inputs.attention_processor_cls),
# ext: controlnet
ext_manager.patch_extensions(denoise_ctx),
@@ -950,6 +961,7 @@ class DenoiseLatentsInvocation(BaseInvocation):
@torch.no_grad()
@SilenceWarnings() # This quenches the NSFW nag from diffusers.
def _old_invoke(self, context: InvocationContext) -> LatentsOutput:
device = TorchDevice.choose_torch_device()
seed, noise, latents = self.prepare_noise_and_latents(context, self.noise, self.latents)
mask, masked_latents, gradient_mask = self.prep_inpaint_mask(context, latents)
@@ -964,6 +976,7 @@ class DenoiseLatentsInvocation(BaseInvocation):
context,
self.t2i_adapter,
latents.shape,
device=device,
do_classifier_free_guidance=True,
)
@@ -987,43 +1000,43 @@ class DenoiseLatentsInvocation(BaseInvocation):
def step_callback(state: PipelineIntermediateState) -> None:
context.util.sd_step_callback(state, unet_config.base)
def _lora_loader() -> Iterator[Tuple[LoRAModelRaw, float]]:
def _lora_loader() -> Iterator[Tuple[ModelPatchRaw, float]]:
for lora in self.unet.loras:
lora_info = context.models.load(lora.lora)
assert isinstance(lora_info.model, LoRAModelRaw)
assert isinstance(lora_info.model, ModelPatchRaw)
yield (lora_info.model, lora.weight)
del lora_info
return
unet_info = context.models.load(self.unet.unet)
assert isinstance(unet_info.model, UNet2DConditionModel)
with (
ExitStack() as exit_stack,
unet_info.model_on_device() as (cached_weights, unet),
context.models.load(self.unet.unet).model_on_device() as (cached_weights, unet),
ModelPatcher.apply_freeu(unet, self.unet.freeu_config),
SeamlessExt.static_patch_model(unet, self.unet.seamless_axes), # FIXME
# Apply the LoRA after unet has been moved to its target device for faster patching.
LoRAPatcher.apply_lora_patches(
LayerPatcher.apply_smart_model_patches(
model=unet,
patches=_lora_loader(),
prefix="lora_unet_",
dtype=unet.dtype,
cached_weights=cached_weights,
),
):
assert isinstance(unet, UNet2DConditionModel)
latents = latents.to(device=unet.device, dtype=unet.dtype)
latents = latents.to(device=device, dtype=unet.dtype)
if noise is not None:
noise = noise.to(device=unet.device, dtype=unet.dtype)
noise = noise.to(device=device, dtype=unet.dtype)
if mask is not None:
mask = mask.to(device=unet.device, dtype=unet.dtype)
mask = mask.to(device=device, dtype=unet.dtype)
if masked_latents is not None:
masked_latents = masked_latents.to(device=unet.device, dtype=unet.dtype)
masked_latents = masked_latents.to(device=device, dtype=unet.dtype)
scheduler = get_scheduler(
context=context,
scheduler_info=self.unet.scheduler,
scheduler_name=self.scheduler,
seed=seed,
unet_config=unet_config,
)
pipeline = self.create_pipeline(unet, scheduler)
@@ -1033,7 +1046,7 @@ class DenoiseLatentsInvocation(BaseInvocation):
context=context,
positive_conditioning_field=self.positive_conditioning,
negative_conditioning_field=self.negative_conditioning,
device=unet.device,
device=device,
dtype=unet.dtype,
latent_height=latent_height,
latent_width=latent_width,
@@ -1046,6 +1059,7 @@ class DenoiseLatentsInvocation(BaseInvocation):
context=context,
control_input=self.control,
latents_shape=latents.shape,
device=device,
# do_classifier_free_guidance=(self.cfg_scale >= 1.0))
do_classifier_free_guidance=True,
exit_stack=exit_stack,
@@ -1063,7 +1077,7 @@ class DenoiseLatentsInvocation(BaseInvocation):
timesteps, init_timestep, scheduler_step_kwargs = self.init_scheduler(
scheduler,
device=unet.device,
device=device,
steps=self.steps,
denoising_start=self.denoising_start,
denoising_end=self.denoising_end,

View File

@@ -4,7 +4,7 @@ from invokeai.app.invocations.baseinvocation import BaseInvocation, invocation
from invokeai.app.invocations.fields import ImageField, InputField, WithBoard, WithMetadata
from invokeai.app.invocations.primitives import ImageOutput
from invokeai.app.services.shared.invocation_context import InvocationContext
from invokeai.backend.image_util.dw_openpose import DWOpenposeDetector2
from invokeai.backend.image_util.dw_openpose import DWOpenposeDetector
@invocation(
@@ -25,20 +25,20 @@ class DWOpenposeDetectionInvocation(BaseInvocation, WithMetadata, WithBoard):
def invoke(self, context: InvocationContext) -> ImageOutput:
image = context.images.get_pil(self.image.image_name, "RGB")
onnx_det_path = context.models.download_and_cache_model(DWOpenposeDetector2.get_model_url_det())
onnx_pose_path = context.models.download_and_cache_model(DWOpenposeDetector2.get_model_url_pose())
onnx_det_path = context.models.download_and_cache_model(DWOpenposeDetector.get_model_url_det())
onnx_pose_path = context.models.download_and_cache_model(DWOpenposeDetector.get_model_url_pose())
loaded_session_det = context.models.load_local_model(
onnx_det_path, DWOpenposeDetector2.create_onnx_inference_session
onnx_det_path, DWOpenposeDetector.create_onnx_inference_session
)
loaded_session_pose = context.models.load_local_model(
onnx_pose_path, DWOpenposeDetector2.create_onnx_inference_session
onnx_pose_path, DWOpenposeDetector.create_onnx_inference_session
)
with loaded_session_det as session_det, loaded_session_pose as session_pose:
assert isinstance(session_det, ort.InferenceSession)
assert isinstance(session_pose, ort.InferenceSession)
detector = DWOpenposeDetector2(session_det=session_det, session_pose=session_pose)
detector = DWOpenposeDetector(session_det=session_det, session_pose=session_pose)
detected_image = detector.run(
image,
draw_face=self.draw_face,

View File

@@ -40,6 +40,7 @@ class UIType(str, Enum, metaclass=MetaEnum):
# region Model Field Types
MainModel = "MainModelField"
CogView4MainModel = "CogView4MainModelField"
FluxMainModel = "FluxMainModelField"
SD3MainModel = "SD3MainModelField"
SDXLMainModel = "SDXLMainModelField"
@@ -56,6 +57,10 @@ class UIType(str, Enum, metaclass=MetaEnum):
CLIPLEmbedModel = "CLIPLEmbedModelField"
CLIPGEmbedModel = "CLIPGEmbedModelField"
SpandrelImageToImageModel = "SpandrelImageToImageModelField"
ControlLoRAModel = "ControlLoRAModelField"
SigLipModel = "SigLipModelField"
FluxReduxModel = "FluxReduxModelField"
LlavaOnevisionModel = "LLaVAModelField"
# endregion
# region Misc Field Types
@@ -133,6 +138,7 @@ class FieldDescriptions:
noise = "Noise tensor"
clip = "CLIP (tokenizer, text encoder, LoRAs) and skipped layer count"
t5_encoder = "T5 tokenizer and text encoder"
glm_encoder = "GLM (THUDM) tokenizer and text encoder"
clip_embed_model = "CLIP Embed loader"
clip_g_model = "CLIP-G Embed loader"
unet = "UNet (scheduler, LoRAs)"
@@ -143,13 +149,16 @@ class FieldDescriptions:
controlnet_model = "ControlNet model to load"
vae_model = "VAE model to load"
lora_model = "LoRA model to load"
control_lora_model = "Control LoRA model to load"
main_model = "Main model (UNet, VAE, CLIP) to load"
flux_model = "Flux model (Transformer) to load"
sd3_model = "SD3 model (MMDiTX) to load"
cogview4_model = "CogView4 model (Transformer) to load"
sdxl_main_model = "SDXL Main model (UNet, VAE, CLIP1, CLIP2) to load"
sdxl_refiner_model = "SDXL Refiner Main Modde (UNet, VAE, CLIP2) to load"
onnx_main_model = "ONNX Main model (UNet, VAE, CLIP) to load"
spandrel_image_to_image_model = "Image-to-Image model"
vllm_model = "VLLM model"
lora_weight = "The weight at which the LoRA is applied to each model"
compel_prompt = "Prompt to be parsed by Compel to create a conditioning tensor"
raw_prompt = "Raw prompt text (no parsing)"
@@ -199,6 +208,9 @@ class FieldDescriptions:
freeu_b1 = "Scaling factor for stage 1 to amplify the contributions of backbone features."
freeu_b2 = "Scaling factor for stage 2 to amplify the contributions of backbone features."
instantx_control_mode = "The control mode for InstantX ControlNet union models. Ignored for other ControlNet models. The standard mapping is: canny (0), tile (1), depth (2), blur (3), pose (4), gray (5), low quality (6). Negative values will be treated as 'None'."
flux_redux_conditioning = "FLUX Redux conditioning tensor"
vllm_model = "The VLLM model to use"
flux_fill_conditioning = "FLUX Fill conditioning tensor"
class ImageField(BaseModel):
@@ -250,6 +262,29 @@ class FluxConditioningField(BaseModel):
"""A conditioning tensor primitive value"""
conditioning_name: str = Field(description="The name of conditioning tensor")
mask: Optional[TensorField] = Field(
default=None,
description="The mask associated with this conditioning tensor. Excluded regions should be set to False, "
"included regions should be set to True.",
)
class FluxReduxConditioningField(BaseModel):
"""A FLUX Redux conditioning tensor primitive value"""
conditioning: TensorField = Field(description="The Redux image conditioning tensor.")
mask: Optional[TensorField] = Field(
default=None,
description="The mask associated with this conditioning tensor. Excluded regions should be set to False, "
"included regions should be set to True.",
)
class FluxFillConditioningField(BaseModel):
"""A FLUX Fill conditioning field."""
image: ImageField = Field(description="The FLUX Fill reference image.")
mask: TensorField = Field(description="The FLUX Fill inpaint mask.")
class SD3ConditioningField(BaseModel):
@@ -258,6 +293,12 @@ class SD3ConditioningField(BaseModel):
conditioning_name: str = Field(description="The name of conditioning tensor")
class CogView4ConditioningField(BaseModel):
"""A conditioning tensor primitive value"""
conditioning_name: str = Field(description="The name of conditioning tensor")
class ConditioningField(BaseModel):
"""A conditioning tensor primitive value"""
@@ -293,6 +334,13 @@ class BoundingBoxField(BaseModel):
raise ValueError(f"y_min ({self.y_min}) is greater than y_max ({self.y_max}).")
return self
def tuple(self) -> Tuple[int, int, int, int]:
"""
Returns the bounding box as a tuple suitable for use with PIL's `Image.crop()` method.
This method returns a tuple of the form (left, upper, right, lower) == (x_min, y_min, x_max, y_max).
"""
return (self.x_min, self.y_min, self.x_max, self.y_max)
class MetadataField(RootModel[dict[str, Any]]):
"""

View File

@@ -0,0 +1,47 @@
from invokeai.app.invocations.baseinvocation import (
BaseInvocation,
BaseInvocationOutput,
invocation,
invocation_output,
)
from invokeai.app.invocations.fields import FieldDescriptions, ImageField, InputField, OutputField, UIType
from invokeai.app.invocations.model import ControlLoRAField, ModelIdentifierField
from invokeai.app.services.shared.invocation_context import InvocationContext
@invocation_output("flux_control_lora_loader_output")
class FluxControlLoRALoaderOutput(BaseInvocationOutput):
"""Flux Control LoRA Loader Output"""
control_lora: ControlLoRAField = OutputField(
title="Flux Control LoRA", description="Control LoRAs to apply on model loading", default=None
)
@invocation(
"flux_control_lora_loader",
title="Control LoRA - FLUX",
tags=["lora", "model", "flux"],
category="model",
version="1.1.1",
)
class FluxControlLoRALoaderInvocation(BaseInvocation):
"""LoRA model and Image to use with FLUX transformer generation."""
lora: ModelIdentifierField = InputField(
description=FieldDescriptions.control_lora_model, title="Control LoRA", ui_type=UIType.ControlLoRAModel
)
image: ImageField = InputField(description="The image to encode.")
weight: float = InputField(description="The weight of the LoRA.", default=1.0)
def invoke(self, context: InvocationContext) -> FluxControlLoRALoaderOutput:
if not context.models.exists(self.lora.key):
raise ValueError(f"Unknown lora: {self.lora.key}!")
return FluxControlLoRALoaderOutput(
control_lora=ControlLoRAField(
lora=self.lora,
img=self.image,
weight=self.weight,
)
)

View File

@@ -3,7 +3,6 @@ from pydantic import BaseModel, Field, field_validator, model_validator
from invokeai.app.invocations.baseinvocation import (
BaseInvocation,
BaseInvocationOutput,
Classification,
invocation,
invocation_output,
)
@@ -52,7 +51,6 @@ class FluxControlNetOutput(BaseInvocationOutput):
tags=["controlnet", "flux"],
category="controlnet",
version="1.0.0",
classification=Classification.Prototype,
)
class FluxControlNetInvocation(BaseInvocation):
"""Collect FLUX ControlNet info to pass to other nodes."""

View File

@@ -1,18 +1,22 @@
from contextlib import ExitStack
from typing import Callable, Iterator, Optional, Tuple
from typing import Callable, Iterator, Optional, Tuple, Union
import einops
import numpy as np
import numpy.typing as npt
import torch
import torchvision.transforms as tv_transforms
from PIL import Image
from torchvision.transforms.functional import resize as tv_resize
from transformers import CLIPImageProcessor, CLIPVisionModelWithProjection
from invokeai.app.invocations.baseinvocation import BaseInvocation, Classification, invocation
from invokeai.app.invocations.baseinvocation import BaseInvocation, invocation
from invokeai.app.invocations.fields import (
DenoiseMaskField,
FieldDescriptions,
FluxConditioningField,
FluxFillConditioningField,
FluxReduxConditioningField,
ImageField,
Input,
InputField,
@@ -21,15 +25,16 @@ from invokeai.app.invocations.fields import (
WithMetadata,
)
from invokeai.app.invocations.flux_controlnet import FluxControlNetField
from invokeai.app.invocations.flux_vae_encode import FluxVaeEncodeInvocation
from invokeai.app.invocations.ip_adapter import IPAdapterField
from invokeai.app.invocations.model import TransformerField, VAEField
from invokeai.app.invocations.model import ControlLoRAField, LoRAField, TransformerField, VAEField
from invokeai.app.invocations.primitives import LatentsOutput
from invokeai.app.services.shared.invocation_context import InvocationContext
from invokeai.backend.flux.controlnet.instantx_controlnet_flux import InstantXControlNetFlux
from invokeai.backend.flux.controlnet.xlabs_controlnet_flux import XLabsControlNetFlux
from invokeai.backend.flux.denoise import denoise
from invokeai.backend.flux.extensions.inpaint_extension import InpaintExtension
from invokeai.backend.flux.extensions.instantx_controlnet_extension import InstantXControlNetExtension
from invokeai.backend.flux.extensions.regional_prompting_extension import RegionalPromptingExtension
from invokeai.backend.flux.extensions.xlabs_controlnet_extension import XLabsControlNetExtension
from invokeai.backend.flux.extensions.xlabs_ip_adapter_extension import XLabsIPAdapterExtension
from invokeai.backend.flux.ip_adapter.xlabs_ip_adapter_flux import XlabsIpAdapterFlux
@@ -42,10 +47,12 @@ from invokeai.backend.flux.sampling_utils import (
pack,
unpack,
)
from invokeai.backend.lora.conversions.flux_lora_constants import FLUX_LORA_TRANSFORMER_PREFIX
from invokeai.backend.lora.lora_model_raw import LoRAModelRaw
from invokeai.backend.lora.lora_patcher import LoRAPatcher
from invokeai.backend.model_manager.config import ModelFormat
from invokeai.backend.flux.text_conditioning import FluxReduxConditioning, FluxTextConditioning
from invokeai.backend.model_manager.taxonomy import ModelFormat, ModelVariantType
from invokeai.backend.patches.layer_patcher import LayerPatcher
from invokeai.backend.patches.lora_conversions.flux_lora_constants import FLUX_LORA_TRANSFORMER_PREFIX
from invokeai.backend.patches.model_patch_raw import ModelPatchRaw
from invokeai.backend.rectified_flow.rectified_flow_inpaint_extension import RectifiedFlowInpaintExtension
from invokeai.backend.stable_diffusion.diffusers_pipeline import PipelineIntermediateState
from invokeai.backend.stable_diffusion.diffusion.conditioning_data import FLUXConditioningInfo
from invokeai.backend.util.devices import TorchDevice
@@ -56,8 +63,7 @@ from invokeai.backend.util.devices import TorchDevice
title="FLUX Denoise",
tags=["image", "flux"],
category="image",
version="3.2.1",
classification=Classification.Prototype,
version="3.3.0",
)
class FluxDenoiseInvocation(BaseInvocation, WithMetadata, WithBoard):
"""Run denoising process with a FLUX transformer model."""
@@ -87,14 +93,27 @@ class FluxDenoiseInvocation(BaseInvocation, WithMetadata, WithBoard):
input=Input.Connection,
title="Transformer",
)
positive_text_conditioning: FluxConditioningField = InputField(
control_lora: Optional[ControlLoRAField] = InputField(
description=FieldDescriptions.control_lora_model, input=Input.Connection, title="Control LoRA", default=None
)
positive_text_conditioning: FluxConditioningField | list[FluxConditioningField] = InputField(
description=FieldDescriptions.positive_cond, input=Input.Connection
)
negative_text_conditioning: FluxConditioningField | None = InputField(
negative_text_conditioning: FluxConditioningField | list[FluxConditioningField] | None = InputField(
default=None,
description="Negative conditioning tensor. Can be None if cfg_scale is 1.0.",
input=Input.Connection,
)
redux_conditioning: FluxReduxConditioningField | list[FluxReduxConditioningField] | None = InputField(
default=None,
description="FLUX Redux conditioning tensor.",
input=Input.Connection,
)
fill_conditioning: FluxFillConditioningField | None = InputField(
default=None,
description="FLUX Fill conditioning.",
input=Input.Connection,
)
cfg_scale: float | list[float] = InputField(default=1.0, description=FieldDescriptions.cfg_scale, title="CFG Scale")
cfg_scale_start_step: int = InputField(
default=0,
@@ -139,36 +158,12 @@ class FluxDenoiseInvocation(BaseInvocation, WithMetadata, WithBoard):
name = context.tensors.save(tensor=latents)
return LatentsOutput.build(latents_name=name, latents=latents, seed=None)
def _load_text_conditioning(
self, context: InvocationContext, conditioning_name: str, dtype: torch.dtype
) -> Tuple[torch.Tensor, torch.Tensor]:
# Load the conditioning data.
cond_data = context.conditioning.load(conditioning_name)
assert len(cond_data.conditionings) == 1
flux_conditioning = cond_data.conditionings[0]
assert isinstance(flux_conditioning, FLUXConditioningInfo)
flux_conditioning = flux_conditioning.to(dtype=dtype)
t5_embeddings = flux_conditioning.t5_embeds
clip_embeddings = flux_conditioning.clip_embeds
return t5_embeddings, clip_embeddings
def _run_diffusion(
self,
context: InvocationContext,
):
inference_dtype = torch.bfloat16
# Load the conditioning data.
pos_t5_embeddings, pos_clip_embeddings = self._load_text_conditioning(
context, self.positive_text_conditioning.conditioning_name, inference_dtype
)
neg_t5_embeddings: torch.Tensor | None = None
neg_clip_embeddings: torch.Tensor | None = None
if self.negative_text_conditioning is not None:
neg_t5_embeddings, neg_clip_embeddings = self._load_text_conditioning(
context, self.negative_text_conditioning.conditioning_name, inference_dtype
)
# Load the input latents, if provided.
init_latents = context.tensors.load(self.latents.latents_name) if self.latents else None
if init_latents is not None:
@@ -183,15 +178,57 @@ class FluxDenoiseInvocation(BaseInvocation, WithMetadata, WithBoard):
dtype=inference_dtype,
seed=self.seed,
)
b, _c, latent_h, latent_w = noise.shape
packed_h = latent_h // 2
packed_w = latent_w // 2
transformer_info = context.models.load(self.transformer.transformer)
is_schnell = "schnell" in transformer_info.config.config_path
# Load the conditioning data.
pos_text_conditionings = self._load_text_conditioning(
context=context,
cond_field=self.positive_text_conditioning,
packed_height=packed_h,
packed_width=packed_w,
dtype=inference_dtype,
device=TorchDevice.choose_torch_device(),
)
neg_text_conditionings: list[FluxTextConditioning] | None = None
if self.negative_text_conditioning is not None:
neg_text_conditionings = self._load_text_conditioning(
context=context,
cond_field=self.negative_text_conditioning,
packed_height=packed_h,
packed_width=packed_w,
dtype=inference_dtype,
device=TorchDevice.choose_torch_device(),
)
redux_conditionings: list[FluxReduxConditioning] = self._load_redux_conditioning(
context=context,
redux_cond_field=self.redux_conditioning,
packed_height=packed_h,
packed_width=packed_w,
device=TorchDevice.choose_torch_device(),
dtype=inference_dtype,
)
pos_regional_prompting_extension = RegionalPromptingExtension.from_text_conditioning(
text_conditioning=pos_text_conditionings,
redux_conditioning=redux_conditionings,
img_seq_len=packed_h * packed_w,
)
neg_regional_prompting_extension = (
RegionalPromptingExtension.from_text_conditioning(
text_conditioning=neg_text_conditionings, redux_conditioning=[], img_seq_len=packed_h * packed_w
)
if neg_text_conditionings
else None
)
transformer_config = context.models.get_config(self.transformer.transformer)
is_schnell = "schnell" in getattr(transformer_config, "config_path", "")
# Calculate the timestep schedule.
image_seq_len = noise.shape[-1] * noise.shape[-2] // 4
timesteps = get_schedule(
num_steps=self.num_steps,
image_seq_len=image_seq_len,
image_seq_len=packed_h * packed_w,
shift=not is_schnell,
)
@@ -226,36 +263,42 @@ class FluxDenoiseInvocation(BaseInvocation, WithMetadata, WithBoard):
if len(timesteps) <= 1:
return x
if is_schnell and self.control_lora:
raise ValueError("Control LoRAs cannot be used with FLUX Schnell")
# Prepare the extra image conditioning tensor (img_cond) for either FLUX structural control or FLUX Fill.
img_cond: torch.Tensor | None = None
is_flux_fill = transformer_config.variant == ModelVariantType.Inpaint # type: ignore
if is_flux_fill:
img_cond = self._prep_flux_fill_img_cond(
context, device=TorchDevice.choose_torch_device(), dtype=inference_dtype
)
else:
if self.fill_conditioning is not None:
raise ValueError("fill_conditioning was provided, but the model is not a FLUX Fill model.")
if self.control_lora is not None:
img_cond = self._prep_structural_control_img_cond(context)
inpaint_mask = self._prep_inpaint_mask(context, x)
b, _c, latent_h, latent_w = x.shape
img_ids = generate_img_ids(h=latent_h, w=latent_w, batch_size=b, device=x.device, dtype=x.dtype)
pos_bs, pos_t5_seq_len, _ = pos_t5_embeddings.shape
pos_txt_ids = torch.zeros(
pos_bs, pos_t5_seq_len, 3, dtype=inference_dtype, device=TorchDevice.choose_torch_device()
)
neg_txt_ids: torch.Tensor | None = None
if neg_t5_embeddings is not None:
neg_bs, neg_t5_seq_len, _ = neg_t5_embeddings.shape
neg_txt_ids = torch.zeros(
neg_bs, neg_t5_seq_len, 3, dtype=inference_dtype, device=TorchDevice.choose_torch_device()
)
# Pack all latent tensors.
init_latents = pack(init_latents) if init_latents is not None else None
inpaint_mask = pack(inpaint_mask) if inpaint_mask is not None else None
noise = pack(noise)
x = pack(x)
# Now that we have 'packed' the latent tensors, verify that we calculated the image_seq_len correctly.
assert image_seq_len == x.shape[1]
# Now that we have 'packed' the latent tensors, verify that we calculated the image_seq_len, packed_h, and
# packed_w correctly.
assert packed_h * packed_w == x.shape[1]
# Prepare inpaint extension.
inpaint_extension: InpaintExtension | None = None
inpaint_extension: RectifiedFlowInpaintExtension | None = None
if inpaint_mask is not None:
assert init_latents is not None
inpaint_extension = InpaintExtension(
inpaint_extension = RectifiedFlowInpaintExtension(
init_latents=init_latents,
inpaint_mask=inpaint_mask,
noise=noise,
@@ -266,7 +309,7 @@ class FluxDenoiseInvocation(BaseInvocation, WithMetadata, WithBoard):
# TODO(ryand): We should really do this in a separate invocation to benefit from caching.
ip_adapter_fields = self._normalize_ip_adapter_fields()
pos_image_prompt_clip_embeds, neg_image_prompt_clip_embeds = self._prep_ip_adapter_image_prompt_clip_embeds(
ip_adapter_fields, context
ip_adapter_fields, context, device=x.device
)
cfg_scale = self.prep_cfg_scale(
@@ -289,41 +332,40 @@ class FluxDenoiseInvocation(BaseInvocation, WithMetadata, WithBoard):
)
# Load the transformer model.
(cached_weights, transformer) = exit_stack.enter_context(transformer_info.model_on_device())
(cached_weights, transformer) = exit_stack.enter_context(
context.models.load(self.transformer.transformer).model_on_device()
)
assert isinstance(transformer, Flux)
config = transformer_info.config
config = transformer_config
assert config is not None
# Apply LoRA models to the transformer.
# Note: We apply the LoRA after the transformer has been moved to its target device for faster patching.
# Determine if the model is quantized.
# If the model is quantized, then we need to apply the LoRA weights as sidecar layers. This results in
# slower inference than direct patching, but is agnostic to the quantization format.
if config.format in [ModelFormat.Checkpoint]:
# The model is non-quantized, so we can apply the LoRA weights directly into the model.
exit_stack.enter_context(
LoRAPatcher.apply_lora_patches(
model=transformer,
patches=self._lora_iterator(context),
prefix=FLUX_LORA_TRANSFORMER_PREFIX,
cached_weights=cached_weights,
)
)
model_is_quantized = False
elif config.format in [
ModelFormat.BnbQuantizedLlmInt8b,
ModelFormat.BnbQuantizednf4b,
ModelFormat.GGUFQuantized,
]:
# The model is quantized, so apply the LoRA weights as sidecar layers. This results in slower inference,
# than directly patching the weights, but is agnostic to the quantization format.
exit_stack.enter_context(
LoRAPatcher.apply_lora_sidecar_patches(
model=transformer,
patches=self._lora_iterator(context),
prefix=FLUX_LORA_TRANSFORMER_PREFIX,
dtype=inference_dtype,
)
)
model_is_quantized = True
else:
raise ValueError(f"Unsupported model format: {config.format}")
# Apply LoRA models to the transformer.
# Note: We apply the LoRA after the transformer has been moved to its target device for faster patching.
exit_stack.enter_context(
LayerPatcher.apply_smart_model_patches(
model=transformer,
patches=self._lora_iterator(context),
prefix=FLUX_LORA_TRANSFORMER_PREFIX,
dtype=inference_dtype,
cached_weights=cached_weights,
force_sidecar_patching=model_is_quantized,
)
)
# Prepare IP-Adapter extensions.
pos_ip_adapter_extensions, neg_ip_adapter_extensions = self._prep_ip_adapter_extensions(
pos_image_prompt_clip_embeds=pos_image_prompt_clip_embeds,
@@ -338,12 +380,8 @@ class FluxDenoiseInvocation(BaseInvocation, WithMetadata, WithBoard):
model=transformer,
img=x,
img_ids=img_ids,
txt=pos_t5_embeddings,
txt_ids=pos_txt_ids,
vec=pos_clip_embeddings,
neg_txt=neg_t5_embeddings,
neg_txt_ids=neg_txt_ids,
neg_vec=neg_clip_embeddings,
pos_regional_prompting_extension=pos_regional_prompting_extension,
neg_regional_prompting_extension=neg_regional_prompting_extension,
timesteps=timesteps,
step_callback=self._build_step_callback(context),
guidance=self.guidance,
@@ -352,11 +390,85 @@ class FluxDenoiseInvocation(BaseInvocation, WithMetadata, WithBoard):
controlnet_extensions=controlnet_extensions,
pos_ip_adapter_extensions=pos_ip_adapter_extensions,
neg_ip_adapter_extensions=neg_ip_adapter_extensions,
img_cond=img_cond,
)
x = unpack(x.float(), self.height, self.width)
return x
def _load_text_conditioning(
self,
context: InvocationContext,
cond_field: FluxConditioningField | list[FluxConditioningField],
packed_height: int,
packed_width: int,
dtype: torch.dtype,
device: torch.device,
) -> list[FluxTextConditioning]:
"""Load text conditioning data from a FluxConditioningField or a list of FluxConditioningFields."""
# Normalize to a list of FluxConditioningFields.
cond_list = [cond_field] if isinstance(cond_field, FluxConditioningField) else cond_field
text_conditionings: list[FluxTextConditioning] = []
for cond_field in cond_list:
# Load the text embeddings.
cond_data = context.conditioning.load(cond_field.conditioning_name)
assert len(cond_data.conditionings) == 1
flux_conditioning = cond_data.conditionings[0]
assert isinstance(flux_conditioning, FLUXConditioningInfo)
flux_conditioning = flux_conditioning.to(dtype=dtype, device=device)
t5_embeddings = flux_conditioning.t5_embeds
clip_embeddings = flux_conditioning.clip_embeds
# Load the mask, if provided.
mask: Optional[torch.Tensor] = None
if cond_field.mask is not None:
mask = context.tensors.load(cond_field.mask.tensor_name)
mask = mask.to(device=device)
mask = RegionalPromptingExtension.preprocess_regional_prompt_mask(
mask, packed_height, packed_width, dtype, device
)
text_conditionings.append(FluxTextConditioning(t5_embeddings, clip_embeddings, mask))
return text_conditionings
def _load_redux_conditioning(
self,
context: InvocationContext,
redux_cond_field: FluxReduxConditioningField | list[FluxReduxConditioningField] | None,
packed_height: int,
packed_width: int,
device: torch.device,
dtype: torch.dtype,
) -> list[FluxReduxConditioning]:
# Normalize to a list of FluxReduxConditioningFields.
if redux_cond_field is None:
return []
redux_cond_list = (
[redux_cond_field] if isinstance(redux_cond_field, FluxReduxConditioningField) else redux_cond_field
)
redux_conditionings: list[FluxReduxConditioning] = []
for redux_cond_field in redux_cond_list:
# Load the Redux conditioning tensor.
redux_cond_data = context.tensors.load(redux_cond_field.conditioning.tensor_name)
redux_cond_data.to(device=device, dtype=dtype)
# Load the mask, if provided.
mask: Optional[torch.Tensor] = None
if redux_cond_field.mask is not None:
mask = context.tensors.load(redux_cond_field.mask.tensor_name)
mask = mask.to(device=device)
mask = RegionalPromptingExtension.preprocess_regional_prompt_mask(
mask, packed_height, packed_width, dtype, device
)
redux_conditionings.append(FluxReduxConditioning(redux_embeddings=redux_cond_data, mask=mask))
return redux_conditionings
@classmethod
def prep_cfg_scale(
cls, cfg_scale: float | list[float], timesteps: list[float], cfg_scale_start_step: int, cfg_scale_end_step: int
@@ -471,15 +583,18 @@ class FluxDenoiseInvocation(BaseInvocation, WithMetadata, WithBoard):
# before loading the models. Then make sure that all VAE encoding is done before loading the ControlNets to
# minimize peak memory.
# First, load the ControlNet models so that we can determine the ControlNet types.
controlnet_models = [context.models.load(controlnet.control_model) for controlnet in controlnets]
# Calculate the controlnet conditioning tensors.
# We do this before loading the ControlNet models because it may require running the VAE, and we are trying to
# keep peak memory down.
controlnet_conds: list[torch.Tensor] = []
for controlnet, controlnet_model in zip(controlnets, controlnet_models, strict=True):
for controlnet in controlnets:
image = context.images.get_pil(controlnet.image.image_name)
# HACK(ryand): We have to load the ControlNet model to determine whether the VAE needs to be run. We really
# shouldn't have to load the model here. There's a risk that the model will be dropped from the model cache
# before we load it into VRAM and thus we'll have to load it again (context:
# https://github.com/invoke-ai/InvokeAI/issues/7513).
controlnet_model = context.models.load(controlnet.control_model)
if isinstance(controlnet_model.model, InstantXControlNetFlux):
if self.controlnet_vae is None:
raise ValueError("A ControlNet VAE is required when using an InstantX FLUX ControlNet.")
@@ -509,10 +624,8 @@ class FluxDenoiseInvocation(BaseInvocation, WithMetadata, WithBoard):
# Finally, load the ControlNet models and initialize the ControlNet extensions.
controlnet_extensions: list[XLabsControlNetExtension | InstantXControlNetExtension] = []
for controlnet, controlnet_cond, controlnet_model in zip(
controlnets, controlnet_conds, controlnet_models, strict=True
):
model = exit_stack.enter_context(controlnet_model)
for controlnet, controlnet_cond in zip(controlnets, controlnet_conds, strict=True):
model = exit_stack.enter_context(context.models.load(controlnet.control_model))
if isinstance(model, XLabsControlNetFlux):
controlnet_extensions.append(
@@ -545,6 +658,92 @@ class FluxDenoiseInvocation(BaseInvocation, WithMetadata, WithBoard):
return controlnet_extensions
def _prep_structural_control_img_cond(self, context: InvocationContext) -> torch.Tensor | None:
if self.control_lora is None:
return None
if not self.controlnet_vae:
raise ValueError("controlnet_vae must be set when using a FLUX Control LoRA.")
# Load the conditioning image and resize it to the target image size.
cond_img = context.images.get_pil(self.control_lora.img.image_name)
cond_img = cond_img.convert("RGB")
cond_img = cond_img.resize((self.width, self.height), Image.Resampling.BICUBIC)
cond_img = np.array(cond_img)
# Normalize the conditioning image to the range [-1, 1].
# This normalization is based on the original implementations here:
# https://github.com/black-forest-labs/flux/blob/805da8571a0b49b6d4043950bd266a65328c243b/src/flux/modules/image_embedders.py#L34
# https://github.com/black-forest-labs/flux/blob/805da8571a0b49b6d4043950bd266a65328c243b/src/flux/modules/image_embedders.py#L60
img_cond = torch.from_numpy(cond_img).float() / 127.5 - 1.0
img_cond = einops.rearrange(img_cond, "h w c -> 1 c h w")
vae_info = context.models.load(self.controlnet_vae.vae)
img_cond = FluxVaeEncodeInvocation.vae_encode(vae_info=vae_info, image_tensor=img_cond)
return pack(img_cond)
def _prep_flux_fill_img_cond(
self, context: InvocationContext, device: torch.device, dtype: torch.dtype
) -> torch.Tensor:
"""Prepare the FLUX Fill conditioning. This method should be called iff the model is a FLUX Fill model.
This logic is based on:
https://github.com/black-forest-labs/flux/blob/716724eb276d94397be99710a0a54d352664e23b/src/flux/sampling.py#L107-L157
"""
# Validate inputs.
if self.fill_conditioning is None:
raise ValueError("A FLUX Fill model is being used without fill_conditioning.")
# TODO(ryand): We should probable rename controlnet_vae. It's used for more than just ControlNets.
if self.controlnet_vae is None:
raise ValueError("A FLUX Fill model is being used without controlnet_vae.")
if self.control_lora is not None:
raise ValueError(
"A FLUX Fill model is being used, but a control_lora was provided. Control LoRAs are not compatible with FLUX Fill models."
)
# Log input warnings related to FLUX Fill usage.
if self.denoise_mask is not None:
context.logger.warning(
"Both fill_conditioning and a denoise_mask were provided. You probably meant to use one or the other."
)
if self.guidance < 25.0:
context.logger.warning("A guidance value of ~30.0 is recommended for FLUX Fill models.")
# Load the conditioning image and resize it to the target image size.
cond_img = context.images.get_pil(self.fill_conditioning.image.image_name, mode="RGB")
cond_img = cond_img.resize((self.width, self.height), Image.Resampling.BICUBIC)
cond_img = np.array(cond_img)
cond_img = torch.from_numpy(cond_img).float() / 127.5 - 1.0
cond_img = einops.rearrange(cond_img, "h w c -> 1 c h w")
cond_img = cond_img.to(device=device, dtype=dtype)
# Load the mask and resize it to the target image size.
mask = context.tensors.load(self.fill_conditioning.mask.tensor_name)
# We expect mask to be a bool tensor with shape [1, H, W].
assert mask.dtype == torch.bool
assert mask.dim() == 3
assert mask.shape[0] == 1
mask = tv_resize(mask, size=[self.height, self.width], interpolation=tv_transforms.InterpolationMode.NEAREST)
mask = mask.to(device=device, dtype=dtype)
mask = einops.rearrange(mask, "1 h w -> 1 1 h w")
# Prepare image conditioning.
cond_img = cond_img * (1 - mask)
vae_info = context.models.load(self.controlnet_vae.vae)
cond_img = FluxVaeEncodeInvocation.vae_encode(vae_info=vae_info, image_tensor=cond_img)
cond_img = pack(cond_img)
# Prepare mask conditioning.
mask = mask[:, 0, :, :]
# Rearrange mask to a 16-channel representation that matches the shape of the VAE-encoded latent space.
mask = einops.rearrange(mask, "b (h ph) (w pw) -> b (ph pw) h w", ph=8, pw=8)
mask = pack(mask)
# Merge image and mask conditioning.
img_cond = torch.cat((cond_img, mask), dim=-1)
return img_cond
def _normalize_ip_adapter_fields(self) -> list[IPAdapterField]:
if self.ip_adapter is None:
return []
@@ -559,6 +758,7 @@ class FluxDenoiseInvocation(BaseInvocation, WithMetadata, WithBoard):
self,
ip_adapter_fields: list[IPAdapterField],
context: InvocationContext,
device: torch.device,
) -> tuple[list[torch.Tensor], list[torch.Tensor]]:
"""Run the IPAdapter CLIPVisionModel, returning image prompt embeddings."""
clip_image_processor = CLIPImageProcessor()
@@ -598,11 +798,11 @@ class FluxDenoiseInvocation(BaseInvocation, WithMetadata, WithBoard):
assert isinstance(image_encoder_model, CLIPVisionModelWithProjection)
clip_image: torch.Tensor = clip_image_processor(images=pos_images, return_tensors="pt").pixel_values
clip_image = clip_image.to(device=image_encoder_model.device, dtype=image_encoder_model.dtype)
clip_image = clip_image.to(device=device, dtype=image_encoder_model.dtype)
pos_clip_image_embeds = image_encoder_model(clip_image).image_embeds
clip_image = clip_image_processor(images=neg_images, return_tensors="pt").pixel_values
clip_image = clip_image.to(device=image_encoder_model.device, dtype=image_encoder_model.dtype)
clip_image = clip_image.to(device=device, dtype=image_encoder_model.dtype)
neg_clip_image_embeds = image_encoder_model(clip_image).image_embeds
pos_image_prompt_clip_embeds.append(pos_clip_image_embeds)
@@ -651,10 +851,15 @@ class FluxDenoiseInvocation(BaseInvocation, WithMetadata, WithBoard):
return pos_ip_adapter_extensions, neg_ip_adapter_extensions
def _lora_iterator(self, context: InvocationContext) -> Iterator[Tuple[LoRAModelRaw, float]]:
for lora in self.transformer.loras:
def _lora_iterator(self, context: InvocationContext) -> Iterator[Tuple[ModelPatchRaw, float]]:
loras: list[Union[LoRAField, ControlLoRAField]] = [*self.transformer.loras]
if self.control_lora:
# Note: Since FLUX structural control LoRAs modify the shape of some weights, it is important that they are
# applied last.
loras.append(self.control_lora)
for lora in loras:
lora_info = context.models.load(lora.lora)
assert isinstance(lora_info.model, LoRAModelRaw)
assert isinstance(lora_info.model, ModelPatchRaw)
yield (lora_info.model, lora.weight)
del lora_info

View File

@@ -0,0 +1,46 @@
from invokeai.app.invocations.baseinvocation import (
BaseInvocation,
BaseInvocationOutput,
Classification,
invocation,
invocation_output,
)
from invokeai.app.invocations.fields import (
FieldDescriptions,
FluxFillConditioningField,
InputField,
OutputField,
TensorField,
)
from invokeai.app.invocations.primitives import ImageField
from invokeai.app.services.shared.invocation_context import InvocationContext
@invocation_output("flux_fill_output")
class FluxFillOutput(BaseInvocationOutput):
"""The conditioning output of a FLUX Fill invocation."""
fill_cond: FluxFillConditioningField = OutputField(
description=FieldDescriptions.flux_redux_conditioning, title="Conditioning"
)
@invocation(
"flux_fill",
title="FLUX Fill Conditioning",
tags=["inpaint"],
category="inpaint",
version="1.0.0",
classification=Classification.Beta,
)
class FluxFillInvocation(BaseInvocation):
"""Prepare the FLUX Fill conditioning data."""
image: ImageField = InputField(description="The FLUX Fill reference image.")
mask: TensorField = InputField(
description="The bool inpainting mask. Excluded regions should be set to "
"False, included regions should be set to True.",
)
def invoke(self, context: InvocationContext) -> FluxFillOutput:
return FluxFillOutput(fill_cond=FluxFillConditioningField(image=self.image, mask=self.mask))

View File

@@ -4,7 +4,7 @@ from typing import List, Literal, Union
from pydantic import field_validator, model_validator
from typing_extensions import Self
from invokeai.app.invocations.baseinvocation import BaseInvocation, Classification, invocation
from invokeai.app.invocations.baseinvocation import BaseInvocation, invocation
from invokeai.app.invocations.fields import InputField, UIType
from invokeai.app.invocations.ip_adapter import (
CLIP_VISION_MODEL_MAP,
@@ -28,7 +28,6 @@ from invokeai.backend.model_manager.config import (
tags=["ip_adapter", "control"],
category="ip_adapter",
version="1.0.0",
classification=Classification.Prototype,
)
class FluxIPAdapterInvocation(BaseInvocation):
"""Collects FLUX IP-Adapter info to pass to other nodes."""

View File

@@ -3,14 +3,13 @@ from typing import Optional
from invokeai.app.invocations.baseinvocation import (
BaseInvocation,
BaseInvocationOutput,
Classification,
invocation,
invocation_output,
)
from invokeai.app.invocations.fields import FieldDescriptions, Input, InputField, OutputField, UIType
from invokeai.app.invocations.model import CLIPField, LoRAField, ModelIdentifierField, TransformerField
from invokeai.app.invocations.model import CLIPField, LoRAField, ModelIdentifierField, T5EncoderField, TransformerField
from invokeai.app.services.shared.invocation_context import InvocationContext
from invokeai.backend.model_manager.config import BaseModelType
from invokeai.backend.model_manager.taxonomy import BaseModelType
@invocation_output("flux_lora_loader_output")
@@ -21,15 +20,17 @@ class FluxLoRALoaderOutput(BaseInvocationOutput):
default=None, description=FieldDescriptions.transformer, title="FLUX Transformer"
)
clip: Optional[CLIPField] = OutputField(default=None, description=FieldDescriptions.clip, title="CLIP")
t5_encoder: Optional[T5EncoderField] = OutputField(
default=None, description=FieldDescriptions.t5_encoder, title="T5 Encoder"
)
@invocation(
"flux_lora_loader",
title="FLUX LoRA",
title="Apply LoRA - FLUX",
tags=["lora", "model", "flux"],
category="model",
version="1.1.0",
classification=Classification.Prototype,
version="1.2.1",
)
class FluxLoRALoaderInvocation(BaseInvocation):
"""Apply a LoRA model to a FLUX transformer and/or text encoder."""
@@ -50,6 +51,12 @@ class FluxLoRALoaderInvocation(BaseInvocation):
description=FieldDescriptions.clip,
input=Input.Connection,
)
t5_encoder: T5EncoderField | None = InputField(
default=None,
title="T5 Encoder",
description=FieldDescriptions.t5_encoder,
input=Input.Connection,
)
def invoke(self, context: InvocationContext) -> FluxLoRALoaderOutput:
lora_key = self.lora.key
@@ -62,6 +69,8 @@ class FluxLoRALoaderInvocation(BaseInvocation):
raise ValueError(f'LoRA "{lora_key}" already applied to transformer.')
if self.clip and any(lora.lora.key == lora_key for lora in self.clip.loras):
raise ValueError(f'LoRA "{lora_key}" already applied to CLIP encoder.')
if self.t5_encoder and any(lora.lora.key == lora_key for lora in self.t5_encoder.loras):
raise ValueError(f'LoRA "{lora_key}" already applied to T5 encoder.')
output = FluxLoRALoaderOutput()
@@ -82,23 +91,30 @@ class FluxLoRALoaderInvocation(BaseInvocation):
weight=self.weight,
)
)
if self.t5_encoder is not None:
output.t5_encoder = self.t5_encoder.model_copy(deep=True)
output.t5_encoder.loras.append(
LoRAField(
lora=self.lora,
weight=self.weight,
)
)
return output
@invocation(
"flux_lora_collection_loader",
title="FLUX LoRA Collection Loader",
title="Apply LoRA Collection - FLUX",
tags=["lora", "model", "flux"],
category="model",
version="1.1.0",
classification=Classification.Prototype,
version="1.3.1",
)
class FLUXLoRACollectionLoader(BaseInvocation):
"""Applies a collection of LoRAs to a FLUX transformer."""
loras: LoRAField | list[LoRAField] = InputField(
description="LoRA models and weights. May be a single LoRA or collection.", title="LoRAs"
loras: Optional[LoRAField | list[LoRAField]] = InputField(
default=None, description="LoRA models and weights. May be a single LoRA or collection.", title="LoRAs"
)
transformer: Optional[TransformerField] = InputField(
@@ -113,13 +129,30 @@ class FLUXLoRACollectionLoader(BaseInvocation):
description=FieldDescriptions.clip,
input=Input.Connection,
)
t5_encoder: T5EncoderField | None = InputField(
default=None,
title="T5 Encoder",
description=FieldDescriptions.t5_encoder,
input=Input.Connection,
)
def invoke(self, context: InvocationContext) -> FluxLoRALoaderOutput:
output = FluxLoRALoaderOutput()
loras = self.loras if isinstance(self.loras, list) else [self.loras]
added_loras: list[str] = []
if self.transformer is not None:
output.transformer = self.transformer.model_copy(deep=True)
if self.clip is not None:
output.clip = self.clip.model_copy(deep=True)
if self.t5_encoder is not None:
output.t5_encoder = self.t5_encoder.model_copy(deep=True)
for lora in loras:
if lora is None:
continue
if lora.lora.key in added_loras:
continue
@@ -130,14 +163,13 @@ class FLUXLoRACollectionLoader(BaseInvocation):
added_loras.append(lora.lora.key)
if self.transformer is not None:
if output.transformer is None:
output.transformer = self.transformer.model_copy(deep=True)
if self.transformer is not None and output.transformer is not None:
output.transformer.loras.append(lora)
if self.clip is not None:
if output.clip is None:
output.clip = self.clip.model_copy(deep=True)
if self.clip is not None and output.clip is not None:
output.clip.loras.append(lora)
if self.t5_encoder is not None and output.t5_encoder is not None:
output.t5_encoder.loras.append(lora)
return output

View File

@@ -3,18 +3,21 @@ from typing import Literal
from invokeai.app.invocations.baseinvocation import (
BaseInvocation,
BaseInvocationOutput,
Classification,
invocation,
invocation_output,
)
from invokeai.app.invocations.fields import FieldDescriptions, Input, InputField, OutputField, UIType
from invokeai.app.invocations.model import CLIPField, ModelIdentifierField, T5EncoderField, TransformerField, VAEField
from invokeai.app.services.shared.invocation_context import InvocationContext
from invokeai.app.util.t5_model_identifier import (
preprocess_t5_encoder_model_identifier,
preprocess_t5_tokenizer_model_identifier,
)
from invokeai.backend.flux.util import max_seq_lengths
from invokeai.backend.model_manager.config import (
CheckpointConfigBase,
SubModelType,
)
from invokeai.backend.model_manager.taxonomy import SubModelType
@invocation_output("flux_model_loader_output")
@@ -33,11 +36,10 @@ class FluxModelLoaderOutput(BaseInvocationOutput):
@invocation(
"flux_model_loader",
title="Flux Main Model",
title="Main Model - FLUX",
tags=["model", "flux"],
category="model",
version="1.0.4",
classification=Classification.Prototype,
version="1.0.6",
)
class FluxModelLoaderInvocation(BaseInvocation):
"""Loads a flux base model, outputting its submodels."""
@@ -74,8 +76,8 @@ class FluxModelLoaderInvocation(BaseInvocation):
tokenizer = self.clip_embed_model.model_copy(update={"submodel_type": SubModelType.Tokenizer})
clip_encoder = self.clip_embed_model.model_copy(update={"submodel_type": SubModelType.TextEncoder})
tokenizer2 = self.t5_encoder_model.model_copy(update={"submodel_type": SubModelType.Tokenizer2})
t5_encoder = self.t5_encoder_model.model_copy(update={"submodel_type": SubModelType.TextEncoder2})
tokenizer2 = preprocess_t5_tokenizer_model_identifier(self.t5_encoder_model)
t5_encoder = preprocess_t5_encoder_model_identifier(self.t5_encoder_model)
transformer_config = context.models.get_config(transformer)
assert isinstance(transformer_config, CheckpointConfigBase)
@@ -83,7 +85,7 @@ class FluxModelLoaderInvocation(BaseInvocation):
return FluxModelLoaderOutput(
transformer=TransformerField(transformer=transformer, loras=[]),
clip=CLIPField(tokenizer=tokenizer, text_encoder=clip_encoder, loras=[], skipped_layers=0),
t5_encoder=T5EncoderField(tokenizer=tokenizer2, text_encoder=t5_encoder),
t5_encoder=T5EncoderField(tokenizer=tokenizer2, text_encoder=t5_encoder, loras=[]),
vae=VAEField(vae=vae),
max_seq_len=max_seq_lengths[transformer_config.config_path],
)

View File

@@ -0,0 +1,166 @@
import math
from typing import Literal, Optional
import torch
from PIL import Image
from transformers import SiglipImageProcessor, SiglipVisionModel
from invokeai.app.invocations.baseinvocation import (
BaseInvocation,
BaseInvocationOutput,
Classification,
invocation,
invocation_output,
)
from invokeai.app.invocations.fields import (
FieldDescriptions,
FluxReduxConditioningField,
InputField,
OutputField,
TensorField,
UIType,
)
from invokeai.app.invocations.model import ModelIdentifierField
from invokeai.app.invocations.primitives import ImageField
from invokeai.app.services.model_records.model_records_base import ModelRecordChanges
from invokeai.app.services.shared.invocation_context import InvocationContext
from invokeai.backend.flux.redux.flux_redux_model import FluxReduxModel
from invokeai.backend.model_manager import BaseModelType, ModelType
from invokeai.backend.model_manager.config import AnyModelConfig
from invokeai.backend.model_manager.starter_models import siglip
from invokeai.backend.sig_lip.sig_lip_pipeline import SigLipPipeline
from invokeai.backend.util.devices import TorchDevice
@invocation_output("flux_redux_output")
class FluxReduxOutput(BaseInvocationOutput):
"""The conditioning output of a FLUX Redux invocation."""
redux_cond: FluxReduxConditioningField = OutputField(
description=FieldDescriptions.flux_redux_conditioning, title="Conditioning"
)
DOWNSAMPLING_FUNCTIONS = Literal["nearest", "bilinear", "bicubic", "area", "nearest-exact"]
@invocation(
"flux_redux",
title="FLUX Redux",
tags=["ip_adapter", "control"],
category="ip_adapter",
version="2.1.0",
classification=Classification.Beta,
)
class FluxReduxInvocation(BaseInvocation):
"""Runs a FLUX Redux model to generate a conditioning tensor."""
image: ImageField = InputField(description="The FLUX Redux image prompt.")
mask: Optional[TensorField] = InputField(
default=None,
description="The bool mask associated with this FLUX Redux image prompt. Excluded regions should be set to "
"False, included regions should be set to True.",
)
redux_model: ModelIdentifierField = InputField(
description="The FLUX Redux model to use.",
title="FLUX Redux Model",
ui_type=UIType.FluxReduxModel,
)
downsampling_factor: int = InputField(
ge=1,
le=9,
default=1,
description="Redux Downsampling Factor (1-9)",
)
downsampling_function: DOWNSAMPLING_FUNCTIONS = InputField(
default="area",
description="Redux Downsampling Function",
)
weight: float = InputField(
ge=0,
le=1,
default=1.0,
description="Redux weight (0.0-1.0)",
)
def invoke(self, context: InvocationContext) -> FluxReduxOutput:
image = context.images.get_pil(self.image.image_name, "RGB")
encoded_x = self._siglip_encode(context, image)
redux_conditioning = self._flux_redux_encode(context, encoded_x)
if self.downsampling_factor > 1 or self.weight != 1.0:
redux_conditioning = self._downsample_weight(context, redux_conditioning)
tensor_name = context.tensors.save(redux_conditioning)
return FluxReduxOutput(
redux_cond=FluxReduxConditioningField(conditioning=TensorField(tensor_name=tensor_name), mask=self.mask)
)
@torch.no_grad()
def _downsample_weight(self, context: InvocationContext, redux_conditioning: torch.Tensor) -> torch.Tensor:
# Downsampling derived from https://github.com/kaibioinfo/ComfyUI_AdvancedRefluxControl
(b, t, h) = redux_conditioning.shape
m = int(math.sqrt(t))
if self.downsampling_factor > 1:
redux_conditioning = redux_conditioning.view(b, m, m, h)
redux_conditioning = torch.nn.functional.interpolate(
redux_conditioning.transpose(1, -1),
size=(m // self.downsampling_factor, m // self.downsampling_factor),
mode=self.downsampling_function,
)
redux_conditioning = redux_conditioning.transpose(1, -1).reshape(b, -1, h)
if self.weight != 1.0:
redux_conditioning = redux_conditioning * self.weight * self.weight
return redux_conditioning
@torch.no_grad()
def _siglip_encode(self, context: InvocationContext, image: Image.Image) -> torch.Tensor:
siglip_model_config = self._get_siglip_model(context)
with context.models.load(siglip_model_config.key).model_on_device() as (_, model):
assert isinstance(model, SiglipVisionModel)
model_abs_path = context.models.get_absolute_path(siglip_model_config)
processor = SiglipImageProcessor.from_pretrained(model_abs_path, local_files_only=True)
assert isinstance(processor, SiglipImageProcessor)
siglip_pipeline = SigLipPipeline(processor, model)
return siglip_pipeline.encode_image(
x=image, device=TorchDevice.choose_torch_device(), dtype=TorchDevice.choose_torch_dtype()
)
@torch.no_grad()
def _flux_redux_encode(self, context: InvocationContext, encoded_x: torch.Tensor) -> torch.Tensor:
with context.models.load(self.redux_model).model_on_device() as (_, flux_redux):
assert isinstance(flux_redux, FluxReduxModel)
dtype = next(flux_redux.parameters()).dtype
encoded_x = encoded_x.to(dtype=dtype)
return flux_redux(encoded_x)
def _get_siglip_model(self, context: InvocationContext) -> AnyModelConfig:
siglip_models = context.models.search_by_attrs(name=siglip.name, base=BaseModelType.Any, type=ModelType.SigLIP)
if not len(siglip_models) > 0:
context.logger.warning(
f"The SigLIP model required by FLUX Redux ({siglip.name}) is not installed. Downloading and installing now. This may take a while."
)
# TODO(psyche): Can the probe reliably determine the type of the model? Just hardcoding it bc I don't want to experiment now
config_overrides = ModelRecordChanges(name=siglip.name, type=ModelType.SigLIP)
# Queue the job
job = context._services.model_manager.install.heuristic_import(siglip.source, config=config_overrides)
# Wait for up to 10 minutes - model is ~3.5GB
context._services.model_manager.install.wait_for_job(job, timeout=600)
siglip_models = context.models.search_by_attrs(
name=siglip.name,
base=BaseModelType.Any,
type=ModelType.SigLIP,
)
if len(siglip_models) == 0:
context.logger.error("Error while fetching SigLIP for FLUX Redux")
assert len(siglip_models) == 1
return siglip_models[0]

View File

@@ -1,29 +1,35 @@
from contextlib import ExitStack
from typing import Iterator, Literal, Tuple
from typing import Iterator, Literal, Optional, Tuple
import torch
from transformers import CLIPTextModel, CLIPTokenizer, T5EncoderModel, T5Tokenizer
from transformers import CLIPTextModel, CLIPTokenizer, T5EncoderModel, T5Tokenizer, T5TokenizerFast
from invokeai.app.invocations.baseinvocation import BaseInvocation, Classification, invocation
from invokeai.app.invocations.fields import FieldDescriptions, Input, InputField
from invokeai.app.invocations.baseinvocation import BaseInvocation, invocation
from invokeai.app.invocations.fields import (
FieldDescriptions,
FluxConditioningField,
Input,
InputField,
TensorField,
UIComponent,
)
from invokeai.app.invocations.model import CLIPField, T5EncoderField
from invokeai.app.invocations.primitives import FluxConditioningOutput
from invokeai.app.services.shared.invocation_context import InvocationContext
from invokeai.backend.flux.modules.conditioner import HFEncoder
from invokeai.backend.lora.conversions.flux_lora_constants import FLUX_LORA_CLIP_PREFIX
from invokeai.backend.lora.lora_model_raw import LoRAModelRaw
from invokeai.backend.lora.lora_patcher import LoRAPatcher
from invokeai.backend.model_manager.config import ModelFormat
from invokeai.backend.model_manager import ModelFormat
from invokeai.backend.patches.layer_patcher import LayerPatcher
from invokeai.backend.patches.lora_conversions.flux_lora_constants import FLUX_LORA_CLIP_PREFIX, FLUX_LORA_T5_PREFIX
from invokeai.backend.patches.model_patch_raw import ModelPatchRaw
from invokeai.backend.stable_diffusion.diffusion.conditioning_data import ConditioningFieldData, FLUXConditioningInfo
@invocation(
"flux_text_encoder",
title="FLUX Text Encoding",
title="Prompt - FLUX",
tags=["prompt", "conditioning", "flux"],
category="conditioning",
version="1.1.0",
classification=Classification.Prototype,
version="1.1.2",
)
class FluxTextEncoderInvocation(BaseInvocation):
"""Encodes and preps a prompt for a flux image."""
@@ -41,7 +47,10 @@ class FluxTextEncoderInvocation(BaseInvocation):
t5_max_seq_len: Literal[256, 512] = InputField(
description="Max sequence length for the T5 encoder. Expected to be 256 for FLUX schnell models and 512 for FLUX dev models."
)
prompt: str = InputField(description="Text prompt to encode.")
prompt: str = InputField(description="Text prompt to encode.", ui_component=UIComponent.Textarea)
mask: Optional[TensorField] = InputField(
default=None, description="A mask defining the region that this conditioning prompt applies to."
)
@torch.no_grad()
def invoke(self, context: InvocationContext) -> FluxConditioningOutput:
@@ -54,20 +63,51 @@ class FluxTextEncoderInvocation(BaseInvocation):
)
conditioning_name = context.conditioning.save(conditioning_data)
return FluxConditioningOutput.build(conditioning_name)
return FluxConditioningOutput(
conditioning=FluxConditioningField(conditioning_name=conditioning_name, mask=self.mask)
)
def _t5_encode(self, context: InvocationContext) -> torch.Tensor:
t5_tokenizer_info = context.models.load(self.t5_encoder.tokenizer)
t5_text_encoder_info = context.models.load(self.t5_encoder.text_encoder)
prompt = [self.prompt]
t5_encoder_info = context.models.load(self.t5_encoder.text_encoder)
t5_encoder_config = t5_encoder_info.config
assert t5_encoder_config is not None
with (
t5_text_encoder_info as t5_text_encoder,
t5_tokenizer_info as t5_tokenizer,
t5_encoder_info.model_on_device() as (cached_weights, t5_text_encoder),
context.models.load(self.t5_encoder.tokenizer) as t5_tokenizer,
ExitStack() as exit_stack,
):
assert isinstance(t5_text_encoder, T5EncoderModel)
assert isinstance(t5_tokenizer, T5Tokenizer)
assert isinstance(t5_tokenizer, (T5Tokenizer, T5TokenizerFast))
# Determine if the model is quantized.
# If the model is quantized, then we need to apply the LoRA weights as sidecar layers. This results in
# slower inference than direct patching, but is agnostic to the quantization format.
if t5_encoder_config.format in [ModelFormat.T5Encoder, ModelFormat.Diffusers]:
model_is_quantized = False
elif t5_encoder_config.format in [
ModelFormat.BnbQuantizedLlmInt8b,
ModelFormat.BnbQuantizednf4b,
ModelFormat.GGUFQuantized,
]:
model_is_quantized = True
else:
raise ValueError(f"Unsupported model format: {t5_encoder_config.format}")
# Apply LoRA models to the T5 encoder.
# Note: We apply the LoRA after the encoder has been moved to its target device for faster patching.
exit_stack.enter_context(
LayerPatcher.apply_smart_model_patches(
model=t5_text_encoder,
patches=self._t5_lora_iterator(context),
prefix=FLUX_LORA_T5_PREFIX,
dtype=t5_text_encoder.dtype,
cached_weights=cached_weights,
force_sidecar_patching=model_is_quantized,
)
)
t5_encoder = HFEncoder(t5_text_encoder, t5_tokenizer, False, self.t5_max_seq_len)
@@ -78,31 +118,30 @@ class FluxTextEncoderInvocation(BaseInvocation):
return prompt_embeds
def _clip_encode(self, context: InvocationContext) -> torch.Tensor:
clip_tokenizer_info = context.models.load(self.clip.tokenizer)
clip_text_encoder_info = context.models.load(self.clip.text_encoder)
prompt = [self.prompt]
clip_text_encoder_info = context.models.load(self.clip.text_encoder)
clip_text_encoder_config = clip_text_encoder_info.config
assert clip_text_encoder_config is not None
with (
clip_text_encoder_info.model_on_device() as (cached_weights, clip_text_encoder),
clip_tokenizer_info as clip_tokenizer,
context.models.load(self.clip.tokenizer) as clip_tokenizer,
ExitStack() as exit_stack,
):
assert isinstance(clip_text_encoder, CLIPTextModel)
assert isinstance(clip_tokenizer, CLIPTokenizer)
clip_text_encoder_config = clip_text_encoder_info.config
assert clip_text_encoder_config is not None
# Apply LoRA models to the CLIP encoder.
# Note: We apply the LoRA after the transformer has been moved to its target device for faster patching.
if clip_text_encoder_config.format in [ModelFormat.Diffusers]:
# The model is non-quantized, so we can apply the LoRA weights directly into the model.
exit_stack.enter_context(
LoRAPatcher.apply_lora_patches(
LayerPatcher.apply_smart_model_patches(
model=clip_text_encoder,
patches=self._clip_lora_iterator(context),
prefix=FLUX_LORA_CLIP_PREFIX,
dtype=clip_text_encoder.dtype,
cached_weights=cached_weights,
)
)
@@ -118,9 +157,16 @@ class FluxTextEncoderInvocation(BaseInvocation):
assert isinstance(pooled_prompt_embeds, torch.Tensor)
return pooled_prompt_embeds
def _clip_lora_iterator(self, context: InvocationContext) -> Iterator[Tuple[LoRAModelRaw, float]]:
def _clip_lora_iterator(self, context: InvocationContext) -> Iterator[Tuple[ModelPatchRaw, float]]:
for lora in self.clip.loras:
lora_info = context.models.load(lora.lora)
assert isinstance(lora_info.model, LoRAModelRaw)
assert isinstance(lora_info.model, ModelPatchRaw)
yield (lora_info.model, lora.weight)
del lora_info
def _t5_lora_iterator(self, context: InvocationContext) -> Iterator[Tuple[ModelPatchRaw, float]]:
for lora in self.t5_encoder.loras:
lora_info = context.models.load(lora.lora)
assert isinstance(lora_info.model, ModelPatchRaw)
yield (lora_info.model, lora.weight)
del lora_info

View File

@@ -3,6 +3,7 @@ from einops import rearrange
from PIL import Image
from invokeai.app.invocations.baseinvocation import BaseInvocation, invocation
from invokeai.app.invocations.constants import LATENT_SCALE_FACTOR
from invokeai.app.invocations.fields import (
FieldDescriptions,
Input,
@@ -21,10 +22,10 @@ from invokeai.backend.util.devices import TorchDevice
@invocation(
"flux_vae_decode",
title="FLUX Latents to Image",
title="Latents to Image - FLUX",
tags=["latents", "image", "vae", "l2i", "flux"],
category="latents",
version="1.0.0",
version="1.0.2",
)
class FluxVaeDecodeInvocation(BaseInvocation, WithMetadata, WithBoard):
"""Generates an image from latents."""
@@ -38,8 +39,18 @@ class FluxVaeDecodeInvocation(BaseInvocation, WithMetadata, WithBoard):
input=Input.Connection,
)
def _estimate_working_memory(self, latents: torch.Tensor, vae: AutoEncoder) -> int:
"""Estimate the working memory required by the invocation in bytes."""
out_h = LATENT_SCALE_FACTOR * latents.shape[-2]
out_w = LATENT_SCALE_FACTOR * latents.shape[-1]
element_size = next(vae.parameters()).element_size()
scaling_constant = 2200 # Determined experimentally.
working_memory = out_h * out_w * element_size * scaling_constant
return int(working_memory)
def _vae_decode(self, vae_info: LoadedModel, latents: torch.Tensor) -> Image.Image:
with vae_info as vae:
estimated_working_memory = self._estimate_working_memory(latents, vae_info.model)
with vae_info.model_on_device(working_mem_bytes=estimated_working_memory) as (_, vae):
assert isinstance(vae, AutoEncoder)
vae_dtype = next(iter(vae.parameters())).dtype
latents = latents.to(device=TorchDevice.choose_torch_device(), dtype=vae_dtype)

View File

@@ -19,10 +19,10 @@ from invokeai.backend.util.devices import TorchDevice
@invocation(
"flux_vae_encode",
title="FLUX Image to Latents",
title="Image to Latents - FLUX",
tags=["latents", "image", "vae", "i2l", "flux"],
category="latents",
version="1.0.0",
version="1.0.1",
)
class FluxVaeEncodeInvocation(BaseInvocation):
"""Encodes an image into latents."""

View File

@@ -6,7 +6,7 @@ from invokeai.app.invocations.constants import LATENT_SCALE_FACTOR
from invokeai.app.invocations.fields import FieldDescriptions, InputField, OutputField
from invokeai.app.invocations.model import UNetField
from invokeai.app.services.shared.invocation_context import InvocationContext
from invokeai.backend.model_manager.config import BaseModelType
from invokeai.backend.model_manager.taxonomy import BaseModelType
@invocation_output("ideal_size_output")
@@ -19,9 +19,9 @@ class IdealSizeOutput(BaseInvocationOutput):
@invocation(
"ideal_size",
title="Ideal Size",
title="Ideal Size - SD1.5, SDXL",
tags=["latents", "math", "ideal_size"],
version="1.0.3",
version="1.0.5",
)
class IdealSizeInvocation(BaseInvocation):
"""Calculates the ideal size for generation to avoid duplication"""
@@ -41,11 +41,16 @@ class IdealSizeInvocation(BaseInvocation):
def invoke(self, context: InvocationContext) -> IdealSizeOutput:
unet_config = context.models.get_config(self.unet.unet.key)
aspect = self.width / self.height
dimension: float = 512
if unet_config.base == BaseModelType.StableDiffusion2:
if unet_config.base == BaseModelType.StableDiffusion1:
dimension = 512
elif unet_config.base == BaseModelType.StableDiffusion2:
dimension = 768
elif unet_config.base == BaseModelType.StableDiffusionXL:
elif unet_config.base in (BaseModelType.StableDiffusionXL, BaseModelType.Flux, BaseModelType.StableDiffusion3):
dimension = 1024
else:
raise ValueError(f"Unsupported model type: {unet_config.base}")
dimension = dimension * self.multiplier
min_dimension = math.floor(dimension * 0.5)
model_area = dimension * dimension # hardcoded for now since all models are trained on square images

View File

@@ -13,6 +13,7 @@ from invokeai.app.invocations.baseinvocation import (
)
from invokeai.app.invocations.constants import IMAGE_MODES
from invokeai.app.invocations.fields import (
BoundingBoxField,
ColorField,
FieldDescriptions,
ImageField,
@@ -23,6 +24,7 @@ from invokeai.app.invocations.fields import (
from invokeai.app.invocations.primitives import ImageOutput
from invokeai.app.services.image_records.image_records_common import ImageCategory
from invokeai.app.services.shared.invocation_context import InvocationContext
from invokeai.app.util.misc import SEED_MAX
from invokeai.backend.image_util.invisible_watermark import InvisibleWatermark
from invokeai.backend.image_util.safety_checker import SafetyChecker
@@ -161,12 +163,12 @@ class ImagePasteInvocation(BaseInvocation, WithMetadata, WithBoard):
crop: bool = InputField(default=False, description="Crop to base image dimensions")
def invoke(self, context: InvocationContext) -> ImageOutput:
base_image = context.images.get_pil(self.base_image.image_name)
image = context.images.get_pil(self.image.image_name)
base_image = context.images.get_pil(self.base_image.image_name, mode="RGBA")
image = context.images.get_pil(self.image.image_name, mode="RGBA")
mask = None
if self.mask is not None:
mask = context.images.get_pil(self.mask.image_name)
mask = ImageOps.invert(mask.convert("L"))
mask = context.images.get_pil(self.mask.image_name, mode="L")
mask = ImageOps.invert(mask)
# TODO: probably shouldn't invert mask here... should user be required to do it?
min_x = min(0, self.x)
@@ -176,7 +178,11 @@ class ImagePasteInvocation(BaseInvocation, WithMetadata, WithBoard):
new_image = Image.new(mode="RGBA", size=(max_x - min_x, max_y - min_y), color=(0, 0, 0, 0))
new_image.paste(base_image, (abs(min_x), abs(min_y)))
new_image.paste(image, (max(0, self.x), max(0, self.y)), mask=mask)
# Create a temporary image to paste the image with transparency
temp_image = Image.new("RGBA", new_image.size)
temp_image.paste(image, (max(0, self.x), max(0, self.y)), mask=mask)
new_image = Image.alpha_composite(new_image, temp_image)
if self.crop:
base_w, base_h = base_image.size
@@ -301,14 +307,44 @@ class ImageBlurInvocation(BaseInvocation, WithMetadata, WithBoard):
blur_type: Literal["gaussian", "box"] = InputField(default="gaussian", description="The type of blur")
def invoke(self, context: InvocationContext) -> ImageOutput:
image = context.images.get_pil(self.image.image_name)
image = context.images.get_pil(self.image.image_name, mode="RGBA")
# Split the image into RGBA channels
r, g, b, a = image.split()
# Premultiply RGB channels by alpha
premultiplied_image = ImageChops.multiply(image, a.convert("RGBA"))
premultiplied_image.putalpha(a)
# Apply the blur
blur = (
ImageFilter.GaussianBlur(self.radius) if self.blur_type == "gaussian" else ImageFilter.BoxBlur(self.radius)
)
blur_image = image.filter(blur)
blurred_image = premultiplied_image.filter(blur)
image_dto = context.images.save(image=blur_image)
# Split the blurred image into RGBA channels
r, g, b, a_orig = blurred_image.split()
# Convert to float using NumPy. float 32/64 division are much faster than float 16
r = numpy.array(r, dtype=numpy.float32)
g = numpy.array(g, dtype=numpy.float32)
b = numpy.array(b, dtype=numpy.float32)
a = numpy.array(a_orig, dtype=numpy.float32) / 255.0 # Normalize alpha to [0, 1]
# Unpremultiply RGB channels by alpha
r /= a + 1e-6 # Add a small epsilon to avoid division by zero
g /= a + 1e-6
b /= a + 1e-6
# Convert back to PIL images
r = Image.fromarray(numpy.uint8(numpy.clip(r, 0, 255)))
g = Image.fromarray(numpy.uint8(numpy.clip(g, 0, 255)))
b = Image.fromarray(numpy.uint8(numpy.clip(b, 0, 255)))
# Merge back into a single image
result_image = Image.merge("RGBA", (r, g, b, a_orig))
image_dto = context.images.save(image=result_image)
return ImageOutput.build(image_dto)
@@ -319,7 +355,6 @@ class ImageBlurInvocation(BaseInvocation, WithMetadata, WithBoard):
tags=["image", "unsharp_mask"],
category="image",
version="1.2.2",
classification=Classification.Beta,
)
class UnsharpMaskInvocation(BaseInvocation, WithMetadata, WithBoard):
"""Applies an unsharp mask filter to an image"""
@@ -807,7 +842,7 @@ CHANNEL_FORMATS = {
"value",
],
category="image",
version="1.2.2",
version="1.2.3",
)
class ImageChannelOffsetInvocation(BaseInvocation, WithMetadata, WithBoard):
"""Add or subtract a value from a specific color channel of an image."""
@@ -817,18 +852,22 @@ class ImageChannelOffsetInvocation(BaseInvocation, WithMetadata, WithBoard):
offset: int = InputField(default=0, ge=-255, le=255, description="The amount to adjust the channel by")
def invoke(self, context: InvocationContext) -> ImageOutput:
pil_image = context.images.get_pil(self.image.image_name)
image = context.images.get_pil(self.image.image_name, "RGBA")
# extract the channel and mode from the input and reference tuple
mode = CHANNEL_FORMATS[self.channel][0]
channel_number = CHANNEL_FORMATS[self.channel][1]
# Convert PIL image to new format
converted_image = numpy.array(pil_image.convert(mode)).astype(int)
converted_image = numpy.array(image.convert(mode)).astype(int)
image_channel = converted_image[:, :, channel_number]
# Adjust the value, clipping to 0..255
image_channel = numpy.clip(image_channel + self.offset, 0, 255)
if self.channel == "Hue (HSV)":
# loop around the values because hue is special
image_channel = (image_channel + self.offset) % 256
else:
# Adjust the value, clipping to 0..255
image_channel = numpy.clip(image_channel + self.offset, 0, 255)
# Put the channel back into the image
converted_image[:, :, channel_number] = image_channel
@@ -836,6 +875,10 @@ class ImageChannelOffsetInvocation(BaseInvocation, WithMetadata, WithBoard):
# Convert back to RGBA format and output
pil_image = Image.fromarray(converted_image.astype(numpy.uint8), mode=mode).convert("RGBA")
# restore the alpha channel
if self.channel != "Alpha (RGBA)":
pil_image.putalpha(image.getchannel("A"))
image_dto = context.images.save(image=pil_image)
return ImageOutput.build(image_dto)
@@ -863,7 +906,7 @@ class ImageChannelOffsetInvocation(BaseInvocation, WithMetadata, WithBoard):
"value",
],
category="image",
version="1.2.2",
version="1.2.3",
)
class ImageChannelMultiplyInvocation(BaseInvocation, WithMetadata, WithBoard):
"""Scale a specific color channel of an image."""
@@ -874,14 +917,14 @@ class ImageChannelMultiplyInvocation(BaseInvocation, WithMetadata, WithBoard):
invert_channel: bool = InputField(default=False, description="Invert the channel after scaling")
def invoke(self, context: InvocationContext) -> ImageOutput:
pil_image = context.images.get_pil(self.image.image_name)
image = context.images.get_pil(self.image.image_name, "RGBA")
# extract the channel and mode from the input and reference tuple
mode = CHANNEL_FORMATS[self.channel][0]
channel_number = CHANNEL_FORMATS[self.channel][1]
# Convert PIL image to new format
converted_image = numpy.array(pil_image.convert(mode)).astype(float)
converted_image = numpy.array(image.convert(mode)).astype(float)
image_channel = converted_image[:, :, channel_number]
# Adjust the value, clipping to 0..255
@@ -897,6 +940,10 @@ class ImageChannelMultiplyInvocation(BaseInvocation, WithMetadata, WithBoard):
# Convert back to RGBA format and output
pil_image = Image.fromarray(converted_image.astype(numpy.uint8), mode=mode).convert("RGBA")
# restore the alpha channel
if self.channel != "Alpha (RGBA)":
pil_image.putalpha(image.getchannel("A"))
image_dto = context.images.save(image=pil_image)
return ImageOutput.build(image_dto)
@@ -962,10 +1009,10 @@ class CanvasPasteBackInvocation(BaseInvocation, WithMetadata, WithBoard):
@invocation(
"mask_from_id",
title="Mask from ID",
title="Mask from Segmented Image",
tags=["image", "mask", "id"],
category="image",
version="1.0.0",
version="1.0.1",
)
class MaskFromIDInvocation(BaseInvocation, WithMetadata, WithBoard):
"""Generate a mask for a particular color in an ID Map"""
@@ -975,40 +1022,24 @@ class MaskFromIDInvocation(BaseInvocation, WithMetadata, WithBoard):
threshold: int = InputField(default=100, description="Threshold for color detection")
invert: bool = InputField(default=False, description="Whether or not to invert the mask")
def rgba_to_hex(self, rgba_color: tuple[int, int, int, int]):
r, g, b, a = rgba_color
hex_code = "#{:02X}{:02X}{:02X}{:02X}".format(r, g, b, int(a * 255))
return hex_code
def invoke(self, context: InvocationContext) -> ImageOutput:
image = context.images.get_pil(self.image.image_name, mode="RGBA")
def id_to_mask(self, id_mask: Image.Image, color: tuple[int, int, int, int], threshold: int = 100):
if id_mask.mode != "RGB":
id_mask = id_mask.convert("RGB")
# Can directly just use the tuple but I'll leave this rgba_to_hex here
# incase anyone prefers using hex codes directly instead of the color picker
hex_color_str = self.rgba_to_hex(color)
rgb_color = numpy.array([int(hex_color_str[i : i + 2], 16) for i in (1, 3, 5)])
np_color = numpy.array(self.color.tuple())
# Maybe there's a faster way to calculate this distance but I can't think of any right now.
color_distance = numpy.linalg.norm(id_mask - rgb_color, axis=-1)
color_distance = numpy.linalg.norm(image - np_color, axis=-1)
# Create a mask based on the threshold and the distance calculated above
binary_mask = (color_distance < threshold).astype(numpy.uint8) * 255
binary_mask = (color_distance < self.threshold).astype(numpy.uint8) * 255
# Convert the mask back to PIL
binary_mask_pil = Image.fromarray(binary_mask)
return binary_mask_pil
def invoke(self, context: InvocationContext) -> ImageOutput:
image = context.images.get_pil(self.image.image_name)
mask = self.id_to_mask(image, self.color.tuple(), self.threshold)
if self.invert:
mask = ImageOps.invert(mask)
binary_mask_pil = ImageOps.invert(binary_mask_pil)
image_dto = context.images.save(image=mask, image_category=ImageCategory.MASK)
image_dto = context.images.save(image=binary_mask_pil, image_category=ImageCategory.MASK)
return ImageOutput.build(image_dto)
@@ -1019,7 +1050,7 @@ class MaskFromIDInvocation(BaseInvocation, WithMetadata, WithBoard):
tags=["image", "mask", "id"],
category="image",
version="1.0.0",
classification=Classification.Internal,
classification=Classification.Deprecated,
)
class CanvasV2MaskAndCropInvocation(BaseInvocation, WithMetadata, WithBoard):
"""Handles Canvas V2 image output masking and cropping"""
@@ -1055,3 +1086,246 @@ class CanvasV2MaskAndCropInvocation(BaseInvocation, WithMetadata, WithBoard):
image_dto = context.images.save(image=generated_image)
return ImageOutput.build(image_dto)
@invocation(
"expand_mask_with_fade", title="Expand Mask with Fade", tags=["image", "mask"], category="image", version="1.0.1"
)
class ExpandMaskWithFadeInvocation(BaseInvocation, WithMetadata, WithBoard):
"""Expands a mask with a fade effect. The mask uses black to indicate areas to keep from the generated image and white for areas to discard.
The mask is thresholded to create a binary mask, and then a distance transform is applied to create a fade effect.
The fade size is specified in pixels, and the mask is expanded by that amount. The result is a mask with a smooth transition from black to white.
If the fade size is 0, the mask is returned as-is.
"""
mask: ImageField = InputField(description="The mask to expand")
threshold: int = InputField(default=0, ge=0, le=255, description="The threshold for the binary mask (0-255)")
fade_size_px: int = InputField(default=32, ge=0, description="The size of the fade in pixels")
def invoke(self, context: InvocationContext) -> ImageOutput:
pil_mask = context.images.get_pil(self.mask.image_name, mode="L")
if self.fade_size_px == 0:
# If the fade size is 0, just return the mask as-is.
image_dto = context.images.save(image=pil_mask, image_category=ImageCategory.MASK)
return ImageOutput.build(image_dto)
np_mask = numpy.array(pil_mask)
# Threshold the mask to create a binary mask - 0 for black, 255 for white
# If we don't threshold we can get some weird artifacts
np_mask = numpy.where(np_mask > self.threshold, 255, 0).astype(numpy.uint8)
# Create a mask for the black region (1 where black, 0 otherwise)
black_mask = (np_mask == 0).astype(numpy.uint8)
# Invert the black region
bg_mask = 1 - black_mask
# Create a distance transform of the inverted mask
dist = cv2.distanceTransform(bg_mask, cv2.DIST_L2, 5)
# Normalize distances so that pixels <fade_size_px become a linear gradient (0 to 1)
d_norm = numpy.clip(dist / self.fade_size_px, 0, 1)
# Control points: x values (normalized distance) and corresponding fade pct y values.
# There are some magic numbers here that are used to create a smooth transition:
# - The first point is at 0% of fade size from edge of mask (meaning the edge of the mask), and is 0% fade (black)
# - The second point is 1px from the edge of the mask and also has 0% fade, effectively expanding the mask
# by 1px. This fixes an issue where artifacts can occur at the edge of the mask
# - The third point is at 20% of the fade size from the edge of the mask and has 20% fade
# - The fourth point is at 80% of the fade size from the edge of the mask and has 90% fade
# - The last point is at 100% of the fade size from the edge of the mask and has 100% fade (white)
# x values: 0 = mask edge, 1 = fade_size_px from edge
x_control = numpy.array([0.0, 1.0 / self.fade_size_px, 0.2, 0.8, 1.0])
# y values: 0 = black, 1 = white
y_control = numpy.array([0.0, 0.0, 0.2, 0.9, 1.0])
# Fit a cubic polynomial that smoothly passes through the control points
coeffs = numpy.polyfit(x_control, y_control, 3)
poly = numpy.poly1d(coeffs)
# Evaluate the polynomial
feather = poly(d_norm)
# The polynomial fit isn't perfect. Points beyond the fade distance are likely to be slightly less than 1.0,
# even though the control points indicate that they should be exactly 1.0. This is due to the nature of the
# polynomial fit, which is a best approximation of the control points but not an exact match.
# When this occurs, the area outside the mask and fade-out will not be 100% transparent. For example, it may
# have an alpha value of 1 instead of 0. So we must force pixels at or beyond the fade distance to exactly 1.0.
# Force pixels at or beyond the fade distance to exactly 1.0
feather = numpy.where(d_norm >= 1.0, 1.0, feather)
# Clip any other values to ensure they're in the valid range [0,1]
feather = numpy.clip(feather, 0, 1)
# Build final image.
np_result = numpy.where(black_mask == 1, 0, (feather * 255).astype(numpy.uint8))
# Convert back to PIL, grayscale
pil_result = Image.fromarray(np_result.astype(numpy.uint8), mode="L")
image_dto = context.images.save(image=pil_result, image_category=ImageCategory.MASK)
return ImageOutput.build(image_dto)
@invocation(
"apply_mask_to_image",
title="Apply Mask to Image",
tags=["image", "mask", "blend"],
category="image",
version="1.0.0",
)
class ApplyMaskToImageInvocation(BaseInvocation, WithMetadata, WithBoard):
"""
Extracts a region from a generated image using a mask and blends it seamlessly onto a source image.
The mask uses black to indicate areas to keep from the generated image and white for areas to discard.
"""
image: ImageField = InputField(description="The image from which to extract the masked region")
mask: ImageField = InputField(description="The mask defining the region (black=keep, white=discard)")
invert_mask: bool = InputField(
default=False,
description="Whether to invert the mask before applying it",
)
def invoke(self, context: InvocationContext) -> ImageOutput:
# Load images
image = context.images.get_pil(self.image.image_name, mode="RGBA")
mask = context.images.get_pil(self.mask.image_name, mode="L")
if self.invert_mask:
# Invert the mask if requested
mask = ImageOps.invert(mask.copy())
# Combine the mask as the alpha channel of the image
r, g, b, _ = image.split() # Split the image into RGB and alpha channels
result_image = Image.merge("RGBA", (r, g, b, mask)) # Use the mask as the new alpha channel
# Save the resulting image
image_dto = context.images.save(image=result_image)
return ImageOutput.build(image_dto)
@invocation(
"img_noise",
title="Add Image Noise",
tags=["image", "noise"],
category="image",
version="1.0.1",
)
class ImageNoiseInvocation(BaseInvocation, WithMetadata, WithBoard):
"""Add noise to an image"""
image: ImageField = InputField(description="The image to add noise to")
seed: int = InputField(
default=0,
ge=0,
le=SEED_MAX,
description=FieldDescriptions.seed,
)
noise_type: Literal["gaussian", "salt_and_pepper"] = InputField(
default="gaussian",
description="The type of noise to add",
)
amount: float = InputField(default=0.1, ge=0, le=1, description="The amount of noise to add")
noise_color: bool = InputField(default=True, description="Whether to add colored noise")
size: int = InputField(default=1, ge=1, description="The size of the noise points")
def invoke(self, context: InvocationContext) -> ImageOutput:
image = context.images.get_pil(self.image.image_name, mode="RGBA")
# Save out the alpha channel
alpha = image.getchannel("A")
# Set the seed for numpy random
rs = numpy.random.RandomState(numpy.random.MT19937(numpy.random.SeedSequence(self.seed)))
if self.noise_type == "gaussian":
if self.noise_color:
noise = rs.normal(0, 1, (image.height // self.size, image.width // self.size, 3)) * 255
else:
noise = rs.normal(0, 1, (image.height // self.size, image.width // self.size)) * 255
noise = numpy.stack([noise] * 3, axis=-1)
elif self.noise_type == "salt_and_pepper":
if self.noise_color:
noise = rs.choice(
[0, 255], (image.height // self.size, image.width // self.size, 3), p=[1 - self.amount, self.amount]
)
else:
noise = rs.choice(
[0, 255], (image.height // self.size, image.width // self.size), p=[1 - self.amount, self.amount]
)
noise = numpy.stack([noise] * 3, axis=-1)
noise = Image.fromarray(noise.astype(numpy.uint8), mode="RGB").resize(
(image.width, image.height), Image.Resampling.NEAREST
)
noisy_image = Image.blend(image.convert("RGB"), noise, self.amount).convert("RGBA")
# Paste back the alpha channel
noisy_image.putalpha(alpha)
image_dto = context.images.save(image=noisy_image)
return ImageOutput.build(image_dto)
@invocation(
"crop_image_to_bounding_box",
title="Crop Image to Bounding Box",
category="image",
version="1.0.0",
tags=["image", "crop"],
)
class CropImageToBoundingBoxInvocation(BaseInvocation, WithMetadata, WithBoard):
"""Crop an image to the given bounding box. If the bounding box is omitted, the image is cropped to the non-transparent pixels."""
image: ImageField = InputField(description="The image to crop")
bounding_box: BoundingBoxField | None = InputField(
default=None, description="The bounding box to crop the image to"
)
def invoke(self, context: InvocationContext) -> ImageOutput:
image = context.images.get_pil(self.image.image_name)
bounding_box = self.bounding_box.tuple() if self.bounding_box is not None else image.getbbox()
cropped_image = image.crop(bounding_box)
image_dto = context.images.save(image=cropped_image)
return ImageOutput.build(image_dto)
@invocation(
"paste_image_into_bounding_box",
title="Paste Image into Bounding Box",
category="image",
version="1.0.0",
tags=["image", "crop"],
)
class PasteImageIntoBoundingBoxInvocation(BaseInvocation, WithMetadata, WithBoard):
"""Paste the source image into the target image at the given bounding box.
The source image must be the same size as the bounding box, and the bounding box must fit within the target image."""
source_image: ImageField = InputField(description="The image to paste")
target_image: ImageField = InputField(description="The image to paste into")
bounding_box: BoundingBoxField = InputField(description="The bounding box to paste the image into")
def invoke(self, context: InvocationContext) -> ImageOutput:
source_image = context.images.get_pil(self.source_image.image_name, mode="RGBA")
target_image = context.images.get_pil(self.target_image.image_name, mode="RGBA")
bounding_box = self.bounding_box.tuple()
target_image.paste(source_image, bounding_box, source_image)
image_dto = context.images.save(image=target_image)
return ImageOutput.build(image_dto)

View File

@@ -0,0 +1,59 @@
from pydantic import ValidationInfo, field_validator
from invokeai.app.invocations.baseinvocation import (
BaseInvocation,
BaseInvocationOutput,
Classification,
invocation,
invocation_output,
)
from invokeai.app.invocations.fields import InputField, OutputField
from invokeai.app.services.shared.invocation_context import InvocationContext
@invocation_output("image_panel_coordinate_output")
class ImagePanelCoordinateOutput(BaseInvocationOutput):
x_left: int = OutputField(description="The left x-coordinate of the panel.")
y_top: int = OutputField(description="The top y-coordinate of the panel.")
width: int = OutputField(description="The width of the panel.")
height: int = OutputField(description="The height of the panel.")
@invocation(
"image_panel_layout",
title="Image Panel Layout",
tags=["image", "panel", "layout"],
category="image",
version="1.0.0",
classification=Classification.Prototype,
)
class ImagePanelLayoutInvocation(BaseInvocation):
"""Get the coordinates of a single panel in a grid. (If the full image shape cannot be divided evenly into panels,
then the grid may not cover the entire image.)
"""
width: int = InputField(description="The width of the entire grid.")
height: int = InputField(description="The height of the entire grid.")
num_cols: int = InputField(ge=1, default=1, description="The number of columns in the grid.")
num_rows: int = InputField(ge=1, default=1, description="The number of rows in the grid.")
panel_col_idx: int = InputField(ge=0, default=0, description="The column index of the panel to be processed.")
panel_row_idx: int = InputField(ge=0, default=0, description="The row index of the panel to be processed.")
@field_validator("panel_col_idx")
def validate_panel_col_idx(cls, v: int, info: ValidationInfo) -> int:
if v < 0 or v >= info.data["num_cols"]:
raise ValueError(f"panel_col_idx must be between 0 and {info.data['num_cols'] - 1}")
return v
@field_validator("panel_row_idx")
def validate_panel_row_idx(cls, v: int, info: ValidationInfo) -> int:
if v < 0 or v >= info.data["num_rows"]:
raise ValueError(f"panel_row_idx must be between 0 and {info.data['num_rows'] - 1}")
return v
def invoke(self, context: InvocationContext) -> ImagePanelCoordinateOutput:
x_left = self.panel_col_idx * (self.width // self.num_cols)
y_top = self.panel_row_idx * (self.height // self.num_rows)
width = self.width // self.num_cols
height = self.height // self.num_rows
return ImagePanelCoordinateOutput(x_left=x_left, y_top=y_top, width=width, height=height)

View File

@@ -13,7 +13,7 @@ from diffusers.models.autoencoders.autoencoder_kl import AutoencoderKL
from diffusers.models.autoencoders.autoencoder_tiny import AutoencoderTiny
from invokeai.app.invocations.baseinvocation import BaseInvocation, invocation
from invokeai.app.invocations.constants import DEFAULT_PRECISION, LATENT_SCALE_FACTOR
from invokeai.app.invocations.constants import LATENT_SCALE_FACTOR
from invokeai.app.invocations.fields import (
FieldDescriptions,
ImageField,
@@ -26,14 +26,15 @@ from invokeai.app.services.shared.invocation_context import InvocationContext
from invokeai.backend.model_manager import LoadedModel
from invokeai.backend.stable_diffusion.diffusers_pipeline import image_resized_to_grid_as_tensor
from invokeai.backend.stable_diffusion.vae_tiling import patch_vae_tiling_params
from invokeai.backend.util.devices import TorchDevice
@invocation(
"i2l",
title="Image to Latents",
title="Image to Latents - SD1.5, SDXL",
tags=["latents", "image", "vae", "i2l"],
category="latents",
version="1.1.0",
version="1.1.1",
)
class ImageToLatentsInvocation(BaseInvocation):
"""Encodes an image into latents."""
@@ -49,7 +50,7 @@ class ImageToLatentsInvocation(BaseInvocation):
# NOTE: tile_size = 0 is a special value. We use this rather than `int | None`, because the workflow UI does not
# offer a way to directly set None values.
tile_size: int = InputField(default=0, multiple_of=8, description=FieldDescriptions.vae_tile_size)
fp32: bool = InputField(default=DEFAULT_PRECISION == torch.float32, description=FieldDescriptions.fp32)
fp32: bool = InputField(default=False, description=FieldDescriptions.fp32)
@staticmethod
def vae_encode(
@@ -98,7 +99,7 @@ class ImageToLatentsInvocation(BaseInvocation):
)
# non_noised_latents_from_image
image_tensor = image_tensor.to(device=vae.device, dtype=vae.dtype)
image_tensor = image_tensor.to(device=TorchDevice.choose_torch_device(), dtype=vae.dtype)
with torch.inference_mode(), tiling_context:
latents = ImageToLatentsInvocation._encode_to_tensor(vae, image_tensor)

View File

@@ -127,13 +127,16 @@ class InfillPatchMatchInvocation(InfillImageProcessorInvocation):
return infilled
LAMA_MODEL_URL = "https://github.com/Sanster/models/releases/download/add_big_lama/big-lama.pt"
@invocation("infill_lama", title="LaMa Infill", tags=["image", "inpaint"], category="inpaint", version="1.2.2")
class LaMaInfillInvocation(InfillImageProcessorInvocation):
"""Infills transparent areas of an image using the LaMa model"""
def infill(self, image: Image.Image):
with self._context.models.load_remote_model(
source="https://github.com/Sanster/models/releases/download/add_big_lama/big-lama.pt",
source=LAMA_MODEL_URL,
loader=LaMA.load_jit_model,
) as model:
lama = LaMA(model)

View File

@@ -13,10 +13,8 @@ from invokeai.app.services.model_records.model_records_base import ModelRecordCh
from invokeai.app.services.shared.invocation_context import InvocationContext
from invokeai.backend.model_manager.config import (
AnyModelConfig,
BaseModelType,
IPAdapterCheckpointConfig,
IPAdapterInvokeAIConfig,
ModelType,
)
from invokeai.backend.model_manager.starter_models import (
StarterModel,
@@ -24,6 +22,7 @@ from invokeai.backend.model_manager.starter_models import (
ip_adapter_sd_image_encoder,
ip_adapter_sdxl_image_encoder,
)
from invokeai.backend.model_manager.taxonomy import BaseModelType, ModelType
class IPAdapterField(BaseModel):
@@ -69,7 +68,13 @@ CLIP_VISION_MODEL_MAP: dict[Literal["ViT-L", "ViT-H", "ViT-G"], StarterModel] =
}
@invocation("ip_adapter", title="IP-Adapter", tags=["ip_adapter", "control"], category="ip_adapter", version="1.5.0")
@invocation(
"ip_adapter",
title="IP-Adapter - SD1.5, SDXL",
tags=["ip_adapter", "control"],
category="ip_adapter",
version="1.5.1",
)
class IPAdapterInvocation(BaseInvocation):
"""Collects IP-Adapter info to pass to other nodes."""

View File

@@ -12,7 +12,7 @@ from diffusers.models.autoencoders.autoencoder_kl import AutoencoderKL
from diffusers.models.autoencoders.autoencoder_tiny import AutoencoderTiny
from invokeai.app.invocations.baseinvocation import BaseInvocation, invocation
from invokeai.app.invocations.constants import DEFAULT_PRECISION, LATENT_SCALE_FACTOR
from invokeai.app.invocations.constants import LATENT_SCALE_FACTOR
from invokeai.app.invocations.fields import (
FieldDescriptions,
Input,
@@ -31,10 +31,10 @@ from invokeai.backend.util.devices import TorchDevice
@invocation(
"l2i",
title="Latents to Image",
title="Latents to Image - SD1.5, SDXL",
tags=["latents", "image", "vae", "l2i"],
category="latents",
version="1.3.0",
version="1.3.2",
)
class LatentsToImageInvocation(BaseInvocation, WithMetadata, WithBoard):
"""Generates an image from latents."""
@@ -51,18 +51,58 @@ class LatentsToImageInvocation(BaseInvocation, WithMetadata, WithBoard):
# NOTE: tile_size = 0 is a special value. We use this rather than `int | None`, because the workflow UI does not
# offer a way to directly set None values.
tile_size: int = InputField(default=0, multiple_of=8, description=FieldDescriptions.vae_tile_size)
fp32: bool = InputField(default=DEFAULT_PRECISION == torch.float32, description=FieldDescriptions.fp32)
fp32: bool = InputField(default=False, description=FieldDescriptions.fp32)
def _estimate_working_memory(
self, latents: torch.Tensor, use_tiling: bool, vae: AutoencoderKL | AutoencoderTiny
) -> int:
"""Estimate the working memory required by the invocation in bytes."""
# It was found experimentally that the peak working memory scales linearly with the number of pixels and the
# element size (precision). This estimate is accurate for both SD1 and SDXL.
element_size = 4 if self.fp32 else 2
scaling_constant = 2200 # Determined experimentally.
if use_tiling:
tile_size = self.tile_size
if tile_size == 0:
tile_size = vae.tile_sample_min_size
assert isinstance(tile_size, int)
out_h = tile_size
out_w = tile_size
working_memory = out_h * out_w * element_size * scaling_constant
# We add 25% to the working memory estimate when tiling is enabled to account for factors like tile overlap
# and number of tiles. We could make this more precise in the future, but this should be good enough for
# most use cases.
working_memory = working_memory * 1.25
else:
out_h = LATENT_SCALE_FACTOR * latents.shape[-2]
out_w = LATENT_SCALE_FACTOR * latents.shape[-1]
working_memory = out_h * out_w * element_size * scaling_constant
if self.fp32:
# If we are running in FP32, then we should account for the likely increase in model size (~250MB).
working_memory += 250 * 2**20
return int(working_memory)
@torch.no_grad()
def invoke(self, context: InvocationContext) -> ImageOutput:
latents = context.tensors.load(self.latents.latents_name)
use_tiling = self.tiled or context.config.get().force_tiled_decode
vae_info = context.models.load(self.vae.vae)
assert isinstance(vae_info.model, (AutoencoderKL, AutoencoderTiny))
with SeamlessExt.static_patch_model(vae_info.model, self.vae.seamless_axes), vae_info as vae:
estimated_working_memory = self._estimate_working_memory(latents, use_tiling, vae_info.model)
with (
SeamlessExt.static_patch_model(vae_info.model, self.vae.seamless_axes),
vae_info.model_on_device(working_mem_bytes=estimated_working_memory) as (_, vae),
):
context.util.signal_progress("Running VAE decoder")
assert isinstance(vae, (AutoencoderKL, AutoencoderTiny))
latents = latents.to(vae.device)
latents = latents.to(TorchDevice.choose_torch_device())
if self.fp32:
vae.to(dtype=torch.float32)
@@ -88,7 +128,7 @@ class LatentsToImageInvocation(BaseInvocation, WithMetadata, WithBoard):
vae.to(dtype=torch.float16)
latents = latents.half()
if self.tiled or context.config.get().force_tiled_decode:
if use_tiling:
vae.enable_tiling()
else:
vae.disable_tiling()

View File

@@ -0,0 +1,75 @@
from typing import Any
import torch
from PIL.Image import Image
from pydantic import field_validator
from transformers import AutoProcessor, LlavaOnevisionForConditionalGeneration, LlavaOnevisionProcessor
from invokeai.app.invocations.baseinvocation import BaseInvocation, Classification, invocation
from invokeai.app.invocations.fields import FieldDescriptions, ImageField, InputField, UIComponent, UIType
from invokeai.app.invocations.model import ModelIdentifierField
from invokeai.app.invocations.primitives import StringOutput
from invokeai.app.services.shared.invocation_context import InvocationContext
from invokeai.backend.llava_onevision_pipeline import LlavaOnevisionPipeline
from invokeai.backend.util.devices import TorchDevice
@invocation(
"llava_onevision_vllm",
title="LLaVA OneVision VLLM",
tags=["vllm"],
category="vllm",
version="1.0.0",
classification=Classification.Beta,
)
class LlavaOnevisionVllmInvocation(BaseInvocation):
"""Run a LLaVA OneVision VLLM model."""
images: list[ImageField] | ImageField | None = InputField(default=None, max_length=3, description="Input image.")
prompt: str = InputField(
default="",
description="Input text prompt.",
ui_component=UIComponent.Textarea,
)
vllm_model: ModelIdentifierField = InputField(
title="LLaVA Model Type",
description=FieldDescriptions.vllm_model,
ui_type=UIType.LlavaOnevisionModel,
)
@field_validator("images", mode="before")
def listify_images(cls, v: Any) -> list:
if v is None:
return v
if not isinstance(v, list):
return [v]
return v
def _get_images(self, context: InvocationContext) -> list[Image]:
if self.images is None:
return []
image_fields = self.images if isinstance(self.images, list) else [self.images]
return [context.images.get_pil(image_field.image_name, "RGB") for image_field in image_fields]
@torch.no_grad()
def invoke(self, context: InvocationContext) -> StringOutput:
images = self._get_images(context)
model_config = context.models.get_config(self.vllm_model)
with context.models.load(self.vllm_model).model_on_device() as (_, model):
assert isinstance(model, LlavaOnevisionForConditionalGeneration)
model_abs_path = context.models.get_absolute_path(model_config)
processor = AutoProcessor.from_pretrained(model_abs_path, local_files_only=True)
assert isinstance(processor, LlavaOnevisionProcessor)
model = LlavaOnevisionPipeline(model, processor)
output = model.run(
prompt=self.prompt,
images=images,
device=TorchDevice.choose_torch_device(),
dtype=TorchDevice.choose_torch_dtype(),
)
return StringOutput(value=output)

View File

@@ -0,0 +1,83 @@
import logging
import shutil
import sys
import traceback
from importlib.util import module_from_spec, spec_from_file_location
from pathlib import Path
def load_custom_nodes(custom_nodes_path: Path, logger: logging.Logger):
"""
Loads all custom nodes from the custom_nodes_path directory.
If custom_nodes_path does not exist, it creates it.
It also copies the custom_nodes/README.md file to the custom_nodes_path directory. Because this file may change,
it is _always_ copied to the custom_nodes_path directory.
Then, it crawls the custom_nodes_path directory and imports all top-level directories as python modules.
If the directory does not contain an __init__.py file or starts with an `_` or `.`, it is skipped.
"""
# create the custom nodes directory if it does not exist
custom_nodes_path.mkdir(parents=True, exist_ok=True)
# Copy the README file to the custom nodes directory
source_custom_nodes_readme_path = Path(__file__).parent / "custom_nodes/README.md"
target_custom_nodes_readme_path = Path(custom_nodes_path) / "README.md"
# copy our custom nodes README to the custom nodes directory
shutil.copy(source_custom_nodes_readme_path, target_custom_nodes_readme_path)
loaded_packs: list[str] = []
failed_packs: list[str] = []
# Import custom nodes, see https://docs.python.org/3/library/importlib.html#importing-programmatically
for d in custom_nodes_path.iterdir():
# skip files
if not d.is_dir():
continue
# skip hidden directories
if d.name.startswith("_") or d.name.startswith("."):
continue
# skip directories without an `__init__.py`
init = d / "__init__.py"
if not init.exists():
continue
module_name = init.parent.stem
# skip if already imported
if module_name in globals():
continue
# load the module
spec = spec_from_file_location(module_name, init.absolute())
if spec is None or spec.loader is None:
logger.warning(f"Could not load {init}")
continue
logger.info(f"Loading node pack {module_name}")
try:
module = module_from_spec(spec)
sys.modules[spec.name] = module
spec.loader.exec_module(module)
loaded_packs.append(module_name)
except Exception:
failed_packs.append(module_name)
full_error = traceback.format_exc()
logger.error(f"Failed to load node pack {module_name} (may have partially loaded):\n{full_error}")
del init, module_name
loaded_count = len(loaded_packs)
if loaded_count > 0:
logger.info(
f"Loaded {loaded_count} node pack{'s' if loaded_count != 1 else ''} from {custom_nodes_path}: {', '.join(loaded_packs)}"
)

Some files were not shown because too many files have changed in this diff Show More