Compare commits

...

386 Commits

Author SHA1 Message Date
psychedelicious
ac012721b0 feat(ui): iterate on simple tab 2025-05-14 17:58:17 +10:00
psychedelicious
9706df02d4 feat(ui): rough out simple generation tab state (wip) 2025-05-14 10:46:38 +10:00
Riku
7722f479e8 translationBot(ui): update translation (German)
Currently translated at 64.9% (1236 of 1902 strings)

Co-authored-by: Riku <riku.block@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/de/
Translation: InvokeAI/Web UI
2025-05-14 10:32:24 +10:00
Linos
3ad4072183 translationBot(ui): update translation (Vietnamese)
Currently translated at 100.0% (1904 of 1904 strings)

translationBot(ui): update translation (Vietnamese)

Currently translated at 100.0% (1902 of 1902 strings)

Co-authored-by: Linos <linos.coding@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/vi/
Translation: InvokeAI/Web UI
2025-05-14 10:32:24 +10:00
Hosted Weblate
6dfb9a1906 translationBot(ui): update translation files
Updated by "Cleanup translation files" hook in Weblate.

Co-authored-by: Hosted Weblate <hosted@weblate.org>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/
Translation: InvokeAI/Web UI
2025-05-14 10:32:24 +10:00
RyoKoba
ad2924350d translationBot(ui): update translation (Japanese)
Currently translated at 67.1% (1279 of 1904 strings)

translationBot(ui): update translation (Japanese)

Currently translated at 64.9% (1231 of 1895 strings)

translationBot(ui): update translation (Japanese)

Currently translated at 60.2% (1141 of 1895 strings)

translationBot(ui): update translation (Japanese)

Currently translated at 56.7% (1075 of 1895 strings)

Co-authored-by: RyoKoba <kobayashi_ryo@cyberagent.co.jp>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/ja/
Translation: InvokeAI/Web UI
2025-05-14 10:32:24 +10:00
Linos
3bf51ee0c2 translationBot(ui): update translation (Vietnamese)
Currently translated at 100.0% (1896 of 1896 strings)

translationBot(ui): update translation (Vietnamese)

Currently translated at 100.0% (1895 of 1895 strings)

translationBot(ui): update translation (Vietnamese)

Currently translated at 100.0% (1886 of 1886 strings)

Co-authored-by: Linos <linos.coding@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/vi/
Translation: InvokeAI/Web UI
2025-05-14 10:32:24 +10:00
Hosted Weblate
fce5051dcc translationBot(ui): update translation files
Updated by "Remove blank strings" hook in Weblate.

Co-authored-by: Hosted Weblate <hosted@weblate.org>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/
Translation: InvokeAI/Web UI
2025-05-14 10:32:24 +10:00
Riccardo Giovanetti
446d8818b9 translationBot(ui): update translation (Italian)
Currently translated at 98.8% (1883 of 1904 strings)

translationBot(ui): update translation (Italian)

Currently translated at 98.8% (1882 of 1903 strings)

translationBot(ui): update translation (Italian)

Currently translated at 98.8% (1881 of 1902 strings)

translationBot(ui): update translation (Italian)

Currently translated at 98.8% (1878 of 1899 strings)

translationBot(ui): update translation (Italian)

Currently translated at 98.8% (1874 of 1895 strings)

translationBot(ui): update translation (Italian)

Currently translated at 98.8% (1873 of 1895 strings)

translationBot(ui): update translation (Italian)

Currently translated at 98.8% (1864 of 1886 strings)

Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
2025-05-14 10:32:24 +10:00
psychedelicious
1566e29c19 feat(nodes): tidy some type annotations in baseinvocation 2025-05-14 06:55:15 +10:00
psychedelicious
6a2e35f2c4 feat(nodes): store original field annotation & FieldInfo in invocations 2025-05-14 06:55:15 +10:00
psychedelicious
b6d58774f4 feat(nodes): improved error messages for invalid defaults 2025-05-14 06:55:15 +10:00
psychedelicious
758f94d3c6 chore(ui): typegen 2025-05-14 06:55:15 +10:00
psychedelicious
9df0871754 fix(nodes): do not provide invalid defaults for batch nodes 2025-05-14 06:55:15 +10:00
psychedelicious
3011150a3a feat(nodes): validate default values for all fields
This prevents issues where the node is defined with an invalid default value, which would guarantee an error during a ser/de roundtrip.

- Upstream issue requesting this functionality be built-in to pydantic: https://github.com/pydantic/pydantic/issues/8722
- Upstream PR that implements the functionality: https://github.com/pydantic/pydantic-core/pull/1593
2025-05-14 06:55:15 +10:00
psychedelicious
05aa1fce71 chore(ui): typegen 2025-05-14 06:55:15 +10:00
psychedelicious
df81f3274a feat(nodes): improved pydantic type annotation massaging
When we do our field type overrides to allow invocations to be instantiated without all required fields, we were not modifying the annotation of the field but did set the default value of the field to `None`.

This results in an error when doing a ser/de round trip. Here's what we end up doing:

```py
from pydantic import BaseModel, Field

class MyModel(BaseModel):
    foo: str = Field(default=None)
```

And here is a simple round-trip, which should not error but which does:

```py
MyModel(**MyModel().model_dump())
# ValidationError: 1 validation error for MyModel
# foo
#   Input should be a valid string [type=string_type, input_value=None, input_type=NoneType]
#     For further information visit https://errors.pydantic.dev/2.11/v/string_type
```

To fix this, we now check every incoming field and update its annotation to match its default value. In other words, when we override the default field value to `None`, we make its type annotation `<original type> | None`.

This prevents the error during deserialization.

This slightly alters the schema for all invocations and outputs - the values of all fields without default values are now typed as `<original type> | None`, reflecting the overrides.

This means the autogenerated types for fields have also changed for fields without defaults:

```ts
// Old
image?: components["schemas"]["ImageField"];

// New
image?: components["schemas"]["ImageField"] | null;
```

This does not break anything on the frontend.
2025-05-14 06:55:15 +10:00
psychedelicious
143487a492 chore: bump version to v5.11.0 2025-05-13 14:04:45 +10:00
psychedelicious
203fa04295 feat(nodes): support bottleneck flag for nodes 2025-05-13 11:56:40 +10:00
Mary Hipp Rogers
954fce3c67 feat(ui): custom error toast support (#8001)
* support for custom error toast components, starting with usage limit

* add support for all usage limits

---------

Co-authored-by: Mary Hipp <maryhipp@Marys-MacBook-Air.local>
2025-05-08 15:53:10 -04:00
Mary Hipp
821889148a easier way to override Whats New 2025-05-07 15:40:21 -04:00
Mary Hipp
4c248d8c2c refetch queue list on mount 2025-05-07 15:37:55 -04:00
Mary Hipp
deb75805d4 use the max for iterations passed in 2025-05-06 18:26:40 -04:00
Mary Hipp Rogers
93110654da Change feature to disable apiModels to chatGPT4oModels only (#7996)
* display credit column in queue list if shouldShowCredits is true

* change apiModels feature to chatGPT4oModels feature

* empty

---------

Co-authored-by: Mary Hipp <maryhipp@Marys-MacBook-Air.local>
2025-05-06 14:37:03 -04:00
psychedelicious
ff0c48d532 chore(ui): prettier 2025-05-06 09:07:52 -04:00
psychedelicious
de18073814 feat(ui): support imagen3/chatgpt-4o models in canvas 2025-05-06 09:07:52 -04:00
psychedelicious
0708af9545 feat(ui): support imagen3/chatgpt-4o models in workflow editor 2025-05-06 09:07:52 -04:00
psychedelicious
1e85184c62 feat(nodes): add imagen3/chatgpt-4o field types 2025-05-06 09:07:52 -04:00
psychedelicious
11d3b8d944 feat(ui): add usage info to model picker 2025-05-06 09:07:52 -04:00
psychedelicious
bffd4afb96 chore(ui): typegen 2025-05-06 09:07:52 -04:00
psychedelicious
518a896521 feat(mm): add usage_info to model config 2025-05-06 09:07:52 -04:00
psychedelicious
2647ff141a feat(ui): add basic metadata to imagen3/chatgpt-4o graphs 2025-05-06 09:07:52 -04:00
Mary Hipp Rogers
ba0bac2aa5 add credits to queue item status changed (#7993)
* display credit column in queue list if shouldShowCredits is true

* add credits when queue item status changes

* chore(ui): typegen

---------

Co-authored-by: Mary Hipp <maryhipp@Marys-MacBook-Air.local>
Co-authored-by: psychedelicious <4822129+psychedelicious@users.noreply.github.com>
2025-05-06 08:54:44 -04:00
psychedelicious
862e2a3e49 chore(ui): typegen 2025-05-05 16:09:13 -04:00
Mary Hipp
d22fd32b05 typegen 2025-05-05 16:09:13 -04:00
Mary Hipp
391e5b7f8c update schema 2025-05-05 16:09:13 -04:00
Mary Hipp
c9d2a5f59a display credit column in queue list if shouldShowCredits is true 2025-05-05 16:09:13 -04:00
Kent Keirsey
1f63b60021 Implementing support for Non-Standard LoRA Format (#7985)
* integrate loRA

* idk anymore tbh

* enable fused matrix for quantized models

* integrate loRA

* idk anymore tbh

* enable fused matrix for quantized models

* ruff fix

---------

Co-authored-by: Sam <bhaskarmdutt@gmail.com>
Co-authored-by: psychedelicious <4822129+psychedelicious@users.noreply.github.com>
2025-05-05 09:40:38 -04:00
psychedelicious
a499b9f54e chore: bump version to v5.11.0rc2 2025-05-05 23:32:27 +10:00
psychedelicious
104505ea02 chore(ui): lint 2025-05-05 23:25:29 +10:00
psychedelicious
ee4002607c feat(ui): add UI to reset hf token 2025-05-05 23:25:29 +10:00
psychedelicious
fd20582cdd chore(ui): typegen 2025-05-05 23:25:29 +10:00
psychedelicious
43b0d07517 feat(api): add route to reset hf token 2025-05-05 23:25:29 +10:00
blessedcoolant
f83592a052 fix: deprecation warning in get_iso_timestemp 2025-05-05 11:45:30 +10:00
Mary Hipp
b3ee906749 add prompt validation to imagen3 graph 2025-05-01 13:02:13 -04:00
psychedelicious
5d69e9068a feat(ui): add ability to globally disable hotkeys
This will both hide the hotkey from the hotkey modal and override any other enabled status it has.
2025-05-01 10:50:34 -04:00
psychedelicious
a79136b058 fix(ui): always add selectModelsTab hotkey data to prevent unhandled exception while registering the hotkey handler 2025-05-01 10:50:34 -04:00
psychedelicious
944af4d4a9 feat(ui): show unsupported gen mode toasts as warnings intead of errors 2025-05-01 23:25:01 +10:00
psychedelicious
5e001be73a tidy(ui): remove excessive nav to mm buttons 2025-05-01 23:22:19 +10:00
psychedelicious
576a644b3a tidy(ui): modelpicker component 2025-05-01 23:22:19 +10:00
psychedelicious
703557c8a6 feat(ui): cleanup 2025-05-01 23:22:19 +10:00
psychedelicious
d59a53b3f9 feat(ui): simplify picker types 2025-05-01 23:22:19 +10:00
psychedelicious
7b8f78c2d9 fix(ui): focus bug w/ popvoer 2025-05-01 23:22:19 +10:00
psychedelicious
31ab9be79a feat(ui): iterate on picker 2025-05-01 23:22:19 +10:00
psychedelicious
5011fab85d fix(ui): restore FLUX Dev info popover to main model picker 2025-05-01 10:59:51 +10:00
psychedelicious
92bdb9fdcc chore(ui): remove unused exports 2025-05-01 10:59:51 +10:00
Mary Hipp
548e766c0b feat(ui): ability to disable generating with API models 2025-05-01 10:59:51 +10:00
Mary Hipp
ff897f74a1 send the list of reference images reversed to chatGPT so it matches displayed order 2025-04-30 15:56:38 -04:00
psychedelicious
3d29c996ed feat(ui): support img2img for chatgpt 4o w/ ref images 2025-04-30 13:39:05 +10:00
psychedelicious
42d57d1225 fix(ui): ref image layout 2025-04-30 13:39:05 +10:00
psychedelicious
193fa9395a fix(ui): match ref image model to main model when creating global ref image 2025-04-30 13:39:05 +10:00
psychedelicious
56cd839d5b feat(ui): support for ref images for chatgpt on canvas 2025-04-30 13:39:05 +10:00
ubansi
7b446ee40d docs: fix Contribute node import error
When I followed the Contribute Node documentation, I encountered an import error.
This commit fixes the error, which will help reduce debugging time for all future contributors.
2025-04-29 21:03:00 -04:00
Mary Hipp Rogers
17027c4070 Maryhipp/chatgpt UI (#7969)
* add GPTimage1 as allowed base model

* fix for non-disabled inpaint layers

* lots of boilerplate for adding gpt-image base model and disabling things along with imagen

* handle gpt-image dimensions

* build graph for gpt-image

* lint

* feat(ui): make chatgpt model naming consistent

* feat(ui): graph builder naming

* feat(ui): disable img2img for imagen3

* feat(ui): more naming

* feat(ui): support presigned url prefetch

* feat(ui): disable neg prompt for chatgpt

* docs(ui): update docstring

* feat(ui): fix graph building issues for chatgpt

* fix(ui): node ids for chatgpt/imagen

* chore(ui): typegen

---------

Co-authored-by: Mary Hipp <maryhipp@Marys-MacBook-Air.local>
Co-authored-by: psychedelicious <4822129+psychedelicious@users.noreply.github.com>
2025-04-29 09:38:03 -04:00
psychedelicious
13d44f47ce chore(ui): prettier 2025-04-29 09:12:49 +10:00
psychedelicious
550fbdeb1c fix(ui): more types fixes 2025-04-29 09:12:49 +10:00
psychedelicious
a01cd7c497 fix(ui): add chatgpt-4o to zod schemas that need to match autogenerated types 2025-04-29 09:12:49 +10:00
Mary Hipp
c54afd600c typegen 2025-04-29 09:12:49 +10:00
Mary Hipp
4f911a0ea8 typegen 2025-04-29 09:12:49 +10:00
Mary Hipp
fb91f48722 change base model for chatGPT 4o 2025-04-29 09:12:49 +10:00
psychedelicious
69db60a614 fix(ui): toast typo 2025-04-29 06:56:36 +10:00
Mary Hipp
c6d7f951aa typegen 2025-04-28 15:39:11 -04:00
Mary Hipp
04c005284c add gpt-image to possible base model types 2025-04-28 15:39:11 -04:00
psychedelicious
2d7f9697bf chore(ui): lint 2025-04-28 13:31:26 -04:00
psychedelicious
ae530492a2 chore(ui): typegen 2025-04-28 13:31:26 -04:00
psychedelicious
87ed1e3b6d feat(ui): do not allow imagen3 nodes in published workflows 2025-04-28 13:31:26 -04:00
psychedelicious
cc54466db9 fix(nodes): default value for UIConfigBase.tags 2025-04-28 13:31:26 -04:00
psychedelicious
cbdafe7e38 feat(nodes): allow node clobbering 2025-04-28 13:31:26 -04:00
psychedelicious
112cb76174 fix: random seed for edit mode imagen 2025-04-28 13:31:26 -04:00
psychedelicious
e56d41ab99 feat: rip out enhance prompt as toggleable option, imagen always randomizes seed 2025-04-28 13:31:26 -04:00
psychedelicious
273dfd86ab fix(ui): upscale builder 2025-04-28 13:31:26 -04:00
psychedelicious
871271fde5 feat(ui): rough out imagen3 support for canvas 2025-04-28 13:31:26 -04:00
psychedelicious
14944872c4 feat(mm): add model taxonomy for API models & Imagen3 as base model type 2025-04-28 13:31:26 -04:00
psychedelicious
07bcf3c446 feat(ui): port bbox select to native select 2025-04-28 13:31:26 -04:00
psychedelicious
8ed5585285 feat(nodes): move output metadata to BaseInvocationOutput 2025-04-28 09:19:43 -04:00
psychedelicious
5ce226a467 chore(ui): typegen 2025-04-28 09:19:43 -04:00
Mary Hipp
c64f20a72b remove output_metdata from schema 2025-04-28 09:19:43 -04:00
Mary Hipp
0c9c10a03a update schema 2025-04-28 09:19:43 -04:00
Mary Hipp
4a0df6b865 add optional output_metadata to baseinvocation 2025-04-28 09:19:43 -04:00
psychedelicious
ba165572bf chore: bump version to v5.11.0rc1 2025-04-28 10:10:50 +10:00
psychedelicious
c3d6a10603 fix(ui): handle minor breaking typing change from serialize-error 2025-04-28 09:53:08 +10:00
psychedelicious
4efc86299d fix(ui): type error in SettingsUpsellMenuItem 2025-04-28 09:53:08 +10:00
psychedelicious
e8c7cf63fd fix(ui): type error in canvas worker 2025-04-28 09:53:08 +10:00
psychedelicious
698b034190 chore(ui): bump deps 2025-04-28 09:53:08 +10:00
psychedelicious
3988128c40 feat(ui): add _all_ image outputs to gallery (including collections) 2025-04-28 09:49:04 +10:00
psychedelicious
c768f47365 fix(ui): dnd autoscroll in scrollable containers 2025-04-28 09:46:38 +10:00
psychedelicious
19a63abc54 fix(ui): hide file size on model picker when it is zero 2025-04-23 17:45:09 +10:00
psychedelicious
75ec36bf9a chore(ui): lint 2025-04-23 17:45:09 +10:00
psychedelicious
d802f8e7fb feat(ui): disable search when no options 2025-04-23 17:45:09 +10:00
psychedelicious
6873e0308d feat(ui): custom fallback for model picker when no models installed 2025-04-23 17:45:09 +10:00
psychedelicious
66eb73088e feat(ui): rename user-provided extra ctx for picker from ctx to extra to be less confusing 2025-04-23 17:45:09 +10:00
psychedelicious
ed81a13eb4 docs(ui): add some comments for picker 2025-04-23 17:45:09 +10:00
psychedelicious
fbc1aae52d feat(ui): more flexible fallbacks for model picker 2025-04-23 17:45:09 +10:00
psychedelicious
ba42c3e63f feat(ui): tooltip for compact/full model picker view 2025-04-23 17:45:09 +10:00
psychedelicious
b24e820aa0 fix(ui): flash of "select a model" when changing model 2025-04-23 17:45:09 +10:00
psychedelicious
e8f6b3b77a feat(ui): split out mainmodelpicker component 2025-04-23 17:45:09 +10:00
psychedelicious
8f13518c97 feat(ui): add clear search button to model combobox 2025-04-23 17:45:09 +10:00
psychedelicious
6afbc12074 feat(ui): when no model bases selected, show all models 2025-04-23 17:45:09 +10:00
psychedelicious
6b0a56ceb9 chore(ui): lint 2025-04-23 17:45:09 +10:00
psychedelicious
ca92497e52 feat(ui): remove description from model pciker for now 2025-04-23 17:45:09 +10:00
psychedelicious
97d45ceaf2 feat(ui): model picker filter buttons 2025-04-23 17:45:09 +10:00
psychedelicious
aeb3841a6f feat(ui): wip model picker 2025-04-23 17:45:09 +10:00
psychedelicious
c14d33d3c1 tweak(ui): remove bg on ModelImage fallback 2025-04-23 17:45:09 +10:00
psychedelicious
676e59e072 chore(ui): bump react-resizable-panels to latest
This resolves a bug where SVG elements were ignored when checking when cursor is over a resize handle
2025-04-23 17:45:09 +10:00
psychedelicious
e7dcb6a03f feat(ui): wip model picker 2025-04-23 17:45:09 +10:00
psychedelicious
fb95b7cc2b feat(ui): wip model picker 2025-04-23 17:45:09 +10:00
psychedelicious
015dc3ac0d feat(ui): wip model picker 2025-04-23 17:45:09 +10:00
psychedelicious
9d8a71b362 feat(ui): genericizing picker 2025-04-23 17:45:09 +10:00
psychedelicious
2eb212f393 feat(ui): onSelectId -> onSelectById 2025-04-23 17:45:09 +10:00
psychedelicious
34b268c15c feat(ui): use context for stable picker state 2025-04-23 17:45:09 +10:00
psychedelicious
9a203a64dc feat(ui): render picker in portal 2025-04-23 17:45:09 +10:00
psychedelicious
d80004e056 feat(ui): iterate on model combobox (wip) 2025-04-23 17:45:09 +10:00
psychedelicious
de32ed23a7 feat(ui): iterate on model combobox (wip) 2025-04-23 17:45:09 +10:00
psychedelicious
5aed2b315d feat(ui): iterate on model combobox (wip) 2025-04-23 17:45:09 +10:00
psychedelicious
48db6cfc4f feat(ui): iterate on model combobox (wip) 2025-04-23 17:45:09 +10:00
psychedelicious
aa7c5c281a feat(ui): iterate on model combobox (wip) 2025-04-23 17:45:09 +10:00
psychedelicious
87aeb7f889 feat(ui): iterate on model combobox (wip) 2025-04-23 17:45:09 +10:00
psychedelicious
3b3d6e413a feat(ui): iterate on model combobox (wip) 2025-04-23 17:45:09 +10:00
psychedelicious
b6432f2de3 feat(ui): iterate on model combobox (wip) 2025-04-23 17:45:09 +10:00
psychedelicious
9d0a28ccae feat(ui): iterate on model combobox (wip) 2025-04-23 17:45:09 +10:00
psychedelicious
c3bf0a3277 feat(ui): iterate on model combobox (wip) 2025-04-23 17:45:09 +10:00
psychedelicious
b516610c1e feat(ui): iterate on model combobox (wip) 2025-04-23 17:45:09 +10:00
psychedelicious
677e717cd7 feat(ui): iterate on model combobox (wip) 2025-04-23 17:45:09 +10:00
psychedelicious
c52584e057 feat(ui): simplify ScrollableContent 2025-04-23 17:45:09 +10:00
psychedelicious
b6767441db feat(ui): iterate on model combobox (wip) 2025-04-23 17:45:09 +10:00
psychedelicious
8745dbe67d feat(ui): iterate on model combobox (wip) 2025-04-23 17:45:09 +10:00
psychedelicious
a565d9473e feat(ui): add useStateImperative 2025-04-23 17:45:09 +10:00
psychedelicious
4dbf07c3e0 feat(ui): iterate on model combobox (wip) 2025-04-23 17:45:09 +10:00
psychedelicious
f6eb4d9a6b feat(ui): toast on select for demo purposes 2025-04-23 17:45:09 +10:00
psychedelicious
5037967b82 feat(ui): just make the damn thing myself 2025-04-23 17:45:09 +10:00
psychedelicious
4930ba48ce feat(ui): just make the damn thing myself 2025-04-23 17:45:09 +10:00
psychedelicious
40d2092256 feat(ui): reworked model selection ui (WIP) 2025-04-23 17:45:09 +10:00
psychedelicious
d2e9237740 feat(ui): reworked model selection ui (WIP) 2025-04-23 17:45:09 +10:00
psychedelicious
b191b706c1 feat(ui): reworked model selection ui (WIP) 2025-04-23 17:45:09 +10:00
psychedelicious
4d0f760ec8 chore(ui): bump cmdk to latest 2025-04-23 17:45:09 +10:00
psychedelicious
65cda5365a feat(ui): remove go to mm button from node fields 2025-04-23 17:45:09 +10:00
psychedelicious
1f2d1d086f feat(ui): add <NavigateToModelManagerButton /> to model comboboxes everywhere 2025-04-23 17:45:09 +10:00
psychedelicious
418f3c3f19 feat(ui): abstract out workflow editor model combobox, ensure consistent ui for all model fields 2025-04-23 17:45:09 +10:00
psychedelicious
72173e284c fix(ui): useModelCombobox should use null for no value instead of undefined
This fixes an issue where the refiner combobox doesn't clear itself visually when clicking the little X icon to clear the selection.
2025-04-23 17:45:09 +10:00
psychedelicious
9cc13556aa feat(ui): accept callback to override navigate to model manager functionality
If provided, `<NavigateToModelManagerButton />` will render, even if `disabledTabs` includes "models". If provided, `<NavigateToModelManagerButton />` will run the callback instead of switching tabs within the studio.

The button's tooltip is now just "Manage Models" and its icon is the same as the model manager tab's icon ([CUBE!](https://www.youtube.com/watch?v=4aGDCE6Nrz0)).
2025-04-23 17:45:09 +10:00
psychedelicious
298444f2bc chore: bump version to v5.10.1 2025-04-19 00:05:02 +10:00
psychedelicious
deb1984289 fix(mm): disable new model probe API
There is a subtle change in behaviour with the new model probe API.

Previously, checks for model types was done in a specific order. For example, we did all main model checks before LoRA checks.

With the new API, the order of checks has changed. Check ordering is as follows:
- New API checks are run first, then legacy API checks.
- New API checks categorized by their speed. When we run new API checks, we sort them from fastest to slowest, and run them in that order. This is a performance optimization.

Currently, LoRA and LLaVA models are the only model types with the new API. Checks for them are thus run first.

LoRA checks involve checking the state dict for presence of keys with specific prefixes. We expect these keys to only exist in LoRAs.

It turns out that main models may have some of these keys.

For example, this model has keys that match the LoRA prefix `lora_te_`: https://civitai.com/models/134442/helloyoung25d

Under the old probe, we'd do the main model checks first and correctly identify this as a main model. But with the new setup, we do the LoRA check first, and those pass. So we import this model as a LoRA.

Thankfully, the old probe still exists. For now, the new probe is fully disabled. It was only called in one spot.

I've also added the example affected model as a test case for the model probe. Right now, this causes the test to fail, and I've marked the test as xfail. CI will pass.

Once we enable the new API again, the xfail will pass, and CI will fail, and we'll be reminded to update the test.
2025-04-18 22:44:10 +10:00
psychedelicious
814406d98a feat(mm): siglip model loading supports partial loading
In the previous commit, the LLaVA model was updated to support partial loading.

In this commit, the SigLIP model is updated in the same way.

This model is used for FLUX Redux. It's <4GB and only ever run in isolation, so it won't benefit from partial loading for the vast majority of users. Regardless, I think it is best if we make _all_ models work with partial loading.

PS: I also fixed the initial load dtype issue, described in the prev commit. It's probably a non-issue for this model, but we may as well fix it.
2025-04-18 10:12:03 +10:00
psychedelicious
c054501103 feat(mm): llava model loading supports partial loading; fix OOM crash on initial load
The model manager has two types of model cache entries:
- `CachedModelOnlyFullLoad`: The model may only ever be loaded and unloaded as a single object.
- `CachedModelWithPartialLoad`: The model may be partially loaded and unloaded.

Partial loaded is enabled by overwriting certain torch layer classes, adding the ability to autocast the layer to a device on-the-fly. See `CustomLinear` for an example.

So, to take advantage of partial loading and be cached as a `CachedModelWithPartialLoad`, the model must inherit from `torch.nn.Module`.

The LLaVA classes provided by `transformers` do inherit from `torch.nn.Module`, but we wrap those classes in a separate class called `LlavaOnevisionModel`. The wrapper encapsulate both the LLaVA model and its "processor" - a lightweight class that prepares model inputs like text and images.

While it is more elegant to encapsulate both model and processor classes in a single entity, this prevents the model cache from enabling partial loading for the chunky vLLM model.

Fixing this involved a few changes.
- Update the `LlavaOnevisionModelLoader` class to operate on the vLLM model directly, instead the `LlavaOnevisionModel` wrapper class.
- Instantiate the processor directly in the node. The processor is lightweight and does its business on the CPU. We don't need to worry about caching in the model manager.
- Remove caching support code from the `LlavaOnevisionModel` wrapper class. It's not needed, because we do not cache this class. The class now only handles running the models provided to it.
- Rename `LlavaOnevisionModel` to `LlavaOnevisionPipeline` to better represent its purpose.

These changes have a bonus effect of fixing an OOM crash when initially loading the models. This was most apparent when loading LLaVA 7B, which is pretty chunky.

The initial load is onto CPU RAM. In the old version of the loaders, we ignored the loader's target dtype for the initial load. Instead, we loaded the model at `transformers`'s "default" dtype of fp32.

LLaVA 7B is fp16 and weighs ~17GB. Loading as fp32 means we need double that amount (~34GB) of CPU RAM. Many users only have 32GB RAM, so this causes a _CPU_ OOM - which is a hard crash of the whole process.

With the updated loaders, the initial load logic now uses the target dtype for the initial load. LLaVA now needs the expected ~17GB RAM for its initial load.

PS: If we didn't make the accompanying partial loading changes, we still could have solved this OOM. We'd just need to pass the initial load dtype to the wrapper class and have it load on that dtype. But we may as well fix both issues.

PPS: There are other models whose model classes are wrappers around a torch module class, and thus cannot be partially loaded. However, these models are typically fairly small and/or are run only on their own, so they don't benefit as much from partial loading. It's the really big models (like LLaVA 7B) that benefit most from the partial loading.
2025-04-18 10:12:03 +10:00
psychedelicious
c1d819c7e5 feat(nodes): add get_absolute_path method to context.models API
Given a model config or path (presumably to a model), returns the absolute path to the model.

Check the next few commits for use-case.
2025-04-18 10:12:03 +10:00
psychedelicious
2a8e91f94d feat(ui): wrap JSON in dataviewer 2025-04-17 22:55:04 +10:00
psychedelicious
64f3e56039 chore: bump version to v5.10.0 2025-04-17 15:08:26 +10:00
Hosted Weblate
819afab230 translationBot(ui): update translation files
Updated by "Cleanup translation files" hook in Weblate.

translationBot(ui): update translation files

Updated by "Cleanup translation files" hook in Weblate.

Co-authored-by: Hosted Weblate <hosted@weblate.org>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/
Translation: InvokeAI/Web UI
2025-04-17 11:28:02 +10:00
Linos
9fff064c55 translationBot(ui): update translation (Vietnamese)
Currently translated at 100.0% (1887 of 1887 strings)

translationBot(ui): update translation (Vietnamese)

Currently translated at 100.0% (1887 of 1887 strings)

Co-authored-by: Linos <linos.coding@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/vi/
Translation: InvokeAI/Web UI
2025-04-17 11:28:02 +10:00
Riccardo Giovanetti
1aa8d94378 translationBot(ui): update translation (Italian)
Currently translated at 98.0% (1851 of 1887 strings)

Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
2025-04-17 11:28:02 +10:00
RyoKoba
d78bdde2c3 translationBot(ui): update translation (Japanese)
Currently translated at 56.6% (1069 of 1887 strings)

translationBot(ui): update translation (Japanese)

Currently translated at 50.8% (960 of 1887 strings)

translationBot(ui): update translation (Japanese)

Currently translated at 48.4% (912 of 1882 strings)

Co-authored-by: RyoKoba <kobayashi_ryo@cyberagent.co.jp>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/ja/
Translation: InvokeAI/Web UI
2025-04-17 11:28:02 +10:00
psychedelicious
7b663b3432 fix(ui): scrolling in builder
I am at loss as the to cause of this bug. The styles that I needed to change to fix it haven't been changed in a couple months. But these do seem to fix it.

Closes #7910
2025-04-17 11:24:54 +10:00
psychedelicious
9c4159915a feat(ui): add guardrails to prevent entity types being missed in useIsEntityTypeEnabled 2025-04-17 11:21:16 +10:00
psychedelicious
dbb5830027 fix(ui): useIsEntityTypeEnabled should use useMemo not useCallback
Typo/bug introduced in #7770
2025-04-17 11:21:16 +10:00
psychedelicious
4fc4dbb656 fix(ui): ensure query subs are reset in case of error 2025-04-17 11:13:41 +10:00
psychedelicious
d4f6d09cc9 fix(ui): never subscribe to dynamic prompts queries
If the request errors, we would never get to unsubscribe. The request would forever be marked as having a subscriber and never be cleared from memory.
2025-04-17 10:36:09 +10:00
psychedelicious
44e44602d3 feat(ui): remove keepUnusedDataFor for dynamic prompts
This query can have potentially large responses. Keeping them around for 24 hours essentially a hardcoded memory leak. Use the default for RTKQ of 60 seconds.
2025-04-17 10:36:09 +10:00
psychedelicious
36066c5f26 fix(ui): ensure dynamic prompts updates on any change to any dependent state
When users generate on the canvas or upscaling tabs, we parse prompts through dynamic prompts before invoking. Whenever the prompt or other settings change, we run dynamic prompts.

Previously, we used a redux listener to react to changes to dynamic prompts' dependent state, keeping the processed dynamic prompts synced. For example, when the user changed the prompt field, we re-processed the dynamic prompts.

This requires that all redux actions that change the dependent state be added to the listener matcher. It's easy to forget actions, though, which can result in the dynamic prompts state being stale.

For example, when resetting canvas state, we dispatch an action that resets the whole params slice, but this wasn't in the matcher. As a result, when resetting canvas, the dynamic prompts aren't updated. If the user then clicks Invoke (with an empty prompt), the last dynamic prompts state will be used.

For example:
- Generate w/ prompt "frog", get frog
- Click new canvas session
- Generate without any prompt, still get frog

To resolve this, the logic that keeps the dynamic prompts synced is moved from the listener to a hook. The way the logic is triggered is improved - it's now triggered in a useEffect, which is run when the dependent state changes. This way, it doesn't matter _how_ the dependent state changes - the changes will always be "seen", and the dynamic prompts will update.
2025-04-17 10:36:09 +10:00
psychedelicious
361c6eed4b docs: update manual install docs w/ correct pytorch indicies for v5.10.0 and later 2025-04-17 10:32:41 +10:00
psychedelicious
bb154fd40f docs: update dev env docs with correct pytorch pypi index 2025-04-17 10:32:41 +10:00
psychedelicious
cbee6e6faf fix(app): remove accidentally committed tensor cache size
I had set this to zero for testing udring the python 2.6.0 upgrade and neglected to remove it.
2025-04-17 10:12:47 +10:00
psychedelicious
6a822a52b8 chore(ui): update whats new copy 2025-04-16 07:17:52 +10:00
psychedelicious
d10dc28fc2 chore: bump version to v5.10.0rc1 2025-04-16 07:17:52 +10:00
psychedelicious
20eea18c41 chore(ui): typegen 2025-04-16 06:28:22 +10:00
skunkworxdark
566282bff0 Update metadata_linked.py
added metadata_to_string_collection, metadata_to_integer_collection, metadata_to_float_collection, metadata_to_bool_collection
2025-04-16 06:28:22 +10:00
psychedelicious
e7e874f7c3 fix(ui): increase padding when fitting layers to stage 2025-04-15 07:47:39 +10:00
Eugene Brodsky
95445c1163 chore: update pre-commit syntax; add check for uv.lock needing an update 2025-04-15 07:41:32 +10:00
psychedelicious
557e0cb3e6 chore(ui): knip 2025-04-15 07:13:25 +10:00
psychedelicious
a12bf07fb3 feat(ui): add node publish denylist 2025-04-15 07:13:25 +10:00
psychedelicious
a5bc21cf50 feat(nodes): extract LaMa model url to constant 2025-04-15 07:13:25 +10:00
psychedelicious
03ca23bec2 chore: update lockfile 2025-04-15 07:06:23 +10:00
psychedelicious
e15194a45d Revert "ci: change pyproject.toml to trigger uv lock check (it should fail)"
This reverts commit b802933190.
2025-04-15 07:06:23 +10:00
psychedelicious
e71ea309e7 ci: change pyproject.toml to trigger uv lock check (it should fail) 2025-04-15 07:06:23 +10:00
psychedelicious
2513756c25 ci: fix name of uv lock checks job 2025-04-15 07:06:23 +10:00
psychedelicious
875670f713 ci: add comment to uv-lock-checks.yml 2025-04-15 07:06:23 +10:00
psychedelicious
153b148362 ci: add check for uv lockfile consistency with pyproject.toml 2025-04-15 07:06:23 +10:00
psychedelicious
7b84f8c5e8 fix(ui): do not disable image context canvas actions based on selected base model
These actions should be accessible at any time.
2025-04-10 10:50:13 +10:00
psychedelicious
0280c9b4b9 fix(ui): generation_mode metadata not set correctly 2025-04-10 10:50:13 +10:00
psychedelicious
ae8d1f26d6 fix(app): import CogView4Transformer2DModel from the module that exports it 2025-04-10 10:50:13 +10:00
psychedelicious
170ea4fb75 fix(app): add CogView4ConditioningInfo to ObjectSerializerDisk's safe_globals
needed for torch w/ weights_only=True
2025-04-10 10:50:13 +10:00
psychedelicious
e5b0f8b985 feat(app): remove cogview4 inpaint workflow
This doesn't make sense to have as a default workflow given the trickiness of producing alpha masks.
2025-04-10 10:50:13 +10:00
psychedelicious
3f656072cf feat(app): update cogview4 t2i workflow w/ form 2025-04-10 10:50:13 +10:00
psychedelicious
1d4aa93f5e chore(ui): typegen 2025-04-10 10:50:13 +10:00
psychedelicious
b182060201 chore(ui): lint 2025-04-10 10:50:13 +10:00
psychedelicious
2b2f64b232 refactor(ui): simplify useIsEntityTypeEnabled 2025-04-10 10:50:13 +10:00
psychedelicious
df32974378 fix(ui): add checks for cogview4's dimension restrictions 2025-04-10 10:50:13 +10:00
psychedelicious
ad582c8cc5 feat(nodes): rename CogView4 nodes to match naming format 2025-04-10 10:50:13 +10:00
psychedelicious
47273135ca feat(ui): add cogview4 and inpainting tags to library 2025-04-10 10:50:13 +10:00
psychedelicious
c99e65bdab feat(app): add cogview4 default workflows 2025-04-10 10:50:13 +10:00
Mary Hipp
92b726d731 update available params for cogview4 2025-04-10 10:50:13 +10:00
Mary Hipp
8837932bad create hook for managing entity type enabledness for given base model and update usage 2025-04-10 10:50:13 +10:00
Mary Hipp
9846229e52 build graph for cogview4 2025-04-10 10:50:13 +10:00
maryhipp
305c5761d0 add generation modes for cogview linear 2025-04-10 10:50:13 +10:00
Ryan Dick
3ba399779f Fix lint error. 2025-04-10 10:50:13 +10:00
Ryan Dick
46316e43f0 typegen 2025-04-10 10:50:13 +10:00
Ryan Dick
d86cd66994 Add CogView4 VAE approximation for progress images. 2025-04-10 10:50:13 +10:00
Ryan Dick
13850271ab Add inpainting to CogView4DenoiseInvocation. 2025-04-10 10:50:13 +10:00
Ryan Dick
7e894ffe83 Consolidate InpaintExtension implementations for SD3 and FLUX. 2025-04-10 10:50:13 +10:00
Ryan Dick
0939030324 Support cfg_scale list in CogView4Denoise. 2025-04-10 10:50:13 +10:00
Ryan Dick
30f19dc37a Update CogView4Denoise to support image-to-image. 2025-04-10 10:50:13 +10:00
Ryan Dick
ace5e748f4 Simplify CogView4 timesteps schedule generation in preparation for timestep schedule slipping. 2025-04-10 10:50:13 +10:00
Ryan Dick
4fae8ad163 Add CogView4ImageToLatentsInvocation. 2025-04-10 10:50:13 +10:00
Ryan Dick
5e75bc570a Fix bug in CogView4 noise schedule handling that was resulting in low-quality images. 2025-04-10 10:50:13 +10:00
Ryan Dick
3166b5d2ea Switch to sequential CFG for CogView4 (for now, until I sort out the padding). 2025-04-10 10:50:13 +10:00
Ryan Dick
321c2d358c Add CogView4 model loader. And various other fixes to get a CogView4 workflow running (though quality is still below expectations). 2025-04-10 10:50:13 +10:00
Ryan Dick
0338983895 Update CogView4 starter model entry with approximate bundle size. 2025-04-10 10:50:13 +10:00
Ryan Dick
f4e00ab261 Add CogView4 to frontend. 2025-04-10 10:50:13 +10:00
Ryan Dick
e1133bc53f Fix typo in BaseModelTypo.CogView4. 2025-04-10 10:50:13 +10:00
Ryan Dick
e1ccbd5c29 typegen 2025-04-10 10:50:13 +10:00
Ryan Dick
cf76a0b575 Add CogView4ModelLoaderInvocation. (Not wired up with frontend yet.) 2025-04-10 10:50:13 +10:00
Ryan Dick
67bfd63c73 Require the cogview4 height/width are multiples of 32. This requirement is documented here: https://huggingface.co/THUDM/CogView4-6B. I haven't tracked down the underlying source of this requirement. 2025-04-10 10:50:13 +10:00
Ryan Dick
cdad8a4fd1 Add CogView4LatentsToImageInvocation. 2025-04-10 10:50:13 +10:00
Ryan Dick
5d9797945b Completed first pass of CogView4Denoise. 2025-04-10 10:50:13 +10:00
Ryan Dick
78159c3200 Simplify CogView4 timestep schedule initialization. 2025-04-10 10:50:13 +10:00
Ryan Dick
1320c4fa13 WIP - CogView4DenoiseInvocation. 2025-04-10 10:50:13 +10:00
Ryan Dick
883297c809 Bump diffusers to dev version with CogView4 support. 2025-04-10 10:50:13 +10:00
Ryan Dick
bac05a7885 Add CogView4TextEncoderInvocation 2025-04-10 10:50:13 +10:00
Ryan Dick
e2c4ea8e89 Add CogView4 model probing. 2025-04-10 10:50:13 +10:00
psychedelicious
851e23d6b4 feat(ui): move size to be next to model name 2025-04-10 09:53:03 +10:00
psychedelicious
7c8c9694ce feat(ui): use filesize package to format model file size 2025-04-10 09:53:03 +10:00
Kevin Turner
52a8ad1c18 chore: rename model.size to model.file_size
to disambiguate from RAM size or pixel size
2025-04-10 09:53:03 +10:00
Kevin Turner
e537020c11 chore: cursed whitespace fight 2025-04-10 09:53:03 +10:00
Kevin Turner
c50d1d6127 test: add size field to model metadata 2025-04-10 09:53:03 +10:00
Kevin Turner
53292b3592 fix: localization for file size units 2025-04-10 09:53:03 +10:00
Kevin Turner
bcfc61b2d7 feat: show model size in model list 2025-04-10 09:53:03 +10:00
Kevin Turner
9d869fc9ce chore: typegen 2025-04-10 09:53:03 +10:00
Kevin Turner
f09aacf992 fix: ModelProbe.probe needs to return a size field 2025-04-10 09:53:03 +10:00
Kevin Turner
98260a8efc test: add size field to test model configs 2025-04-10 09:53:03 +10:00
Kevin Turner
9590e8ff39 feat: expose model storage size 2025-04-10 09:53:03 +10:00
psychedelicious
a23d90187b feat(ui): allow send-image-to-canvas to work when canvas is uninitialized
Add `useCanvasIsBusySafe()` hook. This is like `useCanvasIsBusy()`, but when the canvas is not initialized, it gracefully falls back to false instead of raising.

Because app tabs are lazy-loaded, the canvas is not initialized until the user visits that tab. If the page loads up on the workflows tab, the canvas will be uninitialized until the user clicks on it.

This graceful fallback behaviour allows actions like sending an image to canvas to work even when the canvas is not yet initialized. These actions are exposed in the image context menu, and previously were hidden when the canvas was not initialized. We can now show these actions and use them even when the canvas is uninitialized.

- Add `useCanvasIsBusySafe()` hook
- Use the new hook in the image context menu for send to canvas actions
- Do not use `<CanvasManagerProviderGate />` in the image context menu (this was hiding the actions when canvas was uninitialized)
2025-04-10 06:44:44 +10:00
psychedelicious
f655a85154 fix(ui): canvas dnd drop indicator color 2025-04-10 06:42:01 +10:00
psychedelicious
f45b494805 tidy(ui): remove extraneous calls to HTMLElement.remove()
these will be auto-gc'd when there are no more references
2025-04-09 14:00:20 +10:00
psychedelicious
d1776e0b63 feat(ui): safer use of drawImage
When calling `ctx.drawImage()`, if the image to be drawn has a width of height of 0, the call will raise.

In this change, I have carefully reviewed the call hierarchy for all of our own code that calls this method and ensured that each call has error handling.

Well, with one exception - I'm not sure how to handle errors in `invokeai/frontend/web/src/common/hooks/useClientSideUpload.ts`. But this should never be an issue in that hook - it's a Canvas problem.
2025-04-09 14:00:20 +10:00
psychedelicious
646887e3c9 feat(ui): save canvas/bbox to gallery saves basic metadata
- Positive prompt
- Negative prompt
- Seed
- Model (if set)

The rest is a bit complicated to derive as it comes from the graph building process.
2025-04-09 08:52:38 +10:00
Riccardo Giovanetti
e7e25a0c37 translationBot(ui): update translation (Italian)
Currently translated at 98.7% (1849 of 1873 strings)

translationBot(ui): update translation (Italian)

Currently translated at 97.8% (1833 of 1873 strings)

Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
2025-04-08 11:01:37 +10:00
Linos
589b849e64 translationBot(ui): update translation (Vietnamese)
Currently translated at 100.0% (1873 of 1873 strings)

translationBot(ui): update translation (Vietnamese)

Currently translated at 100.0% (1871 of 1871 strings)

translationBot(ui): update translation (Vietnamese)

Currently translated at 99.2% (1857 of 1871 strings)

translationBot(ui): update translation (Vietnamese)

Currently translated at 100.0% (1840 of 1840 strings)

Co-authored-by: Linos <linos.coding@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/vi/
Translation: InvokeAI/Web UI
2025-04-08 11:01:37 +10:00
psychedelicious
aedbc9f778 chore: prep for v5.10.0a1 2025-04-08 10:59:08 +10:00
psychedelicious
a0cf9e2e80 tweak(ui): ip adapter settings layout 2025-04-08 10:33:45 +10:00
psychedelicious
5c8f1c5666 fix(ui): use flux redux influence on regional guidance 2025-04-08 10:33:45 +10:00
psychedelicious
fd37117221 chore(ui): lint 2025-04-08 10:33:45 +10:00
psychedelicious
5956f96e57 feat(ui): add flux redux image influence to canvas 2025-04-08 10:33:45 +10:00
psychedelicious
49622c37ed fix(nodes): logic bug in flux redux node 2025-04-08 10:33:45 +10:00
psychedelicious
50387c8f64 chore(ui): typegen 2025-04-08 10:33:45 +10:00
skunkworxdark
e1538af219 Update flux_redux.py
Add down sampling and weight to redux node
2025-04-08 10:33:45 +10:00
psychedelicious
e5a0010a72 fix(ui): normalize alpha value to 0-1 when picking color on canvas 2025-04-08 08:20:49 +10:00
psychedelicious
b75d1b2473 refactor(ui): move update node logic from listener to hook 2025-04-08 08:18:17 +10:00
psychedelicious
b91bb9ba9f fix(ui): remove debug logger middleware 2025-04-08 08:18:17 +10:00
psychedelicious
a7c818bcae fix(ui): rebase import issue 2025-04-08 08:18:17 +10:00
psychedelicious
a54b255718 chore(ui): lint 2025-04-08 08:18:17 +10:00
psychedelicious
3e04baa684 feat(ui): improved undo/redo history grouping for selections and postiino changes 2025-04-08 08:18:17 +10:00
psychedelicious
d23db705dd feat(ui): improved undo/redo history grouping 2025-04-08 08:18:17 +10:00
psychedelicious
96a481530d refactor(ui): merge the workflow and nodes slices
This allows undo/redo history to apply to node editor and workflow details/form.
2025-04-08 08:18:17 +10:00
psychedelicious
a0b515979a Revert "correctly set is_published when loading a workflow"
This reverts commit e4b07894fd55b3a24fc006882585b6d55fe329c3.
2025-04-08 07:05:12 +10:00
Mary Hipp
2da8ac216b add mutation for unpublishing 2025-04-08 07:05:12 +10:00
Mary Hipp
1558fe9a37 correctly set is_published when loading a workflow 2025-04-08 07:05:12 +10:00
Mary Hipp
ded080ae04 show cancel icon and not retry icon on validation run queue items 2025-04-08 07:05:12 +10:00
psychedelicious
982603e051 fix(ui): use getDefaultForm when resetting form 2025-04-08 06:54:43 +10:00
psychedelicious
a23b5c3408 refactor(ui): make workflow published status server-side state
Whether a workflow is published or not shouldn't be something stored on the client. It's properly server-side state.

This change removes the `is_published` flag from redux and updates all references to the flag to use the getWorkflow query.

It also updates the socket event listener that handles session complete events. When a validation run completes, we invalidate the tags for the getWorkflow query. We need to do a bit of juggling to avoid a race condition (documented in the code). Works well though.
2025-04-08 06:54:43 +10:00
psychedelicious
c9f93b3746 refactor(ui): workflow unsaved changes tracking
Previously, we maintained an `isTouched` flag in redux state to indicate if a workflow had unsaved changes. We manually updated this whenever we changed something on the workflow.

This was tedious and error-prone. It also didn't handle undo/redo, so if you made a change to a node and undid it, we'd still think the workflow had unsaved changes.

Moving forward, we use a simpler and more robust strategy by hashing the server's version of the workflow and comparing it to the client's version of the workflow.

The hashing uses `stable-hash`, which is both fast and, well, stable. Most importantly, the ordering of keys in hashed objects does not change the resultant hash.

- Remove `isTouched` state entirely.
- Extract the logic that builds the "preview" workflow object from redux state into its own hook. This "preview" workflow is what we send to the server when saving a workflow. This "preview" workflow is effectively the client version of the workflow.
- Add `useDoesWorkflowHaveUnsavedChanges()` hook, which compares the hash of the client workflow and server workflow (if it exists).
- Add `useIsWorkflowUntouched()` hook, which compares the hash of the client workflow and the initial workflow that you get when you click new workflow.
- Remove `reactflow` workaround in the nodes slice undo/redo filter. When we set the nodes state while loading a workflow, `reactflow` emits a nodes size/placement change event. This triggered up our `isTouched` flag logic and marked the workflow as unsaved right from the get-go. With the new strategy to track touched status, this workaround can be removed.
- Update all logic that tracked the old `isTouched` flag to use the new hooks.
2025-04-08 06:54:43 +10:00
psychedelicious
e381024cc0 fix(ui): remove debug logger middleware from store setup
Accidentally left in from prev change
2025-04-08 06:54:43 +10:00
psychedelicious
bb65884040 refactor(ui): workflow form root element is a constant
Previously, the workflow form's root element id was random. Every time we reset the workflow editor, the root id changed. This makes it difficult to check if the workflow editor is untouched (in its default state).

Now that root element's id is simply "root". I can't imagine any way that this would break anything.
2025-04-08 06:54:43 +10:00
psychedelicious
920339dbeb refactor(ui): split out the modal isolator component 2025-04-08 06:54:43 +10:00
psychedelicious
0f618bdbcb refactor(ui): split out the hook isolator component 2025-04-08 06:54:43 +10:00
psychedelicious
8294e2cdea feat(mm): support size calculation for onnx models 2025-04-07 11:37:55 +10:00
psychedelicious
7da43be4b7 docs: fix incorrect filename 2025-04-07 10:57:32 +10:00
psychedelicious
8561e9e540 docs: remove legacy scripts documentation 2025-04-07 10:57:32 +10:00
psychedelicious
b0d5e7e3d8 feat(app): restore "Using torch device" message on startup 2025-04-07 10:56:26 +10:00
Eugene Brodsky
ab2d203d5e fix(build): re-add sentencepiece which is apparently needed by gguf, but is not defined as its dependency 2025-04-04 16:26:20 -04:00
Eugene Brodsky
eae5c54091 fix(docker): another pip install is needed in docker build after copying sources 2025-04-04 16:26:20 -04:00
Mary Hipp
ee2b486e8b fix badge for validation run 2025-04-04 11:38:40 -04:00
psychedelicious
a2c7050832 docs: update README.md 2025-04-04 18:42:13 +11:00
psychedelicious
cd090eb76f build: fix path in build script 2025-04-04 18:42:13 +11:00
psychedelicious
3348755e6e ci: fix name of build hweel workflow 2025-04-04 18:42:13 +11:00
psychedelicious
d6dbdaacd1 chore: bump version to v5.10.0dev4 2025-04-04 18:42:13 +11:00
psychedelicious
1c6fa1ad18 ci: update workflows to use revised build scripts 2025-04-04 18:42:13 +11:00
psychedelicious
39bed90eda build: remove installer & convert installer build script to only build the wheel 2025-04-04 18:42:13 +11:00
psychedelicious
c0e48193a7 chore: bump version to v5.10.0dev3 2025-04-04 18:42:13 +11:00
psychedelicious
41677394c0 chore: update uv.lock 2025-04-04 18:42:13 +11:00
psychedelicious
405cfd46e7 build: remove pin on spandrel dependency 2025-04-04 18:42:13 +11:00
psychedelicious
9cc9a5c8b0 build: add comment about torchsde to pyproject 2025-04-04 18:42:13 +11:00
psychedelicious
ddc0461882 build: remove pin on gguf dependency
This allows it to pull in sentencepiece on its own. In 0.10.0, it didn't have this package listed as a dependency, but in recent releases it does. So we are able to remove sentencepiece as an explicit dep.
2025-04-04 18:42:13 +11:00
psychedelicious
0f09091a26 build: remove unused clip_anytorch dependency 2025-04-04 18:42:13 +11:00
psychedelicious
dedb77b6f2 build: remove unused pytorch-lightning dependency 2025-04-04 18:42:13 +11:00
psychedelicious
89f8dbee6c build: remove unused pyreadline3 dependency 2025-04-04 18:42:13 +11:00
psychedelicious
8b0dc8ce84 build: remove unused pyperclip dependency 2025-04-04 18:42:13 +11:00
psychedelicious
018121e407 build: remove unused pympler dependency 2025-04-04 18:42:13 +11:00
psychedelicious
095025b637 build: remove unused scikit-image dependency 2025-04-04 18:42:13 +11:00
psychedelicious
ed8487659e build: remove unused npyscreen dependency 2025-04-04 18:42:13 +11:00
psychedelicious
3745d2be0c build: remove unused torchmetrics dependency 2025-04-04 18:42:13 +11:00
psychedelicious
b5206e204f build: remove unused datasets dependency 2025-04-04 18:42:13 +11:00
psychedelicious
b237ccbdd8 build: remove unused click dependency 2025-04-04 18:42:13 +11:00
psychedelicious
224ebc72ae build: remove unused omegaconf dependency 2025-04-04 18:42:13 +11:00
psychedelicious
05c3d47be9 build: remove unused facexlib dependency 2025-04-04 18:42:13 +11:00
psychedelicious
a4d709c169 build: remove unused timm dependency 2025-04-04 18:42:13 +11:00
psychedelicious
5a8e95c700 chore(ui): typegen 2025-04-04 18:42:13 +11:00
psychedelicious
e630f364df chore: update uv.lock 2025-04-04 18:42:13 +11:00
psychedelicious
9c287038e4 build: remove unused matplotlib dep 2025-04-04 18:42:13 +11:00
psychedelicious
8d32ede082 tidy(nodes): remove matplotlib dependency
It was only used for a single color conversion function. Replaced with cv2 code, tested functionality to confirm it works the same.
2025-04-04 18:42:13 +11:00
psychedelicious
bab0b6d069 build: move humanize to test deps 2025-04-04 18:42:13 +11:00
psychedelicious
8e013ef3be build: remove unused albumentations dependency
This is not used
2025-04-04 18:42:13 +11:00
psychedelicious
8188484a40 tidy: delete unused file 2025-04-04 18:42:13 +11:00
psychedelicious
5d8fe9fb56 build: remove controlnet_aux dependency, remove pin for timm 2025-04-04 18:42:13 +11:00
psychedelicious
8d3743c6f2 tidy(nodes): rename controlnet_image_processors.py -> controlnet.py 2025-04-04 18:42:13 +11:00
psychedelicious
986b7426d2 tidy(nodes): remove unused old dw openpose detector class 2025-04-04 18:42:13 +11:00
psychedelicious
8d8150b47e tidy(nodes): remove deprecated controlnet "processor" nodes 2025-04-04 18:42:13 +11:00
psychedelicious
ae3944b4e0 build: upgrade python to 3.12 in pins 2025-04-04 18:42:13 +11:00
psychedelicious
6f0c5c9c05 build: update uv.lock 2025-04-04 18:42:13 +11:00
psychedelicious
89c999ca58 fix(backend): remove mps_fixes
The fixes in this module monkeypatched `torch` to resolve some issues with FP16 on macOS. These issues have long since been resolved.

Included in the now-removed fixes is `CustomSlicedAttentionProcessor`, which is intended to reduce memory requirements for MPS. This overrides `diffusers`' own `SlicedAttentionProcessor`.

Unfortunately, `attention_type: sliced` produces hot garbage with the fixes and black images without the fixes. So this class appears to now be a moot point.

Regardless, SDPA is supported on MPS and very efficient, so sliced attention is largely obsolete.
2025-04-04 18:42:13 +11:00
psychedelicious
89cefc6a88 chore: bump version to v5.10.0dev2
Doing a dev build so I can test the launcher.
2025-04-04 18:42:13 +11:00
psychedelicious
79e384e71c build: downgrade python to 3.11 in pins 2025-04-04 18:42:13 +11:00
psychedelicious
3ebe96765a build: restore prev setuptools config to fix wheel build 2025-04-04 18:42:13 +11:00
psychedelicious
97e158f13a ci: use py3.12 to build installer 2025-04-04 18:42:13 +11:00
psychedelicious
2b1a36ef4a experiment: add pins.json to repo
The launcher will query this file to get the pins needed for installation
2025-04-04 18:42:13 +11:00
psychedelicious
6824b4b036 chore: bump version to v5.10.0dev1
Doing a dev build so I can test the launcher.
2025-04-04 18:42:13 +11:00
psychedelicious
e8a09a5ed8 chore: update uv.lock for latest pydantic
Ran `uv lock --upgrade-package pydantic`
2025-04-04 18:42:13 +11:00
psychedelicious
c4df7d3cb9 fix(ui): handle updated schema structure during invocation parsing
In https://github.com/pydantic/pydantic/pull/10029, pydantic made an improvement to its generated JSON schemas (OpenAPI schemas). The previous and new generated schemas both meet the schema spec.

When we parse the OpenAPI schema to generate node templates, we use some typeguard to narrow schema components from generic OpenAPI schema objects to a node field schema objects. The narrower node field schema objects contain extra data.

For example, they contain a `field_kind` attribute that indicates it the field is an input field or output field. These extra attributes are not part of the OpenAPI spec (but the spec allows does allow for this extra data).

This typeguard relied on a pydantic implementation detail. This was changed in the linked pydantic PR, which released with v2.9.0. With the change, our typeguard rejects input field schema objects, causing parsing to fail with errors/warnings like `Unhandled input property` in the JS console.

In the UI, this causes many fields - mostly model fields - to not show up in the workflow editor.

The fix for this is very simple - instead of relying on an implementation detail for the typeguard, we can check if the incoming schema object has any of our invoke-specific extra attributes. Specifically, we now look for the presence of the `field_kind` attribute on the incoming schema object. If it is present, we know we are dealing with an invocation input field and can parse it appropriately.
2025-04-04 18:42:13 +11:00
psychedelicious
b9e76afbf5 chore: typegen 2025-04-04 18:42:13 +11:00
psychedelicious
dfd8b8f220 chore: remove pydantic pin 2025-04-04 18:42:13 +11:00
psychedelicious
a089e1bf5c chore(ui): typegen 2025-04-04 18:42:13 +11:00
psychedelicious
875f3fe779 tests: update tests/test_object_serializer_disk.py 2025-04-04 18:42:13 +11:00
psychedelicious
5fa2cf59e2 fix(app): add trusted classes to torch safe globals to prevent errors when loading them
In `ObjectSerializerDisk`, we use `torch.load` to load serialized objects from disk. With torch 2.6.0, torch defaults to `weights_only=True`. As a result, torch will raise when attempting to deserialize anything with an unrecognized class.

For example, our `ConditioningFieldData` class is untrusted. When we load conditioning from disk, we will get a runtime error.

Torch provides a method to add trusted classes to an allowlist. This change adds an arg to `ObjectSerializerDisk` to add a list of safe globals to the allowlist and uses it for both `ObjectSerializerDisk` instances.

Note: My first attempt inferred the class from the generic type arg that `ObjectSerializerDisk` accepts, and added that to the allowlist. Unfortunately, this doesn't work.

For example, `ConditioningFieldData` has a `conditionings` attribute that may be one some other untrusted classes representing model-specific conditioning data. So, even if we allowlist `ConditioningFieldData`, loading will fail when torch deserializes the `conditionings` attribute.
2025-04-04 18:42:13 +11:00
Eugene Brodsky
4d58c222f3 resolve conflict between timm version needed by LLaVA and controlnet-aux 2025-04-04 18:42:13 +11:00
Eugene Brodsky
c27142bb02 reintroduce GPU_DRIVER build arg in CI container build, as it has apparently been removed 2025-04-04 18:42:13 +11:00
Eugene Brodsky
e3c441fda4 remove obsoleted depenencies that were used by the CLI 2025-04-04 18:42:13 +11:00
Eugene Brodsky
6bb102f860 modify docs for python 3.12 2025-04-04 18:42:13 +11:00
Eugene Brodsky
5c45ef1a8c update nodes schema / typegen 2025-04-04 18:42:13 +11:00
Eugene Brodsky
7a218a8040 update uv.lock 2025-04-04 18:42:13 +11:00
Eugene Brodsky
929d86768f refactor Dockerfile; get rid of multi-stage build; upgrade to python 3.12 2025-04-04 18:42:13 +11:00
Eugene Brodsky
3676160496 use uv.lock to pin dependencies 2025-04-04 18:42:13 +11:00
Eugene Brodsky
8e6ebb537b upgrade pytorch and unpin some of the strict dependency pins to facilitate upgrading co-dependencies.
we will use uv.lock to ensure reproducibility
2025-04-04 18:42:13 +11:00
Chantell
2b5da91beb Update manual.md
Removed a redundancy of package specifier on step 6.
2025-04-04 16:52:04 +11:00
psychedelicious
74bede14be feat(ui): put all validatoin run data into single object 2025-04-04 11:38:04 +11:00
psychedelicious
04ea3c491a chore(ui): typegen 2025-04-04 11:38:04 +11:00
psychedelicious
38e7b23d18 feat(api): put all validatoin run data into single object 2025-04-04 11:38:04 +11:00
psychedelicious
c052846e05 feat(ui): ensure workflow id is passed when doing validation run 2025-04-04 11:38:04 +11:00
psychedelicious
af3a31dfec chore(ui): typegen 2025-04-04 11:38:04 +11:00
psychedelicious
571710fab6 feat(app): add optional published_workflow_id to enqueue payloads and queue item 2025-04-04 11:38:04 +11:00
psychedelicious
a175a5c252 feat(ui): add safeguard against accidentally loading non-library workflow as library workflow 2025-04-04 11:38:04 +11:00
psychedelicious
8b3c36c6fa refactor(ui): better UX for choosing output nodes 2025-04-04 11:38:04 +11:00
psychedelicious
b9ffacd4bf fix(ui): disable publish button when not ready to enqueue (i.e. invalid graph) 2025-04-04 11:38:04 +11:00
psychedelicious
ae45fc8a74 gh: update codeowners
- Add @psychedelicious as codeowner for docs
- Remove inactive contributors
2025-04-03 18:34:39 -04:00
psychedelicious
85db9c65e5 fix(ui): add missing tkey 2025-04-03 12:42:28 +11:00
psychedelicious
ddddaef7ca refactor(ui): use dedicated allowPublishWorkflows instead of disabledFeatures 2025-04-03 12:42:28 +11:00
psychedelicious
e4678201cb feat(ui): add conditionally-enabled workflow publishing ui
This is a squash of a lot of scattered commits that became very difficult to clean up and make individually. Sorry.

Besides the new UI, there are a number of notable changes:
- Publishing logic is disabled in OSS by default. To enable it, provided a `disabledFeatures` prop _without_ "publishWorkflow".
- Enqueuing a workflow is no longer handled in a redux listener. It was  hard to track the state of the enqueue logic in the listener. It is now in a hook. I did not migrate the canvas and upscaling tabs - their enqueue logic is still in the listener.
- When queueing a validation run, the new `useEnqueueWorkflows()` hook will update the payload with the required data for the run.
- Some logic is added to the socket event listeners to handle workflow publish runs completing.
- The workflow library side nav has a new "published" view. It is hidden when the "publishWorkflow" feature is disabled.
- I've added `Safe` and `OrThrow` versions of some workflows hooks. These hooks typically retrieve some data from redux. For example, a node. The `Safe` hooks return the node or null if it cannot be found, while the `OrThrow` hooks return the node or raise if it cannot be found. The `OrThrow` hooks should be used within one of the gate components. These components use the `Safe` hooks and render a fallback if e.g. the node isn't found. This change is required for some of the publish flow UI.
- Add support for locking the workflow editor. When locked, you can pan and zoom but that's it. Currently, it is only locked during publish flow and if a published workflow is opened.
2025-04-03 12:42:28 +11:00
psychedelicious
d66fdfde71 chore(ui): typegen 2025-04-03 12:42:28 +11:00
psychedelicious
08ee08557b feat(app): add noop api validation run stuff to routes and methods 2025-04-03 12:42:28 +11:00
psychedelicious
496f1262c6 feat(app): truncate warnings for invalid model config in db
This message is logged _every_ time we retrieve a list of models if there is an invalid model. Previously it logged the _whole_ row which can be a lot of data. Truncate the row to 64 characters to reduce log pollution.
2025-04-03 12:42:28 +11:00
psychedelicious
188d52e4a5 chore(ui): bump tsafe to latest 2025-04-03 12:42:28 +11:00
Riku
db03c196a1 translationBot(ui): update translation (German)
Currently translated at 66.8% (1230 of 1840 strings)

Co-authored-by: Riku <riku.block@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/de/
Translation: InvokeAI/Web UI
2025-04-03 07:42:43 +11:00
Riccardo Giovanetti
6bc36b697d translationBot(ui): update translation (Italian)
Currently translated at 98.8% (1818 of 1840 strings)

translationBot(ui): update translation (Italian)

Currently translated at 98.6% (1816 of 1840 strings)

translationBot(ui): update translation (Italian)

Currently translated at 98.7% (1816 of 1839 strings)

Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
2025-04-03 07:42:43 +11:00
Linos
b7d71d3028 translationBot(ui): update translation (Vietnamese)
Currently translated at 100.0% (1840 of 1840 strings)

translationBot(ui): update translation (Vietnamese)

Currently translated at 100.0% (1838 of 1838 strings)

Co-authored-by: Linos <linos.coding@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/vi/
Translation: InvokeAI/Web UI
2025-04-03 07:42:43 +11:00
psychedelicious
fa1ebd9d2f fix(ui): do not switch between images when focused on a tab element
Arrow keys should only navigate between tabs, not gallery images.
2025-04-03 07:40:10 +11:00
psychedelicious
eed5d02069 fix(ui): handling for invalid edges when loading workflows
Previously, reactflow appears to have handled an edge case when using its `applyChanges` utility. If a change was provided without an item, it would skip that change. For example, an "add edge" change that somehow passed `null` as the edge, instead of a valid edge.

In our workflow loading and validation logic, invalid edges were removed from the array using `delete edges[i]`. This left "holes" in the array of edges. We then asked `reactflow` to add these edges to state. When it encountered one of the "holes", it skipped over it.

In a recent release (unsure which, somewhere between the latest v11 and ~v12.4) this seems to have changed. It no longer skips over the "holes" and instead trusts the data. This can cause a couple issues:
- Error when loading the workflow if `reactflow` attempt to do anything with the nonexistent edge.
- If somehow the workflow makes it into state with "holes" in the array of edges, all sorts of other stuff breaks when our code does anything with the nonexistent edge.

Two-part fix:
- Update the invalid edge handling to not use `delete edges[i]`. Instead, as we check each edge, we add invalid ones to a set. Then, after all the checks are finished, filter out the invalid edges. The resultant edges array has no holes.
- Simplify the logic around setting nodes and edges in redux. Previously we were using `reactflow`'s `applyChanges` utils, but this does literally nothing except take extra CPU cycles. We can simply set the loaded nodes and edges directly in redux. Perhaps we were using `applyChanges` because it addressed the "holes" issue? Not sure. But we don't need it now.

Closes #7868
2025-04-03 07:37:49 +11:00
psychedelicious
3650d91045 chore(ui): bump @xyflow/react to latest 2025-04-03 07:37:49 +11:00
Eugene Brodsky
6c7d08cacb Change timm and controlnet-aux pins to fix LLaVA model support (#7846)
## Summary

`timm` below 1.0.0 prevents llava models from working (broken in
transformers). but `controlnet-aux` pins `timm` to an earlier version
because otherwise it was breaking the ZoeDepth controlnet.

we don't use ZoeDepth (replaced by depthAnything), and downgrading
controlnet-aux seems to be acceptable.

more context here:

https://github.com/huggingface/controlnet_aux/issues/106
https://github.com/huggingface/controlnet_aux/pull/101


Note that this results in some warnings on startup, stemming from
controlnet-aux:

![image](https://github.com/user-attachments/assets/fa908837-6154-42a2-a93b-eb5e363f5783)

we can probably silence the warnings as a separate enhancement

## Related Issues / Discussions

<!--WHEN APPLICABLE: List any related issues or discussions on github or
discord. If this PR closes an issue, please use the "Closes #1234"
format, so that the issue will be automatically closed when the PR
merges.-->

## QA Instructions

<!--WHEN APPLICABLE: Describe how you have tested the changes in this
PR. Provide enough detail that a reviewer can reproduce your tests.-->

## Merge Plan

<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->

## Checklist

- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
2025-04-01 21:16:40 -04:00
Eugene Brodsky
bb1c40f222 Merge branch 'main' into pin-timm-for-llava 2025-04-01 21:10:30 -04:00
jazzhaiku
bfb117d0e0 Port LoRA to new classification API (#7849)
## Summary

- Port LoRA to new classification API
- Add 2 additional tests cases (ControlLora and Flux Diffusers LoRA)
- Moved `ModelOnDisk` to its own module

## Related Issues / Discussions

<!--WHEN APPLICABLE: List any related issues or discussions on github or
discord. If this PR closes an issue, please use the "Closes #1234"
format, so that the issue will be automatically closed when the PR
merges.-->

## QA Instructions

<!--WHEN APPLICABLE: Describe how you have tested the changes in this
PR. Provide enough detail that a reviewer can reproduce your tests.-->

## Merge Plan

<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->

## Checklist

- [ ] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
2025-04-01 08:05:48 +11:00
jazzhaiku
b31c1022c3 Merge branch 'main' into lora-classification 2025-04-01 07:58:36 +11:00
Mary Hipp
a5851ca31c fix from leftover testing 2025-03-31 12:45:53 -04:00
Mary Hipp
77bf5c15bb GET presigned URLs directly instead of trying to use redirects 2025-03-31 12:45:53 -04:00
Eugene Brodsky
d26b7a1a12 Merge branch 'main' into pin-timm-for-llava 2025-03-31 11:37:29 -04:00
psychedelicious
595133463e feat(nodes): add methods to invalidate invocation typeadapters 2025-03-31 19:15:59 +11:00
psychedelicious
6155f9ff9e feat(nodes): move invocation/output registration to separate class 2025-03-31 19:15:59 +11:00
psychedelicious
7be87c8048 refactor(nodes): simpler logic for baseinvocation typeadapter handling 2025-03-31 19:15:59 +11:00
jazzhaiku
9868c3bfe3 Merge branch 'main' into lora-classification 2025-03-31 16:43:26 +11:00
jazzhaiku
f6c2ee5040 Merge branch 'main' into lora-classification 2025-03-31 09:01:16 +11:00
Billy
965753bf8b Ruff formatting 2025-03-31 08:18:00 +11:00
Billy
40c53ab95c Guard 2025-03-29 09:58:02 +11:00
Eugene Brodsky
c9992914d6 Merge branch 'main' into pin-timm-for-llava 2025-03-28 09:20:30 -04:00
jazzhaiku
c25f6d1f84 Merge branch 'main' into lora-classification 2025-03-28 12:32:22 +11:00
Billy
c276c1cbee Comment 2025-03-28 10:57:46 +11:00
Billy
c619348f29 Extract ModelOnDisk to its own module 2025-03-28 10:35:13 +11:00
Billy
0d75c99476 Caching 2025-03-27 17:55:09 +11:00
Billy
323d409fb6 Make ruff happy 2025-03-27 17:47:57 +11:00
Billy
f251722f56 LoRA classification API 2025-03-27 17:47:01 +11:00
Eugene Brodsky
3f12a43e75 remove pin for controlnet-aux and pin timm to a version that works with llava
timm < 1.0.0 prevents llava models from working (broken in transformers). but controlnet-aux pinned it to an earlier version because otherwise it was breaking the ZoeDepth controlnet.

we don't use ZoeDepth (replaced by depthAnything), and downgrading controlnet-aux seems to be acceptable.

more context here:

https://github.com/huggingface/controlnet_aux/issues/106
https://github.com/huggingface/controlnet_aux/pull/101
2025-03-26 16:58:18 -04:00
454 changed files with 18402 additions and 9924 deletions

View File

@@ -1,9 +1,11 @@
*
!invokeai
!pyproject.toml
!uv.lock
!docker/docker-entrypoint.sh
!LICENSE
**/dist
**/node_modules
**/__pycache__
**/*.egg-info
**/*.egg-info

8
.github/CODEOWNERS vendored
View File

@@ -2,11 +2,11 @@
/.github/workflows/ @lstein @blessedcoolant @hipsterusername @ebr @jazzhaiku
# documentation
/docs/ @lstein @blessedcoolant @hipsterusername @Millu
/mkdocs.yml @lstein @blessedcoolant @hipsterusername @Millu
/docs/ @lstein @blessedcoolant @hipsterusername @psychedelicious
/mkdocs.yml @lstein @blessedcoolant @hipsterusername @psychedelicious
# nodes
/invokeai/app/ @Kyle0654 @blessedcoolant @psychedelicious @brandonrising @hipsterusername @jazzhaiku
/invokeai/app/ @blessedcoolant @psychedelicious @brandonrising @hipsterusername @jazzhaiku
# installation and configuration
/pyproject.toml @lstein @blessedcoolant @hipsterusername
@@ -22,7 +22,7 @@
/invokeai/backend @blessedcoolant @psychedelicious @lstein @maryhipp @hipsterusername
# generation, model management, postprocessing
/invokeai/backend @damian0815 @lstein @blessedcoolant @gregghelt2 @StAlKeR7779 @brandonrising @ryanjdick @hipsterusername @jazzhaiku
/invokeai/backend @lstein @blessedcoolant @brandonrising @hipsterusername @jazzhaiku
# front ends
/invokeai/frontend/CLI @lstein @hipsterusername

View File

@@ -97,6 +97,8 @@ jobs:
context: .
file: docker/Dockerfile
platforms: ${{ env.PLATFORMS }}
build-args: |
GPU_DRIVER=${{ matrix.gpu-driver }}
push: ${{ github.ref == 'refs/heads/main' || github.ref_type == 'tag' || github.event.inputs.push-to-registry }}
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}

View File

@@ -1,6 +1,6 @@
# Builds and uploads the installer and python build artifacts.
# Builds and uploads python build artifacts.
name: build installer
name: build wheel
on:
workflow_dispatch:
@@ -17,7 +17,7 @@ jobs:
- name: setup python
uses: actions/setup-python@v5
with:
python-version: '3.10'
python-version: '3.12'
cache: pip
cache-dependency-path: pyproject.toml
@@ -27,19 +27,12 @@ jobs:
- name: setup frontend
uses: ./.github/actions/install-frontend-deps
- name: create installer
id: create_installer
run: ./create_installer.sh
working-directory: installer
- name: build wheel
id: build_wheel
run: ./scripts/build_wheel.sh
- name: upload python distribution artifact
uses: actions/upload-artifact@v4
with:
name: dist
path: ${{ steps.create_installer.outputs.DIST_PATH }}
- name: upload installer artifact
uses: actions/upload-artifact@v4
with:
name: installer
path: ${{ steps.create_installer.outputs.INSTALLER_PATH }}
path: ${{ steps.build_wheel.outputs.DIST_PATH }}

View File

@@ -49,7 +49,7 @@ jobs:
always_run: true
build:
uses: ./.github/workflows/build-installer.yml
uses: ./.github/workflows/build-wheel.yml
publish-testpypi:
runs-on: ubuntu-latest

68
.github/workflows/uv-lock-checks.yml vendored Normal file
View File

@@ -0,0 +1,68 @@
# Check the `uv` lockfile for consistency with `pyproject.toml`.
#
# If this check fails, you should run `uv lock` to update the lockfile.
name: 'uv lock checks'
on:
push:
branches:
- 'main'
pull_request:
types:
- 'ready_for_review'
- 'opened'
- 'synchronize'
merge_group:
workflow_dispatch:
inputs:
always_run:
description: 'Always run the checks'
required: true
type: boolean
default: true
workflow_call:
inputs:
always_run:
description: 'Always run the checks'
required: true
type: boolean
default: true
jobs:
uv-lock-checks:
env:
# uv requires a venv by default - but for this, we can simply use the system python
UV_SYSTEM_PYTHON: 1
runs-on: ubuntu-latest
timeout-minutes: 5 # expected run time: <1 min
steps:
- name: checkout
uses: actions/checkout@v4
- name: check for changed python files
if: ${{ inputs.always_run != true }}
id: changed-files
# Pinned to the _hash_ for v45.0.9 to prevent supply-chain attacks.
# See:
# - CVE-2025-30066
# - https://www.stepsecurity.io/blog/harden-runner-detection-tj-actions-changed-files-action-is-compromised
# - https://github.com/tj-actions/changed-files/issues/2463
uses: tj-actions/changed-files@a284dc1814e3fd07f2e34267fc8f81227ed29fb8
with:
files_yaml: |
uvlock-pyprojecttoml:
- 'pyproject.toml'
- 'uv.lock'
- name: setup uv
if: ${{ steps.changed-files.outputs.uvlock-pyprojecttoml_any_changed == 'true' || inputs.always_run == true }}
uses: astral-sh/setup-uv@v5
with:
version: '0.6.10'
enable-cache: true
- name: check lockfile
if: ${{ steps.changed-files.outputs.uvlock-pyprojecttoml_any_changed == 'true' || inputs.always_run == true }}
run: uv lock --locked # this will exit with 1 if the lockfile is not consistent with pyproject.toml
shell: bash

2
.nvmrc
View File

@@ -1 +1 @@
v22.12.0
v22.14.0

View File

@@ -4,21 +4,29 @@ repos:
hooks:
- id: black
name: black
stages: [commit]
stages: [pre-commit]
language: system
entry: black
types: [python]
- id: flake8
name: flake8
stages: [commit]
stages: [pre-commit]
language: system
entry: flake8
types: [python]
- id: isort
name: isort
stages: [commit]
stages: [pre-commit]
language: system
entry: isort
types: [python]
types: [python]
- id: uvlock
name: uv lock
stages: [pre-commit]
language: system
entry: uv lock
files: ^pyproject\.toml$
pass_filenames: false

View File

@@ -16,7 +16,7 @@ help:
@echo "frontend-build Build the frontend in order to run on localhost:9090"
@echo "frontend-dev Run the frontend in developer mode on localhost:5173"
@echo "frontend-typegen Generate types for the frontend from the OpenAPI schema"
@echo "installer-zip Build the installer .zip file for the current version"
@echo "wheel Build the wheel for the current version"
@echo "tag-release Tag the GitHub repository with the current version (use at release time only!)"
@echo "openapi Generate the OpenAPI schema for the app, outputting to stdout"
@echo "docs Serve the mkdocs site with live reload"
@@ -64,13 +64,13 @@ frontend-dev:
frontend-typegen:
cd invokeai/frontend/web && python ../../../scripts/generate_openapi_schema.py | pnpm typegen
# Installer zip file
installer-zip:
cd installer && ./create_installer.sh
# Tag the release
wheel:
cd scripts && ./build_wheel.sh
# Tag the release
tag-release:
cd installer && ./tag_release.sh
cd scripts && ./tag_release.sh
# Generate the OpenAPI Schema for the app
openapi:

View File

@@ -1,77 +1,6 @@
# syntax=docker/dockerfile:1.4
## Builder stage
FROM library/ubuntu:24.04 AS builder
ARG DEBIAN_FRONTEND=noninteractive
RUN rm -f /etc/apt/apt.conf.d/docker-clean; echo 'Binary::apt::APT::Keep-Downloaded-Packages "true";' > /etc/apt/apt.conf.d/keep-cache
RUN --mount=type=cache,target=/var/cache/apt,sharing=locked \
--mount=type=cache,target=/var/lib/apt,sharing=locked \
apt update && apt-get install -y \
build-essential \
git
# Install `uv` for package management
COPY --from=ghcr.io/astral-sh/uv:0.6.0 /uv /uvx /bin/
ENV VIRTUAL_ENV=/opt/venv
ENV PATH="$VIRTUAL_ENV/bin:$PATH"
ENV INVOKEAI_SRC=/opt/invokeai
ENV PYTHON_VERSION=3.11
ENV UV_PYTHON=3.11
ENV UV_COMPILE_BYTECODE=1
ENV UV_LINK_MODE=copy
ENV UV_PROJECT_ENVIRONMENT="$VIRTUAL_ENV"
ENV UV_INDEX="https://download.pytorch.org/whl/cu124"
ARG GPU_DRIVER=cuda
# unused but available
ARG BUILDPLATFORM
# Switch to the `ubuntu` user to work around dependency issues with uv-installed python
RUN mkdir -p ${VIRTUAL_ENV} && \
mkdir -p ${INVOKEAI_SRC} && \
chmod -R a+w /opt && \
mkdir ~ubuntu/.cache && chown ubuntu: ~ubuntu/.cache
USER ubuntu
# Install python
RUN --mount=type=cache,target=/home/ubuntu/.cache/uv,uid=1000,gid=1000 \
uv python install ${PYTHON_VERSION}
WORKDIR ${INVOKEAI_SRC}
# Install project's dependencies as a separate layer so they aren't rebuilt every commit.
# bind-mount instead of copy to defer adding sources to the image until next layer.
#
# NOTE: there are no pytorch builds for arm64 + cuda, only cpu
# x86_64/CUDA is the default
RUN --mount=type=cache,target=/home/ubuntu/.cache/uv,uid=1000,gid=1000 \
--mount=type=bind,source=pyproject.toml,target=pyproject.toml \
--mount=type=bind,source=invokeai/version,target=invokeai/version \
if [ "$TARGETPLATFORM" = "linux/arm64" ] || [ "$GPU_DRIVER" = "cpu" ]; then \
UV_INDEX="https://download.pytorch.org/whl/cpu"; \
elif [ "$GPU_DRIVER" = "rocm" ]; then \
UV_INDEX="https://download.pytorch.org/whl/rocm6.1"; \
fi && \
uv sync --no-install-project
# Now that the bulk of the dependencies have been installed, copy in the project files that change more frequently.
COPY invokeai invokeai
COPY pyproject.toml .
RUN --mount=type=cache,target=/home/ubuntu/.cache/uv,uid=1000,gid=1000 \
--mount=type=bind,source=pyproject.toml,target=pyproject.toml \
if [ "$TARGETPLATFORM" = "linux/arm64" ] || [ "$GPU_DRIVER" = "cpu" ]; then \
UV_INDEX="https://download.pytorch.org/whl/cpu"; \
elif [ "$GPU_DRIVER" = "rocm" ]; then \
UV_INDEX="https://download.pytorch.org/whl/rocm6.1"; \
fi && \
uv sync
#### Build the Web UI ------------------------------------
#### Web UI ------------------------------------
FROM docker.io/node:22-slim AS web-builder
ENV PNPM_HOME="/pnpm"
@@ -85,69 +14,100 @@ RUN --mount=type=cache,target=/pnpm/store \
pnpm install --frozen-lockfile
RUN npx vite build
#### Runtime stage ---------------------------------------
## Backend ---------------------------------------
FROM library/ubuntu:24.04 AS runtime
FROM library/ubuntu:24.04
ARG DEBIAN_FRONTEND=noninteractive
ENV PYTHONUNBUFFERED=1
ENV PYTHONDONTWRITEBYTECODE=1
RUN rm -f /etc/apt/apt.conf.d/docker-clean; echo 'Binary::apt::APT::Keep-Downloaded-Packages "true";' > /etc/apt/apt.conf.d/keep-cache
RUN --mount=type=cache,target=/var/cache/apt \
--mount=type=cache,target=/var/lib/apt \
apt update && apt install -y --no-install-recommends \
ca-certificates \
git \
gosu \
libglib2.0-0 \
libgl1 \
libglx-mesa0 \
build-essential \
libopencv-dev \
libstdc++-10-dev
RUN apt update && apt install -y --no-install-recommends \
git \
curl \
vim \
tmux \
ncdu \
iotop \
bzip2 \
gosu \
magic-wormhole \
libglib2.0-0 \
libgl1 \
libglx-mesa0 \
build-essential \
libopencv-dev \
libstdc++-10-dev &&\
apt-get clean && apt-get autoclean
ENV \
PYTHONUNBUFFERED=1 \
PYTHONDONTWRITEBYTECODE=1 \
VIRTUAL_ENV=/opt/venv \
INVOKEAI_SRC=/opt/invokeai \
PYTHON_VERSION=3.12 \
UV_PYTHON=3.12 \
UV_COMPILE_BYTECODE=1 \
UV_MANAGED_PYTHON=1 \
UV_LINK_MODE=copy \
UV_PROJECT_ENVIRONMENT=/opt/venv \
UV_INDEX="https://download.pytorch.org/whl/cu124" \
INVOKEAI_ROOT=/invokeai \
INVOKEAI_HOST=0.0.0.0 \
INVOKEAI_PORT=9090 \
PATH="/opt/venv/bin:$PATH" \
CONTAINER_UID=${CONTAINER_UID:-1000} \
CONTAINER_GID=${CONTAINER_GID:-1000}
ENV INVOKEAI_SRC=/opt/invokeai
ENV VIRTUAL_ENV=/opt/venv
ENV UV_PROJECT_ENVIRONMENT="$VIRTUAL_ENV"
ENV PYTHON_VERSION=3.11
ENV INVOKEAI_ROOT=/invokeai
ENV INVOKEAI_HOST=0.0.0.0
ENV INVOKEAI_PORT=9090
ENV PATH="$VIRTUAL_ENV/bin:$INVOKEAI_SRC:$PATH"
ENV CONTAINER_UID=${CONTAINER_UID:-1000}
ENV CONTAINER_GID=${CONTAINER_GID:-1000}
ARG GPU_DRIVER=cuda
# Install `uv` for package management
# and install python for the ubuntu user (expected to exist on ubuntu >=24.x)
# this is too tiny to optimize with multi-stage builds, but maybe we'll come back to it
COPY --from=ghcr.io/astral-sh/uv:0.6.0 /uv /uvx /bin/
USER ubuntu
RUN uv python install ${PYTHON_VERSION}
USER root
COPY --from=ghcr.io/astral-sh/uv:0.6.9 /uv /uvx /bin/
# --link requires buldkit w/ dockerfile syntax 1.4
COPY --link --from=builder ${INVOKEAI_SRC} ${INVOKEAI_SRC}
COPY --link --from=builder ${VIRTUAL_ENV} ${VIRTUAL_ENV}
COPY --link --from=web-builder /build/dist ${INVOKEAI_SRC}/invokeai/frontend/web/dist
# Link amdgpu.ids for ROCm builds
# contributed by https://github.com/Rubonnek
RUN mkdir -p "/opt/amdgpu/share/libdrm" &&\
ln -s "/usr/share/libdrm/amdgpu.ids" "/opt/amdgpu/share/libdrm/amdgpu.ids"
# Install python & allow non-root user to use it by traversing the /root dir without read permissions
RUN --mount=type=cache,target=/root/.cache/uv \
uv python install ${PYTHON_VERSION} && \
# chmod --recursive a+rX /root/.local/share/uv/python
chmod 711 /root
WORKDIR ${INVOKEAI_SRC}
# Install project's dependencies as a separate layer so they aren't rebuilt every commit.
# bind-mount instead of copy to defer adding sources to the image until next layer.
#
# NOTE: there are no pytorch builds for arm64 + cuda, only cpu
# x86_64/CUDA is the default
RUN --mount=type=cache,target=/root/.cache/uv \
--mount=type=bind,source=pyproject.toml,target=pyproject.toml \
--mount=type=bind,source=uv.lock,target=uv.lock \
# this is just to get the package manager to recognize that the project exists, without making changes to the docker layer
--mount=type=bind,source=invokeai/version,target=invokeai/version \
if [ "$TARGETPLATFORM" = "linux/arm64" ] || [ "$GPU_DRIVER" = "cpu" ]; then UV_INDEX="https://download.pytorch.org/whl/cpu"; \
elif [ "$GPU_DRIVER" = "rocm" ]; then UV_INDEX="https://download.pytorch.org/whl/rocm6.2"; \
fi && \
uv sync --frozen
# build patchmatch
RUN cd /usr/lib/$(uname -p)-linux-gnu/pkgconfig/ && ln -sf opencv4.pc opencv.pc
RUN python -c "from patchmatch import patch_match"
# Link amdgpu.ids for ROCm builds
# contributed by https://github.com/Rubonnek
RUN mkdir -p "/opt/amdgpu/share/libdrm" &&\
ln -s "/usr/share/libdrm/amdgpu.ids" "/opt/amdgpu/share/libdrm/amdgpu.ids"
RUN mkdir -p ${INVOKEAI_ROOT} && chown -R ${CONTAINER_UID}:${CONTAINER_GID} ${INVOKEAI_ROOT}
COPY docker/docker-entrypoint.sh ./
ENTRYPOINT ["/opt/invokeai/docker-entrypoint.sh"]
CMD ["invokeai-web"]
# --link requires buldkit w/ dockerfile syntax 1.4, does not work with podman
COPY --link --from=web-builder /build/dist ${INVOKEAI_SRC}/invokeai/frontend/web/dist
# add sources last to minimize image changes on code changes
COPY invokeai ${INVOKEAI_SRC}/invokeai
# this should not increase image size because we've already installed dependencies
# in a previous layer
RUN --mount=type=cache,target=/root/.cache/uv \
--mount=type=bind,source=pyproject.toml,target=pyproject.toml \
--mount=type=bind,source=uv.lock,target=uv.lock \
if [ "$TARGETPLATFORM" = "linux/arm64" ] || [ "$GPU_DRIVER" = "cpu" ]; then UV_INDEX="https://download.pytorch.org/whl/cpu"; \
elif [ "$GPU_DRIVER" = "rocm" ]; then UV_INDEX="https://download.pytorch.org/whl/rocm6.2"; \
fi && \
uv pip install -e .

View File

@@ -60,16 +60,11 @@ Next, these jobs run and must pass. They are the same jobs that are run for ever
- **`frontend-checks`**: runs `prettier` (format), `eslint` (lint), `dpdm` (circular refs), `tsc` (static type check) and `knip` (unused imports)
- **`typegen-checks`**: ensures the frontend and backend types are synced
#### `build-installer` Job
#### `build-wheel` Job
This sets up both python and frontend dependencies and builds the python package. Internally, this runs `installer/create_installer.sh` and uploads two artifacts:
This sets up both python and frontend dependencies and builds the python package. Internally, this runs `./scripts/build_wheel.sh` and uploads `dist.zip`, which contains the wheel and unarchived build.
- **`dist`**: the python distribution, to be published on PyPI
- **`InvokeAI-installer-${VERSION}.zip`**: the legacy install scripts
You don't need to download either of these files.
> The legacy install scripts are no longer used, but we haven't updated the workflow to skip building them.
You don't need to download or test these artifacts.
#### Sanity Check & Smoke Test
@@ -79,7 +74,7 @@ It's possible to test the python package before it gets published to PyPI. We've
But, if you want to be extra-super careful, here's how to test it:
- Download the `dist.zip` build artifact from the `build-installer` job
- Download the `dist.zip` build artifact from the `build-wheel` job
- Unzip it and find the wheel file
- Create a fresh Invoke install by following the [manual install guide](https://invoke-ai.github.io/InvokeAI/installation/manual/) - but instead of installing from PyPI, install from the wheel
- Test the app

View File

@@ -39,7 +39,7 @@ nodes imported in the `__init__.py` file are loaded. See the README in the nodes
folder for more examples:
```py
from .cool_node import CoolInvocation
from .cool_node import ResizeInvocation
```
## Creating A New Invocation
@@ -69,7 +69,10 @@ The first set of things we need to do when creating a new Invocation are -
So let us do that.
```python
from invokeai.app.invocations.baseinvocation import BaseInvocation, invocation
from invokeai.invocation_api import (
BaseInvocation,
invocation,
)
@invocation('resize')
class ResizeInvocation(BaseInvocation):
@@ -103,8 +106,12 @@ create your own custom field types later in this guide. For now, let's go ahead
and use it.
```python
from invokeai.app.invocations.baseinvocation import BaseInvocation, InputField, invocation
from invokeai.app.invocations.primitives import ImageField
from invokeai.invocation_api import (
BaseInvocation,
ImageField,
InputField,
invocation,
)
@invocation('resize')
class ResizeInvocation(BaseInvocation):
@@ -128,8 +135,12 @@ image: ImageField = InputField(description="The input image")
Great. Now let us create our other inputs for `width` and `height`
```python
from invokeai.app.invocations.baseinvocation import BaseInvocation, InputField, invocation
from invokeai.app.invocations.primitives import ImageField
from invokeai.invocation_api import (
BaseInvocation,
ImageField,
InputField,
invocation,
)
@invocation('resize')
class ResizeInvocation(BaseInvocation):
@@ -163,8 +174,13 @@ that are provided by it by InvokeAI.
Let us create this function first.
```python
from invokeai.app.invocations.baseinvocation import BaseInvocation, InputField, invocation, InvocationContext
from invokeai.app.invocations.primitives import ImageField
from invokeai.invocation_api import (
BaseInvocation,
ImageField,
InputField,
InvocationContext,
invocation,
)
@invocation('resize')
class ResizeInvocation(BaseInvocation):
@@ -191,8 +207,14 @@ all the necessary info related to image outputs. So let us use that.
We will cover how to create your own output types later in this guide.
```python
from invokeai.app.invocations.baseinvocation import BaseInvocation, InputField, invocation, InvocationContext
from invokeai.app.invocations.primitives import ImageField
from invokeai.invocation_api import (
BaseInvocation,
ImageField,
InputField,
InvocationContext,
invocation,
)
from invokeai.app.invocations.image import ImageOutput
@invocation('resize')
@@ -217,9 +239,15 @@ Perfect. Now that we have our Invocation setup, let us do what we want to do.
So let's do that.
```python
from invokeai.app.invocations.baseinvocation import BaseInvocation, InputField, invocation, InvocationContext
from invokeai.app.invocations.primitives import ImageField
from invokeai.app.invocations.image import ImageOutput, ResourceOrigin, ImageCategory
from invokeai.invocation_api import (
BaseInvocation,
ImageField,
InputField,
InvocationContext,
invocation,
)
from invokeai.app.invocations.image import ImageOutput
@invocation("resize")
class ResizeInvocation(BaseInvocation):

View File

@@ -41,7 +41,7 @@ If you just want to use Invoke, you should use the [launcher][launcher link].
With the modifications made, the install command should look something like this:
```sh
uv pip install -e ".[dev,test,docs,xformers]" --python 3.11 --python-preference only-managed --index=https://download.pytorch.org/whl/cu124 --reinstall
uv pip install -e ".[dev,test,docs,xformers]" --python 3.12 --python-preference only-managed --index=https://download.pytorch.org/whl/cu126 --reinstall
```
6. At this point, you should have Invoke installed, a venv set up and activated, and the server running. But you will see a warning in the terminal that no UI was found. If you go to the URL for the server, you won't get a UI.

View File

@@ -1,121 +0,0 @@
# Legacy Scripts
!!! warning "Legacy Scripts"
We recommend using the Invoke Launcher to install and update Invoke. It's a desktop application for Windows, macOS and Linux. It takes care of a lot of nitty gritty details for you.
Follow the [quick start guide](./quick_start.md) to get started.
!!! tip "Use the installer to update"
Using the installer for updates will not erase any of your data (images, models, boards, etc). It only updates the core libraries used to run Invoke.
Simply use the same path you installed to originally to update your existing installation.
Both release and pre-release versions can be installed using the installer. It also supports install through a wheel if needed.
Be sure to review the [installation requirements] and ensure your system has everything it needs to install Invoke.
## Getting the Latest Installer
Download the `InvokeAI-installer-vX.Y.Z.zip` file from the [latest release] page. It is at the bottom of the page, under **Assets**.
After unzipping the installer, you should have a `InvokeAI-Installer` folder with some files inside, including `install.bat` and `install.sh`.
## Running the Installer
!!! tip
Windows users should first double-click the `WinLongPathsEnabled.reg` file to prevent a failed installation due to long file paths.
Double-click the install script:
=== "Windows"
```sh
install.bat
```
=== "Linux/macOS"
```sh
install.sh
```
!!! info "Running the Installer from the commandline"
You can also run the install script from cmd/powershell (Windows) or terminal (Linux/macOS).
!!! warning "Untrusted Publisher (Windows)"
You may get a popup saying the file comes from an `Untrusted Publisher`. Click `More Info` and `Run Anyway` to get past this.
The installation process is simple, with a few prompts:
- Select the version to install. Unless you have a specific reason to install a specific version, select the default (the latest version).
- Select location for the install. Be sure you have enough space in this folder for the base application, as described in the [installation requirements].
- Select a GPU device.
!!! info "Slow Installation"
The installer needs to download several GB of data and install it all. It may appear to get stuck at 99.9% when installing `pytorch` or during a step labeled "Installing collected packages".
If it is stuck for over 10 minutes, something has probably gone wrong and you should close the window and restart.
## Running the Application
Find the install location you selected earlier. Double-click the launcher script to run the app:
=== "Windows"
```sh
invoke.bat
```
=== "Linux/macOS"
```sh
invoke.sh
```
Choose the first option to run the UI. After a series of startup messages, you'll see something like this:
```sh
Uvicorn running on http://127.0.0.1:9090 (Press CTRL+C to quit)
```
Copy the URL into your browser and you should see the UI.
## Improved Outpainting with PatchMatch
PatchMatch is an extra add-on that can improve outpainting. Windows users are in luck - it works out of the box.
On macOS and Linux, a few extra steps are needed to set it up. See the [PatchMatch installation guide](./patchmatch.md).
## First-time Setup
You will need to [install some models] before you can generate.
Check the [configuration docs] for details on configuring the application.
## Updating
Updating is exactly the same as installing - download the latest installer, choose the latest version, enter your existing installation path, and the app will update. None of your data (images, models, boards, etc) will be erased.
!!! info "Dependency Resolution Issues"
We've found that pip's dependency resolution can cause issues when upgrading packages. One very common problem was pip "downgrading" torch from CUDA to CPU, but things broke in other novel ways.
The installer doesn't have this kind of problem, so we use it for updating as well.
## Installation Issues
If you have installation issues, please review the [FAQ]. You can also [create an issue] or ask for help on [discord].
[installation requirements]: ./requirements.md
[FAQ]: ../faq.md
[install some models]: ./models.md
[configuration docs]: ../configuration.md
[latest release]: https://github.com/invoke-ai/InvokeAI/releases/latest
[create an issue]: https://github.com/invoke-ai/InvokeAI/issues
[discord]: https://discord.gg/ZmtBAhwWhy

View File

@@ -43,10 +43,10 @@ The following commands vary depending on the version of Invoke being installed a
3. Create a virtual environment in that directory:
```sh
uv venv --relocatable --prompt invoke --python 3.11 --python-preference only-managed .venv
uv venv --relocatable --prompt invoke --python 3.12 --python-preference only-managed .venv
```
This command creates a portable virtual environment at `.venv` complete with a portable python 3.11. It doesn't matter if your system has no python installed, or has a different version - `uv` will handle everything.
This command creates a portable virtual environment at `.venv` complete with a portable python 3.12. It doesn't matter if your system has no python installed, or has a different version - `uv` will handle everything.
4. Activate the virtual environment:
@@ -64,14 +64,21 @@ The following commands vary depending on the version of Invoke being installed a
5. Choose a version to install. Review the [GitHub releases page](https://github.com/invoke-ai/InvokeAI/releases).
6. Determine the package package specifier to use when installing. This is a performance optimization.
6. Determine the package specifier to use when installing. This is a performance optimization.
- If you have an Nvidia 20xx series GPU or older, use `invokeai[xformers]`.
- If you have an Nvidia 30xx series GPU or newer, or do not have an Nvidia GPU, use `invokeai`.
7. Determine the `PyPI` index URL to use for installation, if any. This is necessary to get the right version of torch installed.
=== "Invoke v5 or later"
=== "Invoke v5.10.0 and later"
- If you are on Windows or Linux with an Nvidia GPU, use `https://download.pytorch.org/whl/cu126`.
- If you are on Linux with no GPU, use `https://download.pytorch.org/whl/cpu`.
- If you are on Linux with an AMD GPU, use `https://download.pytorch.org/whl/rocm6.2.4`.
- **In all other cases, do not use an index.**
=== "Invoke v5.0.0 to v5.9.1"
- If you are on Windows with an Nvidia GPU, use `https://download.pytorch.org/whl/cu124`.
- If you are on Linux with no GPU, use `https://download.pytorch.org/whl/cpu`.
@@ -88,13 +95,13 @@ The following commands vary depending on the version of Invoke being installed a
8. Install the `invokeai` package. Substitute the package specifier and version.
```sh
uv pip install <PACKAGE_SPECIFIER>==<VERSION> --python 3.11 --python-preference only-managed --force-reinstall
uv pip install <PACKAGE_SPECIFIER>==<VERSION> --python 3.12 --python-preference only-managed --force-reinstall
```
If you determined you needed to use a `PyPI` index URL in the previous step, you'll need to add `--index=<INDEX_URL>` like this:
```sh
uv pip install <PACKAGE_SPECIFIER>==<VERSION> --python 3.11 --python-preference only-managed --index=<INDEX_URL> --force-reinstall
uv pip install <PACKAGE_SPECIFIER>==<VERSION> --python 3.12 --python-preference only-managed --index=<INDEX_URL> --force-reinstall
```
9. Deactivate and reactivate your venv so that the invokeai-specific commands become available in the environment:

View File

@@ -49,9 +49,9 @@ If you have an existing Invoke installation, you can select it and let the launc
!!! warning "Problem running the launcher on macOS"
macOS may not allow you to run the launcher. We are working to resolve this by signing the launcher executable. Until that is done, you can either use the [legacy scripts](./legacy_scripts.md) to install, or manually flag the launcher as safe:
macOS may not allow you to run the launcher. We are working to resolve this by signing the launcher executable. Until that is done, you can manually flag the launcher as safe:
- Open the **Invoke-Installer-mac-arm64.dmg** file.
- Open the **Invoke Community Edition.dmg** file.
- Drag the launcher to **Applications**.
- Open a terminal.
- Run `xattr -d 'com.apple.quarantine' /Applications/Invoke\ Community\ Edition.app`.
@@ -117,7 +117,6 @@ If you still have problems, ask for help on the Invoke [discord](https://discord
- You can install the Invoke application as a python package. See our [manual install](./manual.md) docs.
- You can run Invoke with docker. See our [docker install](./docker.md) docs.
- You can still use our legacy scripts to install and run Invoke. See the [legacy scripts](./legacy_scripts.md) docs.
## Need Help?

View File

@@ -41,7 +41,7 @@ The requirements below are rough guidelines for best performance. GPUs with less
You don't need to do this if you are installing with the [Invoke Launcher](./quick_start.md).
Invoke requires python 3.10 or 3.11. If you don't already have one of these versions installed, we suggest installing 3.11, as it will be supported for longer.
Invoke requires python 3.10 through 3.12. If you don't already have one of these versions installed, we suggest installing 3.12, as it will be supported for longer.
Check that your system has an up-to-date Python installed by running `python3 --version` in the terminal (Linux, macOS) or cmd/powershell (Windows).
@@ -49,19 +49,19 @@ Check that your system has an up-to-date Python installed by running `python3 --
=== "Windows"
- Install python 3.11 with [an official installer].
- Install python with [an official installer].
- The installer includes an option to add python to your PATH. Be sure to enable this. If you missed it, re-run the installer, choose to modify an existing installation, and tick that checkbox.
- You may need to install [Microsoft Visual C++ Redistributable].
=== "macOS"
- Install python 3.11 with [an official installer].
- Install python with [an official installer].
- If model installs fail with a certificate error, you may need to run this command (changing the python version to match what you have installed): `/Applications/Python\ 3.10/Install\ Certificates.command`
- If you haven't already, you will need to install the XCode CLI Tools by running `xcode-select --install` in a terminal.
=== "Linux"
- Installing python varies depending on your system. On Ubuntu, you can use the [deadsnakes PPA](https://launchpad.net/~deadsnakes/+archive/ubuntu/ppa).
- Installing python varies depending on your system. We recommend [using `uv` to manage your python installation](https://docs.astral.sh/uv/concepts/python-versions/#installing-a-python-version).
- You'll need to install `libglib2.0-0` and `libgl1-mesa-glx` for OpenCV to work. For example, on a Debian system: `sudo apt update && sudo apt install -y libglib2.0-0 libgl1-mesa-glx`
## Drivers

Binary file not shown.

View File

@@ -1,128 +0,0 @@
@echo off
setlocal EnableExtensions EnableDelayedExpansion
@rem This script requires the user to install Python 3.10 or higher. All other
@rem requirements are downloaded as needed.
@rem change to the script's directory
PUSHD "%~dp0"
set "no_cache_dir=--no-cache-dir"
if "%1" == "use-cache" (
set "no_cache_dir="
)
@rem Config
@rem The version in the next line is replaced by an up to date release number
@rem when create_installer.sh is run. Change the release number there.
set INSTRUCTIONS=https://invoke-ai.github.io/InvokeAI/installation/INSTALL_AUTOMATED/
set TROUBLESHOOTING=https://invoke-ai.github.io/InvokeAI/help/FAQ/
set PYTHON_URL=https://www.python.org/downloads/windows/
set MINIMUM_PYTHON_VERSION=3.10.0
set PYTHON_URL=https://www.python.org/downloads/release/python-3109/
set err_msg=An error has occurred and the script could not continue.
@rem --------------------------- Intro -------------------------------
echo This script will install InvokeAI and its dependencies.
echo.
echo BEFORE YOU START PLEASE MAKE SURE TO DO THE FOLLOWING
echo 1. Install python 3.10 or 3.11. Python version 3.9 is no longer supported.
echo 2. Double-click on the file WinLongPathsEnabled.reg in order to
echo enable long path support on your system.
echo 3. Install the Visual C++ core libraries.
echo Please download and install the libraries from:
echo https://learn.microsoft.com/en-US/cpp/windows/latest-supported-vc-redist?view=msvc-170
echo.
echo See %INSTRUCTIONS% for more details.
echo.
echo FOR THE BEST USER EXPERIENCE WE SUGGEST MAXIMIZING THIS WINDOW NOW.
pause
@rem ---------------------------- check Python version ---------------
echo ***** Checking and Updating Python *****
call python --version >.tmp1 2>.tmp2
if %errorlevel% == 1 (
set err_msg=Please install Python 3.10-11. See %INSTRUCTIONS% for details.
goto err_exit
)
for /f "tokens=2" %%i in (.tmp1) do set python_version=%%i
if "%python_version%" == "" (
set err_msg=No python was detected on your system. Please install Python version %MINIMUM_PYTHON_VERSION% or higher. We recommend Python 3.10.12 from %PYTHON_URL%
goto err_exit
)
call :compareVersions %MINIMUM_PYTHON_VERSION% %python_version%
if %errorlevel% == 1 (
set err_msg=Your version of Python is too low. You need at least %MINIMUM_PYTHON_VERSION% but you have %python_version%. We recommend Python 3.10.12 from %PYTHON_URL%
goto err_exit
)
@rem Cleanup
del /q .tmp1 .tmp2
@rem -------------- Install and Configure ---------------
call python .\lib\main.py
pause
exit /b
@rem ------------------------ Subroutines ---------------
@rem routine to do comparison of semantic version numbers
@rem found at https://stackoverflow.com/questions/15807762/compare-version-numbers-in-batch-file
:compareVersions
::
:: Compares two version numbers and returns the result in the ERRORLEVEL
::
:: Returns 1 if version1 > version2
:: 0 if version1 = version2
:: -1 if version1 < version2
::
:: The nodes must be delimited by . or , or -
::
:: Nodes are normally strictly numeric, without a 0 prefix. A letter suffix
:: is treated as a separate node
::
setlocal enableDelayedExpansion
set "v1=%~1"
set "v2=%~2"
call :divideLetters v1
call :divideLetters v2
:loop
call :parseNode "%v1%" n1 v1
call :parseNode "%v2%" n2 v2
if %n1% gtr %n2% exit /b 1
if %n1% lss %n2% exit /b -1
if not defined v1 if not defined v2 exit /b 0
if not defined v1 exit /b -1
if not defined v2 exit /b 1
goto :loop
:parseNode version nodeVar remainderVar
for /f "tokens=1* delims=.,-" %%A in ("%~1") do (
set "%~2=%%A"
set "%~3=%%B"
)
exit /b
:divideLetters versionVar
for %%C in (a b c d e f g h i j k l m n o p q r s t u v w x y z) do set "%~1=!%~1:%%C=.%%C!"
exit /b
:err_exit
echo %err_msg%
echo The installer will exit now.
pause
exit /b
pause
:Trim
SetLocal EnableDelayedExpansion
set Params=%*
for /f "tokens=1*" %%a in ("!Params!") do EndLocal & set %1=%%b
exit /b

View File

@@ -1,40 +0,0 @@
#!/bin/bash
# make sure we are not already in a venv
# (don't need to check status)
deactivate >/dev/null 2>&1
scriptdir=$(dirname "$0")
cd $scriptdir
function version { echo "$@" | awk -F. '{ printf("%d%03d%03d%03d\n", $1,$2,$3,$4); }'; }
MINIMUM_PYTHON_VERSION=3.10.0
MAXIMUM_PYTHON_VERSION=3.11.100
PYTHON=""
for candidate in python3.11 python3.10 python3 python ; do
if ppath=`which $candidate 2>/dev/null`; then
# when using `pyenv`, the executable for an inactive Python version will exist but will not be operational
# we check that this found executable can actually run
if [ $($candidate --version &>/dev/null; echo ${PIPESTATUS}) -gt 0 ]; then continue; fi
python_version=$($ppath -V | awk '{ print $2 }')
if [ $(version $python_version) -ge $(version "$MINIMUM_PYTHON_VERSION") ]; then
if [ $(version $python_version) -le $(version "$MAXIMUM_PYTHON_VERSION") ]; then
PYTHON=$ppath
break
fi
fi
fi
done
if [ -z "$PYTHON" ]; then
echo "A suitable Python interpreter could not be found"
echo "Please install Python $MINIMUM_PYTHON_VERSION or higher (maximum $MAXIMUM_PYTHON_VERSION) before running this script. See instructions at $INSTRUCTIONS for help."
read -p "Press any key to exit"
exit -1
fi
echo "For the best user experience we suggest enlarging or maximizing this window now."
exec $PYTHON ./lib/main.py ${@}
read -p "Press any key to exit"

View File

@@ -1,438 +0,0 @@
# Copyright (c) 2023 Eugene Brodsky (https://github.com/ebr)
"""
InvokeAI installer script
"""
import locale
import os
import platform
import re
import shutil
import subprocess
import sys
import venv
from pathlib import Path
from tempfile import TemporaryDirectory
from typing import Optional, Tuple
SUPPORTED_PYTHON = ">=3.10.0,<=3.11.100"
INSTALLER_REQS = ["rich", "semver", "requests", "plumbum", "prompt-toolkit"]
BOOTSTRAP_VENV_PREFIX = "invokeai-installer-tmp"
DOCS_URL = "https://invoke-ai.github.io/InvokeAI/"
DISCORD_URL = "https://discord.gg/ZmtBAhwWhy"
OS = platform.uname().system
ARCH = platform.uname().machine
VERSION = "latest"
def get_version_from_wheel_filename(wheel_filename: str) -> str:
match = re.search(r"-(\d+\.\d+\.\d+)", wheel_filename)
if match:
version = match.group(1)
return version
else:
raise ValueError(f"Could not extract version from wheel filename: {wheel_filename}")
class Installer:
"""
Deploys an InvokeAI installation into a given path
"""
reqs: list[str] = INSTALLER_REQS
def __init__(self) -> None:
if os.getenv("VIRTUAL_ENV") is not None:
print("A virtual environment is already activated. Please 'deactivate' before installation.")
sys.exit(-1)
self.bootstrap()
self.available_releases = get_github_releases()
def mktemp_venv(self) -> TemporaryDirectory[str]:
"""
Creates a temporary virtual environment for the installer itself
:return: path to the created virtual environment directory
:rtype: TemporaryDirectory
"""
# Cleaning up temporary directories on Windows results in a race condition
# and a stack trace.
# `ignore_cleanup_errors` was only added in Python 3.10
if OS == "Windows" and int(platform.python_version_tuple()[1]) >= 10:
venv_dir = TemporaryDirectory(prefix=BOOTSTRAP_VENV_PREFIX, ignore_cleanup_errors=True)
else:
venv_dir = TemporaryDirectory(prefix=BOOTSTRAP_VENV_PREFIX)
venv.create(venv_dir.name, with_pip=True)
self.venv_dir = venv_dir
set_sys_path(Path(venv_dir.name))
return venv_dir
def bootstrap(self, verbose: bool = False) -> TemporaryDirectory[str] | None:
"""
Bootstrap the installer venv with packages required at install time
"""
print("Initializing the installer. This may take a minute - please wait...")
venv_dir = self.mktemp_venv()
pip = get_pip_from_venv(Path(venv_dir.name))
cmd = [pip, "install", "--require-virtualenv", "--use-pep517"]
cmd.extend(self.reqs)
try:
# upgrade pip to the latest version to avoid a confusing message
res = upgrade_pip(Path(venv_dir.name))
if verbose:
print(res)
# run the install prerequisites installation
res = subprocess.check_output(cmd).decode()
if verbose:
print(res)
return venv_dir
except subprocess.CalledProcessError as e:
print(e)
def app_venv(self, venv_parent: Path) -> Path:
"""
Create a virtualenv for the InvokeAI installation
"""
venv_dir = venv_parent / ".venv"
# Prefer to copy python executables
# so that updates to system python don't break InvokeAI
try:
venv.create(venv_dir, with_pip=True)
# If installing over an existing environment previously created with symlinks,
# the executables will fail to copy. Keep symlinks in that case
except shutil.SameFileError:
venv.create(venv_dir, with_pip=True, symlinks=True)
return venv_dir
def install(
self,
root: str = "~/invokeai",
yes_to_all: bool = False,
find_links: Optional[str] = None,
wheel: Optional[Path] = None,
) -> None:
"""Install the InvokeAI application into the given runtime path
Args:
root: Destination path for the installation
yes_to_all: Accept defaults to all questions
find_links: A local directory to search for requirement wheels before going to remote indexes
wheel: A wheel file to install
"""
import messages
if wheel:
messages.installing_from_wheel(wheel.name)
version = get_version_from_wheel_filename(wheel.name)
else:
messages.welcome(self.available_releases)
version = messages.choose_version(self.available_releases)
auto_dest = Path(os.environ.get("INVOKEAI_ROOT", root)).expanduser().resolve()
destination = auto_dest if yes_to_all else messages.dest_path(root)
if destination is None:
print("Could not find or create the destination directory. Installation cancelled.")
sys.exit(0)
# create the venv for the app
self.venv = self.app_venv(venv_parent=destination)
self.instance = InvokeAiInstance(runtime=destination, venv=self.venv, version=version)
# install dependencies and the InvokeAI application
(extra_index_url, optional_modules) = get_torch_source() if not yes_to_all else (None, None)
self.instance.install(extra_index_url, optional_modules, find_links, wheel)
# install the launch/update scripts into the runtime directory
self.instance.install_user_scripts()
message = f"""
*** Installation Successful ***
To start the application, run:
{destination}/invoke.{"bat" if sys.platform == "win32" else "sh"}
For more information, troubleshooting and support, visit our docs at:
{DOCS_URL}
Join the community on Discord:
{DISCORD_URL}
"""
print(message)
class InvokeAiInstance:
"""
Manages an installed instance of InvokeAI, comprising a virtual environment and a runtime directory.
The virtual environment *may* reside within the runtime directory.
A single runtime directory *may* be shared by multiple virtual environments, though this isn't currently tested or supported.
"""
def __init__(self, runtime: Path, venv: Path, version: str = "stable") -> None:
self.runtime = runtime
self.venv = venv
self.pip = get_pip_from_venv(venv)
self.version = version
set_sys_path(venv)
os.environ["INVOKEAI_ROOT"] = str(self.runtime.expanduser().resolve())
os.environ["VIRTUAL_ENV"] = str(self.venv.expanduser().resolve())
upgrade_pip(venv)
def get(self) -> tuple[Path, Path]:
"""
Get the location of the virtualenv directory for this installation
:return: Paths of the runtime and the venv directory
:rtype: tuple[Path, Path]
"""
return (self.runtime, self.venv)
def install(
self,
extra_index_url: Optional[str] = None,
optional_modules: Optional[str] = None,
find_links: Optional[str] = None,
wheel: Optional[Path] = None,
):
"""Install the package from PyPi or a wheel, if provided.
Args:
extra_index_url: the "--extra-index-url ..." line for pip to look in extra indexes.
optional_modules: optional modules to install using "[module1,module2]" format.
find_links: path to a directory containing wheels to be searched prior to going to the internet
wheel: a wheel file to install
"""
import messages
# not currently used, but may be useful for "install most recent version" option
if self.version == "prerelease":
version = None
pre_flag = "--pre"
elif self.version == "stable":
version = None
pre_flag = None
else:
version = self.version
pre_flag = None
src = "invokeai"
if optional_modules:
src += optional_modules
if version:
src += f"=={version}"
messages.simple_banner("Installing the InvokeAI Application :art:")
from plumbum import FG, ProcessExecutionError, local
pip = local[self.pip]
# Uninstall xformers if it is present; the correct version of it will be reinstalled if needed
_ = pip["uninstall", "-yqq", "xformers"] & FG
pipeline = pip[
"install",
"--require-virtualenv",
"--force-reinstall",
"--use-pep517",
str(src) if not wheel else str(wheel),
"--find-links" if find_links is not None else None,
find_links,
"--extra-index-url" if extra_index_url is not None else None,
extra_index_url,
pre_flag if not wheel else None, # Ignore the flag if we are installing a wheel
]
try:
_ = pipeline & FG
except ProcessExecutionError as e:
print(f"Error: {e}")
print(
"Could not install InvokeAI. Please try downloading the latest version of the installer and install again."
)
sys.exit(1)
def install_user_scripts(self):
"""
Copy the launch and update scripts to the runtime dir
"""
ext = "bat" if OS == "Windows" else "sh"
scripts = ["invoke"]
for script in scripts:
src = Path(__file__).parent / ".." / "templates" / f"{script}.{ext}.in"
dest = self.runtime / f"{script}.{ext}"
shutil.copy(src, dest)
os.chmod(dest, 0o0755)
### Utility functions ###
def get_pip_from_venv(venv_path: Path) -> str:
"""
Given a path to a virtual environment, get the absolute path to the `pip` executable
in a cross-platform fashion. Does not validate that the pip executable
actually exists in the virtualenv.
:param venv_path: Path to the virtual environment
:type venv_path: Path
:return: Absolute path to the pip executable
:rtype: str
"""
pip = "Scripts\\pip.exe" if OS == "Windows" else "bin/pip"
return str(venv_path.expanduser().resolve() / pip)
def upgrade_pip(venv_path: Path) -> str | None:
"""
Upgrade the pip executable in the given virtual environment
"""
python = "Scripts\\python.exe" if OS == "Windows" else "bin/python"
python = str(venv_path.expanduser().resolve() / python)
try:
result = subprocess.check_output([python, "-m", "pip", "install", "--upgrade", "pip"]).decode(
encoding=locale.getpreferredencoding()
)
except subprocess.CalledProcessError as e:
print(e)
result = None
return result
def set_sys_path(venv_path: Path) -> None:
"""
Given a path to a virtual environment, set the sys.path, in a cross-platform fashion,
such that packages from the given venv may be imported in the current process.
Ensure that the packages from system environment are not visible (emulate
the virtual env 'activate' script) - this doesn't work on Windows yet.
:param venv_path: Path to the virtual environment
:type venv_path: Path
"""
# filter out any paths in sys.path that may be system- or user-wide
# but leave the temporary bootstrap virtualenv as it contains packages we
# temporarily need at install time
sys.path = list(filter(lambda p: not p.endswith("-packages") or p.find(BOOTSTRAP_VENV_PREFIX) != -1, sys.path))
# determine site-packages/lib directory location for the venv
lib = "Lib" if OS == "Windows" else f"lib/python{sys.version_info.major}.{sys.version_info.minor}"
# add the site-packages location to the venv
sys.path.append(str(Path(venv_path, lib, "site-packages").expanduser().resolve()))
def get_github_releases() -> tuple[list[str], list[str]] | None:
"""
Query Github for published (pre-)release versions.
Return a tuple where the first element is a list of stable releases and the second element is a list of pre-releases.
Return None if the query fails for any reason.
"""
import requests
## get latest releases using github api
url = "https://api.github.com/repos/invoke-ai/InvokeAI/releases"
releases: list[str] = []
pre_releases: list[str] = []
try:
res = requests.get(url)
res.raise_for_status()
tag_info = res.json()
for tag in tag_info:
if not tag["prerelease"]:
releases.append(tag["tag_name"].lstrip("v"))
else:
pre_releases.append(tag["tag_name"].lstrip("v"))
except requests.HTTPError as e:
print(f"Error: {e}")
print("Could not fetch version information from GitHub. Please check your network connection and try again.")
return
except Exception as e:
print(f"Error: {e}")
print("An unexpected error occurred while trying to fetch version information from GitHub. Please try again.")
return
releases.sort(reverse=True)
pre_releases.sort(reverse=True)
return releases, pre_releases
def get_torch_source() -> Tuple[str | None, str | None]:
"""
Determine the extra index URL for pip to use for torch installation.
This depends on the OS and the graphics accelerator in use.
This is only applicable to Windows and Linux, since PyTorch does not
offer accelerated builds for macOS.
Prefer CUDA-enabled wheels if the user wasn't sure of their GPU, as it will fallback to CPU if possible.
A NoneType return means just go to PyPi.
:return: tuple consisting of (extra index url or None, optional modules to load or None)
:rtype: list
"""
from messages import GpuType, select_gpu
# device can be one of: "cuda", "rocm", "cpu", "cuda_and_dml, autodetect"
device = select_gpu()
# The correct extra index URLs for torch are inconsistent, see https://pytorch.org/get-started/locally/#start-locally
url = None
optional_modules: str | None = None
if OS == "Linux":
if device == GpuType.ROCM:
url = "https://download.pytorch.org/whl/rocm6.1"
elif device == GpuType.CPU:
url = "https://download.pytorch.org/whl/cpu"
elif device == GpuType.CUDA:
url = "https://download.pytorch.org/whl/cu124"
optional_modules = "[onnx-cuda]"
elif device == GpuType.CUDA_WITH_XFORMERS:
url = "https://download.pytorch.org/whl/cu124"
optional_modules = "[xformers,onnx-cuda]"
elif OS == "Windows":
if device == GpuType.CUDA:
url = "https://download.pytorch.org/whl/cu124"
optional_modules = "[onnx-cuda]"
elif device == GpuType.CUDA_WITH_XFORMERS:
url = "https://download.pytorch.org/whl/cu124"
optional_modules = "[xformers,onnx-cuda]"
elif device.value == "cpu":
# CPU uses the default PyPi index, no optional modules
pass
elif OS == "Darwin":
# macOS uses the default PyPi index, no optional modules
pass
# Fall back to defaults
return (url, optional_modules)

View File

@@ -1,57 +0,0 @@
"""
InvokeAI Installer
"""
import argparse
import os
from pathlib import Path
from installer import Installer
if __name__ == "__main__":
parser = argparse.ArgumentParser()
parser.add_argument(
"-r",
"--root",
dest="root",
type=str,
help="Destination path for installation",
default=os.environ.get("INVOKEAI_ROOT") or "~/invokeai",
)
parser.add_argument(
"-y",
"--yes",
"--yes-to-all",
dest="yes_to_all",
action="store_true",
help="Assume default answers to all questions",
default=False,
)
parser.add_argument(
"--find-links",
dest="find_links",
help="Specifies a directory of local wheel files to be searched prior to searching the online repositories.",
type=Path,
default=None,
)
parser.add_argument(
"--wheel",
dest="wheel",
help="Specifies a wheel for the InvokeAI package. Used for troubleshooting or testing prereleases.",
type=Path,
default=None,
)
args = parser.parse_args()
inst = Installer()
try:
inst.install(**args.__dict__)
except KeyboardInterrupt:
print("\n")
print("Ctrl-C pressed. Aborting.")
print("Come back soon!")

View File

@@ -1,342 +0,0 @@
# Copyright (c) 2023 Eugene Brodsky (https://github.com/ebr)
"""
Installer user interaction
"""
import os
import platform
from enum import Enum
from pathlib import Path
from typing import Optional
from prompt_toolkit import prompt
from prompt_toolkit.completion import FuzzyWordCompleter, PathCompleter
from prompt_toolkit.validation import Validator
from rich import box, print
from rich.console import Console, Group, group
from rich.panel import Panel
from rich.prompt import Confirm
from rich.style import Style
from rich.syntax import Syntax
from rich.text import Text
OS = platform.uname().system
ARCH = platform.uname().machine
if OS == "Windows":
# Windows terminals look better without a background colour
console = Console(style=Style(color="grey74"))
else:
console = Console(style=Style(color="grey74", bgcolor="grey19"))
def welcome(available_releases: tuple[list[str], list[str]] | None = None) -> None:
@group()
def text():
if (platform_specific := _platform_specific_help()) is not None:
yield platform_specific
yield ""
yield Text.from_markup(
"Some of the installation steps take a long time to run. Please be patient. If the script appears to hang for more than 10 minutes, please interrupt with [i]Control-C[/] and retry.",
justify="center",
)
if available_releases is not None:
latest_stable = available_releases[0][0]
last_pre = available_releases[1][0]
yield ""
yield Text.from_markup(
f"[red3]🠶[/] Latest stable release (recommended): [b bright_white]{latest_stable}", justify="center"
)
yield Text.from_markup(
f"[red3]🠶[/] Last published pre-release version: [b bright_white]{last_pre}", justify="center"
)
console.rule()
print(
Panel(
title="[bold wheat1]Welcome to the InvokeAI Installer",
renderable=text(),
box=box.DOUBLE,
expand=True,
padding=(1, 2),
style=Style(bgcolor="grey23", color="orange1"),
subtitle=f"[bold grey39]{OS}-{ARCH}",
)
)
console.line()
def installing_from_wheel(wheel_filename: str) -> None:
"""Display a message about installing from a wheel"""
@group()
def text():
yield Text.from_markup(f"You are installing from a wheel file: [bold]{wheel_filename}\n")
yield Text.from_markup(
"[bold orange3]If you are not sure why you are doing this, you should cancel and install InvokeAI normally."
)
console.print(
Panel(
title="Installing from Wheel",
renderable=text(),
box=box.DOUBLE,
expand=True,
padding=(1, 2),
)
)
should_proceed = Confirm.ask("Do you want to proceed?")
if not should_proceed:
console.print("Installation cancelled.")
exit()
def choose_version(available_releases: tuple[list[str], list[str]] | None = None) -> str:
"""
Prompt the user to choose an Invoke version to install
"""
# short circuit if we couldn't get a version list
# still try to install the latest stable version
if available_releases is None:
return "stable"
console.print(":grey_question: [orange3]Please choose an Invoke version to install.")
choices = available_releases[0] + available_releases[1]
response = prompt(
message=f" <Enter> to install the recommended release ({choices[0]}). <Tab> or type to pick a version: ",
complete_while_typing=True,
completer=FuzzyWordCompleter(choices),
)
console.print(f" Version {choices[0] if response == '' else response} will be installed.")
console.line()
return "stable" if response == "" else response
def confirm_install(dest: Path) -> bool:
if dest.exists():
print(f":stop_sign: Directory {dest} already exists!")
print(" Is this location correct?")
default = False
else:
print(f":file_folder: InvokeAI will be installed in {dest}")
default = True
dest_confirmed = Confirm.ask(" Please confirm:", default=default)
console.line()
return dest_confirmed
def dest_path(dest: Optional[str | Path] = None) -> Path | None:
"""
Prompt the user for the destination path and create the path
:param dest: a filesystem path, defaults to None
:type dest: str, optional
:return: absolute path to the created installation directory
:rtype: Path
"""
if dest is not None:
dest = Path(dest).expanduser().resolve()
else:
dest = Path.cwd().expanduser().resolve()
prev_dest = init_path = dest
dest_confirmed = False
while not dest_confirmed:
browse_start = (dest or Path.cwd()).expanduser().resolve()
path_completer = PathCompleter(
only_directories=True,
expanduser=True,
get_paths=lambda: [str(browse_start)], # noqa: B023
# get_paths=lambda: [".."].extend(list(browse_start.iterdir()))
)
console.line()
console.print(f":grey_question: [orange3]Please select the install destination:[/] \\[{browse_start}]: ")
selected = prompt(
">>> ",
complete_in_thread=True,
completer=path_completer,
default=str(browse_start) + os.sep,
vi_mode=True,
complete_while_typing=True,
# Test that this is not needed on Windows
# complete_style=CompleteStyle.READLINE_LIKE,
)
prev_dest = dest
dest = Path(selected)
console.line()
dest_confirmed = confirm_install(dest.expanduser().resolve())
if not dest_confirmed:
dest = prev_dest
dest = dest.expanduser().resolve()
try:
dest.mkdir(exist_ok=True, parents=True)
return dest
except PermissionError:
console.print(
f"Failed to create directory {dest} due to insufficient permissions",
style=Style(color="red"),
highlight=True,
)
except OSError:
console.print_exception()
if Confirm.ask("Would you like to try again?"):
dest_path(init_path)
else:
console.rule("Goodbye!")
class GpuType(Enum):
CUDA_WITH_XFORMERS = "xformers"
CUDA = "cuda"
ROCM = "rocm"
CPU = "cpu"
def select_gpu() -> GpuType:
"""
Prompt the user to select the GPU driver
"""
if ARCH == "arm64" and OS != "Darwin":
print(f"Only CPU acceleration is available on {ARCH} architecture. Proceeding with that.")
return GpuType.CPU
nvidia = (
"an [gold1 b]NVIDIA[/] RTX 3060 or newer GPU using CUDA",
GpuType.CUDA,
)
vintage_nvidia = (
"an [gold1 b]NVIDIA[/] RTX 20xx or older GPU using CUDA+xFormers",
GpuType.CUDA_WITH_XFORMERS,
)
amd = (
"an [gold1 b]AMD[/] GPU using ROCm",
GpuType.ROCM,
)
cpu = (
"Do not install any GPU support, use CPU for generation (slow)",
GpuType.CPU,
)
options = []
if OS == "Windows":
options = [nvidia, vintage_nvidia, cpu]
if OS == "Linux":
options = [nvidia, vintage_nvidia, amd, cpu]
elif OS == "Darwin":
options = [cpu]
if len(options) == 1:
return options[0][1]
options = {str(i): opt for i, opt in enumerate(options, 1)}
console.rule(":space_invader: GPU (Graphics Card) selection :space_invader:")
console.print(
Panel(
Group(
"\n".join(
[
f"Detected the [gold1]{OS}-{ARCH}[/] platform",
"",
"See [deep_sky_blue1]https://invoke-ai.github.io/InvokeAI/installation/requirements/[/] to ensure your system meets the minimum requirements.",
"",
"[red3]🠶[/] [b]Your GPU drivers must be correctly installed before using InvokeAI![/] [red3]🠴[/]",
]
),
"",
"Please select the type of GPU installed in your computer.",
Panel(
"\n".join([f"[dark_goldenrod b i]{i}[/] [dark_red]🢒[/]{opt[0]}" for (i, opt) in options.items()]),
box=box.MINIMAL,
),
),
box=box.MINIMAL,
padding=(1, 1),
)
)
choice = prompt(
"Please make your selection: ",
validator=Validator.from_callable(
lambda n: n in options.keys(), error_message="Please select one the above options"
),
)
return options[choice][1]
def simple_banner(message: str) -> None:
"""
A simple banner with a message, defined here for styling consistency
:param message: The message to display
:type message: str
"""
console.rule(message)
# TODO this does not yet work correctly
def windows_long_paths_registry() -> None:
"""
Display a message about applying the Windows long paths registry fix
"""
with open(str(Path(__file__).parent / "WinLongPathsEnabled.reg"), "r", encoding="utf-16le") as code:
syntax = Syntax(code.read(), line_numbers=True, lexer="regedit")
console.print(
Panel(
Group(
"\n".join(
[
"We will now apply a registry fix to enable long paths on Windows. InvokeAI needs this to function correctly. We are asking your permission to modify the Windows Registry on your behalf.",
"",
"This is the change that will be applied:",
str(syntax),
]
)
),
title="Windows Long Paths registry fix",
box=box.HORIZONTALS,
padding=(1, 1),
)
)
def _platform_specific_help() -> Text | None:
if OS == "Darwin":
text = Text.from_markup(
"""[b wheat1]macOS Users![/]\n\nPlease be sure you have the [b wheat1]Xcode command-line tools[/] installed before continuing.\nIf not, cancel with [i]Control-C[/] and follow the Xcode install instructions at [deep_sky_blue1]https://www.freecodecamp.org/news/install-xcode-command-line-tools/[/]."""
)
elif OS == "Windows":
text = Text.from_markup(
"""[b wheat1]Windows Users![/]\n\nBefore you start, please do the following:
1. Double-click on the file [b wheat1]WinLongPathsEnabled.reg[/] in order to
enable long path support on your system.
2. Make sure you have the [b wheat1]Visual C++ core libraries[/] installed. If not, install from
[deep_sky_blue1]https://learn.microsoft.com/en-US/cpp/windows/latest-supported-vc-redist?view=msvc-170[/]"""
)
else:
return
return text

View File

@@ -1,52 +0,0 @@
InvokeAI
Project homepage: https://github.com/invoke-ai/InvokeAI
Preparations:
You will need to install Python 3.10 or higher for this installer
to work. Instructions are given here:
https://invoke-ai.github.io/InvokeAI/installation/INSTALL_AUTOMATED/
Before you start the installer, please open up your system's command
line window (Terminal or Command) and type the commands:
python --version
If all is well, it will print "Python 3.X.X", where the version number
is at least 3.10.*, and not higher than 3.11.*.
If this works, check the version of the Python package manager, pip:
pip --version
You should get a message that indicates that the pip package
installer was derived from Python 3.10 or 3.11. For example:
"pip 22.0.1 from /usr/bin/pip (python 3.10)"
Long Paths on Windows:
If you are on Windows, you will need to enable Windows Long Paths to
run InvokeAI successfully. If you're not sure what this is, you
almost certainly need to do this.
Simply double-click the "WinLongPathsEnabled.reg" file located in
this directory, and approve the Windows warnings. Note that you will
need to have admin privileges in order to do this.
Launching the installer:
Windows: double-click the 'install.bat' file (while keeping it inside
the InvokeAI-Installer folder).
Linux and Mac: Please open the terminal application and run
'./install.sh' (while keeping it inside the InvokeAI-Installer
folder).
The installer will create a directory of your choice and install the
InvokeAI application within it. This directory contains everything you need to run
invokeai. Once InvokeAI is up and running, you may delete the
InvokeAI-Installer folder at your convenience.
For more information, please see
https://invoke-ai.github.io/InvokeAI/installation/INSTALL_AUTOMATED/

View File

@@ -1,54 +0,0 @@
@echo off
PUSHD "%~dp0"
setlocal
call .venv\Scripts\activate.bat
set INVOKEAI_ROOT=.
:start
echo Desired action:
echo 1. Generate images with the browser-based interface
echo 2. Open the developer console
echo 3. Command-line help
echo Q - Quit
echo.
echo To update, download and run the installer from https://github.com/invoke-ai/InvokeAI/releases/latest
echo.
set /P choice="Please enter 1-4, Q: [1] "
if not defined choice set choice=1
IF /I "%choice%" == "1" (
echo Starting the InvokeAI browser-based UI..
python .venv\Scripts\invokeai-web.exe %*
) ELSE IF /I "%choice%" == "2" (
echo Developer Console
echo Python command is:
where python
echo Python version is:
python --version
echo *************************
echo You are now in the system shell, with the local InvokeAI Python virtual environment activated,
echo so that you can troubleshoot this InvokeAI installation as necessary.
echo *************************
echo *** Type `exit` to quit this shell and deactivate the Python virtual environment ***
call cmd /k
) ELSE IF /I "%choice%" == "3" (
echo Displaying command line help...
python .venv\Scripts\invokeai-web.exe --help %*
pause
exit /b
) ELSE IF /I "%choice%" == "q" (
echo Goodbye!
goto ending
) ELSE (
echo Invalid selection
pause
exit /b
)
goto start
endlocal
pause
:ending
exit /b

View File

@@ -1,87 +0,0 @@
#!/bin/bash
# MIT License
# Coauthored by Lincoln Stein, Eugene Brodsky and Joshua Kimsey
# Copyright 2023, The InvokeAI Development Team
####
# This launch script assumes that:
# 1. it is located in the runtime directory,
# 2. the .venv is also located in the runtime directory and is named exactly that
#
# If both of the above are not true, this script will likely not work as intended.
# Activate the virtual environment and run `invoke.py` directly.
####
set -eu
# Ensure we're in the correct folder in case user's CWD is somewhere else
scriptdir=$(dirname $(readlink -f "$0"))
cd "$scriptdir"
. .venv/bin/activate
export INVOKEAI_ROOT="$scriptdir"
# Stash the CLI args - when we prompt for user input, `$@` is overwritten
PARAMS=$@
# This setting allows torch to fall back to CPU for operations that are not supported by MPS on macOS.
if [ "$(uname -s)" == "Darwin" ]; then
export PYTORCH_ENABLE_MPS_FALLBACK=1
fi
# Primary function for the case statement to determine user input
do_choice() {
case $1 in
1)
clear
printf "Generate images with a browser-based interface\n"
invokeai-web $PARAMS
;;
2)
clear
printf "Open the developer console\n"
file_name=$(basename "${BASH_SOURCE[0]}")
bash --init-file "$file_name"
;;
3)
clear
printf "Command-line help\n"
invokeai-web --help
;;
*)
clear
printf "Exiting...\n"
exit
;;
esac
clear
}
# Command-line interface for launching Invoke functions
do_line_input() {
clear
printf "What would you like to do?\n"
printf "1: Generate images using the browser-based interface\n"
printf "2: Open the developer console\n"
printf "3: Command-line help\n"
printf "Q: Quit\n\n"
printf "To update, download and run the installer from https://github.com/invoke-ai/InvokeAI/releases/latest\n\n"
read -p "Please enter 1-4, Q: [1] " yn
choice=${yn:='1'}
do_choice $choice
clear
}
# Main IF statement for launching Invoke, and for checking if the user is in the developer console
if [ "$0" != "bash" ]; then
while true; do
do_line_input
done
else # in developer console
python --version
printf "Press ^D to exit\n"
export PS1="(InvokeAI) \u@\h \w> "
fi

View File

@@ -37,7 +37,14 @@ from invokeai.app.services.style_preset_records.style_preset_records_sqlite impo
from invokeai.app.services.urls.urls_default import LocalUrlService
from invokeai.app.services.workflow_records.workflow_records_sqlite import SqliteWorkflowRecordsStorage
from invokeai.app.services.workflow_thumbnails.workflow_thumbnails_disk import WorkflowThumbnailFileStorageDisk
from invokeai.backend.stable_diffusion.diffusion.conditioning_data import ConditioningFieldData
from invokeai.backend.stable_diffusion.diffusion.conditioning_data import (
BasicConditioningInfo,
CogView4ConditioningInfo,
ConditioningFieldData,
FLUXConditioningInfo,
SD3ConditioningInfo,
SDXLConditioningInfo,
)
from invokeai.backend.util.logging import InvokeAILogger
from invokeai.version.invokeai_version import __version__
@@ -101,10 +108,25 @@ class ApiDependencies:
images = ImageService()
invocation_cache = MemoryInvocationCache(max_cache_size=config.node_cache_size)
tensors = ObjectSerializerForwardCache(
ObjectSerializerDisk[torch.Tensor](output_folder / "tensors", ephemeral=True)
ObjectSerializerDisk[torch.Tensor](
output_folder / "tensors",
safe_globals=[torch.Tensor],
ephemeral=True,
),
)
conditioning = ObjectSerializerForwardCache(
ObjectSerializerDisk[ConditioningFieldData](output_folder / "conditioning", ephemeral=True)
ObjectSerializerDisk[ConditioningFieldData](
output_folder / "conditioning",
safe_globals=[
ConditioningFieldData,
BasicConditioningInfo,
SDXLConditioningInfo,
FLUXConditioningInfo,
SD3ConditioningInfo,
CogView4ConditioningInfo,
],
ephemeral=True,
),
)
download_queue_service = DownloadQueueService(app_config=configuration, event_bus=events)
model_images_service = ModelImageFileStorageDisk(model_images_folder / "model_images")

View File

@@ -85,6 +85,7 @@ example_model_config = {
"config_path": "string",
"key": "string",
"hash": "string",
"file_size": 1,
"description": "string",
"source": "string",
"converted_at": 0,
@@ -892,6 +893,12 @@ class HFTokenHelper:
huggingface_hub.login(token=token, add_to_git_credential=False)
return cls.get_status()
@classmethod
def reset_token(cls) -> HFTokenStatus:
with SuppressOutput(), contextlib.suppress(Exception):
huggingface_hub.logout()
return cls.get_status()
@model_manager_router.get("/hf_login", operation_id="get_hf_login_status", response_model=HFTokenStatus)
async def get_hf_login_status() -> HFTokenStatus:
@@ -914,3 +921,8 @@ async def do_hf_login(
ApiDependencies.invoker.services.logger.warning("Unable to verify HF token")
return token_status
@model_manager_router.delete("/hf_login", operation_id="reset_hf_token", response_model=HFTokenStatus)
async def reset_hf_token() -> HFTokenStatus:
return HFTokenHelper.reset_token()

View File

@@ -2,7 +2,7 @@ from typing import Optional
from fastapi import Body, Path, Query
from fastapi.routing import APIRouter
from pydantic import BaseModel
from pydantic import BaseModel, Field
from invokeai.app.api.dependencies import ApiDependencies
from invokeai.app.services.session_processor.session_processor_common import SessionProcessorStatus
@@ -15,6 +15,7 @@ from invokeai.app.services.session_queue.session_queue_common import (
CancelByDestinationResult,
ClearResult,
EnqueueBatchResult,
FieldIdentifier,
PruneResult,
RetryItemsResult,
SessionQueueCountsByDestination,
@@ -34,6 +35,12 @@ class SessionQueueAndProcessorStatus(BaseModel):
processor: SessionProcessorStatus
class ValidationRunData(BaseModel):
workflow_id: str = Field(description="The id of the workflow being published.")
input_fields: list[FieldIdentifier] = Body(description="The input fields for the published workflow")
output_fields: list[FieldIdentifier] = Body(description="The output fields for the published workflow")
@session_queue_router.post(
"/{queue_id}/enqueue_batch",
operation_id="enqueue_batch",
@@ -45,6 +52,10 @@ async def enqueue_batch(
queue_id: str = Path(description="The queue id to perform this operation on"),
batch: Batch = Body(description="Batch to process"),
prepend: bool = Body(default=False, description="Whether or not to prepend this batch in the queue"),
validation_run_data: Optional[ValidationRunData] = Body(
default=None,
description="The validation run data to use for this batch. This is only used if this is a validation run.",
),
) -> EnqueueBatchResult:
"""Processes a batch and enqueues the output graphs for execution."""

View File

@@ -106,6 +106,7 @@ async def list_workflows(
tags: Optional[list[str]] = Query(default=None, description="The tags of workflow to get"),
query: Optional[str] = Query(default=None, description="The text to query by (matches name and description)"),
has_been_opened: Optional[bool] = Query(default=None, description="Whether to include/exclude recent workflows"),
is_published: Optional[bool] = Query(default=None, description="Whether to include/exclude published workflows"),
) -> PaginatedResults[WorkflowRecordListItemWithThumbnailDTO]:
"""Gets a page of workflows"""
workflows_with_thumbnails: list[WorkflowRecordListItemWithThumbnailDTO] = []
@@ -118,6 +119,7 @@ async def list_workflows(
categories=categories,
tags=tags,
has_been_opened=has_been_opened,
is_published=is_published,
)
for workflow in workflows.items:
workflows_with_thumbnails.append(

View File

@@ -5,9 +5,12 @@ from __future__ import annotations
import inspect
import re
import sys
import types
import typing
import warnings
from abc import ABC, abstractmethod
from enum import Enum
from functools import lru_cache
from inspect import signature
from typing import (
TYPE_CHECKING,
@@ -19,15 +22,16 @@ from typing import (
Literal,
Optional,
Type,
TypedDict,
TypeVar,
Union,
cast,
)
import semver
from pydantic import BaseModel, ConfigDict, Field, TypeAdapter, create_model
from pydantic import BaseModel, ConfigDict, Field, JsonValue, TypeAdapter, create_model
from pydantic.fields import FieldInfo
from pydantic_core import PydanticUndefined
from typing_extensions import TypeAliasType
from invokeai.app.invocations.fields import (
FieldKind,
@@ -72,13 +76,24 @@ class Classification(str, Enum, metaclass=MetaEnum):
Special = "special"
class Bottleneck(str, Enum, metaclass=MetaEnum):
"""
The bottleneck of an invocation.
- `Network`: The invocation's execution is network-bound.
- `GPU`: The invocation's execution is GPU-bound.
"""
Network = "network"
GPU = "gpu"
class UIConfigBase(BaseModel):
"""
Provides additional node configuration to the UI.
This is used internally by the @invocation decorator logic. Do not use this directly.
"""
tags: Optional[list[str]] = Field(default_factory=None, description="The node's tags")
tags: Optional[list[str]] = Field(default=None, description="The node's tags")
title: Optional[str] = Field(default=None, description="The node's display name")
category: Optional[str] = Field(default=None, description="The node's category")
version: str = Field(
@@ -93,6 +108,11 @@ class UIConfigBase(BaseModel):
)
class OriginalModelField(TypedDict):
annotation: Any
field_info: FieldInfo
class BaseInvocationOutput(BaseModel):
"""
Base class for all invocation outputs.
@@ -100,36 +120,11 @@ class BaseInvocationOutput(BaseModel):
All invocation outputs must use the `@invocation_output` decorator to provide their unique type.
"""
_output_classes: ClassVar[set[BaseInvocationOutput]] = set()
_typeadapter: ClassVar[Optional[TypeAdapter[Any]]] = None
_typeadapter_needs_update: ClassVar[bool] = False
@classmethod
def register_output(cls, output: BaseInvocationOutput) -> None:
"""Registers an invocation output."""
cls._output_classes.add(output)
cls._typeadapter_needs_update = True
@classmethod
def get_outputs(cls) -> Iterable[BaseInvocationOutput]:
"""Gets all invocation outputs."""
return cls._output_classes
@classmethod
def get_typeadapter(cls) -> TypeAdapter[Any]:
"""Gets a pydantc TypeAdapter for the union of all invocation output types."""
if not cls._typeadapter or cls._typeadapter_needs_update:
AnyInvocationOutput = TypeAliasType(
"AnyInvocationOutput", Annotated[Union[tuple(cls._output_classes)], Field(discriminator="type")]
)
cls._typeadapter = TypeAdapter(AnyInvocationOutput)
cls._typeadapter_needs_update = False
return cls._typeadapter
@classmethod
def get_output_types(cls) -> Iterable[str]:
"""Gets all invocation output types."""
return (i.get_type() for i in BaseInvocationOutput.get_outputs())
output_meta: Optional[dict[str, JsonValue]] = Field(
default=None,
description="Optional dictionary of metadata for the invocation output, unrelated to the invocation's actual output value. This is not exposed as an output field.",
json_schema_extra={"field_kind": FieldKind.NodeAttribute},
)
@staticmethod
def json_schema_extra(schema: dict[str, Any], model_class: Type[BaseInvocationOutput]) -> None:
@@ -146,6 +141,9 @@ class BaseInvocationOutput(BaseModel):
"""Gets the invocation output's type, as provided by the `@invocation_output` decorator."""
return cls.model_fields["type"].default
_original_model_fields: ClassVar[dict[str, OriginalModelField]] = {}
"""The original model fields, before any modifications were made by the @invocation_output decorator."""
model_config = ConfigDict(
protected_namespaces=(),
validate_assignment=True,
@@ -173,76 +171,16 @@ class BaseInvocation(ABC, BaseModel):
All invocations must use the `@invocation` decorator to provide their unique type.
"""
_invocation_classes: ClassVar[set[BaseInvocation]] = set()
_typeadapter: ClassVar[Optional[TypeAdapter[Any]]] = None
_typeadapter_needs_update: ClassVar[bool] = False
@classmethod
def get_type(cls) -> str:
"""Gets the invocation's type, as provided by the `@invocation` decorator."""
return cls.model_fields["type"].default
@classmethod
def register_invocation(cls, invocation: BaseInvocation) -> None:
"""Registers an invocation."""
cls._invocation_classes.add(invocation)
cls._typeadapter_needs_update = True
@classmethod
def get_typeadapter(cls) -> TypeAdapter[Any]:
"""Gets a pydantc TypeAdapter for the union of all invocation types."""
if not cls._typeadapter or cls._typeadapter_needs_update:
AnyInvocation = TypeAliasType(
"AnyInvocation", Annotated[Union[tuple(cls.get_invocations())], Field(discriminator="type")]
)
cls._typeadapter = TypeAdapter(AnyInvocation)
cls._typeadapter_needs_update = False
return cls._typeadapter
@classmethod
def invalidate_typeadapter(cls) -> None:
"""Invalidates the typeadapter, forcing it to be rebuilt on next access. If the invocation allowlist or
denylist is changed, this should be called to ensure the typeadapter is updated and validation respects
the updated allowlist and denylist."""
cls._typeadapter_needs_update = True
@classmethod
def get_invocations(cls) -> Iterable[BaseInvocation]:
"""Gets all invocations, respecting the allowlist and denylist."""
app_config = get_config()
allowed_invocations: set[BaseInvocation] = set()
for sc in cls._invocation_classes:
invocation_type = sc.get_type()
is_in_allowlist = (
invocation_type in app_config.allow_nodes if isinstance(app_config.allow_nodes, list) else True
)
is_in_denylist = (
invocation_type in app_config.deny_nodes if isinstance(app_config.deny_nodes, list) else False
)
if is_in_allowlist and not is_in_denylist:
allowed_invocations.add(sc)
return allowed_invocations
@classmethod
def get_invocations_map(cls) -> dict[str, BaseInvocation]:
"""Gets a map of all invocation types to their invocation classes."""
return {i.get_type(): i for i in BaseInvocation.get_invocations()}
@classmethod
def get_invocation_types(cls) -> Iterable[str]:
"""Gets all invocation types."""
return (i.get_type() for i in BaseInvocation.get_invocations())
@classmethod
def get_output_annotation(cls) -> BaseInvocationOutput:
"""Gets the invocation's output annotation (i.e. the return annotation of its `invoke()` method)."""
return signature(cls.invoke).return_annotation
@classmethod
def get_invocation_for_type(cls, invocation_type: str) -> BaseInvocation | None:
"""Gets the invocation class for a given invocation type."""
return cls.get_invocations_map().get(invocation_type)
@staticmethod
def json_schema_extra(schema: dict[str, Any], model_class: Type[BaseInvocation]) -> None:
"""Adds various UI-facing attributes to the invocation's OpenAPI schema."""
@@ -326,6 +264,8 @@ class BaseInvocation(ABC, BaseModel):
json_schema_extra={"field_kind": FieldKind.NodeAttribute},
)
bottleneck: ClassVar[Bottleneck]
UIConfig: ClassVar[UIConfigBase]
model_config = ConfigDict(
@@ -336,21 +276,163 @@ class BaseInvocation(ABC, BaseModel):
coerce_numbers_to_str=True,
)
_original_model_fields: ClassVar[dict[str, OriginalModelField]] = {}
"""The original model fields, before any modifications were made by the @invocation decorator."""
TBaseInvocation = TypeVar("TBaseInvocation", bound=BaseInvocation)
class InvocationRegistry:
_invocation_classes: ClassVar[set[type[BaseInvocation]]] = set()
_output_classes: ClassVar[set[type[BaseInvocationOutput]]] = set()
@classmethod
def register_invocation(cls, invocation: type[BaseInvocation]) -> None:
"""Registers an invocation."""
invocation_type = invocation.get_type()
node_pack = invocation.UIConfig.node_pack
# Log a warning when an existing invocation is being clobbered by the one we are registering
clobbered_invocation = InvocationRegistry.get_invocation_for_type(invocation_type)
if clobbered_invocation is not None:
# This should always be true - we just checked if the invocation type was in the set
clobbered_node_pack = clobbered_invocation.UIConfig.node_pack
if clobbered_node_pack == "invokeai":
# The invocation being clobbered is a core invocation
logger.warning(f'Overriding core node "{invocation_type}" with node from "{node_pack}"')
else:
# The invocation being clobbered is a custom invocation
logger.warning(
f'Overriding node "{invocation_type}" from "{node_pack}" with node from "{clobbered_node_pack}"'
)
cls._invocation_classes.remove(clobbered_invocation)
cls._invocation_classes.add(invocation)
cls.invalidate_invocation_typeadapter()
@classmethod
@lru_cache(maxsize=1)
def get_invocation_typeadapter(cls) -> TypeAdapter[Any]:
"""Gets a pydantic TypeAdapter for the union of all invocation types.
This is used to parse serialized invocations into the correct invocation class.
This method is cached to avoid rebuilding the TypeAdapter on every access. If the invocation allowlist or
denylist is changed, the cache should be cleared to ensure the TypeAdapter is updated and validation respects
the updated allowlist and denylist.
@see https://docs.pydantic.dev/latest/concepts/type_adapter/
"""
return TypeAdapter(Annotated[Union[tuple(cls.get_invocation_classes())], Field(discriminator="type")])
@classmethod
def invalidate_invocation_typeadapter(cls) -> None:
"""Invalidates the cached invocation type adapter."""
cls.get_invocation_typeadapter.cache_clear()
@classmethod
def get_invocation_classes(cls) -> Iterable[type[BaseInvocation]]:
"""Gets all invocations, respecting the allowlist and denylist."""
app_config = get_config()
allowed_invocations: set[type[BaseInvocation]] = set()
for sc in cls._invocation_classes:
invocation_type = sc.get_type()
is_in_allowlist = (
invocation_type in app_config.allow_nodes if isinstance(app_config.allow_nodes, list) else True
)
is_in_denylist = (
invocation_type in app_config.deny_nodes if isinstance(app_config.deny_nodes, list) else False
)
if is_in_allowlist and not is_in_denylist:
allowed_invocations.add(sc)
return allowed_invocations
@classmethod
def get_invocations_map(cls) -> dict[str, type[BaseInvocation]]:
"""Gets a map of all invocation types to their invocation classes."""
return {i.get_type(): i for i in cls.get_invocation_classes()}
@classmethod
def get_invocation_types(cls) -> Iterable[str]:
"""Gets all invocation types."""
return (i.get_type() for i in cls.get_invocation_classes())
@classmethod
def get_invocation_for_type(cls, invocation_type: str) -> type[BaseInvocation] | None:
"""Gets the invocation class for a given invocation type."""
return cls.get_invocations_map().get(invocation_type)
@classmethod
def register_output(cls, output: "type[TBaseInvocationOutput]") -> None:
"""Registers an invocation output."""
output_type = output.get_type()
# Log a warning when an existing invocation is being clobbered by the one we are registering
clobbered_output = InvocationRegistry.get_output_for_type(output_type)
if clobbered_output is not None:
# TODO(psyche): We do not record the node pack of the output, so we cannot log it here
logger.warning(f'Overriding invocation output "{output_type}"')
cls._output_classes.remove(clobbered_output)
cls._output_classes.add(output)
cls.invalidate_output_typeadapter()
@classmethod
def get_output_classes(cls) -> Iterable[type[BaseInvocationOutput]]:
"""Gets all invocation outputs."""
return cls._output_classes
@classmethod
def get_outputs_map(cls) -> dict[str, type[BaseInvocationOutput]]:
"""Gets a map of all output types to their output classes."""
return {i.get_type(): i for i in cls.get_output_classes()}
@classmethod
@lru_cache(maxsize=1)
def get_output_typeadapter(cls) -> TypeAdapter[Any]:
"""Gets a pydantic TypeAdapter for the union of all invocation output types.
This is used to parse serialized invocation outputs into the correct invocation output class.
This method is cached to avoid rebuilding the TypeAdapter on every access. If the invocation allowlist or
denylist is changed, the cache should be cleared to ensure the TypeAdapter is updated and validation respects
the updated allowlist and denylist.
@see https://docs.pydantic.dev/latest/concepts/type_adapter/
"""
return TypeAdapter(Annotated[Union[tuple(cls._output_classes)], Field(discriminator="type")])
@classmethod
def invalidate_output_typeadapter(cls) -> None:
"""Invalidates the cached invocation output type adapter."""
cls.get_output_typeadapter.cache_clear()
@classmethod
def get_output_types(cls) -> Iterable[str]:
"""Gets all invocation output types."""
return (i.get_type() for i in cls.get_output_classes())
@classmethod
def get_output_for_type(cls, output_type: str) -> type[BaseInvocationOutput] | None:
"""Gets the output class for a given output type."""
return cls.get_outputs_map().get(output_type)
RESERVED_NODE_ATTRIBUTE_FIELD_NAMES = {
"id",
"is_intermediate",
"use_cache",
"type",
"workflow",
"bottleneck",
}
RESERVED_INPUT_FIELD_NAMES = {"metadata", "board"}
RESERVED_OUTPUT_FIELD_NAMES = {"type"}
RESERVED_OUTPUT_FIELD_NAMES = {"type", "output_meta"}
class _Model(BaseModel):
@@ -422,6 +504,48 @@ def validate_fields(model_fields: dict[str, FieldInfo], model_type: str) -> None
return None
class NoDefaultSentinel:
pass
def validate_field_default(
cls_name: str, field_name: str, invocation_type: str, annotation: Any, field_info: FieldInfo
) -> None:
"""Validates the default value of a field against its pydantic field definition."""
assert isinstance(field_info.json_schema_extra, dict), "json_schema_extra is not a dict"
# By the time we are doing this, we've already done some pydantic magic by overriding the original default value.
# We store the original default value in the json_schema_extra dict, so we can validate it here.
orig_default = field_info.json_schema_extra.get("orig_default", NoDefaultSentinel)
if orig_default is NoDefaultSentinel:
return
# To validate the default value, we can create a temporary pydantic model with the field we are validating as its
# only field. Then validate the default value against this temporary model.
TempDefaultValidator = cast(BaseModel, create_model(cls_name, **{field_name: (annotation, field_info)}))
try:
TempDefaultValidator.model_validate({field_name: orig_default})
except Exception as e:
raise InvalidFieldError(
f'Default value for field "{field_name}" on invocation "{invocation_type}" is invalid, {e}'
) from e
def is_optional(annotation: Any) -> bool:
"""
Checks if the given annotation is optional (i.e. Optional[X], Union[X, None] or X | None).
"""
origin = typing.get_origin(annotation)
# PEP 604 unions (int|None) have origin types.UnionType
is_union = origin is typing.Union or origin is types.UnionType
if not is_union:
return False
return any(arg is type(None) for arg in typing.get_args(annotation))
def invocation(
invocation_type: str,
title: Optional[str] = None,
@@ -430,6 +554,7 @@ def invocation(
version: Optional[str] = None,
use_cache: Optional[bool] = True,
classification: Classification = Classification.Stable,
bottleneck: Bottleneck = Bottleneck.GPU,
) -> Callable[[Type[TBaseInvocation]], Type[TBaseInvocation]]:
"""
Registers an invocation.
@@ -441,6 +566,7 @@ def invocation(
:param Optional[str] version: Adds a version to the invocation. Must be a valid semver string. Defaults to None.
:param Optional[bool] use_cache: Whether or not to use the invocation cache. Defaults to True. The user may override this in the workflow editor.
:param Classification classification: The classification of the invocation. Defaults to FeatureClassification.Stable. Use Beta or Prototype if the invocation is unstable.
:param Bottleneck bottleneck: The bottleneck of the invocation. Defaults to Bottleneck.GPU. Use Network if the invocation is network-bound.
"""
def wrapper(cls: Type[TBaseInvocation]) -> Type[TBaseInvocation]:
@@ -452,27 +578,26 @@ def invocation(
# The node pack is the module name - will be "invokeai" for built-in nodes
node_pack = cls.__module__.split(".")[0]
# Handle the case where an existing node is being clobbered by the one we are registering
if invocation_type in BaseInvocation.get_invocation_types():
clobbered_invocation = BaseInvocation.get_invocation_for_type(invocation_type)
# This should always be true - we just checked if the invocation type was in the set
assert clobbered_invocation is not None
clobbered_node_pack = clobbered_invocation.UIConfig.node_pack
if clobbered_node_pack == "invokeai":
# The node being clobbered is a core node
raise ValueError(
f'Cannot load node "{invocation_type}" from node pack "{node_pack}" - a core node with the same type already exists'
)
else:
# The node being clobbered is a custom node
raise ValueError(
f'Cannot load node "{invocation_type}" from node pack "{node_pack}" - a node with the same type already exists in node pack "{clobbered_node_pack}"'
)
validate_fields(cls.model_fields, invocation_type)
fields: dict[str, tuple[Any, FieldInfo]] = {}
for field_name, field_info in cls.model_fields.items():
annotation = field_info.annotation
assert annotation is not None, f"{field_name} on invocation {invocation_type} has no type annotation."
assert isinstance(field_info.json_schema_extra, dict), (
f"{field_name} on invocation {invocation_type} has a non-dict json_schema_extra, did you forget to use InputField?"
)
cls._original_model_fields[field_name] = OriginalModelField(annotation=annotation, field_info=field_info)
validate_field_default(cls.__name__, field_name, invocation_type, annotation, field_info)
if field_info.default is None and not is_optional(annotation):
annotation = annotation | None
fields[field_name] = (annotation, field_info)
# Add OpenAPI schema extras
uiconfig: dict[str, Any] = {}
uiconfig["title"] = title
@@ -496,6 +621,8 @@ def invocation(
if use_cache is not None:
cls.model_fields["use_cache"].default = use_cache
cls.bottleneck = bottleneck
# Add the invocation type to the model.
# You'd be tempted to just add the type field and rebuild the model, like this:
@@ -505,11 +632,17 @@ def invocation(
# Unfortunately, because the `GraphInvocation` uses a forward ref in its `graph` field's annotation, this does
# not work. Instead, we have to create a new class with the type field and patch the original class with it.
invocation_type_annotation = Literal[invocation_type] # type: ignore
invocation_type_field = Field(
title="type", default=invocation_type, json_schema_extra={"field_kind": FieldKind.NodeAttribute}
invocation_type_annotation = Literal[invocation_type]
# Field() returns an instance of FieldInfo, but thanks to a pydantic implementation detail, it is _typed_ as Any.
# This cast makes the type annotation match the class's true type.
invocation_type_field_info = cast(
FieldInfo,
Field(title="type", default=invocation_type, json_schema_extra={"field_kind": FieldKind.NodeAttribute}),
)
fields["type"] = (invocation_type_annotation, invocation_type_field_info)
# Validate the `invoke()` method is implemented
if "invoke" in cls.__abstractmethods__:
raise ValueError(f'Invocation "{invocation_type}" must implement the "invoke" method')
@@ -531,18 +664,12 @@ def invocation(
)
docstring = cls.__doc__
cls = create_model(
cls.__qualname__,
__base__=cls,
__module__=cls.__module__,
type=(invocation_type_annotation, invocation_type_field),
)
cls.__doc__ = docstring
new_class = create_model(cls.__qualname__, __base__=cls, __module__=cls.__module__, **fields) # type: ignore
new_class.__doc__ = docstring
# TODO: how to type this correctly? it's typed as ModelMetaclass, a private class in pydantic
BaseInvocation.register_invocation(cls) # type: ignore
InvocationRegistry.register_invocation(new_class)
return cls
return new_class
return wrapper
@@ -565,29 +692,41 @@ def invocation_output(
if re.compile(r"^\S+$").match(output_type) is None:
raise ValueError(f'"output_type" must consist of non-whitespace characters, got "{output_type}"')
if output_type in BaseInvocationOutput.get_output_types():
raise ValueError(f'Invocation type "{output_type}" already exists')
validate_fields(cls.model_fields, output_type)
# Add the output type to the model.
fields: dict[str, tuple[Any, FieldInfo]] = {}
output_type_annotation = Literal[output_type] # type: ignore
output_type_field = Field(
title="type", default=output_type, json_schema_extra={"field_kind": FieldKind.NodeAttribute}
for field_name, field_info in cls.model_fields.items():
annotation = field_info.annotation
assert annotation is not None, f"{field_name} on invocation output {output_type} has no type annotation."
assert isinstance(field_info.json_schema_extra, dict), (
f"{field_name} on invocation output {output_type} has a non-dict json_schema_extra, did you forget to use InputField?"
)
cls._original_model_fields[field_name] = OriginalModelField(annotation=annotation, field_info=field_info)
if field_info.default is not PydanticUndefined and is_optional(annotation):
annotation = annotation | None
fields[field_name] = (annotation, field_info)
# Add the output type to the model.
output_type_annotation = Literal[output_type]
# Field() returns an instance of FieldInfo, but thanks to a pydantic implementation detail, it is _typed_ as Any.
# This cast makes the type annotation match the class's true type.
output_type_field_info = cast(
FieldInfo,
Field(title="type", default=output_type, json_schema_extra={"field_kind": FieldKind.NodeAttribute}),
)
fields["type"] = (output_type_annotation, output_type_field_info)
docstring = cls.__doc__
cls = create_model(
cls.__qualname__,
__base__=cls,
__module__=cls.__module__,
type=(output_type_annotation, output_type_field),
)
cls.__doc__ = docstring
new_class = create_model(cls.__qualname__, __base__=cls, __module__=cls.__module__, **fields)
new_class.__doc__ = docstring
BaseInvocationOutput.register_output(cls) # type: ignore # TODO: how to type this correctly?
InvocationRegistry.register_output(new_class)
return cls
return new_class
return wrapper

View File

@@ -64,7 +64,6 @@ class ImageBatchInvocation(BaseBatchInvocation):
"""Create a batched generation, where the workflow is executed once for each image in the batch."""
images: list[ImageField] = InputField(
default=[],
min_length=1,
description="The images to batch over",
)
@@ -120,7 +119,6 @@ class StringBatchInvocation(BaseBatchInvocation):
"""Create a batched generation, where the workflow is executed once for each string in the batch."""
strings: list[str] = InputField(
default=[],
min_length=1,
description="The strings to batch over",
)
@@ -176,7 +174,6 @@ class IntegerBatchInvocation(BaseBatchInvocation):
"""Create a batched generation, where the workflow is executed once for each integer in the batch."""
integers: list[int] = InputField(
default=[],
min_length=1,
description="The integers to batch over",
)
@@ -230,7 +227,6 @@ class FloatBatchInvocation(BaseBatchInvocation):
"""Create a batched generation, where the workflow is executed once for each float in the batch."""
floats: list[float] = InputField(
default=[],
min_length=1,
description="The floats to batch over",
)

View File

@@ -0,0 +1,363 @@
from typing import Callable, Optional
import torch
import torchvision.transforms as tv_transforms
from diffusers.models.transformers.transformer_cogview4 import CogView4Transformer2DModel
from torchvision.transforms.functional import resize as tv_resize
from tqdm import tqdm
from invokeai.app.invocations.baseinvocation import BaseInvocation, Classification, invocation
from invokeai.app.invocations.constants import LATENT_SCALE_FACTOR
from invokeai.app.invocations.fields import (
CogView4ConditioningField,
DenoiseMaskField,
FieldDescriptions,
Input,
InputField,
LatentsField,
WithBoard,
WithMetadata,
)
from invokeai.app.invocations.model import TransformerField
from invokeai.app.invocations.primitives import LatentsOutput
from invokeai.app.services.shared.invocation_context import InvocationContext
from invokeai.backend.flux.sampling_utils import clip_timestep_schedule_fractional
from invokeai.backend.model_manager.config import BaseModelType
from invokeai.backend.rectified_flow.rectified_flow_inpaint_extension import RectifiedFlowInpaintExtension
from invokeai.backend.stable_diffusion.diffusers_pipeline import PipelineIntermediateState
from invokeai.backend.stable_diffusion.diffusion.conditioning_data import CogView4ConditioningInfo
from invokeai.backend.util.devices import TorchDevice
@invocation(
"cogview4_denoise",
title="Denoise - CogView4",
tags=["image", "cogview4"],
category="image",
version="1.0.0",
classification=Classification.Prototype,
)
class CogView4DenoiseInvocation(BaseInvocation, WithMetadata, WithBoard):
"""Run the denoising process with a CogView4 model."""
# If latents is provided, this means we are doing image-to-image.
latents: Optional[LatentsField] = InputField(
default=None, description=FieldDescriptions.latents, input=Input.Connection
)
# denoise_mask is used for image-to-image inpainting. Only the masked region is modified.
denoise_mask: Optional[DenoiseMaskField] = InputField(
default=None, description=FieldDescriptions.denoise_mask, input=Input.Connection
)
denoising_start: float = InputField(default=0.0, ge=0, le=1, description=FieldDescriptions.denoising_start)
denoising_end: float = InputField(default=1.0, ge=0, le=1, description=FieldDescriptions.denoising_end)
transformer: TransformerField = InputField(
description=FieldDescriptions.cogview4_model, input=Input.Connection, title="Transformer"
)
positive_conditioning: CogView4ConditioningField = InputField(
description=FieldDescriptions.positive_cond, input=Input.Connection
)
negative_conditioning: CogView4ConditioningField = InputField(
description=FieldDescriptions.negative_cond, input=Input.Connection
)
cfg_scale: float | list[float] = InputField(default=3.5, description=FieldDescriptions.cfg_scale, title="CFG Scale")
width: int = InputField(default=1024, multiple_of=32, description="Width of the generated image.")
height: int = InputField(default=1024, multiple_of=32, description="Height of the generated image.")
steps: int = InputField(default=25, gt=0, description=FieldDescriptions.steps)
seed: int = InputField(default=0, description="Randomness seed for reproducibility.")
@torch.no_grad()
def invoke(self, context: InvocationContext) -> LatentsOutput:
latents = self._run_diffusion(context)
latents = latents.detach().to("cpu")
name = context.tensors.save(tensor=latents)
return LatentsOutput.build(latents_name=name, latents=latents, seed=None)
def _prep_inpaint_mask(self, context: InvocationContext, latents: torch.Tensor) -> torch.Tensor | None:
"""Prepare the inpaint mask.
- Loads the mask
- Resizes if necessary
- Casts to same device/dtype as latents
Args:
context (InvocationContext): The invocation context, for loading the inpaint mask.
latents (torch.Tensor): A latent image tensor. Used to determine the target shape, device, and dtype for the
inpaint mask.
Returns:
torch.Tensor | None: Inpaint mask. Values of 0.0 represent the regions to be fully denoised, and 1.0
represent the regions to be preserved.
"""
if self.denoise_mask is None:
return None
mask = context.tensors.load(self.denoise_mask.mask_name)
# The input denoise_mask contains values in [0, 1], where 0.0 represents the regions to be fully denoised, and
# 1.0 represents the regions to be preserved.
# We invert the mask so that the regions to be preserved are 0.0 and the regions to be denoised are 1.0.
mask = 1.0 - mask
_, _, latent_height, latent_width = latents.shape
mask = tv_resize(
img=mask,
size=[latent_height, latent_width],
interpolation=tv_transforms.InterpolationMode.BILINEAR,
antialias=False,
)
mask = mask.to(device=latents.device, dtype=latents.dtype)
return mask
def _load_text_conditioning(
self,
context: InvocationContext,
conditioning_name: str,
dtype: torch.dtype,
device: torch.device,
) -> torch.Tensor:
# Load the conditioning data.
cond_data = context.conditioning.load(conditioning_name)
assert len(cond_data.conditionings) == 1
cogview4_conditioning = cond_data.conditionings[0]
assert isinstance(cogview4_conditioning, CogView4ConditioningInfo)
cogview4_conditioning = cogview4_conditioning.to(dtype=dtype, device=device)
return cogview4_conditioning.glm_embeds
def _get_noise(
self,
batch_size: int,
num_channels_latents: int,
height: int,
width: int,
dtype: torch.dtype,
device: torch.device,
seed: int,
) -> torch.Tensor:
# We always generate noise on the same device and dtype then cast to ensure consistency across devices/dtypes.
rand_device = "cpu"
rand_dtype = torch.float16
return torch.randn(
batch_size,
num_channels_latents,
int(height) // LATENT_SCALE_FACTOR,
int(width) // LATENT_SCALE_FACTOR,
device=rand_device,
dtype=rand_dtype,
generator=torch.Generator(device=rand_device).manual_seed(seed),
).to(device=device, dtype=dtype)
def _prepare_cfg_scale(self, num_timesteps: int) -> list[float]:
"""Prepare the CFG scale list.
Args:
num_timesteps (int): The number of timesteps in the scheduler. Could be different from num_steps depending
on the scheduler used (e.g. higher order schedulers).
Returns:
list[float]: _description_
"""
if isinstance(self.cfg_scale, float):
cfg_scale = [self.cfg_scale] * num_timesteps
elif isinstance(self.cfg_scale, list):
assert len(self.cfg_scale) == num_timesteps
cfg_scale = self.cfg_scale
else:
raise ValueError(f"Invalid CFG scale type: {type(self.cfg_scale)}")
return cfg_scale
def _convert_timesteps_to_sigmas(self, image_seq_len: int, timesteps: torch.Tensor) -> list[float]:
# The logic to prepare the timestep / sigma schedule is based on:
# https://github.com/huggingface/diffusers/blob/b38450d5d2e5b87d5ff7088ee5798c85587b9635/src/diffusers/pipelines/cogview4/pipeline_cogview4.py#L575-L595
# The default FlowMatchEulerDiscreteScheduler configs are based on:
# https://huggingface.co/THUDM/CogView4-6B/blob/fb6f57289c73ac6d139e8d81bd5a4602d1877847/scheduler/scheduler_config.json
# This implementation differs slightly from the original for the sake of simplicity (differs in terminal value
# handling, not quantizing timesteps to integers, etc.).
def calculate_timestep_shift(
image_seq_len: int, base_seq_len: int = 256, base_shift: float = 0.25, max_shift: float = 0.75
) -> float:
m = (image_seq_len / base_seq_len) ** 0.5
mu = m * max_shift + base_shift
return mu
def time_shift_linear(mu: float, sigma: float, t: torch.Tensor) -> torch.Tensor:
return mu / (mu + (1 / t - 1) ** sigma)
mu = calculate_timestep_shift(image_seq_len)
sigmas = time_shift_linear(mu, 1.0, timesteps)
return sigmas.tolist()
def _run_diffusion(
self,
context: InvocationContext,
):
inference_dtype = torch.bfloat16
device = TorchDevice.choose_torch_device()
transformer_info = context.models.load(self.transformer.transformer)
assert isinstance(transformer_info.model, CogView4Transformer2DModel)
# Load/process the conditioning data.
# TODO(ryand): Make CFG optional.
do_classifier_free_guidance = True
pos_prompt_embeds = self._load_text_conditioning(
context=context,
conditioning_name=self.positive_conditioning.conditioning_name,
dtype=inference_dtype,
device=device,
)
neg_prompt_embeds = self._load_text_conditioning(
context=context,
conditioning_name=self.negative_conditioning.conditioning_name,
dtype=inference_dtype,
device=device,
)
# Prepare misc. conditioning variables.
# TODO(ryand): We could expose these as params (like with SDXL). But, we should experiment to see if they are
# useful first.
original_size = torch.tensor([(self.height, self.width)], dtype=pos_prompt_embeds.dtype, device=device)
target_size = torch.tensor([(self.height, self.width)], dtype=pos_prompt_embeds.dtype, device=device)
crops_coords_top_left = torch.tensor([(0, 0)], dtype=pos_prompt_embeds.dtype, device=device)
# Prepare the timestep / sigma schedule.
patch_size = transformer_info.model.config.patch_size # type: ignore
assert isinstance(patch_size, int)
image_seq_len = ((self.height // LATENT_SCALE_FACTOR) * (self.width // LATENT_SCALE_FACTOR)) // (patch_size**2)
# We add an extra step to the end to account for the final timestep of 0.0.
timesteps: list[float] = torch.linspace(1, 0, self.steps + 1).tolist()
# Clip the timesteps schedule based on denoising_start and denoising_end.
timesteps = clip_timestep_schedule_fractional(timesteps, self.denoising_start, self.denoising_end)
sigmas = self._convert_timesteps_to_sigmas(image_seq_len, torch.tensor(timesteps))
total_steps = len(timesteps) - 1
# Prepare the CFG scale list.
cfg_scale = self._prepare_cfg_scale(total_steps)
# Load the input latents, if provided.
init_latents = context.tensors.load(self.latents.latents_name) if self.latents else None
if init_latents is not None:
init_latents = init_latents.to(device=device, dtype=inference_dtype)
# Generate initial latent noise.
num_channels_latents = transformer_info.model.config.in_channels # type: ignore
assert isinstance(num_channels_latents, int)
noise = self._get_noise(
batch_size=1,
num_channels_latents=num_channels_latents,
height=self.height,
width=self.width,
dtype=inference_dtype,
device=device,
seed=self.seed,
)
# Prepare input latent image.
if init_latents is not None:
# Noise the init_latents by the appropriate amount for the first timestep.
s_0 = sigmas[0]
latents = s_0 * noise + (1.0 - s_0) * init_latents
else:
# init_latents are not provided, so we are not doing image-to-image (i.e. we are starting from pure noise).
if self.denoising_start > 1e-5:
raise ValueError("denoising_start should be 0 when initial latents are not provided.")
latents = noise
# If len(timesteps) == 1, then short-circuit. We are just noising the input latents, but not taking any
# denoising steps.
if len(timesteps) <= 1:
return latents
# Prepare inpaint extension.
inpaint_mask = self._prep_inpaint_mask(context, latents)
inpaint_extension: RectifiedFlowInpaintExtension | None = None
if inpaint_mask is not None:
assert init_latents is not None
inpaint_extension = RectifiedFlowInpaintExtension(
init_latents=init_latents,
inpaint_mask=inpaint_mask,
noise=noise,
)
step_callback = self._build_step_callback(context)
step_callback(
PipelineIntermediateState(
step=0,
order=1,
total_steps=total_steps,
timestep=int(timesteps[0]),
latents=latents,
),
)
with transformer_info.model_on_device() as (_, transformer):
assert isinstance(transformer, CogView4Transformer2DModel)
# Denoising loop
for step_idx in tqdm(range(total_steps)):
t_curr = timesteps[step_idx]
sigma_curr = sigmas[step_idx]
sigma_prev = sigmas[step_idx + 1]
# Expand the timestep to match the latent model input.
# Multiply by 1000 to match the default FlowMatchEulerDiscreteScheduler num_train_timesteps.
timestep = torch.tensor([t_curr * 1000], device=device).expand(latents.shape[0])
# TODO(ryand): Support both sequential and batched CFG inference.
noise_pred_cond = transformer(
hidden_states=latents,
encoder_hidden_states=pos_prompt_embeds,
timestep=timestep,
original_size=original_size,
target_size=target_size,
crop_coords=crops_coords_top_left,
return_dict=False,
)[0]
# Apply CFG.
if do_classifier_free_guidance:
noise_pred_uncond = transformer(
hidden_states=latents,
encoder_hidden_states=neg_prompt_embeds,
timestep=timestep,
original_size=original_size,
target_size=target_size,
crop_coords=crops_coords_top_left,
return_dict=False,
)[0]
noise_pred = noise_pred_uncond + cfg_scale[step_idx] * (noise_pred_cond - noise_pred_uncond)
else:
noise_pred = noise_pred_cond
# Compute the previous noisy sample x_t -> x_t-1.
latents_dtype = latents.dtype
# TODO(ryand): Is casting to float32 necessary for precision/stability? I copied this from SD3.
latents = latents.to(dtype=torch.float32)
latents = latents + (sigma_prev - sigma_curr) * noise_pred
latents = latents.to(dtype=latents_dtype)
if inpaint_extension is not None:
latents = inpaint_extension.merge_intermediate_latents_with_init_latents(latents, sigma_prev)
step_callback(
PipelineIntermediateState(
step=step_idx + 1,
order=1,
total_steps=total_steps,
timestep=int(t_curr),
latents=latents,
),
)
return latents
def _build_step_callback(self, context: InvocationContext) -> Callable[[PipelineIntermediateState], None]:
def step_callback(state: PipelineIntermediateState) -> None:
context.util.sd_step_callback(state, BaseModelType.CogView4)
return step_callback

View File

@@ -0,0 +1,69 @@
import einops
import torch
from diffusers.models.autoencoders.autoencoder_kl import AutoencoderKL
from invokeai.app.invocations.baseinvocation import BaseInvocation, Classification, invocation
from invokeai.app.invocations.fields import (
FieldDescriptions,
ImageField,
Input,
InputField,
WithBoard,
WithMetadata,
)
from invokeai.app.invocations.model import VAEField
from invokeai.app.invocations.primitives import LatentsOutput
from invokeai.app.services.shared.invocation_context import InvocationContext
from invokeai.backend.model_manager.load.load_base import LoadedModel
from invokeai.backend.stable_diffusion.diffusers_pipeline import image_resized_to_grid_as_tensor
from invokeai.backend.util.devices import TorchDevice
# TODO(ryand): This is effectively a copy of SD3ImageToLatentsInvocation and a subset of ImageToLatentsInvocation. We
# should refactor to avoid this duplication.
@invocation(
"cogview4_i2l",
title="Image to Latents - CogView4",
tags=["image", "latents", "vae", "i2l", "cogview4"],
category="image",
version="1.0.0",
classification=Classification.Prototype,
)
class CogView4ImageToLatentsInvocation(BaseInvocation, WithMetadata, WithBoard):
"""Generates latents from an image."""
image: ImageField = InputField(description="The image to encode.")
vae: VAEField = InputField(description=FieldDescriptions.vae, input=Input.Connection)
@staticmethod
def vae_encode(vae_info: LoadedModel, image_tensor: torch.Tensor) -> torch.Tensor:
with vae_info as vae:
assert isinstance(vae, AutoencoderKL)
vae.disable_tiling()
image_tensor = image_tensor.to(device=TorchDevice.choose_torch_device(), dtype=vae.dtype)
with torch.inference_mode():
image_tensor_dist = vae.encode(image_tensor).latent_dist
# TODO: Use seed to make sampling reproducible.
latents: torch.Tensor = image_tensor_dist.sample().to(dtype=vae.dtype)
latents = vae.config.scaling_factor * latents
return latents
@torch.no_grad()
def invoke(self, context: InvocationContext) -> LatentsOutput:
image = context.images.get_pil(self.image.image_name)
image_tensor = image_resized_to_grid_as_tensor(image.convert("RGB"))
if image_tensor.dim() == 3:
image_tensor = einops.rearrange(image_tensor, "c h w -> 1 c h w")
vae_info = context.models.load(self.vae.vae)
latents = self.vae_encode(vae_info=vae_info, image_tensor=image_tensor)
latents = latents.to("cpu")
name = context.tensors.save(tensor=latents)
return LatentsOutput.build(latents_name=name, latents=latents, seed=None)

View File

@@ -0,0 +1,86 @@
from contextlib import nullcontext
import torch
from diffusers.models.autoencoders.autoencoder_kl import AutoencoderKL
from einops import rearrange
from PIL import Image
from invokeai.app.invocations.baseinvocation import BaseInvocation, Classification, invocation
from invokeai.app.invocations.constants import LATENT_SCALE_FACTOR
from invokeai.app.invocations.fields import (
FieldDescriptions,
Input,
InputField,
LatentsField,
WithBoard,
WithMetadata,
)
from invokeai.app.invocations.model import VAEField
from invokeai.app.invocations.primitives import ImageOutput
from invokeai.app.services.shared.invocation_context import InvocationContext
from invokeai.backend.stable_diffusion.extensions.seamless import SeamlessExt
from invokeai.backend.util.devices import TorchDevice
# TODO(ryand): This is effectively a copy of SD3LatentsToImageInvocation and a subset of LatentsToImageInvocation. We
# should refactor to avoid this duplication.
@invocation(
"cogview4_l2i",
title="Latents to Image - CogView4",
tags=["latents", "image", "vae", "l2i", "cogview4"],
category="latents",
version="1.0.0",
classification=Classification.Prototype,
)
class CogView4LatentsToImageInvocation(BaseInvocation, WithMetadata, WithBoard):
"""Generates an image from latents."""
latents: LatentsField = InputField(description=FieldDescriptions.latents, input=Input.Connection)
vae: VAEField = InputField(description=FieldDescriptions.vae, input=Input.Connection)
def _estimate_working_memory(self, latents: torch.Tensor, vae: AutoencoderKL) -> int:
"""Estimate the working memory required by the invocation in bytes."""
out_h = LATENT_SCALE_FACTOR * latents.shape[-2]
out_w = LATENT_SCALE_FACTOR * latents.shape[-1]
element_size = next(vae.parameters()).element_size()
scaling_constant = 2200 # Determined experimentally.
working_memory = out_h * out_w * element_size * scaling_constant
return int(working_memory)
@torch.no_grad()
def invoke(self, context: InvocationContext) -> ImageOutput:
latents = context.tensors.load(self.latents.latents_name)
vae_info = context.models.load(self.vae.vae)
assert isinstance(vae_info.model, (AutoencoderKL))
estimated_working_memory = self._estimate_working_memory(latents, vae_info.model)
with (
SeamlessExt.static_patch_model(vae_info.model, self.vae.seamless_axes),
vae_info.model_on_device(working_mem_bytes=estimated_working_memory) as (_, vae),
):
context.util.signal_progress("Running VAE")
assert isinstance(vae, (AutoencoderKL))
latents = latents.to(TorchDevice.choose_torch_device())
vae.disable_tiling()
tiling_context = nullcontext()
# clear memory as vae decode can request a lot
TorchDevice.empty_cache()
with torch.inference_mode(), tiling_context:
# copied from diffusers pipeline
latents = latents / vae.config.scaling_factor
img = vae.decode(latents, return_dict=False)[0]
img = img.clamp(-1, 1)
img = rearrange(img[0], "c h w -> h w c") # noqa: F821
img_pil = Image.fromarray((127.5 * (img + 1.0)).byte().cpu().numpy())
TorchDevice.empty_cache()
image_dto = context.images.save(image=img_pil)
return ImageOutput.build(image_dto)

View File

@@ -0,0 +1,55 @@
from invokeai.app.invocations.baseinvocation import (
BaseInvocation,
BaseInvocationOutput,
Classification,
invocation,
invocation_output,
)
from invokeai.app.invocations.fields import FieldDescriptions, Input, InputField, OutputField, UIType
from invokeai.app.invocations.model import (
GlmEncoderField,
ModelIdentifierField,
TransformerField,
VAEField,
)
from invokeai.app.services.shared.invocation_context import InvocationContext
from invokeai.backend.model_manager.config import SubModelType
@invocation_output("cogview4_model_loader_output")
class CogView4ModelLoaderOutput(BaseInvocationOutput):
"""CogView4 base model loader output."""
transformer: TransformerField = OutputField(description=FieldDescriptions.transformer, title="Transformer")
glm_encoder: GlmEncoderField = OutputField(description=FieldDescriptions.glm_encoder, title="GLM Encoder")
vae: VAEField = OutputField(description=FieldDescriptions.vae, title="VAE")
@invocation(
"cogview4_model_loader",
title="Main Model - CogView4",
tags=["model", "cogview4"],
category="model",
version="1.0.0",
classification=Classification.Prototype,
)
class CogView4ModelLoaderInvocation(BaseInvocation):
"""Loads a CogView4 base model, outputting its submodels."""
model: ModelIdentifierField = InputField(
description=FieldDescriptions.cogview4_model,
ui_type=UIType.CogView4MainModel,
input=Input.Direct,
)
def invoke(self, context: InvocationContext) -> CogView4ModelLoaderOutput:
transformer = self.model.model_copy(update={"submodel_type": SubModelType.Transformer})
vae = self.model.model_copy(update={"submodel_type": SubModelType.VAE})
glm_tokenizer = self.model.model_copy(update={"submodel_type": SubModelType.Tokenizer})
glm_encoder = self.model.model_copy(update={"submodel_type": SubModelType.TextEncoder})
return CogView4ModelLoaderOutput(
transformer=TransformerField(transformer=transformer, loras=[]),
glm_encoder=GlmEncoderField(tokenizer=glm_tokenizer, text_encoder=glm_encoder),
vae=VAEField(vae=vae),
)

View File

@@ -0,0 +1,92 @@
import torch
from transformers import GlmModel, PreTrainedTokenizerFast
from invokeai.app.invocations.baseinvocation import BaseInvocation, Classification, invocation
from invokeai.app.invocations.fields import FieldDescriptions, Input, InputField, UIComponent
from invokeai.app.invocations.model import GlmEncoderField
from invokeai.app.invocations.primitives import CogView4ConditioningOutput
from invokeai.app.services.shared.invocation_context import InvocationContext
from invokeai.backend.stable_diffusion.diffusion.conditioning_data import (
CogView4ConditioningInfo,
ConditioningFieldData,
)
from invokeai.backend.util.devices import TorchDevice
# The CogView4 GLM Text Encoder max sequence length set based on the default in diffusers.
COGVIEW4_GLM_MAX_SEQ_LEN = 1024
@invocation(
"cogview4_text_encoder",
title="Prompt - CogView4",
tags=["prompt", "conditioning", "cogview4"],
category="conditioning",
version="1.0.0",
classification=Classification.Prototype,
)
class CogView4TextEncoderInvocation(BaseInvocation):
"""Encodes and preps a prompt for a cogview4 image."""
prompt: str = InputField(description="Text prompt to encode.", ui_component=UIComponent.Textarea)
glm_encoder: GlmEncoderField = InputField(
title="GLM Encoder",
description=FieldDescriptions.glm_encoder,
input=Input.Connection,
)
@torch.no_grad()
def invoke(self, context: InvocationContext) -> CogView4ConditioningOutput:
glm_embeds = self._glm_encode(context, max_seq_len=COGVIEW4_GLM_MAX_SEQ_LEN)
conditioning_data = ConditioningFieldData(conditionings=[CogView4ConditioningInfo(glm_embeds=glm_embeds)])
conditioning_name = context.conditioning.save(conditioning_data)
return CogView4ConditioningOutput.build(conditioning_name)
def _glm_encode(self, context: InvocationContext, max_seq_len: int) -> torch.Tensor:
prompt = [self.prompt]
# TODO(ryand): Add model inputs to the invocation rather than hard-coding.
with (
context.models.load(self.glm_encoder.text_encoder).model_on_device() as (_, glm_text_encoder),
context.models.load(self.glm_encoder.tokenizer).model_on_device() as (_, glm_tokenizer),
):
context.util.signal_progress("Running GLM text encoder")
assert isinstance(glm_text_encoder, GlmModel)
assert isinstance(glm_tokenizer, PreTrainedTokenizerFast)
text_inputs = glm_tokenizer(
prompt,
padding="longest",
max_length=max_seq_len,
truncation=True,
add_special_tokens=True,
return_tensors="pt",
)
text_input_ids = text_inputs.input_ids
untruncated_ids = glm_tokenizer(prompt, padding="longest", return_tensors="pt").input_ids
assert isinstance(text_input_ids, torch.Tensor)
assert isinstance(untruncated_ids, torch.Tensor)
if untruncated_ids.shape[-1] >= text_input_ids.shape[-1] and not torch.equal(
text_input_ids, untruncated_ids
):
removed_text = glm_tokenizer.batch_decode(untruncated_ids[:, max_seq_len - 1 : -1])
context.logger.warning(
"The following part of your input was truncated because `max_sequence_length` is set to "
f" {max_seq_len} tokens: {removed_text}"
)
current_length = text_input_ids.shape[1]
pad_length = (16 - (current_length % 16)) % 16
if pad_length > 0:
pad_ids = torch.full(
(text_input_ids.shape[0], pad_length),
fill_value=glm_tokenizer.pad_token_id,
dtype=text_input_ids.dtype,
device=text_input_ids.device,
)
text_input_ids = torch.cat([pad_ids, text_input_ids], dim=1)
prompt_embeds = glm_text_encoder(
text_input_ids.to(TorchDevice.choose_torch_device()), output_hidden_states=True
).hidden_states[-2]
assert isinstance(prompt_embeds, torch.Tensor)
return prompt_embeds

View File

@@ -0,0 +1,128 @@
# Invocations for ControlNet image preprocessors
# initial implementation by Gregg Helt, 2023
from typing import List, Union
from pydantic import BaseModel, Field, field_validator, model_validator
from invokeai.app.invocations.baseinvocation import (
BaseInvocation,
BaseInvocationOutput,
Classification,
invocation,
invocation_output,
)
from invokeai.app.invocations.fields import (
FieldDescriptions,
ImageField,
InputField,
OutputField,
UIType,
)
from invokeai.app.invocations.model import ModelIdentifierField
from invokeai.app.invocations.primitives import ImageOutput
from invokeai.app.invocations.util import validate_begin_end_step, validate_weights
from invokeai.app.services.shared.invocation_context import InvocationContext
from invokeai.app.util.controlnet_utils import CONTROLNET_MODE_VALUES, CONTROLNET_RESIZE_VALUES, heuristic_resize
from invokeai.backend.image_util.util import np_to_pil, pil_to_np
class ControlField(BaseModel):
image: ImageField = Field(description="The control image")
control_model: ModelIdentifierField = Field(description="The ControlNet model to use")
control_weight: Union[float, List[float]] = Field(default=1, description="The weight given to the ControlNet")
begin_step_percent: float = Field(
default=0, ge=0, le=1, description="When the ControlNet is first applied (% of total steps)"
)
end_step_percent: float = Field(
default=1, ge=0, le=1, description="When the ControlNet is last applied (% of total steps)"
)
control_mode: CONTROLNET_MODE_VALUES = Field(default="balanced", description="The control mode to use")
resize_mode: CONTROLNET_RESIZE_VALUES = Field(default="just_resize", description="The resize mode to use")
@field_validator("control_weight")
@classmethod
def validate_control_weight(cls, v):
validate_weights(v)
return v
@model_validator(mode="after")
def validate_begin_end_step_percent(self):
validate_begin_end_step(self.begin_step_percent, self.end_step_percent)
return self
@invocation_output("control_output")
class ControlOutput(BaseInvocationOutput):
"""node output for ControlNet info"""
# Outputs
control: ControlField = OutputField(description=FieldDescriptions.control)
@invocation("controlnet", title="ControlNet - SD1.5, SDXL", tags=["controlnet"], category="controlnet", version="1.1.3")
class ControlNetInvocation(BaseInvocation):
"""Collects ControlNet info to pass to other nodes"""
image: ImageField = InputField(description="The control image")
control_model: ModelIdentifierField = InputField(
description=FieldDescriptions.controlnet_model, ui_type=UIType.ControlNetModel
)
control_weight: Union[float, List[float]] = InputField(
default=1.0, ge=-1, le=2, description="The weight given to the ControlNet"
)
begin_step_percent: float = InputField(
default=0, ge=0, le=1, description="When the ControlNet is first applied (% of total steps)"
)
end_step_percent: float = InputField(
default=1, ge=0, le=1, description="When the ControlNet is last applied (% of total steps)"
)
control_mode: CONTROLNET_MODE_VALUES = InputField(default="balanced", description="The control mode used")
resize_mode: CONTROLNET_RESIZE_VALUES = InputField(default="just_resize", description="The resize mode used")
@field_validator("control_weight")
@classmethod
def validate_control_weight(cls, v):
validate_weights(v)
return v
@model_validator(mode="after")
def validate_begin_end_step_percent(self) -> "ControlNetInvocation":
validate_begin_end_step(self.begin_step_percent, self.end_step_percent)
return self
def invoke(self, context: InvocationContext) -> ControlOutput:
return ControlOutput(
control=ControlField(
image=self.image,
control_model=self.control_model,
control_weight=self.control_weight,
begin_step_percent=self.begin_step_percent,
end_step_percent=self.end_step_percent,
control_mode=self.control_mode,
resize_mode=self.resize_mode,
),
)
@invocation(
"heuristic_resize",
title="Heuristic Resize",
tags=["image, controlnet"],
category="image",
version="1.0.1",
classification=Classification.Prototype,
)
class HeuristicResizeInvocation(BaseInvocation):
"""Resize an image using a heuristic method. Preserves edge maps."""
image: ImageField = InputField(description="The image to resize")
width: int = InputField(default=512, ge=1, description="The width to resize to (px)")
height: int = InputField(default=512, ge=1, description="The height to resize to (px)")
def invoke(self, context: InvocationContext) -> ImageOutput:
image = context.images.get_pil(self.image.image_name, "RGB")
np_img = pil_to_np(image)
np_resized = heuristic_resize(np_img, (self.width, self.height))
resized = np_to_pil(np_resized)
image_dto = context.images.save(image=resized)
return ImageOutput.build(image_dto)

View File

@@ -1,716 +0,0 @@
# Invocations for ControlNet image preprocessors
# initial implementation by Gregg Helt, 2023
# heavily leverages controlnet_aux package: https://github.com/patrickvonplaten/controlnet_aux
from builtins import bool, float
from pathlib import Path
from typing import Dict, List, Literal, Union
import cv2
import numpy as np
from controlnet_aux import (
ContentShuffleDetector,
LeresDetector,
MediapipeFaceDetector,
MidasDetector,
MLSDdetector,
NormalBaeDetector,
PidiNetDetector,
SamDetector,
ZoeDetector,
)
from controlnet_aux.util import HWC3, ade_palette
from PIL import Image
from pydantic import BaseModel, Field, field_validator, model_validator
from transformers import pipeline
from transformers.pipelines import DepthEstimationPipeline
from invokeai.app.invocations.baseinvocation import (
BaseInvocation,
BaseInvocationOutput,
Classification,
invocation,
invocation_output,
)
from invokeai.app.invocations.fields import (
FieldDescriptions,
ImageField,
InputField,
OutputField,
UIType,
WithBoard,
WithMetadata,
)
from invokeai.app.invocations.model import ModelIdentifierField
from invokeai.app.invocations.primitives import ImageOutput
from invokeai.app.invocations.util import validate_begin_end_step, validate_weights
from invokeai.app.services.shared.invocation_context import InvocationContext
from invokeai.app.util.controlnet_utils import CONTROLNET_MODE_VALUES, CONTROLNET_RESIZE_VALUES, heuristic_resize
from invokeai.backend.image_util.canny import get_canny_edges
from invokeai.backend.image_util.depth_anything.depth_anything_pipeline import DepthAnythingPipeline
from invokeai.backend.image_util.dw_openpose import DWPOSE_MODELS, DWOpenposeDetector
from invokeai.backend.image_util.hed import HEDProcessor
from invokeai.backend.image_util.lineart import LineartProcessor
from invokeai.backend.image_util.lineart_anime import LineartAnimeProcessor
from invokeai.backend.image_util.util import np_to_pil, pil_to_np
class ControlField(BaseModel):
image: ImageField = Field(description="The control image")
control_model: ModelIdentifierField = Field(description="The ControlNet model to use")
control_weight: Union[float, List[float]] = Field(default=1, description="The weight given to the ControlNet")
begin_step_percent: float = Field(
default=0, ge=0, le=1, description="When the ControlNet is first applied (% of total steps)"
)
end_step_percent: float = Field(
default=1, ge=0, le=1, description="When the ControlNet is last applied (% of total steps)"
)
control_mode: CONTROLNET_MODE_VALUES = Field(default="balanced", description="The control mode to use")
resize_mode: CONTROLNET_RESIZE_VALUES = Field(default="just_resize", description="The resize mode to use")
@field_validator("control_weight")
@classmethod
def validate_control_weight(cls, v):
validate_weights(v)
return v
@model_validator(mode="after")
def validate_begin_end_step_percent(self):
validate_begin_end_step(self.begin_step_percent, self.end_step_percent)
return self
@invocation_output("control_output")
class ControlOutput(BaseInvocationOutput):
"""node output for ControlNet info"""
# Outputs
control: ControlField = OutputField(description=FieldDescriptions.control)
@invocation("controlnet", title="ControlNet - SD1.5, SDXL", tags=["controlnet"], category="controlnet", version="1.1.3")
class ControlNetInvocation(BaseInvocation):
"""Collects ControlNet info to pass to other nodes"""
image: ImageField = InputField(description="The control image")
control_model: ModelIdentifierField = InputField(
description=FieldDescriptions.controlnet_model, ui_type=UIType.ControlNetModel
)
control_weight: Union[float, List[float]] = InputField(
default=1.0, ge=-1, le=2, description="The weight given to the ControlNet"
)
begin_step_percent: float = InputField(
default=0, ge=0, le=1, description="When the ControlNet is first applied (% of total steps)"
)
end_step_percent: float = InputField(
default=1, ge=0, le=1, description="When the ControlNet is last applied (% of total steps)"
)
control_mode: CONTROLNET_MODE_VALUES = InputField(default="balanced", description="The control mode used")
resize_mode: CONTROLNET_RESIZE_VALUES = InputField(default="just_resize", description="The resize mode used")
@field_validator("control_weight")
@classmethod
def validate_control_weight(cls, v):
validate_weights(v)
return v
@model_validator(mode="after")
def validate_begin_end_step_percent(self) -> "ControlNetInvocation":
validate_begin_end_step(self.begin_step_percent, self.end_step_percent)
return self
def invoke(self, context: InvocationContext) -> ControlOutput:
return ControlOutput(
control=ControlField(
image=self.image,
control_model=self.control_model,
control_weight=self.control_weight,
begin_step_percent=self.begin_step_percent,
end_step_percent=self.end_step_percent,
control_mode=self.control_mode,
resize_mode=self.resize_mode,
),
)
# This invocation exists for other invocations to subclass it - do not register with @invocation!
class ImageProcessorInvocation(BaseInvocation, WithMetadata, WithBoard):
"""Base class for invocations that preprocess images for ControlNet"""
image: ImageField = InputField(description="The image to process")
def run_processor(self, image: Image.Image) -> Image.Image:
# superclass just passes through image without processing
return image
def load_image(self, context: InvocationContext) -> Image.Image:
# allows override for any special formatting specific to the preprocessor
return context.images.get_pil(self.image.image_name, "RGB")
def invoke(self, context: InvocationContext) -> ImageOutput:
self._context = context
raw_image = self.load_image(context)
# image type should be PIL.PngImagePlugin.PngImageFile ?
processed_image = self.run_processor(raw_image)
# currently can't see processed image in node UI without a showImage node,
# so for now setting image_type to RESULT instead of INTERMEDIATE so will get saved in gallery
image_dto = context.images.save(image=processed_image)
"""Builds an ImageOutput and its ImageField"""
processed_image_field = ImageField(image_name=image_dto.image_name)
return ImageOutput(
image=processed_image_field,
# width=processed_image.width,
width=image_dto.width,
# height=processed_image.height,
height=image_dto.height,
# mode=processed_image.mode,
)
@invocation(
"canny_image_processor",
title="Canny Processor",
tags=["controlnet", "canny"],
category="controlnet",
version="1.3.3",
classification=Classification.Deprecated,
)
class CannyImageProcessorInvocation(ImageProcessorInvocation):
"""Canny edge detection for ControlNet"""
detect_resolution: int = InputField(default=512, ge=1, description=FieldDescriptions.detect_res)
image_resolution: int = InputField(default=512, ge=1, description=FieldDescriptions.image_res)
low_threshold: int = InputField(
default=100, ge=0, le=255, description="The low threshold of the Canny pixel gradient (0-255)"
)
high_threshold: int = InputField(
default=200, ge=0, le=255, description="The high threshold of the Canny pixel gradient (0-255)"
)
def load_image(self, context: InvocationContext) -> Image.Image:
# Keep alpha channel for Canny processing to detect edges of transparent areas
return context.images.get_pil(self.image.image_name, "RGBA")
def run_processor(self, image: Image.Image) -> Image.Image:
processed_image = get_canny_edges(
image,
self.low_threshold,
self.high_threshold,
detect_resolution=self.detect_resolution,
image_resolution=self.image_resolution,
)
return processed_image
@invocation(
"hed_image_processor",
title="HED (softedge) Processor",
tags=["controlnet", "hed", "softedge"],
category="controlnet",
version="1.2.3",
classification=Classification.Deprecated,
)
class HedImageProcessorInvocation(ImageProcessorInvocation):
"""Applies HED edge detection to image"""
detect_resolution: int = InputField(default=512, ge=1, description=FieldDescriptions.detect_res)
image_resolution: int = InputField(default=512, ge=1, description=FieldDescriptions.image_res)
# safe not supported in controlnet_aux v0.0.3
# safe: bool = InputField(default=False, description=FieldDescriptions.safe_mode)
scribble: bool = InputField(default=False, description=FieldDescriptions.scribble_mode)
def run_processor(self, image: Image.Image) -> Image.Image:
hed_processor = HEDProcessor()
processed_image = hed_processor.run(
image,
detect_resolution=self.detect_resolution,
image_resolution=self.image_resolution,
# safe not supported in controlnet_aux v0.0.3
# safe=self.safe,
scribble=self.scribble,
)
return processed_image
@invocation(
"lineart_image_processor",
title="Lineart Processor",
tags=["controlnet", "lineart"],
category="controlnet",
version="1.2.3",
classification=Classification.Deprecated,
)
class LineartImageProcessorInvocation(ImageProcessorInvocation):
"""Applies line art processing to image"""
detect_resolution: int = InputField(default=512, ge=1, description=FieldDescriptions.detect_res)
image_resolution: int = InputField(default=512, ge=1, description=FieldDescriptions.image_res)
coarse: bool = InputField(default=False, description="Whether to use coarse mode")
def run_processor(self, image: Image.Image) -> Image.Image:
lineart_processor = LineartProcessor()
processed_image = lineart_processor.run(
image, detect_resolution=self.detect_resolution, image_resolution=self.image_resolution, coarse=self.coarse
)
return processed_image
@invocation(
"lineart_anime_image_processor",
title="Lineart Anime Processor",
tags=["controlnet", "lineart", "anime"],
category="controlnet",
version="1.2.3",
classification=Classification.Deprecated,
)
class LineartAnimeImageProcessorInvocation(ImageProcessorInvocation):
"""Applies line art anime processing to image"""
detect_resolution: int = InputField(default=512, ge=1, description=FieldDescriptions.detect_res)
image_resolution: int = InputField(default=512, ge=1, description=FieldDescriptions.image_res)
def run_processor(self, image: Image.Image) -> Image.Image:
processor = LineartAnimeProcessor()
processed_image = processor.run(
image,
detect_resolution=self.detect_resolution,
image_resolution=self.image_resolution,
)
return processed_image
@invocation(
"midas_depth_image_processor",
title="Midas Depth Processor",
tags=["controlnet", "midas"],
category="controlnet",
version="1.2.4",
classification=Classification.Deprecated,
)
class MidasDepthImageProcessorInvocation(ImageProcessorInvocation):
"""Applies Midas depth processing to image"""
a_mult: float = InputField(default=2.0, ge=0, description="Midas parameter `a_mult` (a = a_mult * PI)")
bg_th: float = InputField(default=0.1, ge=0, description="Midas parameter `bg_th`")
detect_resolution: int = InputField(default=512, ge=1, description=FieldDescriptions.detect_res)
image_resolution: int = InputField(default=512, ge=1, description=FieldDescriptions.image_res)
# depth_and_normal not supported in controlnet_aux v0.0.3
# depth_and_normal: bool = InputField(default=False, description="whether to use depth and normal mode")
def run_processor(self, image: Image.Image) -> Image.Image:
# TODO: replace from_pretrained() calls with context.models.download_and_cache() (or similar)
midas_processor = MidasDetector.from_pretrained("lllyasviel/Annotators")
processed_image = midas_processor(
image,
a=np.pi * self.a_mult,
bg_th=self.bg_th,
image_resolution=self.image_resolution,
detect_resolution=self.detect_resolution,
# dept_and_normal not supported in controlnet_aux v0.0.3
# depth_and_normal=self.depth_and_normal,
)
return processed_image
@invocation(
"normalbae_image_processor",
title="Normal BAE Processor",
tags=["controlnet"],
category="controlnet",
version="1.2.3",
classification=Classification.Deprecated,
)
class NormalbaeImageProcessorInvocation(ImageProcessorInvocation):
"""Applies NormalBae processing to image"""
detect_resolution: int = InputField(default=512, ge=1, description=FieldDescriptions.detect_res)
image_resolution: int = InputField(default=512, ge=1, description=FieldDescriptions.image_res)
def run_processor(self, image: Image.Image) -> Image.Image:
normalbae_processor = NormalBaeDetector.from_pretrained("lllyasviel/Annotators")
processed_image = normalbae_processor(
image, detect_resolution=self.detect_resolution, image_resolution=self.image_resolution
)
return processed_image
@invocation(
"mlsd_image_processor",
title="MLSD Processor",
tags=["controlnet", "mlsd"],
category="controlnet",
version="1.2.3",
classification=Classification.Deprecated,
)
class MlsdImageProcessorInvocation(ImageProcessorInvocation):
"""Applies MLSD processing to image"""
detect_resolution: int = InputField(default=512, ge=1, description=FieldDescriptions.detect_res)
image_resolution: int = InputField(default=512, ge=1, description=FieldDescriptions.image_res)
thr_v: float = InputField(default=0.1, ge=0, description="MLSD parameter `thr_v`")
thr_d: float = InputField(default=0.1, ge=0, description="MLSD parameter `thr_d`")
def run_processor(self, image: Image.Image) -> Image.Image:
mlsd_processor = MLSDdetector.from_pretrained("lllyasviel/Annotators")
processed_image = mlsd_processor(
image,
detect_resolution=self.detect_resolution,
image_resolution=self.image_resolution,
thr_v=self.thr_v,
thr_d=self.thr_d,
)
return processed_image
@invocation(
"pidi_image_processor",
title="PIDI Processor",
tags=["controlnet", "pidi"],
category="controlnet",
version="1.2.3",
classification=Classification.Deprecated,
)
class PidiImageProcessorInvocation(ImageProcessorInvocation):
"""Applies PIDI processing to image"""
detect_resolution: int = InputField(default=512, ge=1, description=FieldDescriptions.detect_res)
image_resolution: int = InputField(default=512, ge=1, description=FieldDescriptions.image_res)
safe: bool = InputField(default=False, description=FieldDescriptions.safe_mode)
scribble: bool = InputField(default=False, description=FieldDescriptions.scribble_mode)
def run_processor(self, image: Image.Image) -> Image.Image:
pidi_processor = PidiNetDetector.from_pretrained("lllyasviel/Annotators")
processed_image = pidi_processor(
image,
detect_resolution=self.detect_resolution,
image_resolution=self.image_resolution,
safe=self.safe,
scribble=self.scribble,
)
return processed_image
@invocation(
"content_shuffle_image_processor",
title="Content Shuffle Processor",
tags=["controlnet", "contentshuffle"],
category="controlnet",
version="1.2.3",
classification=Classification.Deprecated,
)
class ContentShuffleImageProcessorInvocation(ImageProcessorInvocation):
"""Applies content shuffle processing to image"""
detect_resolution: int = InputField(default=512, ge=1, description=FieldDescriptions.detect_res)
image_resolution: int = InputField(default=512, ge=1, description=FieldDescriptions.image_res)
h: int = InputField(default=512, ge=0, description="Content shuffle `h` parameter")
w: int = InputField(default=512, ge=0, description="Content shuffle `w` parameter")
f: int = InputField(default=256, ge=0, description="Content shuffle `f` parameter")
def run_processor(self, image: Image.Image) -> Image.Image:
content_shuffle_processor = ContentShuffleDetector()
processed_image = content_shuffle_processor(
image,
detect_resolution=self.detect_resolution,
image_resolution=self.image_resolution,
h=self.h,
w=self.w,
f=self.f,
)
return processed_image
# should work with controlnet_aux >= 0.0.4 and timm <= 0.6.13
@invocation(
"zoe_depth_image_processor",
title="Zoe (Depth) Processor",
tags=["controlnet", "zoe", "depth"],
category="controlnet",
version="1.2.3",
classification=Classification.Deprecated,
)
class ZoeDepthImageProcessorInvocation(ImageProcessorInvocation):
"""Applies Zoe depth processing to image"""
def run_processor(self, image: Image.Image) -> Image.Image:
zoe_depth_processor = ZoeDetector.from_pretrained("lllyasviel/Annotators")
processed_image = zoe_depth_processor(image)
return processed_image
@invocation(
"mediapipe_face_processor",
title="Mediapipe Face Processor",
tags=["controlnet", "mediapipe", "face"],
category="controlnet",
version="1.2.4",
classification=Classification.Deprecated,
)
class MediapipeFaceProcessorInvocation(ImageProcessorInvocation):
"""Applies mediapipe face processing to image"""
max_faces: int = InputField(default=1, ge=1, description="Maximum number of faces to detect")
min_confidence: float = InputField(default=0.5, ge=0, le=1, description="Minimum confidence for face detection")
detect_resolution: int = InputField(default=512, ge=1, description=FieldDescriptions.detect_res)
image_resolution: int = InputField(default=512, ge=1, description=FieldDescriptions.image_res)
def run_processor(self, image: Image.Image) -> Image.Image:
mediapipe_face_processor = MediapipeFaceDetector()
processed_image = mediapipe_face_processor(
image,
max_faces=self.max_faces,
min_confidence=self.min_confidence,
image_resolution=self.image_resolution,
detect_resolution=self.detect_resolution,
)
return processed_image
@invocation(
"leres_image_processor",
title="Leres (Depth) Processor",
tags=["controlnet", "leres", "depth"],
category="controlnet",
version="1.2.3",
classification=Classification.Deprecated,
)
class LeresImageProcessorInvocation(ImageProcessorInvocation):
"""Applies leres processing to image"""
thr_a: float = InputField(default=0, description="Leres parameter `thr_a`")
thr_b: float = InputField(default=0, description="Leres parameter `thr_b`")
boost: bool = InputField(default=False, description="Whether to use boost mode")
detect_resolution: int = InputField(default=512, ge=1, description=FieldDescriptions.detect_res)
image_resolution: int = InputField(default=512, ge=1, description=FieldDescriptions.image_res)
def run_processor(self, image: Image.Image) -> Image.Image:
leres_processor = LeresDetector.from_pretrained("lllyasviel/Annotators")
processed_image = leres_processor(
image,
thr_a=self.thr_a,
thr_b=self.thr_b,
boost=self.boost,
detect_resolution=self.detect_resolution,
image_resolution=self.image_resolution,
)
return processed_image
@invocation(
"tile_image_processor",
title="Tile Resample Processor",
tags=["controlnet", "tile"],
category="controlnet",
version="1.2.3",
classification=Classification.Deprecated,
)
class TileResamplerProcessorInvocation(ImageProcessorInvocation):
"""Tile resampler processor"""
# res: int = InputField(default=512, ge=0, le=1024, description="The pixel resolution for each tile")
down_sampling_rate: float = InputField(default=1.0, ge=1.0, le=8.0, description="Down sampling rate")
# tile_resample copied from sd-webui-controlnet/scripts/processor.py
def tile_resample(
self,
np_img: np.ndarray,
res=512, # never used?
down_sampling_rate=1.0,
):
np_img = HWC3(np_img)
if down_sampling_rate < 1.1:
return np_img
H, W, C = np_img.shape
H = int(float(H) / float(down_sampling_rate))
W = int(float(W) / float(down_sampling_rate))
np_img = cv2.resize(np_img, (W, H), interpolation=cv2.INTER_AREA)
return np_img
def run_processor(self, image: Image.Image) -> Image.Image:
np_img = np.array(image, dtype=np.uint8)
processed_np_image = self.tile_resample(
np_img,
# res=self.tile_size,
down_sampling_rate=self.down_sampling_rate,
)
processed_image = Image.fromarray(processed_np_image)
return processed_image
@invocation(
"segment_anything_processor",
title="Segment Anything Processor",
tags=["controlnet", "segmentanything"],
category="controlnet",
version="1.2.4",
classification=Classification.Deprecated,
)
class SegmentAnythingProcessorInvocation(ImageProcessorInvocation):
"""Applies segment anything processing to image"""
detect_resolution: int = InputField(default=512, ge=1, description=FieldDescriptions.detect_res)
image_resolution: int = InputField(default=512, ge=1, description=FieldDescriptions.image_res)
def run_processor(self, image: Image.Image) -> Image.Image:
# segment_anything_processor = SamDetector.from_pretrained("ybelkada/segment-anything", subfolder="checkpoints")
segment_anything_processor = SamDetectorReproducibleColors.from_pretrained(
"ybelkada/segment-anything", subfolder="checkpoints"
)
np_img = np.array(image, dtype=np.uint8)
processed_image = segment_anything_processor(
np_img, image_resolution=self.image_resolution, detect_resolution=self.detect_resolution
)
return processed_image
class SamDetectorReproducibleColors(SamDetector):
# overriding SamDetector.show_anns() method to use reproducible colors for segmentation image
# base class show_anns() method randomizes colors,
# which seems to also lead to non-reproducible image generation
# so using ADE20k color palette instead
def show_anns(self, anns: List[Dict]):
if len(anns) == 0:
return
sorted_anns = sorted(anns, key=(lambda x: x["area"]), reverse=True)
h, w = anns[0]["segmentation"].shape
final_img = Image.fromarray(np.zeros((h, w, 3), dtype=np.uint8), mode="RGB")
palette = ade_palette()
for i, ann in enumerate(sorted_anns):
m = ann["segmentation"]
img = np.empty((m.shape[0], m.shape[1], 3), dtype=np.uint8)
# doing modulo just in case number of annotated regions exceeds number of colors in palette
ann_color = palette[i % len(palette)]
img[:, :] = ann_color
final_img.paste(Image.fromarray(img, mode="RGB"), (0, 0), Image.fromarray(np.uint8(m * 255)))
return np.array(final_img, dtype=np.uint8)
@invocation(
"color_map_image_processor",
title="Color Map Processor",
tags=["controlnet"],
category="controlnet",
version="1.2.3",
classification=Classification.Deprecated,
)
class ColorMapImageProcessorInvocation(ImageProcessorInvocation):
"""Generates a color map from the provided image"""
color_map_tile_size: int = InputField(default=64, ge=1, description=FieldDescriptions.tile_size)
def run_processor(self, image: Image.Image) -> Image.Image:
np_image = np.array(image, dtype=np.uint8)
height, width = np_image.shape[:2]
width_tile_size = min(self.color_map_tile_size, width)
height_tile_size = min(self.color_map_tile_size, height)
color_map = cv2.resize(
np_image,
(width // width_tile_size, height // height_tile_size),
interpolation=cv2.INTER_CUBIC,
)
color_map = cv2.resize(color_map, (width, height), interpolation=cv2.INTER_NEAREST)
color_map = Image.fromarray(color_map)
return color_map
DEPTH_ANYTHING_MODEL_SIZES = Literal["large", "base", "small", "small_v2"]
# DepthAnything V2 Small model is licensed under Apache 2.0 but not the base and large models.
DEPTH_ANYTHING_MODELS = {
"large": "LiheYoung/depth-anything-large-hf",
"base": "LiheYoung/depth-anything-base-hf",
"small": "LiheYoung/depth-anything-small-hf",
"small_v2": "depth-anything/Depth-Anything-V2-Small-hf",
}
@invocation(
"depth_anything_image_processor",
title="Depth Anything Processor",
tags=["controlnet", "depth", "depth anything"],
category="controlnet",
version="1.1.3",
classification=Classification.Deprecated,
)
class DepthAnythingImageProcessorInvocation(ImageProcessorInvocation):
"""Generates a depth map based on the Depth Anything algorithm"""
model_size: DEPTH_ANYTHING_MODEL_SIZES = InputField(
default="small_v2", description="The size of the depth model to use"
)
resolution: int = InputField(default=512, ge=1, description=FieldDescriptions.image_res)
def run_processor(self, image: Image.Image) -> Image.Image:
def load_depth_anything(model_path: Path):
depth_anything_pipeline = pipeline(model=str(model_path), task="depth-estimation", local_files_only=True)
assert isinstance(depth_anything_pipeline, DepthEstimationPipeline)
return DepthAnythingPipeline(depth_anything_pipeline)
with self._context.models.load_remote_model(
source=DEPTH_ANYTHING_MODELS[self.model_size], loader=load_depth_anything
) as depth_anything_detector:
assert isinstance(depth_anything_detector, DepthAnythingPipeline)
depth_map = depth_anything_detector.generate_depth(image)
# Resizing to user target specified size
new_height = int(image.size[1] * (self.resolution / image.size[0]))
depth_map = depth_map.resize((self.resolution, new_height))
return depth_map
@invocation(
"dw_openpose_image_processor",
title="DW Openpose Image Processor",
tags=["controlnet", "dwpose", "openpose"],
category="controlnet",
version="1.1.1",
classification=Classification.Deprecated,
)
class DWOpenposeImageProcessorInvocation(ImageProcessorInvocation):
"""Generates an openpose pose from an image using DWPose"""
draw_body: bool = InputField(default=True)
draw_face: bool = InputField(default=False)
draw_hands: bool = InputField(default=False)
image_resolution: int = InputField(default=512, ge=1, description=FieldDescriptions.image_res)
def run_processor(self, image: Image.Image) -> Image.Image:
onnx_det = self._context.models.download_and_cache_model(DWPOSE_MODELS["yolox_l.onnx"])
onnx_pose = self._context.models.download_and_cache_model(DWPOSE_MODELS["dw-ll_ucoco_384.onnx"])
dw_openpose = DWOpenposeDetector(onnx_det=onnx_det, onnx_pose=onnx_pose)
processed_image = dw_openpose(
image,
draw_face=self.draw_face,
draw_hands=self.draw_hands,
draw_body=self.draw_body,
resolution=self.image_resolution,
)
return processed_image
@invocation(
"heuristic_resize",
title="Heuristic Resize",
tags=["image, controlnet"],
category="image",
version="1.0.1",
classification=Classification.Prototype,
)
class HeuristicResizeInvocation(BaseInvocation):
"""Resize an image using a heuristic method. Preserves edge maps."""
image: ImageField = InputField(description="The image to resize")
width: int = InputField(default=512, ge=1, description="The width to resize to (px)")
height: int = InputField(default=512, ge=1, description="The height to resize to (px)")
def invoke(self, context: InvocationContext) -> ImageOutput:
image = context.images.get_pil(self.image.image_name, "RGB")
np_img = pil_to_np(image)
np_resized = heuristic_resize(np_img, (self.width, self.height))
resized = np_to_pil(np_resized)
image_dto = context.images.save(image=resized)
return ImageOutput.build(image_dto)

View File

@@ -22,7 +22,7 @@ from transformers import CLIPVisionModelWithProjection
from invokeai.app.invocations.baseinvocation import BaseInvocation, invocation
from invokeai.app.invocations.constants import LATENT_SCALE_FACTOR
from invokeai.app.invocations.controlnet_image_processors import ControlField
from invokeai.app.invocations.controlnet import ControlField
from invokeai.app.invocations.fields import (
ConditioningField,
DenoiseMaskField,

View File

@@ -4,7 +4,7 @@ from invokeai.app.invocations.baseinvocation import BaseInvocation, invocation
from invokeai.app.invocations.fields import ImageField, InputField, WithBoard, WithMetadata
from invokeai.app.invocations.primitives import ImageOutput
from invokeai.app.services.shared.invocation_context import InvocationContext
from invokeai.backend.image_util.dw_openpose import DWOpenposeDetector2
from invokeai.backend.image_util.dw_openpose import DWOpenposeDetector
@invocation(
@@ -25,20 +25,20 @@ class DWOpenposeDetectionInvocation(BaseInvocation, WithMetadata, WithBoard):
def invoke(self, context: InvocationContext) -> ImageOutput:
image = context.images.get_pil(self.image.image_name, "RGB")
onnx_det_path = context.models.download_and_cache_model(DWOpenposeDetector2.get_model_url_det())
onnx_pose_path = context.models.download_and_cache_model(DWOpenposeDetector2.get_model_url_pose())
onnx_det_path = context.models.download_and_cache_model(DWOpenposeDetector.get_model_url_det())
onnx_pose_path = context.models.download_and_cache_model(DWOpenposeDetector.get_model_url_pose())
loaded_session_det = context.models.load_local_model(
onnx_det_path, DWOpenposeDetector2.create_onnx_inference_session
onnx_det_path, DWOpenposeDetector.create_onnx_inference_session
)
loaded_session_pose = context.models.load_local_model(
onnx_pose_path, DWOpenposeDetector2.create_onnx_inference_session
onnx_pose_path, DWOpenposeDetector.create_onnx_inference_session
)
with loaded_session_det as session_det, loaded_session_pose as session_pose:
assert isinstance(session_det, ort.InferenceSession)
assert isinstance(session_pose, ort.InferenceSession)
detector = DWOpenposeDetector2(session_det=session_det, session_pose=session_pose)
detector = DWOpenposeDetector(session_det=session_det, session_pose=session_pose)
detected_image = detector.run(
image,
draw_face=self.draw_face,

View File

@@ -40,6 +40,7 @@ class UIType(str, Enum, metaclass=MetaEnum):
# region Model Field Types
MainModel = "MainModelField"
CogView4MainModel = "CogView4MainModelField"
FluxMainModel = "FluxMainModelField"
SD3MainModel = "SD3MainModelField"
SDXLMainModel = "SDXLMainModelField"
@@ -60,6 +61,8 @@ class UIType(str, Enum, metaclass=MetaEnum):
SigLipModel = "SigLipModelField"
FluxReduxModel = "FluxReduxModelField"
LlavaOnevisionModel = "LLaVAModelField"
Imagen3Model = "Imagen3ModelField"
ChatGPT4oModel = "ChatGPT4oModelField"
# endregion
# region Misc Field Types
@@ -137,6 +140,7 @@ class FieldDescriptions:
noise = "Noise tensor"
clip = "CLIP (tokenizer, text encoder, LoRAs) and skipped layer count"
t5_encoder = "T5 tokenizer and text encoder"
glm_encoder = "GLM (THUDM) tokenizer and text encoder"
clip_embed_model = "CLIP Embed loader"
clip_g_model = "CLIP-G Embed loader"
unet = "UNet (scheduler, LoRAs)"
@@ -151,6 +155,7 @@ class FieldDescriptions:
main_model = "Main model (UNet, VAE, CLIP) to load"
flux_model = "Flux model (Transformer) to load"
sd3_model = "SD3 model (MMDiTX) to load"
cogview4_model = "CogView4 model (Transformer) to load"
sdxl_main_model = "SDXL Main model (UNet, VAE, CLIP1, CLIP2) to load"
sdxl_refiner_model = "SDXL Refiner Main Modde (UNet, VAE, CLIP2) to load"
onnx_main_model = "ONNX Main model (UNet, VAE, CLIP) to load"
@@ -290,6 +295,12 @@ class SD3ConditioningField(BaseModel):
conditioning_name: str = Field(description="The name of conditioning tensor")
class CogView4ConditioningField(BaseModel):
"""A conditioning tensor primitive value"""
conditioning_name: str = Field(description="The name of conditioning tensor")
class ConditioningField(BaseModel):
"""A conditioning tensor primitive value"""

View File

@@ -33,7 +33,6 @@ from invokeai.app.services.shared.invocation_context import InvocationContext
from invokeai.backend.flux.controlnet.instantx_controlnet_flux import InstantXControlNetFlux
from invokeai.backend.flux.controlnet.xlabs_controlnet_flux import XLabsControlNetFlux
from invokeai.backend.flux.denoise import denoise
from invokeai.backend.flux.extensions.inpaint_extension import InpaintExtension
from invokeai.backend.flux.extensions.instantx_controlnet_extension import InstantXControlNetExtension
from invokeai.backend.flux.extensions.regional_prompting_extension import RegionalPromptingExtension
from invokeai.backend.flux.extensions.xlabs_controlnet_extension import XLabsControlNetExtension
@@ -53,6 +52,7 @@ from invokeai.backend.model_manager.taxonomy import ModelFormat, ModelVariantTyp
from invokeai.backend.patches.layer_patcher import LayerPatcher
from invokeai.backend.patches.lora_conversions.flux_lora_constants import FLUX_LORA_TRANSFORMER_PREFIX
from invokeai.backend.patches.model_patch_raw import ModelPatchRaw
from invokeai.backend.rectified_flow.rectified_flow_inpaint_extension import RectifiedFlowInpaintExtension
from invokeai.backend.stable_diffusion.diffusers_pipeline import PipelineIntermediateState
from invokeai.backend.stable_diffusion.diffusion.conditioning_data import FLUXConditioningInfo
from invokeai.backend.util.devices import TorchDevice
@@ -295,10 +295,10 @@ class FluxDenoiseInvocation(BaseInvocation, WithMetadata, WithBoard):
assert packed_h * packed_w == x.shape[1]
# Prepare inpaint extension.
inpaint_extension: InpaintExtension | None = None
inpaint_extension: RectifiedFlowInpaintExtension | None = None
if inpaint_mask is not None:
assert init_latents is not None
inpaint_extension = InpaintExtension(
inpaint_extension = RectifiedFlowInpaintExtension(
init_latents=init_latents,
inpaint_mask=inpaint_mask,
noise=noise,

View File

@@ -1,7 +1,9 @@
from typing import Optional
import math
from typing import Literal, Optional
import torch
from PIL import Image
from transformers import SiglipImageProcessor, SiglipVisionModel
from invokeai.app.invocations.baseinvocation import (
BaseInvocation,
@@ -39,12 +41,15 @@ class FluxReduxOutput(BaseInvocationOutput):
)
DOWNSAMPLING_FUNCTIONS = Literal["nearest", "bilinear", "bicubic", "area", "nearest-exact"]
@invocation(
"flux_redux",
title="FLUX Redux",
tags=["ip_adapter", "control"],
category="ip_adapter",
version="2.0.0",
version="2.1.0",
classification=Classification.Beta,
)
class FluxReduxInvocation(BaseInvocation):
@@ -61,23 +66,64 @@ class FluxReduxInvocation(BaseInvocation):
title="FLUX Redux Model",
ui_type=UIType.FluxReduxModel,
)
downsampling_factor: int = InputField(
ge=1,
le=9,
default=1,
description="Redux Downsampling Factor (1-9)",
)
downsampling_function: DOWNSAMPLING_FUNCTIONS = InputField(
default="area",
description="Redux Downsampling Function",
)
weight: float = InputField(
ge=0,
le=1,
default=1.0,
description="Redux weight (0.0-1.0)",
)
def invoke(self, context: InvocationContext) -> FluxReduxOutput:
image = context.images.get_pil(self.image.image_name, "RGB")
encoded_x = self._siglip_encode(context, image)
redux_conditioning = self._flux_redux_encode(context, encoded_x)
if self.downsampling_factor > 1 or self.weight != 1.0:
redux_conditioning = self._downsample_weight(context, redux_conditioning)
tensor_name = context.tensors.save(redux_conditioning)
return FluxReduxOutput(
redux_cond=FluxReduxConditioningField(conditioning=TensorField(tensor_name=tensor_name), mask=self.mask)
)
@torch.no_grad()
def _downsample_weight(self, context: InvocationContext, redux_conditioning: torch.Tensor) -> torch.Tensor:
# Downsampling derived from https://github.com/kaibioinfo/ComfyUI_AdvancedRefluxControl
(b, t, h) = redux_conditioning.shape
m = int(math.sqrt(t))
if self.downsampling_factor > 1:
redux_conditioning = redux_conditioning.view(b, m, m, h)
redux_conditioning = torch.nn.functional.interpolate(
redux_conditioning.transpose(1, -1),
size=(m // self.downsampling_factor, m // self.downsampling_factor),
mode=self.downsampling_function,
)
redux_conditioning = redux_conditioning.transpose(1, -1).reshape(b, -1, h)
if self.weight != 1.0:
redux_conditioning = redux_conditioning * self.weight * self.weight
return redux_conditioning
@torch.no_grad()
def _siglip_encode(self, context: InvocationContext, image: Image.Image) -> torch.Tensor:
siglip_model_config = self._get_siglip_model(context)
with context.models.load(siglip_model_config.key).model_on_device() as (_, siglip_pipeline):
assert isinstance(siglip_pipeline, SigLipPipeline)
with context.models.load(siglip_model_config.key).model_on_device() as (_, model):
assert isinstance(model, SiglipVisionModel)
model_abs_path = context.models.get_absolute_path(siglip_model_config)
processor = SiglipImageProcessor.from_pretrained(model_abs_path, local_files_only=True)
assert isinstance(processor, SiglipImageProcessor)
siglip_pipeline = SigLipPipeline(processor, model)
return siglip_pipeline.encode_image(
x=image, device=TorchDevice.choose_torch_device(), dtype=TorchDevice.choose_torch_dtype()
)

View File

@@ -127,13 +127,16 @@ class InfillPatchMatchInvocation(InfillImageProcessorInvocation):
return infilled
LAMA_MODEL_URL = "https://github.com/Sanster/models/releases/download/add_big_lama/big-lama.pt"
@invocation("infill_lama", title="LaMa Infill", tags=["image", "inpaint"], category="inpaint", version="1.2.2")
class LaMaInfillInvocation(InfillImageProcessorInvocation):
"""Infills transparent areas of an image using the LaMa model"""
def infill(self, image: Image.Image):
with self._context.models.load_remote_model(
source="https://github.com/Sanster/models/releases/download/add_big_lama/big-lama.pt",
source=LAMA_MODEL_URL,
loader=LaMA.load_jit_model,
) as model:
lama = LaMA(model)

View File

@@ -3,13 +3,14 @@ from typing import Any
import torch
from PIL.Image import Image
from pydantic import field_validator
from transformers import AutoProcessor, LlavaOnevisionForConditionalGeneration, LlavaOnevisionProcessor
from invokeai.app.invocations.baseinvocation import BaseInvocation, Classification, invocation
from invokeai.app.invocations.fields import FieldDescriptions, ImageField, InputField, UIComponent, UIType
from invokeai.app.invocations.model import ModelIdentifierField
from invokeai.app.invocations.primitives import StringOutput
from invokeai.app.services.shared.invocation_context import InvocationContext
from invokeai.backend.llava_onevision_model import LlavaOnevisionModel
from invokeai.backend.llava_onevision_pipeline import LlavaOnevisionPipeline
from invokeai.backend.util.devices import TorchDevice
@@ -54,10 +55,17 @@ class LlavaOnevisionVllmInvocation(BaseInvocation):
@torch.no_grad()
def invoke(self, context: InvocationContext) -> StringOutput:
images = self._get_images(context)
model_config = context.models.get_config(self.vllm_model)
with context.models.load(self.vllm_model) as vllm_model:
assert isinstance(vllm_model, LlavaOnevisionModel)
output = vllm_model.run(
with context.models.load(self.vllm_model).model_on_device() as (_, model):
assert isinstance(model, LlavaOnevisionForConditionalGeneration)
model_abs_path = context.models.get_absolute_path(model_config)
processor = AutoProcessor.from_pretrained(model_abs_path, local_files_only=True)
assert isinstance(processor, LlavaOnevisionProcessor)
model = LlavaOnevisionPipeline(model, processor)
output = model.run(
prompt=self.prompt,
images=images,
device=TorchDevice.choose_torch_device(),

View File

@@ -152,6 +152,10 @@ GENERATION_MODES = Literal[
"sd3_img2img",
"sd3_inpaint",
"sd3_outpaint",
"cogview4_txt2img",
"cogview4_img2img",
"cogview4_inpaint",
"cogview4_outpaint",
]

View File

@@ -14,7 +14,7 @@ from invokeai.app.invocations.baseinvocation import (
invocation,
invocation_output,
)
from invokeai.app.invocations.controlnet_image_processors import ControlField, ControlNetInvocation
from invokeai.app.invocations.controlnet import ControlField, ControlNetInvocation
from invokeai.app.invocations.denoise_latents import DenoiseLatentsInvocation
from invokeai.app.invocations.fields import (
FieldDescriptions,
@@ -39,7 +39,17 @@ from invokeai.app.invocations.model import (
VAEField,
VAEOutput,
)
from invokeai.app.invocations.primitives import BooleanOutput, FloatOutput, IntegerOutput, LatentsOutput, StringOutput
from invokeai.app.invocations.primitives import (
BooleanCollectionOutput,
BooleanOutput,
FloatCollectionOutput,
FloatOutput,
IntegerCollectionOutput,
IntegerOutput,
LatentsOutput,
StringCollectionOutput,
StringOutput,
)
from invokeai.app.invocations.scheduler import SchedulerOutput
from invokeai.app.invocations.t2i_adapter import T2IAdapterField, T2IAdapterInvocation
from invokeai.app.services.shared.invocation_context import InvocationContext
@@ -1162,3 +1172,133 @@ class MetadataToT2IAdaptersInvocation(BaseInvocation, WithMetadata):
adapters = append_list(T2IAdapterField, i.t2i_adapter, adapters)
return MDT2IAdapterListOutput(t2i_adapter_list=adapters)
@invocation(
"metadata_to_string_collection",
title="Metadata To String Collection",
tags=["metadata"],
category="metadata",
version="1.0.0",
classification=Classification.Beta,
)
class MetadataToStringCollectionInvocation(BaseInvocation, WithMetadata):
"""Extracts a string collection value of a label from metadata"""
label: CORE_LABELS_STRING = InputField(
default=CUSTOM_LABEL,
description=FieldDescriptions.metadata_item_label,
input=Input.Direct,
)
custom_label: Optional[str] = InputField(
default=None,
description=FieldDescriptions.metadata_item_label,
input=Input.Direct,
)
default_value: list[str] = InputField(
description="The default string collection to use if not found in the metadata"
)
_validate_custom_label = model_validator(mode="after")(validate_custom_label)
def invoke(self, context: InvocationContext) -> StringCollectionOutput:
data: Dict[str, Any] = {} if self.metadata is None else self.metadata.root
output = data.get(str(self.custom_label if self.label == CUSTOM_LABEL else self.label), self.default_value)
return StringCollectionOutput(collection=output)
@invocation(
"metadata_to_integer_collection",
title="Metadata To Integer Collection",
tags=["metadata"],
category="metadata",
version="1.0.0",
classification=Classification.Beta,
)
class MetadataToIntegerCollectionInvocation(BaseInvocation, WithMetadata):
"""Extracts an integer value Collection of a label from metadata"""
label: CORE_LABELS_INTEGER = InputField(
default=CUSTOM_LABEL,
description=FieldDescriptions.metadata_item_label,
input=Input.Direct,
)
custom_label: Optional[str] = InputField(
default=None,
description=FieldDescriptions.metadata_item_label,
input=Input.Direct,
)
default_value: list[int] = InputField(description="The default integer to use if not found in the metadata")
_validate_custom_label = model_validator(mode="after")(validate_custom_label)
def invoke(self, context: InvocationContext) -> IntegerCollectionOutput:
data: Dict[str, Any] = {} if self.metadata is None else self.metadata.root
output = data.get(str(self.custom_label if self.label == CUSTOM_LABEL else self.label), self.default_value)
return IntegerCollectionOutput(collection=output)
@invocation(
"metadata_to_float_collection",
title="Metadata To Float Collection",
tags=["metadata"],
category="metadata",
version="1.0.0",
classification=Classification.Beta,
)
class MetadataToFloatCollectionInvocation(BaseInvocation, WithMetadata):
"""Extracts a Float value Collection of a label from metadata"""
label: CORE_LABELS_FLOAT = InputField(
default=CUSTOM_LABEL,
description=FieldDescriptions.metadata_item_label,
input=Input.Direct,
)
custom_label: Optional[str] = InputField(
default=None,
description=FieldDescriptions.metadata_item_label,
input=Input.Direct,
)
default_value: list[float] = InputField(description="The default float to use if not found in the metadata")
_validate_custom_label = model_validator(mode="after")(validate_custom_label)
def invoke(self, context: InvocationContext) -> FloatCollectionOutput:
data: Dict[str, Any] = {} if self.metadata is None else self.metadata.root
output = data.get(str(self.custom_label if self.label == CUSTOM_LABEL else self.label), self.default_value)
return FloatCollectionOutput(collection=output)
@invocation(
"metadata_to_bool_collection",
title="Metadata To Bool Collection",
tags=["metadata"],
category="metadata",
version="1.0.0",
classification=Classification.Beta,
)
class MetadataToBoolCollectionInvocation(BaseInvocation, WithMetadata):
"""Extracts a Boolean value Collection of a label from metadata"""
label: CORE_LABELS_BOOL = InputField(
default=CUSTOM_LABEL,
description=FieldDescriptions.metadata_item_label,
input=Input.Direct,
)
custom_label: Optional[str] = InputField(
default=None,
description=FieldDescriptions.metadata_item_label,
input=Input.Direct,
)
default_value: list[bool] = InputField(description="The default bool to use if not found in the metadata")
_validate_custom_label = model_validator(mode="after")(validate_custom_label)
def invoke(self, context: InvocationContext) -> BooleanCollectionOutput:
data: Dict[str, Any] = {} if self.metadata is None else self.metadata.root
output = data.get(str(self.custom_label if self.label == CUSTOM_LABEL else self.label), self.default_value)
return BooleanCollectionOutput(collection=output)

View File

@@ -68,6 +68,11 @@ class T5EncoderField(BaseModel):
loras: List[LoRAField] = Field(description="LoRAs to apply on model loading")
class GlmEncoderField(BaseModel):
tokenizer: ModelIdentifierField = Field(description="Info to load tokenizer submodel")
text_encoder: ModelIdentifierField = Field(description="Info to load text_encoder submodel")
class VAEField(BaseModel):
vae: ModelIdentifierField = Field(description="Info to load vae submodel")
seamless_axes: List[str] = Field(default_factory=list, description='Axes("x" and "y") to which apply seamless')

View File

@@ -13,6 +13,7 @@ from invokeai.app.invocations.baseinvocation import (
from invokeai.app.invocations.constants import LATENT_SCALE_FACTOR
from invokeai.app.invocations.fields import (
BoundingBoxField,
CogView4ConditioningField,
ColorField,
ConditioningField,
DenoiseMaskField,
@@ -440,6 +441,17 @@ class SD3ConditioningOutput(BaseInvocationOutput):
return cls(conditioning=SD3ConditioningField(conditioning_name=conditioning_name))
@invocation_output("cogview4_conditioning_output")
class CogView4ConditioningOutput(BaseInvocationOutput):
"""Base class for nodes that output a CogView text conditioning tensor."""
conditioning: CogView4ConditioningField = OutputField(description=FieldDescriptions.cond)
@classmethod
def build(cls, conditioning_name: str) -> "CogView4ConditioningOutput":
return cls(conditioning=CogView4ConditioningField(conditioning_name=conditioning_name))
@invocation_output("conditioning_output")
class ConditioningOutput(BaseInvocationOutput):
"""Base class for nodes that output a single conditioning tensor"""

View File

@@ -24,7 +24,7 @@ from invokeai.app.invocations.sd3_text_encoder import SD3_T5_MAX_SEQ_LEN
from invokeai.app.services.shared.invocation_context import InvocationContext
from invokeai.backend.flux.sampling_utils import clip_timestep_schedule_fractional
from invokeai.backend.model_manager import BaseModelType
from invokeai.backend.sd3.extensions.inpaint_extension import InpaintExtension
from invokeai.backend.rectified_flow.rectified_flow_inpaint_extension import RectifiedFlowInpaintExtension
from invokeai.backend.stable_diffusion.diffusers_pipeline import PipelineIntermediateState
from invokeai.backend.stable_diffusion.diffusion.conditioning_data import SD3ConditioningInfo
from invokeai.backend.util.devices import TorchDevice
@@ -263,10 +263,10 @@ class SD3DenoiseInvocation(BaseInvocation, WithMetadata, WithBoard):
# Prepare inpaint extension.
inpaint_mask = self._prep_inpaint_mask(context, latents)
inpaint_extension: InpaintExtension | None = None
inpaint_extension: RectifiedFlowInpaintExtension | None = None
if inpaint_mask is not None:
assert init_latents is not None
inpaint_extension = InpaintExtension(
inpaint_extension = RectifiedFlowInpaintExtension(
init_latents=init_latents,
inpaint_mask=inpaint_mask,
noise=noise,

View File

@@ -9,7 +9,7 @@ from pydantic import field_validator
from invokeai.app.invocations.baseinvocation import BaseInvocation, invocation
from invokeai.app.invocations.constants import LATENT_SCALE_FACTOR
from invokeai.app.invocations.controlnet_image_processors import ControlField
from invokeai.app.invocations.controlnet import ControlField
from invokeai.app.invocations.denoise_latents import DenoiseLatentsInvocation, get_scheduler
from invokeai.app.invocations.fields import (
ConditioningField,

View File

@@ -31,6 +31,12 @@ def run_app() -> None:
if app_config.pytorch_cuda_alloc_conf:
configure_torch_cuda_allocator(app_config.pytorch_cuda_alloc_conf, logger)
# This import must happen after configure_torch_cuda_allocator() is called, because the module imports torch.
from invokeai.backend.util.devices import TorchDevice
torch_device_name = TorchDevice.get_torch_device_name()
logger.info(f"Using torch device: {torch_device_name}")
# Import from startup_utils here to avoid importing torch before configure_torch_cuda_allocator() is called.
from invokeai.app.util.startup_utils import (
apply_monkeypatches,

View File

@@ -241,6 +241,7 @@ class QueueItemStatusChangedEvent(QueueItemEventBase):
batch_status: BatchStatus = Field(description="The status of the batch")
queue_status: SessionQueueStatus = Field(description="The status of the queue")
session_id: str = Field(description="The ID of the session (aka graph execution state)")
credits: Optional[float] = Field(default=None, description="The total credits used for this queue item")
@classmethod
def build(
@@ -263,6 +264,7 @@ class QueueItemStatusChangedEvent(QueueItemEventBase):
completed_at=str(queue_item.completed_at) if queue_item.completed_at else None,
batch_status=batch_status,
queue_status=queue_status,
credits=queue_item.credits,
)

View File

@@ -38,7 +38,6 @@ from invokeai.backend.model_manager.config import (
AnyModelConfig,
CheckpointConfigBase,
InvalidModelConfigException,
ModelConfigBase,
)
from invokeai.backend.model_manager.legacy_probe import ModelProbe
from invokeai.backend.model_manager.metadata import (
@@ -647,10 +646,14 @@ class ModelInstallService(ModelInstallServiceBase):
hash_algo = self._app_config.hashing_algorithm
fields = config.model_dump()
try:
return ModelConfigBase.classify(model_path=model_path, hash_algo=hash_algo, **fields)
except InvalidModelConfigException:
return ModelProbe.probe(model_path=model_path, fields=fields, hash_algo=hash_algo) # type: ignore
return ModelProbe.probe(model_path=model_path, fields=fields, hash_algo=hash_algo)
# New model probe API is disabled pending resolution of issue caused by a change of the ordering of checks.
# See commit message for details.
# try:
# return ModelConfigBase.classify(model_path=model_path, hash_algo=hash_algo, **fields)
# except InvalidModelConfigException:
# return ModelProbe.probe(model_path=model_path, fields=fields, hash_algo=hash_algo) # type: ignore
def _register(
self, model_path: Path, config: Optional[ModelRecordChanges] = None, info: Optional[AnyModelConfig] = None

View File

@@ -80,6 +80,7 @@ class ModelRecordChanges(BaseModelExcludeNull):
type: Optional[ModelType] = Field(description="Type of model", default=None)
key: Optional[str] = Field(description="Database ID for this model", default=None)
hash: Optional[str] = Field(description="hash of model file", default=None)
file_size: Optional[int] = Field(description="Size of model file", default=None)
format: Optional[str] = Field(description="format of model file", default=None)
trigger_phrases: Optional[set[str]] = Field(description="Set of trigger phrases for this model", default=None)
default_settings: Optional[MainModelDefaultSettings | ControlAdapterDefaultSettings] = Field(

View File

@@ -302,7 +302,10 @@ class ModelRecordServiceSQL(ModelRecordServiceBase):
# We catch this error so that the app can still run if there are invalid model configs in the database.
# One reason that an invalid model config might be in the database is if someone had to rollback from a
# newer version of the app that added a new model type.
self._logger.warning(f"Found an invalid model config in the database. Ignoring this model. ({row[0]})")
row_data = f"{row[0][:64]}..." if len(row[0]) > 64 else row[0]
self._logger.warning(
f"Found an invalid model config in the database. Ignoring this model. ({row_data})"
)
else:
results.append(model_config)

View File

@@ -21,10 +21,16 @@ class ObjectSerializerDisk(ObjectSerializerBase[T]):
"""Disk-backed storage for arbitrary python objects. Serialization is handled by `torch.save` and `torch.load`.
:param output_dir: The folder where the serialized objects will be stored
:param safe_globals: A list of types to be added to the safe globals for torch serialization
:param ephemeral: If True, objects will be stored in a temporary directory inside the given output_dir and cleaned up on exit
"""
def __init__(self, output_dir: Path, ephemeral: bool = False):
def __init__(
self,
output_dir: Path,
safe_globals: list[type],
ephemeral: bool = False,
) -> None:
super().__init__()
self._ephemeral = ephemeral
self._base_output_dir = output_dir
@@ -42,6 +48,8 @@ class ObjectSerializerDisk(ObjectSerializerBase[T]):
self._output_dir = Path(self._tempdir.name) if self._tempdir else self._base_output_dir
self.__obj_class_name: Optional[str] = None
torch.serialization.add_safe_globals(safe_globals) if safe_globals else None
def load(self, name: str) -> T:
file_path = self._get_path(name)
try:

View File

@@ -201,6 +201,12 @@ def get_workflow(queue_item_dict: dict) -> Optional[WorkflowWithoutID]:
return None
class FieldIdentifier(BaseModel):
kind: Literal["input", "output"] = Field(description="The kind of field")
node_id: str = Field(description="The ID of the node")
field_name: str = Field(description="The name of the field")
class SessionQueueItemWithoutGraph(BaseModel):
"""Session queue item without the full graph. Used for serialization."""
@@ -237,6 +243,21 @@ class SessionQueueItemWithoutGraph(BaseModel):
retried_from_item_id: Optional[int] = Field(
default=None, description="The item_id of the queue item that this item was retried from"
)
is_api_validation_run: bool = Field(
default=False,
description="Whether this queue item is an API validation run.",
)
published_workflow_id: Optional[str] = Field(
default=None,
description="The ID of the published workflow associated with this queue item",
)
api_input_fields: Optional[list[FieldIdentifier]] = Field(
default=None, description="The fields that were used as input to the API"
)
api_output_fields: Optional[list[FieldIdentifier]] = Field(
default=None, description="The nodes that were used as output from the API"
)
credits: Optional[float] = Field(default=None, description="The total credits used for this queue item")
@classmethod
def queue_item_dto_from_dict(cls, queue_item_dict: dict) -> "SessionQueueItemDTO":

View File

@@ -21,6 +21,7 @@ from invokeai.app.invocations import * # noqa: F401 F403
from invokeai.app.invocations.baseinvocation import (
BaseInvocation,
BaseInvocationOutput,
InvocationRegistry,
invocation,
invocation_output,
)
@@ -283,7 +284,7 @@ class AnyInvocation(BaseInvocation):
@classmethod
def __get_pydantic_core_schema__(cls, source_type: Any, handler: GetCoreSchemaHandler) -> core_schema.CoreSchema:
def validate_invocation(v: Any) -> "AnyInvocation":
return BaseInvocation.get_typeadapter().validate_python(v)
return InvocationRegistry.get_invocation_typeadapter().validate_python(v)
return core_schema.no_info_plain_validator_function(validate_invocation)
@@ -294,7 +295,7 @@ class AnyInvocation(BaseInvocation):
# Nodes are too powerful, we have to make our own OpenAPI schema manually
# No but really, because the schema is dynamic depending on loaded nodes, we need to generate it manually
oneOf: list[dict[str, str]] = []
names = [i.__name__ for i in BaseInvocation.get_invocations()]
names = [i.__name__ for i in InvocationRegistry.get_invocation_classes()]
for name in sorted(names):
oneOf.append({"$ref": f"#/components/schemas/{name}"})
return {"oneOf": oneOf}
@@ -304,7 +305,7 @@ class AnyInvocationOutput(BaseInvocationOutput):
@classmethod
def __get_pydantic_core_schema__(cls, source_type: Any, handler: GetCoreSchemaHandler):
def validate_invocation_output(v: Any) -> "AnyInvocationOutput":
return BaseInvocationOutput.get_typeadapter().validate_python(v)
return InvocationRegistry.get_output_typeadapter().validate_python(v)
return core_schema.no_info_plain_validator_function(validate_invocation_output)
@@ -316,7 +317,7 @@ class AnyInvocationOutput(BaseInvocationOutput):
# No but really, because the schema is dynamic depending on loaded nodes, we need to generate it manually
oneOf: list[dict[str, str]] = []
names = [i.__name__ for i in BaseInvocationOutput.get_outputs()]
names = [i.__name__ for i in InvocationRegistry.get_output_classes()]
for name in sorted(names):
oneOf.append({"$ref": f"#/components/schemas/{name}"})
return {"oneOf": oneOf}

View File

@@ -18,9 +18,10 @@ from invokeai.app.services.invocation_services import InvocationServices
from invokeai.app.services.model_records.model_records_base import UnknownModelException
from invokeai.app.services.session_processor.session_processor_common import ProgressImage
from invokeai.app.services.shared.sqlite.sqlite_common import SQLiteDirection
from invokeai.app.util.step_callback import flux_step_callback, stable_diffusion_step_callback
from invokeai.app.util.step_callback import diffusion_step_callback
from invokeai.backend.model_manager.config import (
AnyModelConfig,
ModelConfigBase,
)
from invokeai.backend.model_manager.load.load_base import LoadedModel, LoadedModelWithoutConfig
from invokeai.backend.model_manager.taxonomy import AnyModel, BaseModelType, ModelFormat, ModelType, SubModelType
@@ -543,6 +544,30 @@ class ModelsInterface(InvocationContextInterface):
self._util.signal_progress(f"Loading model {source}")
return self._services.model_manager.load.load_model_from_path(model_path=model_path, loader=loader)
def get_absolute_path(self, config_or_path: AnyModelConfig | Path | str) -> Path:
"""Gets the absolute path for a given model config or path.
For example, if the model's path is `flux/main/FLUX Dev.safetensors`, and the models path is
`/home/username/InvokeAI/models`, this method will return
`/home/username/InvokeAI/models/flux/main/FLUX Dev.safetensors`.
Args:
config_or_path: The model config or path.
Returns:
The absolute path to the model.
"""
model_path = Path(config_or_path.path) if isinstance(config_or_path, ModelConfigBase) else Path(config_or_path)
if model_path.is_absolute():
return model_path.resolve()
base_models_path = self._services.configuration.models_path
joined_path = base_models_path / model_path
resolved_path = joined_path.resolve()
return resolved_path
class ConfigInterface(InvocationContextInterface):
def get(self) -> InvokeAIAppConfig:
@@ -582,7 +607,7 @@ class UtilInterface(InvocationContextInterface):
base_model: The base model for the current denoising step.
"""
stable_diffusion_step_callback(
diffusion_step_callback(
signal_progress=self.signal_progress,
intermediate_state=intermediate_state,
base_model=base_model,
@@ -600,9 +625,10 @@ class UtilInterface(InvocationContextInterface):
intermediate_state: The intermediate state of the diffusion pipeline.
"""
flux_step_callback(
diffusion_step_callback(
signal_progress=self.signal_progress,
intermediate_state=intermediate_state,
base_model=BaseModelType.Flux,
is_canceled=self.is_canceled,
)

View File

@@ -21,6 +21,7 @@ from invokeai.app.services.shared.sqlite_migrator.migrations.migration_15 import
from invokeai.app.services.shared.sqlite_migrator.migrations.migration_16 import build_migration_16
from invokeai.app.services.shared.sqlite_migrator.migrations.migration_17 import build_migration_17
from invokeai.app.services.shared.sqlite_migrator.migrations.migration_18 import build_migration_18
from invokeai.app.services.shared.sqlite_migrator.migrations.migration_19 import build_migration_19
from invokeai.app.services.shared.sqlite_migrator.sqlite_migrator_impl import SqliteMigrator
@@ -59,6 +60,7 @@ def init_db(config: InvokeAIAppConfig, logger: Logger, image_files: ImageFileSto
migrator.register_migration(build_migration_16())
migrator.register_migration(build_migration_17())
migrator.register_migration(build_migration_18())
migrator.register_migration(build_migration_19(app_config=config))
migrator.run_migrations()
return db

View File

@@ -0,0 +1,37 @@
import sqlite3
from invokeai.app.services.config import InvokeAIAppConfig
from invokeai.app.services.shared.sqlite_migrator.sqlite_migrator_common import Migration
from invokeai.backend.model_manager.model_on_disk import ModelOnDisk
class Migration19Callback:
def __init__(self, app_config: InvokeAIAppConfig):
self.models_path = app_config.models_path
def __call__(self, cursor: sqlite3.Cursor) -> None:
self._populate_size(cursor)
self._add_size_column(cursor)
def _add_size_column(self, cursor: sqlite3.Cursor) -> None:
cursor.execute(
"ALTER TABLE models ADD COLUMN file_size INTEGER "
"GENERATED ALWAYS as (json_extract(config, '$.file_size')) VIRTUAL NOT NULL"
)
def _populate_size(self, cursor: sqlite3.Cursor) -> None:
all_models = cursor.execute("SELECT id, path FROM models;").fetchall()
for model_id, model_path in all_models:
mod = ModelOnDisk(self.models_path / model_path)
cursor.execute(
"UPDATE models SET config = json_set(config, '$.file_size', ?) WHERE id = ?", (mod.size(), model_id)
)
def build_migration_19(app_config: InvokeAIAppConfig) -> Migration:
return Migration(
from_version=18,
to_version=19,
callback=Migration19Callback(app_config),
)

View File

@@ -0,0 +1,343 @@
{
"name": "Text to Image - CogView4",
"author": "",
"description": "Generate an image from a prompt with CogView4.",
"version": "",
"contact": "",
"tags": "CogView4, Text to Image",
"notes": "",
"exposedFields": [],
"meta": { "category": "default", "version": "3.0.0" },
"id": "default_0e405a8e-ab5e-4e6c-bd99-b59deabd5591",
"form": {
"elements": {
"container-XSINSu999B": {
"id": "container-XSINSu999B",
"data": {
"layout": "column",
"children": [
"heading-N0TXlsboP5",
"text-PVw8AvXCTz",
"divider-5wmCOm9mqG",
"node-field-gPil4XSw8L",
"node-field-T2oYYNrAzH",
"node-field-SRj6Dn28lm"
]
},
"type": "container"
},
"node-field-gPil4XSw8L": {
"id": "node-field-gPil4XSw8L",
"type": "node-field",
"parentId": "container-XSINSu999B",
"data": {
"fieldIdentifier": {
"nodeId": "a4569d8b-6a43-44b9-8919-4ceec6682904",
"fieldName": "prompt"
},
"settings": {
"type": "string-field-config",
"component": "textarea"
},
"showDescription": false
}
},
"node-field-T2oYYNrAzH": {
"id": "node-field-T2oYYNrAzH",
"type": "node-field",
"parentId": "container-XSINSu999B",
"data": {
"fieldIdentifier": {
"nodeId": "acb26944-1208-4016-9929-ab8dd0860573",
"fieldName": "prompt"
},
"settings": {
"type": "string-field-config",
"component": "textarea"
},
"showDescription": false
}
},
"node-field-SRj6Dn28lm": {
"id": "node-field-SRj6Dn28lm",
"type": "node-field",
"parentId": "container-XSINSu999B",
"data": {
"fieldIdentifier": {
"nodeId": "7890507c-d346-4d13-bcb4-bc6d4850b2e3",
"fieldName": "model"
},
"showDescription": false
}
},
"heading-N0TXlsboP5": {
"id": "heading-N0TXlsboP5",
"parentId": "container-XSINSu999B",
"type": "heading",
"data": { "content": "Text to Image - CogView4" }
},
"text-PVw8AvXCTz": {
"id": "text-PVw8AvXCTz",
"parentId": "container-XSINSu999B",
"type": "text",
"data": { "content": "Generate an image from a prompt with CogView4." }
},
"divider-5wmCOm9mqG": {
"id": "divider-5wmCOm9mqG",
"parentId": "container-XSINSu999B",
"type": "divider"
}
},
"rootElementId": "container-XSINSu999B"
},
"nodes": [
{
"id": "7890507c-d346-4d13-bcb4-bc6d4850b2e3",
"type": "invocation",
"data": {
"id": "7890507c-d346-4d13-bcb4-bc6d4850b2e3",
"version": "1.0.0",
"nodePack": "invokeai",
"label": "",
"notes": "",
"type": "cogview4_model_loader",
"inputs": {
"model": {
"name": "model",
"label": ""
}
},
"isOpen": true,
"isIntermediate": true,
"useCache": true
},
"position": { "x": -52.193850056888095, "y": 282.4721422789611 }
},
{
"id": "a4569d8b-6a43-44b9-8919-4ceec6682904",
"type": "invocation",
"data": {
"id": "a4569d8b-6a43-44b9-8919-4ceec6682904",
"version": "1.0.0",
"nodePack": "invokeai",
"label": "",
"notes": "",
"type": "cogview4_text_encoder",
"inputs": {
"prompt": {
"name": "prompt",
"label": "Positive Prompt",
"description": "",
"value": "A whimsical stuffed gnome sits on a golden sandy beach, its plush fabric slightly textured and well-worn. The gnome has a round, cheerful face with a fluffy white beard, a bulbous nose, and a tall, slightly floppy red hat with a few decorative stitching details. It wears a tiny blue vest over a soft, earthy-toned tunic, and its stubby arms grasp a ripe yellow banana with a few brown speckles. The ocean waves gently roll onto the shore in the background, with turquoise water reflecting the warm glow of the late afternoon sun. A few scattered seashells and driftwood pieces are near the gnome, while a colorful beach umbrella and footprints in the sand hint at a lively beach scene. The sky is a soft pastel blend of pink, orange, and light blue, with wispy clouds stretching across the horizon.\n"
},
"glm_encoder": {
"name": "glm_encoder",
"label": "",
"description": ""
}
},
"isOpen": true,
"isIntermediate": true,
"useCache": true
},
"position": { "x": 328.9380683664592, "y": 305.11768986950995 }
},
{
"id": "acb26944-1208-4016-9929-ab8dd0860573",
"type": "invocation",
"data": {
"id": "acb26944-1208-4016-9929-ab8dd0860573",
"version": "1.0.0",
"nodePack": "invokeai",
"label": "",
"notes": "",
"type": "cogview4_text_encoder",
"inputs": {
"prompt": {
"name": "prompt",
"label": "Negative Prompt",
"description": "",
"value": ""
},
"glm_encoder": {
"name": "glm_encoder",
"label": "",
"description": ""
}
},
"isOpen": true,
"isIntermediate": true,
"useCache": true
},
"position": { "x": 334.6799782744916, "y": 496.5882067536601 }
},
{
"id": "cdd72700-463d-4e10-8d76-3e842e4c0b49",
"type": "invocation",
"data": {
"id": "cdd72700-463d-4e10-8d76-3e842e4c0b49",
"version": "1.0.0",
"nodePack": "invokeai",
"label": "",
"notes": "",
"type": "cogview4_l2i",
"inputs": {
"board": {
"name": "board",
"label": "",
"description": "",
"value": "auto"
},
"metadata": { "name": "metadata", "label": "", "description": "" },
"latents": { "name": "latents", "label": "", "description": "" },
"vae": { "name": "vae", "label": "", "description": "" }
},
"isOpen": true,
"isIntermediate": false,
"useCache": true
},
"position": { "x": 1112.027247217991, "y": 294.1351498145327 }
},
{
"id": "e75e2ced-284e-4135-81dc-cdf06c7a409d",
"type": "invocation",
"data": {
"id": "e75e2ced-284e-4135-81dc-cdf06c7a409d",
"version": "1.0.0",
"nodePack": "invokeai",
"label": "",
"notes": "",
"type": "cogview4_denoise",
"inputs": {
"board": {
"name": "board",
"label": "",
"description": "",
"value": "auto"
},
"metadata": { "name": "metadata", "label": "", "description": "" },
"latents": { "name": "latents", "label": "", "description": "" },
"denoise_mask": {
"name": "denoise_mask",
"label": "",
"description": ""
},
"denoising_start": {
"name": "denoising_start",
"label": "",
"description": "",
"value": 0
},
"denoising_end": {
"name": "denoising_end",
"label": "",
"description": "",
"value": 1
},
"transformer": {
"name": "transformer",
"label": "",
"description": ""
},
"positive_conditioning": {
"name": "positive_conditioning",
"label": "",
"description": ""
},
"negative_conditioning": {
"name": "negative_conditioning",
"label": "",
"description": ""
},
"cfg_scale": {
"name": "cfg_scale",
"label": "",
"description": "",
"value": 3.5
},
"width": {
"name": "width",
"label": "",
"description": "",
"value": 1024
},
"height": {
"name": "height",
"label": "",
"description": "",
"value": 1024
},
"steps": {
"name": "steps",
"label": "",
"description": "",
"value": 30
},
"seed": { "name": "seed", "label": "", "description": "", "value": 0 }
},
"isOpen": true,
"isIntermediate": true,
"useCache": false
},
"position": { "x": 720.8830004638692, "y": 332.66609681908415 }
}
],
"edges": [
{
"id": "reactflow__edge-7890507c-d346-4d13-bcb4-bc6d4850b2e3vae-cdd72700-463d-4e10-8d76-3e842e4c0b49vae",
"type": "default",
"source": "7890507c-d346-4d13-bcb4-bc6d4850b2e3",
"target": "cdd72700-463d-4e10-8d76-3e842e4c0b49",
"sourceHandle": "vae",
"targetHandle": "vae"
},
{
"id": "reactflow__edge-7890507c-d346-4d13-bcb4-bc6d4850b2e3glm_encoder-a4569d8b-6a43-44b9-8919-4ceec6682904glm_encoder",
"type": "default",
"source": "7890507c-d346-4d13-bcb4-bc6d4850b2e3",
"target": "a4569d8b-6a43-44b9-8919-4ceec6682904",
"sourceHandle": "glm_encoder",
"targetHandle": "glm_encoder"
},
{
"id": "reactflow__edge-7890507c-d346-4d13-bcb4-bc6d4850b2e3glm_encoder-acb26944-1208-4016-9929-ab8dd0860573glm_encoder",
"type": "default",
"source": "7890507c-d346-4d13-bcb4-bc6d4850b2e3",
"target": "acb26944-1208-4016-9929-ab8dd0860573",
"sourceHandle": "glm_encoder",
"targetHandle": "glm_encoder"
},
{
"id": "reactflow__edge-a4569d8b-6a43-44b9-8919-4ceec6682904conditioning-e75e2ced-284e-4135-81dc-cdf06c7a409dpositive_conditioning",
"type": "default",
"source": "a4569d8b-6a43-44b9-8919-4ceec6682904",
"target": "e75e2ced-284e-4135-81dc-cdf06c7a409d",
"sourceHandle": "conditioning",
"targetHandle": "positive_conditioning"
},
{
"id": "reactflow__edge-acb26944-1208-4016-9929-ab8dd0860573conditioning-e75e2ced-284e-4135-81dc-cdf06c7a409dnegative_conditioning",
"type": "default",
"source": "acb26944-1208-4016-9929-ab8dd0860573",
"target": "e75e2ced-284e-4135-81dc-cdf06c7a409d",
"sourceHandle": "conditioning",
"targetHandle": "negative_conditioning"
},
{
"id": "reactflow__edge-e75e2ced-284e-4135-81dc-cdf06c7a409dlatents-cdd72700-463d-4e10-8d76-3e842e4c0b49latents",
"type": "default",
"source": "e75e2ced-284e-4135-81dc-cdf06c7a409d",
"target": "cdd72700-463d-4e10-8d76-3e842e4c0b49",
"sourceHandle": "latents",
"targetHandle": "latents"
},
{
"id": "reactflow__edge-7890507c-d346-4d13-bcb4-bc6d4850b2e3transformer-e75e2ced-284e-4135-81dc-cdf06c7a409dtransformer",
"type": "default",
"source": "7890507c-d346-4d13-bcb4-bc6d4850b2e3",
"target": "e75e2ced-284e-4135-81dc-cdf06c7a409d",
"sourceHandle": "transformer",
"targetHandle": "transformer"
}
]
}

View File

@@ -47,6 +47,7 @@ class WorkflowRecordsStorageBase(ABC):
query: Optional[str],
tags: Optional[list[str]],
has_been_opened: Optional[bool],
is_published: Optional[bool],
) -> PaginatedResults[WorkflowRecordListItemDTO]:
"""Gets many workflows."""
pass
@@ -56,6 +57,7 @@ class WorkflowRecordsStorageBase(ABC):
self,
categories: list[WorkflowCategory],
has_been_opened: Optional[bool] = None,
is_published: Optional[bool] = None,
) -> dict[str, int]:
"""Gets a dictionary of counts for each of the provided categories."""
pass
@@ -66,6 +68,7 @@ class WorkflowRecordsStorageBase(ABC):
tags: list[str],
categories: Optional[list[WorkflowCategory]] = None,
has_been_opened: Optional[bool] = None,
is_published: Optional[bool] = None,
) -> dict[str, int]:
"""Gets a dictionary of counts for each of the provided tags."""
pass

View File

@@ -67,6 +67,7 @@ class WorkflowWithoutID(BaseModel):
# This is typed as optional to prevent errors when pulling workflows from the DB. The frontend adds a default form if
# it is None.
form: dict[str, JsonValue] | None = Field(default=None, description="The form of the workflow.")
is_published: bool | None = Field(default=None, description="Whether the workflow is published or not.")
model_config = ConfigDict(extra="ignore")
@@ -101,6 +102,7 @@ class WorkflowRecordDTOBase(BaseModel):
opened_at: Optional[Union[datetime.datetime, str]] = Field(
default=None, description="The opened timestamp of the workflow."
)
is_published: bool | None = Field(default=None, description="Whether the workflow is published or not.")
class WorkflowRecordDTO(WorkflowRecordDTOBase):

View File

@@ -119,6 +119,7 @@ class SqliteWorkflowRecordsStorage(WorkflowRecordsStorageBase):
query: Optional[str] = None,
tags: Optional[list[str]] = None,
has_been_opened: Optional[bool] = None,
is_published: Optional[bool] = None,
) -> PaginatedResults[WorkflowRecordListItemDTO]:
# sanitize!
assert order_by in WorkflowRecordOrderBy
@@ -241,6 +242,7 @@ class SqliteWorkflowRecordsStorage(WorkflowRecordsStorageBase):
tags: list[str],
categories: Optional[list[WorkflowCategory]] = None,
has_been_opened: Optional[bool] = None,
is_published: Optional[bool] = None,
) -> dict[str, int]:
if not tags:
return {}
@@ -292,6 +294,7 @@ class SqliteWorkflowRecordsStorage(WorkflowRecordsStorageBase):
self,
categories: list[WorkflowCategory],
has_been_opened: Optional[bool] = None,
is_published: Optional[bool] = None,
) -> dict[str, int]:
cursor = self._conn.cursor()
result: dict[str, int] = {}

View File

@@ -4,7 +4,10 @@ from fastapi import FastAPI
from fastapi.openapi.utils import get_openapi
from pydantic.json_schema import models_json_schema
from invokeai.app.invocations.baseinvocation import BaseInvocation, BaseInvocationOutput, UIConfigBase
from invokeai.app.invocations.baseinvocation import (
InvocationRegistry,
UIConfigBase,
)
from invokeai.app.invocations.fields import InputFieldJSONSchemaExtra, OutputFieldJSONSchemaExtra
from invokeai.app.invocations.model import ModelIdentifierField
from invokeai.app.services.events.events_common import EventBase
@@ -56,14 +59,18 @@ def get_openapi_func(
invocation_output_map_required: list[str] = []
# We need to manually add all outputs to the schema - pydantic doesn't add them because they aren't used directly.
for output in BaseInvocationOutput.get_outputs():
for output in InvocationRegistry.get_output_classes():
json_schema = output.model_json_schema(mode="serialization", ref_template="#/components/schemas/{model}")
# Remove output_metadata that is only used on back-end from the schema
if "output_meta" in json_schema["properties"]:
json_schema["properties"].pop("output_meta")
move_defs_to_top_level(openapi_schema, json_schema)
openapi_schema["components"]["schemas"][output.__name__] = json_schema
# Technically, invocations are added to the schema by pydantic, but we still need to manually set their output
# property, so we'll just do it all manually.
for invocation in BaseInvocation.get_invocations():
for invocation in InvocationRegistry.get_invocation_classes():
json_schema = invocation.model_json_schema(
mode="serialization", ref_template="#/components/schemas/{model}"
)

View File

@@ -10,7 +10,7 @@ def get_timestamp() -> int:
def get_iso_timestamp() -> str:
return datetime.datetime.utcnow().isoformat()
return datetime.datetime.now(datetime.timezone.utc).isoformat()
def get_datetime_from_iso_timestamp(iso_timestamp: str) -> datetime.datetime:

View File

@@ -65,9 +65,6 @@ def apply_monkeypatches() -> None:
import invokeai.backend.util.hotfixes # noqa: F401 (monkeypatching on import)
if torch.backends.mps.is_available():
import invokeai.backend.util.mps_fixes # noqa: F401 (monkeypatching on import)
def register_mime_types() -> None:
"""Register additional mime types for windows."""

View File

@@ -8,6 +8,8 @@ from invokeai.app.services.session_processor.session_processor_common import Can
from invokeai.backend.model_manager.taxonomy import BaseModelType
from invokeai.backend.stable_diffusion.diffusers_pipeline import PipelineIntermediateState
# See scripts/generate_vae_linear_approximation.py for generating these factors.
# fast latents preview matrix for sdxl
# generated by @StAlKeR7779
SDXL_LATENT_RGB_FACTORS = [
@@ -72,11 +74,32 @@ FLUX_LATENT_RGB_FACTORS = [
[-0.1146, -0.0827, -0.0598],
]
COGVIEW4_LATENT_RGB_FACTORS = [
[0.00408832, -0.00082485, -0.00214816],
[0.00084172, 0.00132241, 0.00842067],
[-0.00466737, -0.00983181, -0.00699561],
[0.03698397, -0.04797235, 0.03585809],
[0.00234701, -0.00124326, 0.00080869],
[-0.00723903, -0.00388422, -0.00656606],
[-0.00970917, -0.00467356, -0.00971113],
[0.17292486, -0.03452463, -0.1457515],
[0.02330308, 0.02942557, 0.02704329],
[-0.00903131, -0.01499841, -0.01432564],
[0.01250298, 0.0019407, -0.02168986],
[0.01371188, 0.00498283, -0.01302135],
[0.42396525, 0.4280575, 0.42148206],
[0.00983825, 0.00613302, 0.00610316],
[0.00473307, -0.00889551, -0.00915924],
[-0.00955853, -0.00980067, -0.00977842],
]
def sample_to_lowres_estimated_image(
samples: torch.Tensor, latent_rgb_factors: torch.Tensor, smooth_matrix: Optional[torch.Tensor] = None
):
latent_image = samples[0].permute(1, 2, 0) @ latent_rgb_factors
if samples.dim() == 4:
samples = samples[0]
latent_image = samples.permute(1, 2, 0) @ latent_rgb_factors
if smooth_matrix is not None:
latent_image = latent_image.unsqueeze(0).permute(3, 0, 1, 2)
@@ -108,7 +131,7 @@ def calc_percentage(intermediate_state: PipelineIntermediateState) -> float:
SignalProgressFunc: TypeAlias = Callable[[str, float | None, Image.Image | None, tuple[int, int] | None], None]
def stable_diffusion_step_callback(
def diffusion_step_callback(
signal_progress: SignalProgressFunc,
intermediate_state: PipelineIntermediateState,
base_model: BaseModelType,
@@ -125,39 +148,28 @@ def stable_diffusion_step_callback(
else:
sample = intermediate_state.latents
if base_model in [BaseModelType.StableDiffusionXL, BaseModelType.StableDiffusionXLRefiner]:
sdxl_latent_rgb_factors = torch.tensor(SDXL_LATENT_RGB_FACTORS, dtype=sample.dtype, device=sample.device)
sdxl_smooth_matrix = torch.tensor(SDXL_SMOOTH_MATRIX, dtype=sample.dtype, device=sample.device)
image = sample_to_lowres_estimated_image(sample, sdxl_latent_rgb_factors, sdxl_smooth_matrix)
smooth_matrix: list[list[float]] | None = None
if base_model in [BaseModelType.StableDiffusion1, BaseModelType.StableDiffusion2]:
latent_rgb_factors = SD1_5_LATENT_RGB_FACTORS
elif base_model in [BaseModelType.StableDiffusionXL, BaseModelType.StableDiffusionXLRefiner]:
latent_rgb_factors = SDXL_LATENT_RGB_FACTORS
smooth_matrix = SDXL_SMOOTH_MATRIX
elif base_model == BaseModelType.StableDiffusion3:
sd3_latent_rgb_factors = torch.tensor(SD3_5_LATENT_RGB_FACTORS, dtype=sample.dtype, device=sample.device)
image = sample_to_lowres_estimated_image(sample, sd3_latent_rgb_factors)
latent_rgb_factors = SD3_5_LATENT_RGB_FACTORS
elif base_model == BaseModelType.CogView4:
latent_rgb_factors = COGVIEW4_LATENT_RGB_FACTORS
elif base_model == BaseModelType.Flux:
latent_rgb_factors = FLUX_LATENT_RGB_FACTORS
else:
v1_5_latent_rgb_factors = torch.tensor(SD1_5_LATENT_RGB_FACTORS, dtype=sample.dtype, device=sample.device)
image = sample_to_lowres_estimated_image(sample, v1_5_latent_rgb_factors)
width = image.width * 8
height = image.height * 8
percentage = calc_percentage(intermediate_state)
signal_progress("Denoising", percentage, image, (width, height))
def flux_step_callback(
signal_progress: SignalProgressFunc,
intermediate_state: PipelineIntermediateState,
is_canceled: Callable[[], bool],
) -> None:
if is_canceled():
raise CanceledException
sample = intermediate_state.latents
latent_rgb_factors = torch.tensor(FLUX_LATENT_RGB_FACTORS, dtype=sample.dtype, device=sample.device)
latent_image_perm = sample.permute(1, 2, 0).to(dtype=sample.dtype, device=sample.device)
latent_image = latent_image_perm @ latent_rgb_factors
latents_ubyte = (
((latent_image + 1) / 2).clamp(0, 1).mul(0xFF) # change scale from -1..1 to 0..1 # to 0..255
).to(device="cpu", dtype=torch.uint8)
image = Image.fromarray(latents_ubyte.cpu().numpy())
raise ValueError(f"Unsupported base model: {base_model}")
latent_rgb_factors_torch = torch.tensor(latent_rgb_factors, dtype=sample.dtype, device=sample.device)
smooth_matrix_torch = (
torch.tensor(smooth_matrix, dtype=sample.dtype, device=sample.device) if smooth_matrix else None
)
image = sample_to_lowres_estimated_image(
samples=sample, latent_rgb_factors=latent_rgb_factors_torch, smooth_matrix=smooth_matrix_torch
)
width = image.width * 8
height = image.height * 8

View File

@@ -5,12 +5,12 @@ import torch
from tqdm import tqdm
from invokeai.backend.flux.controlnet.controlnet_flux_output import ControlNetFluxOutput, sum_controlnet_flux_outputs
from invokeai.backend.flux.extensions.inpaint_extension import InpaintExtension
from invokeai.backend.flux.extensions.instantx_controlnet_extension import InstantXControlNetExtension
from invokeai.backend.flux.extensions.regional_prompting_extension import RegionalPromptingExtension
from invokeai.backend.flux.extensions.xlabs_controlnet_extension import XLabsControlNetExtension
from invokeai.backend.flux.extensions.xlabs_ip_adapter_extension import XLabsIPAdapterExtension
from invokeai.backend.flux.model import Flux
from invokeai.backend.rectified_flow.rectified_flow_inpaint_extension import RectifiedFlowInpaintExtension
from invokeai.backend.stable_diffusion.diffusers_pipeline import PipelineIntermediateState
@@ -26,7 +26,7 @@ def denoise(
step_callback: Callable[[PipelineIntermediateState], None],
guidance: float,
cfg_scale: list[float],
inpaint_extension: InpaintExtension | None,
inpaint_extension: RectifiedFlowInpaintExtension | None,
controlnet_extensions: list[XLabsControlNetExtension | InstantXControlNetExtension],
pos_ip_adapter_extensions: list[XLabsIPAdapterExtension],
neg_ip_adapter_extensions: list[XLabsIPAdapterExtension],

View File

@@ -5,62 +5,14 @@ import huggingface_hub
import numpy as np
import onnxruntime as ort
import torch
from controlnet_aux.util import resize_image
from PIL import Image
from invokeai.backend.image_util.dw_openpose.onnxdet import inference_detector
from invokeai.backend.image_util.dw_openpose.onnxpose import inference_pose
from invokeai.backend.image_util.dw_openpose.utils import NDArrayInt, draw_bodypose, draw_facepose, draw_handpose
from invokeai.backend.image_util.dw_openpose.wholebody import Wholebody
from invokeai.backend.image_util.util import np_to_pil
from invokeai.backend.util.devices import TorchDevice
DWPOSE_MODELS = {
"yolox_l.onnx": "https://huggingface.co/yzd-v/DWPose/resolve/main/yolox_l.onnx?download=true",
"dw-ll_ucoco_384.onnx": "https://huggingface.co/yzd-v/DWPose/resolve/main/dw-ll_ucoco_384.onnx?download=true",
}
def draw_pose(
pose: Dict[str, NDArrayInt | Dict[str, NDArrayInt]],
H: int,
W: int,
draw_face: bool = True,
draw_body: bool = True,
draw_hands: bool = True,
resolution: int = 512,
) -> Image.Image:
bodies = pose["bodies"]
faces = pose["faces"]
hands = pose["hands"]
assert isinstance(bodies, dict)
candidate = bodies["candidate"]
assert isinstance(bodies, dict)
subset = bodies["subset"]
canvas = np.zeros(shape=(H, W, 3), dtype=np.uint8)
if draw_body:
canvas = draw_bodypose(canvas, candidate, subset)
if draw_hands:
assert isinstance(hands, np.ndarray)
canvas = draw_handpose(canvas, hands)
if draw_face:
assert isinstance(hands, np.ndarray)
canvas = draw_facepose(canvas, faces) # type: ignore
dwpose_image: Image.Image = resize_image(
canvas,
resolution,
)
dwpose_image = Image.fromarray(dwpose_image)
return dwpose_image
class DWOpenposeDetector:
"""
@@ -68,62 +20,6 @@ class DWOpenposeDetector:
Credits: https://github.com/IDEA-Research/DWPose
"""
def __init__(self, onnx_det: Path, onnx_pose: Path) -> None:
self.pose_estimation = Wholebody(onnx_det=onnx_det, onnx_pose=onnx_pose)
def __call__(
self,
image: Image.Image,
draw_face: bool = False,
draw_body: bool = True,
draw_hands: bool = False,
resolution: int = 512,
) -> Image.Image:
np_image = np.array(image)
H, W, C = np_image.shape
with torch.no_grad():
candidate, subset = self.pose_estimation(np_image)
nums, keys, locs = candidate.shape
candidate[..., 0] /= float(W)
candidate[..., 1] /= float(H)
body = candidate[:, :18].copy()
body = body.reshape(nums * 18, locs)
score = subset[:, :18]
for i in range(len(score)):
for j in range(len(score[i])):
if score[i][j] > 0.3:
score[i][j] = int(18 * i + j)
else:
score[i][j] = -1
un_visible = subset < 0.3
candidate[un_visible] = -1
# foot = candidate[:, 18:24]
faces = candidate[:, 24:92]
hands = candidate[:, 92:113]
hands = np.vstack([hands, candidate[:, 113:]])
bodies = {"candidate": body, "subset": score}
pose = {"bodies": bodies, "hands": hands, "faces": faces}
return draw_pose(
pose, H, W, draw_face=draw_face, draw_hands=draw_hands, draw_body=draw_body, resolution=resolution
)
class DWOpenposeDetector2:
"""
Code from the original implementation of the DW Openpose Detector.
Credits: https://github.com/IDEA-Research/DWPose
This implementation is similar to DWOpenposeDetector, with some alterations to allow the onnx models to be loaded
and managed by the model manager.
"""
hf_repo_id = "yzd-v/DWPose"
hf_filename_onnx_det = "yolox_l.onnx"
hf_filename_onnx_pose = "dw-ll_ucoco_384.onnx"
@@ -213,7 +109,7 @@ class DWOpenposeDetector2:
bodies = {"candidate": body, "subset": score}
pose = {"bodies": bodies, "hands": hands, "faces": faces}
return DWOpenposeDetector2.draw_pose(
return DWOpenposeDetector.draw_pose(
pose, H, W, draw_face=draw_face, draw_hands=draw_hands, draw_body=draw_body
)

View File

@@ -3,7 +3,6 @@
import math
import cv2
import matplotlib
import numpy as np
import numpy.typing as npt
@@ -127,11 +126,13 @@ def draw_handpose(canvas: NDArrayInt, all_hand_peaks: NDArrayInt) -> NDArrayInt:
x2 = int(x2 * W)
y2 = int(y2 * H)
if x1 > eps and y1 > eps and x2 > eps and y2 > eps:
hsv_color = np.array([[[ie / float(len(edges)) * 180, 255, 255]]], dtype=np.uint8)
rgb_color = cv2.cvtColor(hsv_color, cv2.COLOR_HSV2RGB)[0, 0]
cv2.line(
canvas,
(x1, y1),
(x2, y2),
matplotlib.colors.hsv_to_rgb([ie / float(len(edges)), 1.0, 1.0]) * 255,
rgb_color.tolist(),
thickness=2,
)

View File

@@ -1,44 +0,0 @@
# Code from the original DWPose Implementation: https://github.com/IDEA-Research/DWPose
# Modified pathing to suit Invoke
from pathlib import Path
import numpy as np
import onnxruntime as ort
from invokeai.app.services.config.config_default import get_config
from invokeai.backend.image_util.dw_openpose.onnxdet import inference_detector
from invokeai.backend.image_util.dw_openpose.onnxpose import inference_pose
from invokeai.backend.util.devices import TorchDevice
config = get_config()
class Wholebody:
def __init__(self, onnx_det: Path, onnx_pose: Path):
device = TorchDevice.choose_torch_device()
providers = ["CUDAExecutionProvider"] if device.type == "cuda" else ["CPUExecutionProvider"]
self.session_det = ort.InferenceSession(path_or_bytes=onnx_det, providers=providers)
self.session_pose = ort.InferenceSession(path_or_bytes=onnx_pose, providers=providers)
def __call__(self, oriImg):
det_result = inference_detector(self.session_det, oriImg)
keypoints, scores = inference_pose(self.session_pose, det_result, oriImg)
keypoints_info = np.concatenate((keypoints, scores[..., None]), axis=-1)
# compute neck joint
neck = np.mean(keypoints_info[:, [5, 6]], axis=1)
# neck score when visualizing pred
neck[:, 2:4] = np.logical_and(keypoints_info[:, 5, 2:4] > 0.3, keypoints_info[:, 6, 2:4] > 0.3).astype(int)
new_keypoints_info = np.insert(keypoints_info, 17, neck, axis=1)
mmpose_idx = [17, 6, 8, 10, 7, 9, 12, 14, 16, 13, 15, 2, 1, 4, 3]
openpose_idx = [1, 2, 3, 4, 6, 7, 8, 9, 10, 12, 13, 14, 15, 16, 17]
new_keypoints_info[:, openpose_idx] = new_keypoints_info[:, mmpose_idx]
keypoints_info = new_keypoints_info
keypoints, scores = keypoints_info[..., :2], keypoints_info[..., 2]
return keypoints, scores

View File

@@ -1,26 +1,15 @@
from pathlib import Path
from typing import Optional
import torch
from PIL.Image import Image
from transformers import AutoProcessor, LlavaOnevisionForConditionalGeneration, LlavaOnevisionProcessor
from invokeai.backend.raw_model import RawModel
from transformers import LlavaOnevisionForConditionalGeneration, LlavaOnevisionProcessor
class LlavaOnevisionModel(RawModel):
class LlavaOnevisionPipeline:
"""A wrapper for a LLaVA Onevision model + processor."""
def __init__(self, vllm_model: LlavaOnevisionForConditionalGeneration, processor: LlavaOnevisionProcessor):
self._vllm_model = vllm_model
self._processor = processor
@classmethod
def load_from_path(cls, path: str | Path):
vllm_model = LlavaOnevisionForConditionalGeneration.from_pretrained(path, local_files_only=True)
assert isinstance(vllm_model, LlavaOnevisionForConditionalGeneration)
processor = AutoProcessor.from_pretrained(path, local_files_only=True)
assert isinstance(processor, LlavaOnevisionProcessor)
return cls(vllm_model, processor)
def run(self, prompt: str, images: list[Image], device: torch.device, dtype: torch.dtype) -> str:
# TODO(ryand): Tune the max number of images that are useful for the model.
if len(images) > 3:
@@ -44,13 +33,3 @@ class LlavaOnevisionModel(RawModel):
# The output_str will include the prompt, so we extract the response.
response = output_str.split("assistant\n", 1)[1].strip()
return response
def to(self, device: Optional[torch.device] = None, dtype: Optional[torch.dtype] = None) -> None:
self._vllm_model.to(device=device, dtype=dtype)
def calc_size(self) -> int:
"""Get size of the model in memory in bytes."""
# HACK(ryand): Fix this issue with circular imports.
from invokeai.backend.model_manager.load.model_util import calc_module_size
return calc_module_size(self._vllm_model)

View File

@@ -30,19 +30,18 @@ from inspect import isabstract
from pathlib import Path
from typing import ClassVar, Literal, Optional, TypeAlias, Union
import safetensors.torch
import torch
from picklescan.scanner import scan_file_path
from pydantic import BaseModel, ConfigDict, Discriminator, Field, Tag, TypeAdapter
from typing_extensions import Annotated, Any, Dict
from invokeai.app.util.misc import uuid_string
from invokeai.backend.model_hash.hash_validator import validate_hash
from invokeai.backend.model_hash.model_hash import HASHING_ALGORITHMS, ModelHash
from invokeai.backend.model_hash.model_hash import HASHING_ALGORITHMS
from invokeai.backend.model_manager.model_on_disk import ModelOnDisk
from invokeai.backend.model_manager.taxonomy import (
AnyVariant,
BaseModelType,
ClipVariantType,
FluxLoRAFormat,
ModelFormat,
ModelRepoVariant,
ModelSourceType,
@@ -51,9 +50,8 @@ from invokeai.backend.model_manager.taxonomy import (
SchedulerPredictionType,
SubModelType,
)
from invokeai.backend.quantization.gguf.loaders import gguf_sd_loader
from invokeai.backend.model_manager.util.model_util import lora_token_vector_length
from invokeai.backend.stable_diffusion.schedulers.schedulers import SCHEDULER_NAME_VALUES
from invokeai.backend.util.silence_warnings import SilenceWarnings
logger = logging.getLogger(__name__)
@@ -67,11 +65,6 @@ class InvalidModelConfigException(Exception):
DEFAULTS_PRECISION = Literal["fp16", "fp32"]
class FSLayout(Enum):
FILE = "file"
DIRECTORY = "directory"
class SubmodelDefinition(BaseModel):
path_or_prefix: str
model_type: ModelType
@@ -102,87 +95,6 @@ class ControlAdapterDefaultSettings(BaseModel):
model_config = ConfigDict(extra="forbid")
class ModelOnDisk:
"""A utility class representing a model stored on disk."""
def __init__(self, path: Path, hash_algo: HASHING_ALGORITHMS = "blake3_single"):
self.path = path
# TODO: Revisit checkpoint vs diffusers terminology
self.layout = FSLayout.DIRECTORY if path.is_dir() else FSLayout.FILE
if self.path.suffix in {".safetensors", ".bin", ".pt", ".ckpt"}:
self.name = path.stem
else:
self.name = path.name
self.hash_algo = hash_algo
self._state_dict_cache = {}
def hash(self) -> str:
return ModelHash(algorithm=self.hash_algo).hash(self.path)
def size(self) -> int:
if self.layout == FSLayout.FILE:
return self.path.stat().st_size
return sum(file.stat().st_size for file in self.path.rglob("*"))
def component_paths(self) -> set[Path]:
if self.layout == FSLayout.FILE:
return {self.path}
extensions = {".safetensors", ".pt", ".pth", ".ckpt", ".bin", ".gguf"}
return {f for f in self.path.rglob("*") if f.suffix in extensions}
def repo_variant(self) -> Optional[ModelRepoVariant]:
if self.layout == FSLayout.FILE:
return None
weight_files = list(self.path.glob("**/*.safetensors"))
weight_files.extend(list(self.path.glob("**/*.bin")))
for x in weight_files:
if ".fp16" in x.suffixes:
return ModelRepoVariant.FP16
if "openvino_model" in x.name:
return ModelRepoVariant.OpenVINO
if "flax_model" in x.name:
return ModelRepoVariant.Flax
if x.suffix == ".onnx":
return ModelRepoVariant.ONNX
return ModelRepoVariant.Default
def load_state_dict(self, path: Optional[Path] = None) -> Dict[str | int, Any]:
if path in self._state_dict_cache:
return self._state_dict_cache[path]
if not path:
components = list(self.component_paths())
match components:
case []:
raise ValueError("No weight files found for this model")
case [p]:
path = p
case ps if len(ps) >= 2:
raise ValueError(
f"Multiple weight files found for this model: {ps}. "
f"Please specify the intended file using the 'path' argument"
)
with SilenceWarnings():
if path.suffix.endswith((".ckpt", ".pt", ".pth", ".bin")):
scan_result = scan_file_path(path)
if scan_result.infected_files != 0 or scan_result.scan_err:
raise RuntimeError(f"The model {path.stem} is potentially infected by malware. Aborting import.")
checkpoint = torch.load(path, map_location="cpu")
assert isinstance(checkpoint, dict)
elif path.suffix.endswith(".gguf"):
checkpoint = gguf_sd_loader(path, compute_dtype=torch.float32)
elif path.suffix.endswith(".safetensors"):
checkpoint = safetensors.torch.load_file(path)
else:
raise ValueError(f"Unrecognized model extension: {path.suffix}")
state_dict = checkpoint.get("state_dict", checkpoint)
self._state_dict_cache[path] = state_dict
return state_dict
class MatchSpeed(int, Enum):
"""Represents the estimated runtime speed of a config's 'matches' method."""
@@ -216,6 +128,7 @@ class ModelConfigBase(ABC, BaseModel):
path: str = Field(
description="Path to the model on the filesystem. Relative paths are relative to the Invoke root directory."
)
file_size: int = Field(description="The size of the model in bytes.")
name: str = Field(description="Name of the model.")
type: ModelType = Field(description="Model type")
format: ModelFormat = Field(description="Model format")
@@ -231,6 +144,7 @@ class ModelConfigBase(ABC, BaseModel):
submodels: Optional[Dict[SubModelType, SubmodelDefinition]] = Field(
description="Loadable submodels in this model", default=None
)
usage_info: Optional[str] = Field(default=None, description="Usage information for this model")
_USING_LEGACY_PROBE: ClassVar[set] = set()
_USING_CLASSIFY_API: ClassVar[set] = set()
@@ -257,7 +171,7 @@ class ModelConfigBase(ABC, BaseModel):
Created to deprecate ModelProbe.probe
"""
candidates = ModelConfigBase._USING_CLASSIFY_API
sorted_by_match_speed = sorted(candidates, key=lambda cls: cls._MATCH_SPEED)
sorted_by_match_speed = sorted(candidates, key=lambda cls: (cls._MATCH_SPEED, cls.__name__))
mod = ModelOnDisk(model_path, hash_algo)
for config_cls in sorted_by_match_speed:
@@ -308,6 +222,9 @@ class ModelConfigBase(ABC, BaseModel):
if "source_type" in overrides:
overrides["source_type"] = ModelSourceType(overrides["source_type"])
if "variant" in overrides:
overrides["variant"] = ModelVariantType(overrides["variant"])
@classmethod
def from_model_on_disk(cls, mod: ModelOnDisk, **overrides):
"""Creates an instance of this config or raises InvalidModelConfigException."""
@@ -326,6 +243,7 @@ class ModelConfigBase(ABC, BaseModel):
fields["key"] = fields.get("key") or uuid_string()
fields["description"] = fields.get("description") or f"{base.value} {type.value} model {name}"
fields["repo_variant"] = fields.get("repo_variant") or mod.repo_variant()
fields["file_size"] = fields.get("file_size") or mod.size()
return cls(**fields)
@@ -367,6 +285,38 @@ class LoRAConfigBase(ABC, BaseModel):
type: Literal[ModelType.LoRA] = ModelType.LoRA
trigger_phrases: Optional[set[str]] = Field(description="Set of trigger phrases for this model", default=None)
@classmethod
def flux_lora_format(cls, mod: ModelOnDisk):
key = "FLUX_LORA_FORMAT"
if key in mod.cache:
return mod.cache[key]
from invokeai.backend.patches.lora_conversions.formats import flux_format_from_state_dict
sd = mod.load_state_dict(mod.path)
value = flux_format_from_state_dict(sd)
mod.cache[key] = value
return value
@classmethod
def base_model(cls, mod: ModelOnDisk) -> BaseModelType:
if cls.flux_lora_format(mod):
return BaseModelType.Flux
state_dict = mod.load_state_dict()
# If we've gotten here, we assume that the model is a Stable Diffusion model
token_vector_length = lora_token_vector_length(state_dict)
if token_vector_length == 768:
return BaseModelType.StableDiffusion1
elif token_vector_length == 1024:
return BaseModelType.StableDiffusion2
elif token_vector_length == 1280:
return BaseModelType.StableDiffusionXL # recognizes format at https://civitai.com/models/224641
elif token_vector_length == 2048:
return BaseModelType.StableDiffusionXL
else:
raise InvalidModelConfigException("Unknown LoRA type")
class T5EncoderConfigBase(ABC, BaseModel):
"""Base class for diffusers-style models."""
@@ -382,11 +332,40 @@ class T5EncoderBnbQuantizedLlmInt8bConfig(T5EncoderConfigBase, LegacyProbeMixin,
format: Literal[ModelFormat.BnbQuantizedLlmInt8b] = ModelFormat.BnbQuantizedLlmInt8b
class LoRALyCORISConfig(LoRAConfigBase, LegacyProbeMixin, ModelConfigBase):
class LoRALyCORISConfig(LoRAConfigBase, ModelConfigBase):
"""Model config for LoRA/Lycoris models."""
format: Literal[ModelFormat.LyCORIS] = ModelFormat.LyCORIS
@classmethod
def matches(cls, mod: ModelOnDisk) -> bool:
if mod.path.is_dir():
return False
# Avoid false positive match against ControlLoRA and Diffusers
if cls.flux_lora_format(mod) in [FluxLoRAFormat.Control, FluxLoRAFormat.Diffusers]:
return False
state_dict = mod.load_state_dict()
for key in state_dict.keys():
if type(key) is int:
continue
if key.startswith(("lora_te_", "lora_unet_", "lora_te1_", "lora_te2_", "lora_transformer_")):
return True
# "lora_A.weight" and "lora_B.weight" are associated with models in PEFT format. We don't support all PEFT
# LoRA models, but as of the time of writing, we support Diffusers FLUX PEFT LoRA models.
if key.endswith(("to_k_lora.up.weight", "to_q_lora.down.weight", "lora_A.weight", "lora_B.weight")):
return True
return False
@classmethod
def parse(cls, mod: ModelOnDisk) -> dict[str, Any]:
return {
"base": cls.base_model(mod),
}
class ControlAdapterConfigBase(ABC, BaseModel):
default_settings: Optional[ControlAdapterDefaultSettings] = Field(
@@ -410,11 +389,26 @@ class ControlLoRADiffusersConfig(ControlAdapterConfigBase, LegacyProbeMixin, Mod
format: Literal[ModelFormat.Diffusers] = ModelFormat.Diffusers
class LoRADiffusersConfig(LoRAConfigBase, LegacyProbeMixin, ModelConfigBase):
class LoRADiffusersConfig(LoRAConfigBase, ModelConfigBase):
"""Model config for LoRA/Diffusers models."""
format: Literal[ModelFormat.Diffusers] = ModelFormat.Diffusers
@classmethod
def matches(cls, mod: ModelOnDisk) -> bool:
if mod.path.is_file():
return cls.flux_lora_format(mod) == FluxLoRAFormat.Diffusers
suffixes = ["bin", "safetensors"]
weight_files = [mod.path / f"pytorch_lora_weights.{sfx}" for sfx in suffixes]
return any(wf.exists() for wf in weight_files)
@classmethod
def parse(cls, mod: ModelOnDisk) -> dict[str, Any]:
return {
"base": cls.base_model(mod),
}
class VAECheckpointConfig(CheckpointConfigBase, LegacyProbeMixin, ModelConfigBase):
"""Model config for standalone VAE models."""
@@ -586,7 +580,7 @@ class LlavaOnevisionConfig(DiffusersConfigBase, ModelConfigBase):
@classmethod
def matches(cls, mod: ModelOnDisk) -> bool:
if mod.layout == FSLayout.FILE:
if mod.path.is_file():
return False
config_path = mod.path / "config.json"
@@ -607,6 +601,21 @@ class LlavaOnevisionConfig(DiffusersConfigBase, ModelConfigBase):
}
class ApiModelConfig(MainConfigBase, ModelConfigBase):
"""Model config for API-based models."""
format: Literal[ModelFormat.Api] = ModelFormat.Api
@classmethod
def matches(cls, mod: ModelOnDisk) -> bool:
# API models are not stored on disk, so we can't match them.
return False
@classmethod
def parse(cls, mod: ModelOnDisk) -> dict[str, Any]:
raise NotImplementedError("API models are not parsed from disk.")
def get_model_discriminator_value(v: Any) -> str:
"""
Computes the discriminator value for a model config.
@@ -674,6 +683,7 @@ AnyModelConfig = Annotated[
Annotated[SigLIPConfig, SigLIPConfig.get_tag()],
Annotated[FluxReduxConfig, FluxReduxConfig.get_tag()],
Annotated[LlavaOnevisionConfig, LlavaOnevisionConfig.get_tag()],
Annotated[ApiModelConfig, ApiModelConfig.get_tag()],
],
Discriminator(get_model_discriminator_value),
]

View File

@@ -27,6 +27,7 @@ from invokeai.backend.model_manager.config import (
SubmodelDefinition,
)
from invokeai.backend.model_manager.load.model_loaders.generic_diffusers import ConfigLoader
from invokeai.backend.model_manager.model_on_disk import ModelOnDisk
from invokeai.backend.model_manager.taxonomy import (
AnyVariant,
BaseModelType,
@@ -145,6 +146,7 @@ class ModelProbe(object):
"CLIPTextModelWithProjection": ModelType.CLIPEmbed,
"SiglipModel": ModelType.SigLIP,
"LlavaOnevisionForConditionalGeneration": ModelType.LlavaOnevision,
"CogView4Pipeline": ModelType.Main,
}
TYPE2VARIANT: Dict[ModelType, Callable[[str], Optional[AnyVariant]]] = {ModelType.CLIPEmbed: get_clip_variant_type}
@@ -207,6 +209,7 @@ class ModelProbe(object):
)
fields["format"] = ModelFormat(fields.get("format")) if "format" in fields else probe.get_format()
fields["hash"] = fields.get("hash") or ModelHash(algorithm=hash_algo).hash(model_path)
fields["file_size"] = fields.get("file_size") or ModelOnDisk(model_path).size()
fields["default_settings"] = fields.get("default_settings")
@@ -856,6 +859,8 @@ class PipelineFolderProbe(FolderProbeBase):
transformer_conf = json.load(file)
if transformer_conf["_class_name"] == "SD3Transformer2DModel":
return BaseModelType.StableDiffusion3
elif transformer_conf["_class_name"] == "CogView4Transformer2DModel":
return BaseModelType.CogView4
else:
raise InvalidModelConfigException(f"Unknown base model for {self.model_path}")

View File

@@ -13,6 +13,12 @@ from invokeai.backend.patches.layers.lora_layer import LoRALayer
def linear_lora_forward(input: torch.Tensor, lora_layer: LoRALayer, lora_weight: float) -> torch.Tensor:
"""An optimized implementation of the residual calculation for a sidecar linear LoRALayer."""
# up matrix and down matrix have different ranks so we can't simply multiply them
if lora_layer.up.shape[1] != lora_layer.down.shape[0]:
x = torch.nn.functional.linear(input, lora_layer.get_weight(lora_weight), bias=lora_layer.bias)
x *= lora_weight * lora_layer.scale()
return x
x = torch.nn.functional.linear(input, lora_layer.down)
if lora_layer.mid is not None:
x = torch.nn.functional.linear(x, lora_layer.mid)

View File

@@ -0,0 +1,60 @@
from pathlib import Path
from typing import Optional
import torch
from invokeai.backend.model_manager.config import (
AnyModelConfig,
CheckpointConfigBase,
DiffusersConfigBase,
)
from invokeai.backend.model_manager.load.model_loader_registry import ModelLoaderRegistry
from invokeai.backend.model_manager.load.model_loaders.generic_diffusers import GenericDiffusersLoader
from invokeai.backend.model_manager.taxonomy import (
AnyModel,
BaseModelType,
ModelFormat,
ModelType,
SubModelType,
)
@ModelLoaderRegistry.register(base=BaseModelType.CogView4, type=ModelType.Main, format=ModelFormat.Diffusers)
class CogView4DiffusersModel(GenericDiffusersLoader):
"""Class to load CogView4 main models."""
def _load_model(
self,
config: AnyModelConfig,
submodel_type: Optional[SubModelType] = None,
) -> AnyModel:
if isinstance(config, CheckpointConfigBase):
raise NotImplementedError("CheckpointConfigBase is not implemented for CogView4 models.")
if submodel_type is None:
raise Exception("A submodel type must be provided when loading main pipelines.")
model_path = Path(config.path)
load_class = self.get_hf_load_class(model_path, submodel_type)
repo_variant = config.repo_variant if isinstance(config, DiffusersConfigBase) else None
variant = repo_variant.value if repo_variant else None
model_path = model_path / submodel_type.value
# We force bfloat16 for CogView4 models. It produces black images with float16. I haven't tracked down
# specifically which model(s) is/are responsible.
dtype = torch.bfloat16
try:
result: AnyModel = load_class.from_pretrained(
model_path,
torch_dtype=dtype,
variant=variant,
)
except OSError as e:
if variant and "no file named" in str(
e
): # try without the variant, just in case user's preferences changed
result = load_class.from_pretrained(model_path, torch_dtype=dtype)
else:
raise e
return result

View File

@@ -1,7 +1,8 @@
from pathlib import Path
from typing import Optional
from invokeai.backend.llava_onevision_model import LlavaOnevisionModel
from transformers import LlavaOnevisionForConditionalGeneration
from invokeai.backend.model_manager.config import (
AnyModelConfig,
)
@@ -23,6 +24,8 @@ class LlavaOnevisionModelLoader(ModelLoader):
raise ValueError("Unexpected submodel requested for LLaVA OneVision model.")
model_path = Path(config.path)
model = LlavaOnevisionModel.load_from_path(model_path)
model.to(dtype=self._torch_dtype)
model = LlavaOnevisionForConditionalGeneration.from_pretrained(
model_path, local_files_only=True, torch_dtype=self._torch_dtype
)
assert isinstance(model, LlavaOnevisionForConditionalGeneration)
return model

View File

@@ -1,13 +1,14 @@
from pathlib import Path
from typing import Optional
from transformers import SiglipVisionModel
from invokeai.backend.model_manager.config import (
AnyModelConfig,
)
from invokeai.backend.model_manager.load.load_default import ModelLoader
from invokeai.backend.model_manager.load.model_loader_registry import ModelLoaderRegistry
from invokeai.backend.model_manager.taxonomy import AnyModel, BaseModelType, ModelFormat, ModelType, SubModelType
from invokeai.backend.sig_lip.sig_lip_pipeline import SigLipPipeline
@ModelLoaderRegistry.register(base=BaseModelType.Any, type=ModelType.SigLIP, format=ModelFormat.Diffusers)
@@ -23,6 +24,5 @@ class SigLIPModelLoader(ModelLoader):
raise ValueError("Unexpected submodel requested for LLaVA OneVision model.")
model_path = Path(config.path)
model = SigLipPipeline.load_from_path(model_path)
model.to(dtype=self._torch_dtype)
model = SiglipVisionModel.from_pretrained(model_path, local_files_only=True, torch_dtype=self._torch_dtype)
return model

View File

@@ -6,6 +6,7 @@ import logging
from pathlib import Path
from typing import Optional
import onnxruntime as ort
import torch
from diffusers.pipelines.pipeline_utils import DiffusionPipeline
from diffusers.schedulers.scheduling_utils import SchedulerMixin
@@ -15,11 +16,9 @@ from invokeai.backend.image_util.depth_anything.depth_anything_pipeline import D
from invokeai.backend.image_util.grounding_dino.grounding_dino_pipeline import GroundingDinoPipeline
from invokeai.backend.image_util.segment_anything.segment_anything_pipeline import SegmentAnythingPipeline
from invokeai.backend.ip_adapter.ip_adapter import IPAdapter
from invokeai.backend.llava_onevision_model import LlavaOnevisionModel
from invokeai.backend.model_manager.taxonomy import AnyModel
from invokeai.backend.onnx.onnx_runtime import IAIOnnxRuntimeModel
from invokeai.backend.patches.model_patch_raw import ModelPatchRaw
from invokeai.backend.sig_lip.sig_lip_pipeline import SigLipPipeline
from invokeai.backend.spandrel_image_to_image_model import SpandrelImageToImageModel
from invokeai.backend.textual_inversion import TextualInversionModelRaw
from invokeai.backend.util.calc_tensor_size import calc_tensor_size
@@ -50,11 +49,19 @@ def calc_model_size_by_data(logger: logging.Logger, model: AnyModel) -> int:
GroundingDinoPipeline,
SegmentAnythingPipeline,
DepthAnythingPipeline,
SigLipPipeline,
LlavaOnevisionModel,
),
):
return model.calc_size()
elif isinstance(model, ort.InferenceSession):
if model._model_bytes is not None:
# If the model is already loaded, return the size of the model bytes
return len(model._model_bytes)
elif model._model_path is not None:
# If the model is not loaded, return the size of the model path
return calc_model_size_by_fs(Path(model._model_path))
else:
# If neither is available, return 0
return 0
elif isinstance(
model,
(

View File

@@ -0,0 +1,96 @@
from pathlib import Path
from typing import Any, Optional, TypeAlias
import safetensors.torch
import torch
from picklescan.scanner import scan_file_path
from invokeai.backend.model_hash.model_hash import HASHING_ALGORITHMS, ModelHash
from invokeai.backend.model_manager.taxonomy import ModelRepoVariant
from invokeai.backend.quantization.gguf.loaders import gguf_sd_loader
from invokeai.backend.util.silence_warnings import SilenceWarnings
StateDict: TypeAlias = dict[str | int, Any] # When are the keys int?
class ModelOnDisk:
"""A utility class representing a model stored on disk."""
def __init__(self, path: Path, hash_algo: HASHING_ALGORITHMS = "blake3_single"):
self.path = path
if self.path.suffix in {".safetensors", ".bin", ".pt", ".ckpt"}:
self.name = path.stem
else:
self.name = path.name
self.hash_algo = hash_algo
# Having a cache helps users of ModelOnDisk (i.e. configs) to save state
# This prevents redundant computations during matching and parsing
self.cache = {"_CACHED_STATE_DICTS": {}}
def hash(self) -> str:
return ModelHash(algorithm=self.hash_algo).hash(self.path)
def size(self) -> int:
if self.path.is_file():
return self.path.stat().st_size
return sum(file.stat().st_size for file in self.path.rglob("*"))
def component_paths(self) -> set[Path]:
if self.path.is_file():
return {self.path}
extensions = {".safetensors", ".pt", ".pth", ".ckpt", ".bin", ".gguf"}
return {f for f in self.path.rglob("*") if f.suffix in extensions}
def repo_variant(self) -> Optional[ModelRepoVariant]:
if self.path.is_file():
return None
weight_files = list(self.path.glob("**/*.safetensors"))
weight_files.extend(list(self.path.glob("**/*.bin")))
for x in weight_files:
if ".fp16" in x.suffixes:
return ModelRepoVariant.FP16
if "openvino_model" in x.name:
return ModelRepoVariant.OpenVINO
if "flax_model" in x.name:
return ModelRepoVariant.Flax
if x.suffix == ".onnx":
return ModelRepoVariant.ONNX
return ModelRepoVariant.Default
def load_state_dict(self, path: Optional[Path] = None) -> StateDict:
sd_cache = self.cache["_CACHED_STATE_DICTS"]
if path in sd_cache:
return sd_cache[path]
if not path:
components = list(self.component_paths())
match components:
case []:
raise ValueError("No weight files found for this model")
case [p]:
path = p
case ps if len(ps) >= 2:
raise ValueError(
f"Multiple weight files found for this model: {ps}. "
f"Please specify the intended file using the 'path' argument"
)
with SilenceWarnings():
if path.suffix.endswith((".ckpt", ".pt", ".pth", ".bin")):
scan_result = scan_file_path(path)
if scan_result.infected_files != 0 or scan_result.scan_err:
raise RuntimeError(f"The model {path.stem} is potentially infected by malware. Aborting import.")
checkpoint = torch.load(path, map_location="cpu")
assert isinstance(checkpoint, dict)
elif path.suffix.endswith(".gguf"):
checkpoint = gguf_sd_loader(path, compute_dtype=torch.float32)
elif path.suffix.endswith(".safetensors"):
checkpoint = safetensors.torch.load_file(path)
else:
raise ValueError(f"Unrecognized model extension: {path.suffix}")
state_dict = checkpoint.get("state_dict", checkpoint)
sd_cache[path] = state_dict
return state_dict

View File

@@ -593,6 +593,16 @@ swinir = StarterModel(
# endregion
# region CogView4
cogview4 = StarterModel(
name="CogView4",
base=BaseModelType.CogView4,
source="THUDM/CogView4-6B",
description="The base CogView4 model (~29GB).",
type=ModelType.Main,
)
# endregion
# region SigLIP
siglip = StarterModel(
name="SigLIP - google/siglip-so400m-patch14-384",
@@ -705,6 +715,7 @@ STARTER_MODELS: list[StarterModel] = [
flux_redux,
llava_onevision,
flux_fill,
cogview4,
]
sd1_bundle: list[StarterModel] = [

View File

@@ -25,7 +25,9 @@ class BaseModelType(str, Enum):
StableDiffusionXL = "sdxl"
StableDiffusionXLRefiner = "sdxl-refiner"
Flux = "flux"
# Kandinsky2_1 = "kandinsky-2.1"
CogView4 = "cogview4"
Imagen3 = "imagen3"
ChatGPT4o = "chatgpt-4o"
class ModelType(str, Enum):
@@ -97,6 +99,7 @@ class ModelFormat(str, Enum):
BnbQuantizedLlmInt8b = "bnb_quantized_int8b"
BnbQuantizednf4b = "bnb_quantized_nf4b"
GGUFQuantized = "gguf_quantized"
Api = "api"
class SchedulerPredictionType(str, Enum):
@@ -126,4 +129,13 @@ class ModelSourceType(str, Enum):
HFRepoID = "hf_repo_id"
class FluxLoRAFormat(str, Enum):
"""Flux LoRA formats."""
Diffusers = "flux.diffusers"
Kohya = "flux.kohya"
OneTrainer = "flux.onetrainer"
Control = "flux.control"
AnyVariant: TypeAlias = Union[ModelVariantType, ClipVariantType, None]

View File

@@ -19,6 +19,7 @@ class LoRALayer(LoRALayerBase):
self.up = up
self.mid = mid
self.down = down
self.are_ranks_equal = up.shape[1] == down.shape[0]
@classmethod
def from_state_dict_values(
@@ -58,12 +59,42 @@ class LoRALayer(LoRALayerBase):
def _rank(self) -> int:
return self.down.shape[0]
def fuse_weights(self, up: torch.Tensor, down: torch.Tensor) -> torch.Tensor:
"""
Fuse the weights of the up and down matrices of a LoRA layer with different ranks.
Since the Huggingface implementation of KQV projections are fused, when we convert to Kohya format
the LoRA weights have different ranks. This function handles the fusion of these differently sized
matrices.
"""
fused_lora = torch.zeros((up.shape[0], down.shape[1]), device=down.device, dtype=down.dtype)
rank_diff = down.shape[0] / up.shape[1]
if rank_diff > 1:
rank_diff = down.shape[0] / up.shape[1]
w_down = down.chunk(int(rank_diff), dim=0)
for w_down_chunk in w_down:
fused_lora = fused_lora + (torch.mm(up, w_down_chunk))
else:
rank_diff = up.shape[1] / down.shape[0]
w_up = up.chunk(int(rank_diff), dim=0)
for w_up_chunk in w_up:
fused_lora = fused_lora + (torch.mm(w_up_chunk, down))
return fused_lora
def get_weight(self, orig_weight: torch.Tensor) -> torch.Tensor:
if self.mid is not None:
up = self.up.reshape(self.up.shape[0], self.up.shape[1])
down = self.down.reshape(self.down.shape[0], self.down.shape[1])
weight = torch.einsum("m n w h, i m, n j -> i j w h", self.mid, up, down)
else:
# up matrix and down matrix have different ranks so we can't simply multiply them
if not self.are_ranks_equal:
weight = self.fuse_weights(self.up, self.down)
return weight
weight = self.up.reshape(self.up.shape[0], -1) @ self.down.reshape(self.down.shape[0], -1)
return weight

View File

@@ -20,6 +20,14 @@ from invokeai.backend.patches.model_patch_raw import ModelPatchRaw
FLUX_KOHYA_TRANSFORMER_KEY_REGEX = (
r"lora_unet_(\w+_blocks)_(\d+)_(img_attn|img_mlp|img_mod|txt_attn|txt_mlp|txt_mod|linear1|linear2|modulation)_?(.*)"
)
# A regex pattern that matches all of the last layer keys in the Kohya FLUX LoRA format.
# Example keys:
# lora_unet_final_layer_linear.alpha
# lora_unet_final_layer_linear.lora_down.weight
# lora_unet_final_layer_linear.lora_up.weight
FLUX_KOHYA_LAST_LAYER_KEY_REGEX = r"lora_unet_final_layer_(linear|linear1|linear2)_?(.*)"
# A regex pattern that matches all of the CLIP keys in the Kohya FLUX LoRA format.
# Example keys:
# lora_te1_text_model_encoder_layers_0_mlp_fc1.alpha
@@ -44,6 +52,7 @@ def is_state_dict_likely_in_flux_kohya_format(state_dict: Dict[str, Any]) -> boo
"""
return all(
re.match(FLUX_KOHYA_TRANSFORMER_KEY_REGEX, k)
or re.match(FLUX_KOHYA_LAST_LAYER_KEY_REGEX, k)
or re.match(FLUX_KOHYA_CLIP_KEY_REGEX, k)
or re.match(FLUX_KOHYA_T5_KEY_REGEX, k)
for k in state_dict.keys()
@@ -65,6 +74,9 @@ def lora_model_from_flux_kohya_state_dict(state_dict: Dict[str, torch.Tensor]) -
t5_grouped_sd: dict[str, dict[str, torch.Tensor]] = {}
for layer_name, layer_state_dict in grouped_state_dict.items():
if layer_name.startswith("lora_unet"):
# Skip the final layer. This is incompatible with current model definition.
if layer_name.startswith("lora_unet_final_layer"):
continue
transformer_grouped_sd[layer_name] = layer_state_dict
elif layer_name.startswith("lora_te1"):
clip_grouped_sd[layer_name] = layer_state_dict

View File

@@ -0,0 +1,24 @@
from invokeai.backend.model_manager.taxonomy import FluxLoRAFormat
from invokeai.backend.patches.lora_conversions.flux_control_lora_utils import is_state_dict_likely_flux_control
from invokeai.backend.patches.lora_conversions.flux_diffusers_lora_conversion_utils import (
is_state_dict_likely_in_flux_diffusers_format,
)
from invokeai.backend.patches.lora_conversions.flux_kohya_lora_conversion_utils import (
is_state_dict_likely_in_flux_kohya_format,
)
from invokeai.backend.patches.lora_conversions.flux_onetrainer_lora_conversion_utils import (
is_state_dict_likely_in_flux_onetrainer_format,
)
def flux_format_from_state_dict(state_dict):
if is_state_dict_likely_in_flux_kohya_format(state_dict):
return FluxLoRAFormat.Kohya
elif is_state_dict_likely_in_flux_onetrainer_format(state_dict):
return FluxLoRAFormat.OneTrainer
elif is_state_dict_likely_in_flux_diffusers_format(state_dict):
return FluxLoRAFormat.Diffusers
elif is_state_dict_likely_flux_control(state_dict):
return FluxLoRAFormat.Control
else:
return None

View File

@@ -1,8 +1,15 @@
import torch
class InpaintExtension:
"""A class for managing inpainting with FLUX."""
def assert_broadcastable(*shapes):
try:
torch.broadcast_shapes(*shapes)
except RuntimeError as e:
raise AssertionError(f"Shapes {shapes} are not broadcastable.") from e
class RectifiedFlowInpaintExtension:
"""A class for managing inpainting with rectified flow models (e.g. FLUX, SD3, CogView4)."""
def __init__(self, init_latents: torch.Tensor, inpaint_mask: torch.Tensor, noise: torch.Tensor):
"""Initialize InpaintExtension.
@@ -14,7 +21,8 @@ class InpaintExtension:
inpainted region with the background. In 'packed' format.
noise (torch.Tensor): The noise tensor used to noise the init_latents. In 'packed' format.
"""
assert init_latents.shape == inpaint_mask.shape == noise.shape
assert_broadcastable(init_latents.shape, inpaint_mask.shape, noise.shape)
self._init_latents = init_latents
self._inpaint_mask = inpaint_mask
self._noise = noise

View File

@@ -1,58 +0,0 @@
import torch
class InpaintExtension:
"""A class for managing inpainting with SD3."""
def __init__(self, init_latents: torch.Tensor, inpaint_mask: torch.Tensor, noise: torch.Tensor):
"""Initialize InpaintExtension.
Args:
init_latents (torch.Tensor): The initial latents (i.e. un-noised at timestep 0).
inpaint_mask (torch.Tensor): A mask specifying which elements to inpaint. Range [0, 1]. Values of 1 will be
re-generated. Values of 0 will remain unchanged. Values between 0 and 1 can be used to blend the
inpainted region with the background.
noise (torch.Tensor): The noise tensor used to noise the init_latents.
"""
assert init_latents.dim() == inpaint_mask.dim() == noise.dim() == 4
assert init_latents.shape[-2:] == inpaint_mask.shape[-2:] == noise.shape[-2:]
self._init_latents = init_latents
self._inpaint_mask = inpaint_mask
self._noise = noise
def _apply_mask_gradient_adjustment(self, t_prev: float) -> torch.Tensor:
"""Applies inpaint mask gradient adjustment and returns the inpaint mask to be used at the current timestep."""
# As we progress through the denoising process, we promote gradient regions of the mask to have a full weight of
# 1.0. This helps to produce more coherent seams around the inpainted region. We experimented with a (small)
# number of promotion strategies (e.g. gradual promotion based on timestep), but found that a simple cutoff
# threshold worked well.
# We use a small epsilon to avoid any potential issues with floating point precision.
eps = 1e-4
mask_gradient_t_cutoff = 0.5
if t_prev > mask_gradient_t_cutoff:
# Early in the denoising process, use the inpaint mask as-is.
return self._inpaint_mask
else:
# After the cut-off, promote all non-zero mask values to 1.0.
mask = self._inpaint_mask.where(self._inpaint_mask <= (0.0 + eps), 1.0)
return mask
def merge_intermediate_latents_with_init_latents(
self, intermediate_latents: torch.Tensor, t_prev: float
) -> torch.Tensor:
"""Merge the intermediate latents with the initial latents for the current timestep using the inpaint mask. I.e.
update the intermediate latents to keep the regions that are not being inpainted on the correct noise
trajectory.
This function should be called after each denoising step.
"""
mask = self._apply_mask_gradient_adjustment(t_prev)
# Noise the init latents for the current timestep.
noised_init_latents = self._noise * t_prev + (1.0 - t_prev) * self._init_latents
# Merge the intermediate latents with the noised_init_latents using the inpaint_mask.
return intermediate_latents * mask + noised_init_latents * (1.0 - mask)

View File

@@ -1,14 +1,9 @@
from pathlib import Path
from typing import Optional
import torch
from PIL import Image
from transformers import SiglipImageProcessor, SiglipVisionModel
from invokeai.backend.raw_model import RawModel
class SigLipPipeline(RawModel):
class SigLipPipeline:
"""A wrapper for a SigLIP model + processor."""
def __init__(
@@ -19,25 +14,7 @@ class SigLipPipeline(RawModel):
self._siglip_processor = siglip_processor
self._siglip_model = siglip_model
@classmethod
def load_from_path(cls, path: str | Path):
siglip_model = SiglipVisionModel.from_pretrained(path, local_files_only=True)
assert isinstance(siglip_model, SiglipVisionModel)
siglip_processor = SiglipImageProcessor.from_pretrained(path, local_files_only=True)
assert isinstance(siglip_processor, SiglipImageProcessor)
return cls(siglip_processor, siglip_model)
def to(self, device: Optional[torch.device] = None, dtype: Optional[torch.dtype] = None) -> None:
self._siglip_model.to(device=device, dtype=dtype)
def encode_image(self, x: Image.Image, device: torch.device, dtype: torch.dtype) -> torch.Tensor:
imgs = self._siglip_processor.preprocess(images=[x], do_resize=True, return_tensors="pt", do_convert_rgb=True)
encoded_x = self._siglip_model(**imgs.to(device=device, dtype=dtype)).last_hidden_state
return encoded_x
def calc_size(self) -> int:
"""Get size of the model in memory in bytes."""
# HACK(ryand): Fix this issue with circular imports.
from invokeai.backend.model_manager.load.model_util import calc_module_size
return calc_module_size(self._siglip_model)

View File

@@ -67,13 +67,26 @@ class SD3ConditioningInfo:
return self
@dataclass
class CogView4ConditioningInfo:
glm_embeds: torch.Tensor
def to(self, device: torch.device | None = None, dtype: torch.dtype | None = None):
self.glm_embeds = self.glm_embeds.to(device=device, dtype=dtype)
return self
@dataclass
class ConditioningFieldData:
# If you change this class, adding more types, you _must_ update the instantiation of ObjectSerializerDisk in
# invokeai/app/api/dependencies.py, adding the types to the list of safe globals. If you do not, torch will be
# unable to deserialize the object and will raise an error.
conditionings: (
List[BasicConditioningInfo]
| List[SDXLConditioningInfo]
| List[FLUXConditioningInfo]
| List[SD3ConditioningInfo]
| List[CogView4ConditioningInfo]
)

View File

@@ -1,245 +0,0 @@
import math
import diffusers
import torch
if torch.backends.mps.is_available():
torch.empty = torch.zeros
_torch_layer_norm = torch.nn.functional.layer_norm
def new_layer_norm(input, normalized_shape, weight=None, bias=None, eps=1e-05):
if input.device.type == "mps" and input.dtype == torch.float16:
input = input.float()
if weight is not None:
weight = weight.float()
if bias is not None:
bias = bias.float()
return _torch_layer_norm(input, normalized_shape, weight, bias, eps).half()
else:
return _torch_layer_norm(input, normalized_shape, weight, bias, eps)
torch.nn.functional.layer_norm = new_layer_norm
_torch_tensor_permute = torch.Tensor.permute
def new_torch_tensor_permute(input, *dims):
result = _torch_tensor_permute(input, *dims)
if input.device == "mps" and input.dtype == torch.float16:
result = result.contiguous()
return result
torch.Tensor.permute = new_torch_tensor_permute
_torch_lerp = torch.lerp
def new_torch_lerp(input, end, weight, *, out=None):
if input.device.type == "mps" and input.dtype == torch.float16:
input = input.float()
end = end.float()
if isinstance(weight, torch.Tensor):
weight = weight.float()
if out is not None:
out_fp32 = torch.zeros_like(out, dtype=torch.float32)
else:
out_fp32 = None
result = _torch_lerp(input, end, weight, out=out_fp32)
if out is not None:
out.copy_(out_fp32.half())
del out_fp32
return result.half()
else:
return _torch_lerp(input, end, weight, out=out)
torch.lerp = new_torch_lerp
_torch_interpolate = torch.nn.functional.interpolate
def new_torch_interpolate(
input,
size=None,
scale_factor=None,
mode="nearest",
align_corners=None,
recompute_scale_factor=None,
antialias=False,
):
if input.device.type == "mps" and input.dtype == torch.float16:
return _torch_interpolate(
input.float(), size, scale_factor, mode, align_corners, recompute_scale_factor, antialias
).half()
else:
return _torch_interpolate(input, size, scale_factor, mode, align_corners, recompute_scale_factor, antialias)
torch.nn.functional.interpolate = new_torch_interpolate
# TODO: refactor it
_SlicedAttnProcessor = diffusers.models.attention_processor.SlicedAttnProcessor
class ChunkedSlicedAttnProcessor:
r"""
Processor for implementing sliced attention.
Args:
slice_size (`int`, *optional*):
The number of steps to compute attention. Uses as many slices as `attention_head_dim // slice_size`, and
`attention_head_dim` must be a multiple of the `slice_size`.
"""
def __init__(self, slice_size):
assert isinstance(slice_size, int)
slice_size = 1 # TODO: maybe implement chunking in batches too when enough memory
self.slice_size = slice_size
self._sliced_attn_processor = _SlicedAttnProcessor(slice_size)
def __call__(self, attn, hidden_states, encoder_hidden_states=None, attention_mask=None):
if self.slice_size != 1 or attn.upcast_attention:
return self._sliced_attn_processor(attn, hidden_states, encoder_hidden_states, attention_mask)
residual = hidden_states
input_ndim = hidden_states.ndim
if input_ndim == 4:
batch_size, channel, height, width = hidden_states.shape
hidden_states = hidden_states.view(batch_size, channel, height * width).transpose(1, 2)
batch_size, sequence_length, _ = (
hidden_states.shape if encoder_hidden_states is None else encoder_hidden_states.shape
)
attention_mask = attn.prepare_attention_mask(attention_mask, sequence_length, batch_size)
if attn.group_norm is not None:
hidden_states = attn.group_norm(hidden_states.transpose(1, 2)).transpose(1, 2)
query = attn.to_q(hidden_states)
dim = query.shape[-1]
query = attn.head_to_batch_dim(query)
if encoder_hidden_states is None:
encoder_hidden_states = hidden_states
elif attn.norm_cross:
encoder_hidden_states = attn.norm_encoder_hidden_states(encoder_hidden_states)
key = attn.to_k(encoder_hidden_states)
value = attn.to_v(encoder_hidden_states)
key = attn.head_to_batch_dim(key)
value = attn.head_to_batch_dim(value)
batch_size_attention, query_tokens, _ = query.shape
hidden_states = torch.zeros(
(batch_size_attention, query_tokens, dim // attn.heads), device=query.device, dtype=query.dtype
)
chunk_tmp_tensor = torch.empty(
self.slice_size, query.shape[1], key.shape[1], dtype=query.dtype, device=query.device
)
for i in range(batch_size_attention // self.slice_size):
start_idx = i * self.slice_size
end_idx = (i + 1) * self.slice_size
query_slice = query[start_idx:end_idx]
key_slice = key[start_idx:end_idx]
attn_mask_slice = attention_mask[start_idx:end_idx] if attention_mask is not None else None
self.get_attention_scores_chunked(
attn,
query_slice,
key_slice,
attn_mask_slice,
hidden_states[start_idx:end_idx],
value[start_idx:end_idx],
chunk_tmp_tensor,
)
hidden_states = attn.batch_to_head_dim(hidden_states)
# linear proj
hidden_states = attn.to_out[0](hidden_states)
# dropout
hidden_states = attn.to_out[1](hidden_states)
if input_ndim == 4:
hidden_states = hidden_states.transpose(-1, -2).reshape(batch_size, channel, height, width)
if attn.residual_connection:
hidden_states = hidden_states + residual
hidden_states = hidden_states / attn.rescale_output_factor
return hidden_states
def get_attention_scores_chunked(self, attn, query, key, attention_mask, hidden_states, value, chunk):
# batch size = 1
assert query.shape[0] == 1
assert key.shape[0] == 1
assert value.shape[0] == 1
assert hidden_states.shape[0] == 1
# dtype = query.dtype
if attn.upcast_attention:
query = query.float()
key = key.float()
# out_item_size = query.dtype.itemsize
# if attn.upcast_attention:
# out_item_size = torch.float32.itemsize
out_item_size = query.element_size()
if attn.upcast_attention:
out_item_size = 4
chunk_size = 2**29
out_size = query.shape[1] * key.shape[1] * out_item_size
chunks_count = min(query.shape[1], math.ceil((out_size - 1) / chunk_size))
chunk_step = max(1, int(query.shape[1] / chunks_count))
key = key.transpose(-1, -2)
def _get_chunk_view(tensor, start, length):
if start + length > tensor.shape[1]:
length = tensor.shape[1] - start
# print(f"view: [{tensor.shape[0]},{tensor.shape[1]},{tensor.shape[2]}] - start: {start}, length: {length}")
return tensor[:, start : start + length]
for chunk_pos in range(0, query.shape[1], chunk_step):
if attention_mask is not None:
torch.baddbmm(
_get_chunk_view(attention_mask, chunk_pos, chunk_step),
_get_chunk_view(query, chunk_pos, chunk_step),
key,
beta=1,
alpha=attn.scale,
out=chunk,
)
else:
torch.baddbmm(
torch.zeros((1, 1, 1), device=query.device, dtype=query.dtype),
_get_chunk_view(query, chunk_pos, chunk_step),
key,
beta=0,
alpha=attn.scale,
out=chunk,
)
chunk = chunk.softmax(dim=-1)
torch.bmm(chunk, value, out=_get_chunk_view(hidden_states, chunk_pos, chunk_step))
# del chunk
diffusers.models.attention_processor.SlicedAttnProcessor = ChunkedSlicedAttnProcessor

View File

@@ -52,67 +52,68 @@
}
},
"dependencies": {
"@atlaskit/pragmatic-drag-and-drop": "^1.4.0",
"@atlaskit/pragmatic-drag-and-drop-auto-scroll": "^1.4.0",
"@atlaskit/pragmatic-drag-and-drop": "^1.5.3",
"@atlaskit/pragmatic-drag-and-drop-auto-scroll": "^2.1.0",
"@atlaskit/pragmatic-drag-and-drop-hitbox": "^1.0.3",
"@dagrejs/dagre": "^1.1.4",
"@dagrejs/graphlib": "^2.2.4",
"@fontsource-variable/inter": "^5.1.0",
"@fontsource-variable/inter": "^5.2.5",
"@invoke-ai/ui-library": "^0.0.46",
"@nanostores/react": "^0.7.3",
"@reduxjs/toolkit": "2.6.1",
"@nanostores/react": "^1.0.0",
"@reduxjs/toolkit": "2.7.0",
"@roarr/browser-log-writer": "^1.3.0",
"@xyflow/react": "^12.4.2",
"@xyflow/react": "^12.6.0",
"async-mutex": "^0.5.0",
"chakra-react-select": "^4.9.2",
"cmdk": "^1.0.0",
"cmdk": "^1.1.1",
"compare-versions": "^6.1.1",
"filesize": "^10.1.6",
"fracturedjsonjs": "^4.0.2",
"framer-motion": "^11.10.0",
"i18next": "^23.15.1",
"i18next-http-backend": "^2.6.1",
"i18next": "^25.0.1",
"i18next-http-backend": "^3.0.2",
"idb-keyval": "^6.2.1",
"jsondiffpatch": "^0.6.0",
"konva": "^9.3.15",
"jsondiffpatch": "^0.7.3",
"konva": "^9.3.20",
"linkify-react": "^4.2.0",
"linkifyjs": "^4.2.0",
"lodash-es": "^4.17.21",
"lru-cache": "^11.0.1",
"lru-cache": "^11.1.0",
"mtwist": "^1.0.2",
"nanoid": "^5.0.7",
"nanostores": "^0.11.3",
"new-github-issue-url": "^1.0.0",
"overlayscrollbars": "^2.10.0",
"nanoid": "^5.1.5",
"nanostores": "^1.0.1",
"new-github-issue-url": "^1.1.0",
"overlayscrollbars": "^2.11.1",
"overlayscrollbars-react": "^0.5.6",
"perfect-freehand": "^1.2.2",
"query-string": "^9.1.0",
"query-string": "^9.1.1",
"raf-throttle": "^2.0.6",
"react": "^18.3.1",
"react-colorful": "^5.6.1",
"react-dom": "^18.3.1",
"react-dropzone": "^14.2.9",
"react-error-boundary": "^4.0.13",
"react-hook-form": "^7.53.0",
"react-dropzone": "^14.3.8",
"react-error-boundary": "^5.0.0",
"react-hook-form": "^7.56.1",
"react-hotkeys-hook": "4.5.0",
"react-i18next": "^15.0.2",
"react-icons": "^5.3.0",
"react-redux": "9.1.2",
"react-resizable-panels": "^2.1.4",
"react-textarea-autosize": "^8.5.7",
"react-use": "^17.5.1",
"react-virtuoso": "^4.12.5",
"react-i18next": "^15.5.1",
"react-icons": "^5.5.0",
"react-redux": "9.2.0",
"react-resizable-panels": "^2.1.8",
"react-textarea-autosize": "^8.5.9",
"react-use": "^17.6.0",
"react-virtuoso": "^4.12.6",
"redux-dynamic-middlewares": "^2.2.0",
"redux-remember": "^5.1.0",
"redux-remember": "^5.2.0",
"redux-undo": "^1.1.0",
"rfdc": "^1.4.1",
"roarr": "^7.21.1",
"serialize-error": "^11.0.3",
"socket.io-client": "^4.8.0",
"stable-hash": "^0.0.4",
"use-debounce": "^10.0.3",
"serialize-error": "^12.0.0",
"socket.io-client": "^4.8.1",
"stable-hash": "^0.0.5",
"use-debounce": "^10.0.4",
"use-device-pixel-ratio": "^1.1.2",
"uuid": "^10.0.0",
"zod": "^3.23.8",
"uuid": "^11.1.0",
"zod": "^3.24.3",
"zod-validation-error": "^3.4.0"
},
"peerDependencies": {
@@ -122,45 +123,46 @@
"devDependencies": {
"@invoke-ai/eslint-config-react": "^0.0.14",
"@invoke-ai/prettier-config-react": "^0.0.7",
"@storybook/addon-essentials": "^8.3.4",
"@storybook/addon-interactions": "^8.3.4",
"@storybook/addon-links": "^8.3.4",
"@storybook/addon-storysource": "^8.3.4",
"@storybook/manager-api": "^8.3.4",
"@storybook/react": "^8.3.4",
"@storybook/react-vite": "^8.5.5",
"@storybook/theming": "^8.3.4",
"@storybook/addon-essentials": "^8.6.12",
"@storybook/addon-interactions": "^8.6.12",
"@storybook/addon-links": "^8.6.12",
"@storybook/addon-storysource": "^8.6.12",
"@storybook/manager-api": "^8.6.12",
"@storybook/react": "^8.6.12",
"@storybook/react-vite": "^8.6.12",
"@storybook/theming": "^8.6.12",
"@types/lodash-es": "^4.17.12",
"@types/node": "^20.16.10",
"@types/node": "^22.15.1",
"@types/react": "^18.3.11",
"@types/react-dom": "^18.3.0",
"@types/uuid": "^10.0.0",
"@vitejs/plugin-react-swc": "^3.8.0",
"@vitest/coverage-v8": "^3.0.6",
"@vitest/ui": "^3.0.6",
"concurrently": "^8.2.2",
"@vitejs/plugin-react-swc": "^3.9.0",
"@vitest/coverage-v8": "^3.1.2",
"@vitest/ui": "^3.1.2",
"concurrently": "^9.1.2",
"csstype": "^3.1.3",
"dpdm": "^3.14.0",
"eslint": "^8.57.1",
"eslint-plugin-i18next": "^6.1.0",
"eslint-plugin-i18next": "^6.1.1",
"eslint-plugin-path": "^1.3.0",
"knip": "^5.31.0",
"knip": "^5.50.5",
"openapi-types": "^12.1.3",
"openapi-typescript": "^7.4.1",
"prettier": "^3.3.3",
"rollup-plugin-visualizer": "^5.12.0",
"storybook": "^8.3.4",
"tsafe": "^1.7.5",
"type-fest": "^4.26.1",
"typescript": "^5.6.2",
"vite": "^6.1.0",
"openapi-typescript": "^7.6.1",
"prettier": "^3.5.3",
"rollup-plugin-visualizer": "^5.14.0",
"storybook": "^8.6.12",
"tsafe": "^1.8.5",
"type-fest": "^4.40.0",
"typescript": "^5.8.3",
"vite": "^6.3.3",
"vite-plugin-css-injected-by-js": "^3.5.2",
"vite-plugin-dts": "^4.5.0",
"vite-plugin-dts": "^4.5.3",
"vite-plugin-eslint": "^1.8.1",
"vite-tsconfig-paths": "^5.1.4",
"vitest": "^3.0.6"
"vitest": "^3.1.2"
},
"engines": {
"pnpm": "8"
}
},
"packageManager": "pnpm@8.15.9+sha512.499434c9d8fdd1a2794ebf4552b3b25c0a633abcee5bb15e7b5de90f32f47b513aca98cd5cfd001c31f0db454bc3804edccd578501e4ca293a6816166bbd9f81"
}

File diff suppressed because it is too large Load Diff

View File

@@ -116,7 +116,20 @@
"combinatorial": "Kombinatorisch",
"saveChanges": "Änderungen speichern",
"error_withCount_one": "{{count}} Fehler",
"error_withCount_other": "{{count}} Fehler"
"error_withCount_other": "{{count}} Fehler",
"value": "Wert",
"label": "Label",
"systemInformation": "Systeminformationen",
"search": "Suche",
"clear": "Zurücksetzen",
"fullView": "Vollansicht",
"compactView": "Kompaktansicht",
"options_withCount_one": "{{count}} Option",
"options_withCount_other": "{{count}} Optionen",
"noOptions": "Keine Optionen",
"noMatches": "Keine Treffer",
"model_withCount_one": "{{count}} Modell",
"model_withCount_other": "{{count}} Modelle"
},
"gallery": {
"galleryImageSize": "Bildgröße",
@@ -695,7 +708,10 @@
"guidance": "Führung",
"coherenceMode": "Modus",
"recallMetadata": "Metadaten abrufen",
"gaussianBlur": "Gaußsche Unschärfe"
"gaussianBlur": "Gaußsche Unschärfe",
"sendToUpscale": "An Hochskalieren senden",
"useCpuNoise": "CPU-Rauschen verwenden",
"sendToCanvas": "An Leinwand senden"
},
"settings": {
"displayInProgress": "Zwischenbilder anzeigen",
@@ -1328,7 +1344,8 @@
"loadWorkflowDesc2": "Ihr aktueller Arbeitsablauf enthält nicht gespeicherte Änderungen.",
"loadingTemplates": "Lade {{name}}",
"missingSourceOrTargetHandle": "Fehlender Quell- oder Zielgriff",
"missingSourceOrTargetNode": "Fehlender Quell- oder Zielknoten"
"missingSourceOrTargetNode": "Fehlender Quell- oder Zielknoten",
"showEdgeLabelsHelp": "Beschriftungen an Kanten anzeigen, um die verknüpften Knoten zu kennzeichnen"
},
"hrf": {
"enableHrf": "Korrektur für hohe Auflösungen",

Some files were not shown because too many files have changed in this diff Show More