Compare commits

..

277 Commits

Author SHA1 Message Date
psychedelicious
c451f52ea3 chore(ui): lint 2024-08-22 21:00:09 +10:00
psychedelicious
8a2c78f2e1 fix(ui): dynamic prompts not recalculating when deleting or updating a style preset
The root cause was the active style preset not being reset when it was deleted, or no longer present in the list of style presets.

- Add extra reducer to `stylePresetSlice` to reset the active preset if it is deleted or otherwise unavailable
- Update the dynamic prompts listener to trigger on delete/update/list of style presets
2024-08-22 21:00:09 +10:00
psychedelicious
bcc78bde9b chore: bump version to v4.2.8 2024-08-22 21:00:09 +10:00
Васянатор
054bb6fe0a translationBot(ui): update translation (Russian)
Currently translated at 100.0% (1367 of 1367 strings)

Co-authored-by: Васянатор <ilabulanov339@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/ru/
Translation: InvokeAI/Web UI
2024-08-22 13:09:56 +10:00
Riccardo Giovanetti
4f4aa6d92e translationBot(ui): update translation (Italian)
Currently translated at 98.4% (1346 of 1367 strings)

translationBot(ui): update translation (Italian)

Currently translated at 98.4% (1346 of 1367 strings)

Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
2024-08-22 13:09:56 +10:00
Hosted Weblate
eac51ac6f5 translationBot(ui): update translation files
Updated by "Cleanup translation files" hook in Weblate.

Co-authored-by: Hosted Weblate <hosted@weblate.org>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/
Translation: InvokeAI/Web UI
2024-08-22 13:09:56 +10:00
psychedelicious
9f349a7c0a fix(ui): do not constrain width of hide/show boards button
lets translations display fully
2024-08-22 11:36:07 +10:00
psychedelicious
918afa5b15 fix(ui): show more of current board name 2024-08-22 11:36:07 +10:00
psychedelicious
eb1113f95c feat(ui): add translation string for "Upscale" 2024-08-22 11:36:07 +10:00
psychedelicious
4f4ba7b462 tidy(ui): clean up ActiveStylePreset markup 2024-08-21 09:06:41 +10:00
Mary Hipp
2298be0e6b fix(ui): error handling if unable to convert image URL to blob 2024-08-21 09:06:41 +10:00
Mary Hipp
63494dfca7 remove extra slash in exports path 2024-08-21 09:06:41 +10:00
Mary Hipp
36a1d39454 fix(ui): handle badge styling when template name is long 2024-08-21 09:06:41 +10:00
Mary Hipp
a6f6d5c400 fix(ui): add loading state to button when creating or updating a style preset 2024-08-21 09:06:41 +10:00
Mary Hipp
e85f221aca fix(ui): clear prompt template when prompts are recalled 2024-08-21 09:04:35 +10:00
Mary Hipp
d4797e37dc fix(ui): properly unwrap delete style preset API request so that error is caught 2024-08-19 16:12:39 -04:00
Mary Hipp
3e7923d072 fix(api): allow updating of type for style preset 2024-08-19 16:12:39 -04:00
psychedelicious
a85d69ce3d tidy(ui): getViewModeChunks.tsx -> .ts 2024-08-19 08:25:39 +10:00
psychedelicious
96db006c99 fix(ui): edge case with getViewModeChunks 2024-08-19 08:25:39 +10:00
psychedelicious
8ca57d03d8 tests(ui): add tests for getViewModeChunks 2024-08-19 08:25:39 +10:00
psychedelicious
6c404ce5f8 fix(ui): prompt template preset preview out of order 2024-08-19 08:25:39 +10:00
psychedelicious
584e07182b fix(ui): use translations for style preset strings 2024-08-17 21:27:53 +10:00
psychedelicious
f787e9acf6 chore: bump version v4.2.8rc2 2024-08-16 21:47:06 +10:00
psychedelicious
5a24b89e54 fix(app): include style preset defaults in build 2024-08-16 21:47:06 +10:00
psychedelicious
9b482e2a4f chore: bump version to v4.2.8rc1 2024-08-16 10:53:19 +10:00
Max
df4dbe2d57 Fix invoke.sh not detecting symlinks
When invoke.sh is executed using a symlink with a working directory outside of InvokeAI's root directory, it will fail.

invoke.sh attempts to cd into the correct directory at the start of the script, but will cd into the directory of the symlink instead. This commit fixes that.
2024-08-16 10:40:59 +10:00
psychedelicious
713bd11177 feat(ui, api): prompt template export (#6745)
## Summary

Adds option to download all prompt templates to a CSV

## Related Issues / Discussions

<!--WHEN APPLICABLE: List any related issues or discussions on github or
discord. If this PR closes an issue, please use the "Closes #1234"
format, so that the issue will be automatically closed when the PR
merges.-->

## QA Instructions

<!--WHEN APPLICABLE: Describe how you have tested the changes in this
PR. Provide enough detail that a reviewer can reproduce your tests.-->

## Merge Plan

<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->

## Checklist

- [ ] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
2024-08-16 10:38:50 +10:00
psychedelicious
182571df4b Merge branch 'main' into maryhipp/export-presets 2024-08-16 10:17:07 +10:00
psychedelicious
29bfe492b6 ui: translations update from weblate (#6746)
Translations update from [Hosted Weblate](https://hosted.weblate.org)
for [InvokeAI/Web
UI](https://hosted.weblate.org/projects/invokeai/web-ui/).



Current translation status:

![Weblate translation
status](https://hosted.weblate.org/widget/invokeai/web-ui/horizontal-auto.svg)
2024-08-16 10:16:51 +10:00
psychedelicious
3fb4e3050c feat(ui): focus in textarea after inserting placeholder 2024-08-16 10:14:25 +10:00
psychedelicious
39c7ec3cd9 feat(ui): per type fallbacks for templates 2024-08-16 10:11:43 +10:00
psychedelicious
26bfbdec7f feat(ui): use buttons instead of menu for preset import/export 2024-08-16 09:58:19 +10:00
psychedelicious
7a3eaa8da9 feat(api): save file as prompt_templates.csv 2024-08-16 09:51:46 +10:00
Mary Hipp
599db7296f export only user style presets 2024-08-15 16:07:32 -04:00
Riccardo Giovanetti
042aab4295 translationBot(ui): update translation (Italian)
Currently translated at 98.6% (1340 of 1359 strings)

Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
2024-08-15 20:44:02 +02:00
Mary Hipp
24f298283f clean up, add context menu to import/download templates 2024-08-15 12:39:55 -04:00
Mary Hipp
68dac6349d Merge remote-tracking branch 'origin/main' into maryhipp/export-presets 2024-08-15 11:21:56 -04:00
chainchompa
b675fc19e8 feat: add base prop for selectedWorkflow to allow loading a workflow on launch (#6742)
## Summary
added a base prop for selectedWorkflow to allow loading a workflow on
launch

<!--A description of the changes in this PR. Include the kind of change
(fix, feature, docs, etc), the "why" and the "how". Screenshots or
videos are useful for frontend changes.-->

## Related Issues / Discussions

<!--WHEN APPLICABLE: List any related issues or discussions on github or
discord. If this PR closes an issue, please use the "Closes #1234"
format, so that the issue will be automatically closed when the PR
merges.-->

## QA Instructions
can test by loading InvokeAIUI with a selectedWorkflow prop of the
workflow ID
<!--WHEN APPLICABLE: Describe how you have tested the changes in this
PR. Provide enough detail that a reviewer can reproduce your tests.-->

## Merge Plan

<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->

## Checklist

- [ ] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
2024-08-15 10:52:23 -04:00
chainchompa
659019cfd6 Merge branch 'main' into chainchompa/preselect-workflows 2024-08-15 10:40:44 -04:00
Mary Hipp
dcd61e1f82 pin ruff version in python check gha 2024-08-15 09:47:49 -04:00
Mary Hipp
f5c99b1488 exclude jupyter notebooks from ruff 2024-08-15 09:47:49 -04:00
Mary Hipp
810be3e1d4 update import directions to include JSON 2024-08-15 09:47:49 -04:00
psychedelicious
60d754d1df feat(api): tidy style presets import logic
- Extract parsing into utility function
- Log import errors
- Forbid extra properties on the imported data
2024-08-15 09:47:49 -04:00
psychedelicious
bd07c86db9 feat(ui): make style preset menu trigger look like button 2024-08-15 09:47:49 -04:00
psychedelicious
bcbf8b6bd8 feat(ui): revert to using {prompt} for prompt template placeholder 2024-08-15 09:47:49 -04:00
psychedelicious
356661459b feat(api): support JSON for preset imports
This allows us to support Fooocus format presets.
2024-08-15 09:47:49 -04:00
psychedelicious
deb917825e feat(api): use pydantic validation during style preset import
- Enforce name is present and not an empty string
- Provide empty string as default for positive and negative prompt
- Add `positive_prompt` as validation alias for `prompt` field
- Strip whitespace automatically
- Create `TypeAdapter` to validate the whole list in one go
2024-08-15 09:47:49 -04:00
psychedelicious
15415c6d85 feat(ui): use dropzone for style preset upload
Easier to accept multiple file types and supper drag and drop in the future.
2024-08-15 09:47:49 -04:00
Mary Hipp
76b0380b5f feat(ui): create component to upload CSV of style presets to import 2024-08-15 09:47:49 -04:00
Mary Hipp
2d58754789 feat(api): add endpoint to take a CSV, parse it, validate it, and create many style preset entries 2024-08-15 09:47:49 -04:00
chainchompa
9cdf1f599c Merge branch 'main' into chainchompa/preselect-workflows 2024-08-15 09:25:19 -04:00
chainchompa
268be97ba0 remove ref, make options optional for useGetLoadWorkflow 2024-08-15 09:18:41 -04:00
Mary Hipp
a9014673a0 wip export 2024-08-15 09:00:11 -04:00
psychedelicious
d36c43a10f ui: translations update from weblate (#6727)
Translations update from [Hosted Weblate](https://hosted.weblate.org)
for [InvokeAI/Web
UI](https://hosted.weblate.org/projects/invokeai/web-ui/).



Current translation status:

![Weblate translation
status](https://hosted.weblate.org/widget/invokeai/web-ui/horizontal-auto.svg)
2024-08-15 08:48:03 +10:00
Phrixus2023
54a5c4e482 translationBot(ui): update translation (Chinese (Simplified))
Currently translated at 98.1% (1296 of 1320 strings)

Co-authored-by: Phrixus2023 <920414016@qq.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/zh_Hans/
Translation: InvokeAI/Web UI
2024-08-15 00:46:01 +02:00
Riccardo Giovanetti
5e09a244e3 translationBot(ui): update translation (Italian)
Currently translated at 98.5% (1336 of 1355 strings)

translationBot(ui): update translation (Italian)

Currently translated at 98.5% (1302 of 1321 strings)

translationBot(ui): update translation (Italian)

Currently translated at 98.6% (1302 of 1320 strings)

Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
2024-08-15 00:46:01 +02:00
chainchompa
88648dca1a change selectedWorkflow to selectedWorkflowId 2024-08-14 11:22:37 -04:00
chainchompa
8840df2b00 Merge branch 'main' into chainchompa/preselect-workflows 2024-08-14 09:02:12 -04:00
chainchompa
af159acbdf cleanup 2024-08-14 08:58:38 -04:00
chainchompa
471719bbbe add base prop for selectedWorkflow to allow loading a workflow on launch 2024-08-14 08:47:02 -04:00
psychedelicious
b126f2ffd5 feat(ui, api): prompt templates (#6729)
## Summary

Adds prompt templates to the UI. Demo video is attached.
* added default prompt templates to seed database on startup (these
cannot be edited or deleted by users via the UI)
* can create fresh prompt template, create from an image in gallery that
has prompt metadata, or copy an existing prompt template and modify
* if a template is active, can view what your prompt will be invoked as
by switching to "view mode"



https://github.com/user-attachments/assets/32d84e0c-b04c-48da-bae5-aa6eb685d209



## Related Issues / Discussions

<!--WHEN APPLICABLE: List any related issues or discussions on github or
discord. If this PR closes an issue, please use the "Closes #1234"
format, so that the issue will be automatically closed when the PR
merges.-->

## QA Instructions

<!--WHEN APPLICABLE: Describe how you have tested the changes in this
PR. Provide enough detail that a reviewer can reproduce your tests.-->

## Merge Plan

<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->

## Checklist

- [ ] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
2024-08-14 12:49:31 +10:00
psychedelicious
9938f12ef0 Merge branch 'main' into maryhipp/style-presets 2024-08-14 12:33:30 +10:00
psychedelicious
982c266073 tidy: remove extra characters in prompt templates 2024-08-14 12:31:57 +10:00
psychedelicious
5c37391883 fix(ui): do not show [prompt] in preset preview 2024-08-14 12:29:05 +10:00
psychedelicious
ddeafc6833 fix(ui): minimize layout shift when overlaying preset prompt preview 2024-08-14 12:24:57 +10:00
psychedelicious
41b2d5d013 fix(ui): prompt preview not working preset starts with [prompt] 2024-08-14 12:21:38 +10:00
psychedelicious
29d6f48901 fix(ui): prompt shows thru prompt label text 2024-08-14 12:01:49 +10:00
psychedelicious
d5c9f4e47f chore(ui): revert framer-motion upgrade
`framer-motion` 11 breaks a lot of stuff in profoundly unintuitive ways, holy crap. UI lib rolled back its dep, pulling in latest version of that
2024-08-14 06:12:00 +10:00
psychedelicious
24d73387d8 build(ui): fix chakra deps
We had multiple versions of @emotion/react, stemming from an extraneous dependency on @chakra-ui/react. Removed the extraneosu dep
2024-08-14 06:12:00 +10:00
Mary Hipp
e0d3927265 feat: add flag for allowPrivateStylePresets that shows a type field when creating a style preset 2024-08-13 14:08:54 -04:00
Mary Hipp
e5f7c2a9b7 add type safety / validation to form data payloads and allow type to be passed through api 2024-08-13 13:00:31 -04:00
Mary Hipp
b0760710d5 add the rest of default style presets, update image service to return default images correctly by name, add tooltip popover to images in UI 2024-08-13 11:33:15 -04:00
Mary Hipp
764accc921 update config docstring 2024-08-12 15:17:40 -04:00
Mary Hipp
6a01fce9c1 fix payloads for stringified data 2024-08-12 15:16:22 -04:00
Mary Hipp
9c732ac3b1 Merge remote-tracking branch 'origin/main' into maryhipp/style-presets 2024-08-12 14:53:45 -04:00
Mary Hipp
b70891c661 update descriptoin of placeholder in modal 2024-08-12 13:37:04 -04:00
Mary Hipp
4dbf851741 ui: add labels to prompt boxes 2024-08-12 13:33:39 -04:00
Mary Hipp
6c927a9fd4 move mdoal state into nanostore 2024-08-12 12:46:02 -04:00
Mary Hipp
096f001634 ui: add ability to copy template 2024-08-12 12:32:31 -04:00
Mary Hipp
4837e578b2 api: update dir path for style preset images, update payload for create/update formdata 2024-08-12 12:00:14 -04:00
Mary Hipp
1e547ef912 UI more pr feedback 2024-08-12 11:59:25 -04:00
psychedelicious
f6b8970bd1 fix(app): create reference to events task to prevent accidental GC
This wasn't a problem, but it's advised in the official docs so I've done it.
2024-08-12 07:49:58 +10:00
psychedelicious
29325a7214 fix(app): use asyncio queue and existing event loop for events
Around the time we (I) implemented pydantic events, I noticed a short pause between progress images every 4 or 5 steps when generating with SDXL. It didn't happen with SD1.5, but I did notice that with SD1.5, we'd get 4 or 5 progress events simultaneously. I'd expect one event every ~25ms, matching my it/s with SD1.5. Mysterious!

Digging in, I found an issue is related to our use of a synchronous queue for events. When the event queue is empty, we must call `asyncio.sleep` before checking again. We were sleeping for 100ms.

Said another way, every time we clear the event queue, we have to wait 100ms before another event can be dispatched, even if it is put on the queue immediately after we start waiting. In practice, this means our events get buffered into batches, dispatched once every 100ms.

This explains why I was getting batches of 4 or 5 SD1.5 progress events at once, but not the intermittent SDXL delay.

But this 100ms wait has another effect when the events are put on the queue in intervals that don't perfectly line up with the 100ms wait. This is most noticeable when the time between events is >100ms, and can add up to 100ms delay before the event is dispatched.

For example, say the queue is empty and we start a 100ms wait. Then, immediately after - like 0.01ms later - we push an event on to the queue. We still need to wait another 99.9ms before that event will be dispatched. That's the SDXL delay.

The easy fix is to reduce the sleep to something like 0.01 seconds, but this feels kinda dirty. Can't we just wait on the queue and dispatch every event immediately? Not with the normal synchronous queue - but we can with `asyncio.Queue`.

I switched the events queue to use `asyncio.Queue` (as seen in this commit), which lets us asynchronous wait on the queue in a loop.

Unfortunately, I ran into another issue - events now felt like their timing was inconsistent, but in a different way than with the 100ms sleep. The time between pushing events on the queue and dispatching them was not consistently ~0ms as I'd expect - it was highly variable from ~0ms up to ~100ms.

This is resolved by passing the asyncio loop directly into the events service and using its methods to create the task and interact with the queue. I don't fully understand why this resolved the issue, because either way we are interacting with the same event loop (as shown by `asyncio.get_running_loop()`). I suppose there's some scheduling magic happening.
2024-08-12 07:49:58 +10:00
psychedelicious
8ecf72838d fix(api): image downloads with correct filename
Closes #6730
2024-08-10 09:53:56 -04:00
psychedelicious
c3ab8a6aa8 chore(ui): bump rest of deps 2024-08-10 07:45:23 -04:00
psychedelicious
1931aa3e70 chore(ui): typegen 2024-08-10 07:45:23 -04:00
psychedelicious
d3d8055055 feat(ui): update typegen script 2024-08-10 07:45:23 -04:00
psychedelicious
476b0a0403 chore(ui): bump openapi-typescript 2024-08-10 07:45:23 -04:00
psychedelicious
f66584713c fix(api): sort OpenAPI schema properties for InvocationOutputMap
This makes the schema output deterministic!
2024-08-10 07:45:23 -04:00
psychedelicious
33624fc2fa fix(api): duplicate operation id for get_image_full
There's a FastAPI bug that results in the OpenAPI spec outputting the same operation id for each operation when specifying multiple HTTP methods.

- Discussion: https://github.com/tiangolo/fastapi/discussions/8449
- Pending PR to fix: https://github.com/tiangolo/fastapi/pull/10694

In our case, we have a `get_image_full` endpoint that handles GET and HEAD.

This results in an invalid OpenAPI schema. A workaround is to use two route decorators for the operation handler. This works as expected - HEAD requests get the header, and GET requests get the resource. And the OpenAPI schema is valid.
2024-08-10 07:45:23 -04:00
Mary Hipp
41c3e73a3c fix tests 2024-08-09 16:31:42 -04:00
Mary Hipp
97553a7de2 API/DB updates per PR feedback 2024-08-09 16:27:37 -04:00
Mary Hipp
12ba15bfa9 UI updates per PR feedback 2024-08-09 16:00:13 -04:00
Mary Hipp
09d1e190e7 show warning for maxUpscaleDimension if model tab is disabled 2024-08-09 14:07:55 -04:00
Mary Hipp
8eb5d08499 missed translation 2024-08-08 16:01:16 -04:00
Mary Hipp
9be6acde7d require name to submit style preset 2024-08-08 15:53:21 -04:00
Mary Hipp
5f83bb0069 update config docstring 2024-08-08 15:20:43 -04:00
Mary Hipp
b138882abc fix tests? 2024-08-08 15:18:32 -04:00
Mary Hipp
0cd7cdb52e remove send2trash 2024-08-08 15:13:36 -04:00
Mary Hipp
1d8b7e2bcf ruff 2024-08-08 15:08:45 -04:00
Mary Hipp
6461f4758d lint fix 2024-08-08 15:07:58 -04:00
Mary Hipp
3189ab6863 get dynamic prompts working 2024-08-08 15:07:23 -04:00
Mary Hipp
3f9a674d4b seed default presets and handle them in UI 2024-08-08 15:02:41 -04:00
Mary Hipp
587f59b25b focus on prompt textarea when exiting view mode by clicking 2024-08-08 14:38:50 -04:00
Mary Hipp
4952eada87 ruff format 2024-08-08 14:22:40 -04:00
Mary Hipp
581029ebaa ruff 2024-08-08 14:21:37 -04:00
Mary Hipp
42d68780de lint 2024-08-08 14:19:33 -04:00
Mary Hipp
28032a2f80 more cleanup 2024-08-08 14:18:05 -04:00
Mary Hipp
e381e021e9 knip lint 2024-08-08 14:00:17 -04:00
Mary Hipp
641af64f93 regnerate schema 2024-08-08 13:58:25 -04:00
Mary Hipp
a7b83c8b5b Merge remote-tracking branch 'origin/main' into maryhipp/style-presets 2024-08-08 13:56:59 -04:00
Mary Hipp
4cc41e0188 translations and lint fix 2024-08-08 13:56:37 -04:00
Mary Hipp
442fc02429 resize images to 100x100 for style preset images 2024-08-08 12:56:55 -04:00
Mary Hipp
9a4d075074 fix path for style_preset_images, fix png type when converting blobs to files, built view mode components 2024-08-08 12:31:20 -04:00
Sergey Borisov
17ff8196cb Remove tmp code 2024-08-07 22:06:05 -04:00
Sergey Borisov
68f993998a Add support for norm layer 2024-08-07 22:06:05 -04:00
Sergey Borisov
7da6120b39 Fix LoKR refactor bug 2024-08-07 22:06:05 -04:00
blessedcoolant
6cd40965c4 Depth Anything V2 (#6674)
- Updated the previous DepthAnything manual implementation to use the
`transformers` implementation instead. So we can get upstream features.
- Plugged in the DepthAnything models to be handled by Invoke's Model
Manager.
- `small_v2` model will use DepthAnythingV2. This has been added as a
new model option and is now also the default in the Linear UI.


![opera_TxRhmbFole](https://github.com/user-attachments/assets/2a25abe3-ba0b-4f97-b75a-2ce5fd6246e6)


# Merge

Review and merge.
2024-08-07 20:26:58 +05:30
Kent Keirsey
408a1d6dbb Merge branch 'main' into depth_anything_v2 2024-08-07 10:45:56 -04:00
Mary Hipp
0b0abfbe8f clean up image implementation 2024-08-07 10:36:38 -04:00
Mary Hipp
cc96dcf0ed style preset images 2024-08-07 09:58:27 -04:00
Mary Hipp
2604fd9fde a whole bunch of stuff 2024-08-06 15:31:13 -04:00
Hosted Weblate
140670d00e translationBot(ui): update translation files
Updated by "Cleanup translation files" hook in Weblate.

translationBot(ui): update translation files

Updated by "Cleanup translation files" hook in Weblate.

Co-authored-by: Hosted Weblate <hosted@weblate.org>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/
Translation: InvokeAI/Web UI
2024-08-06 17:54:47 +10:00
Phrixus2023
70233fae5d translationBot(ui): update translation (Chinese (Simplified))
Currently translated at 98.1% (1296 of 1321 strings)

Co-authored-by: Phrixus2023 <920414016@qq.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/zh_Hans/
Translation: InvokeAI/Web UI
2024-08-06 17:54:47 +10:00
Alexander Eichhorn
6f457a6c4c translationBot(ui): update translation (German)
Currently translated at 65.1% (860 of 1321 strings)

Co-authored-by: Alexander Eichhorn <pfannkuchensack@einfach-doof.de>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/de/
Translation: InvokeAI/Web UI
2024-08-06 17:54:47 +10:00
B N
5c319f5356 translationBot(ui): update translation (German)
Currently translated at 64.8% (857 of 1321 strings)

Co-authored-by: B N <berndnieschalk@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/de/
Translation: InvokeAI/Web UI
2024-08-06 17:54:47 +10:00
Riccardo Giovanetti
991a04f090 translationBot(ui): update translation (Italian)
Currently translated at 98.6% (1303 of 1321 strings)

translationBot(ui): update translation (Italian)

Currently translated at 98.6% (1302 of 1320 strings)

translationBot(ui): update translation (Italian)

Currently translated at 98.6% (1294 of 1312 strings)

Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
2024-08-06 17:54:47 +10:00
psychedelicious
c39fa75113 docs(ui): add comment in useIsTooLargeToUpscale 2024-08-06 11:49:35 +10:00
psychedelicious
f7863e17ce docs(ui): add docstring for maxUpscaleDimension 2024-08-06 11:49:35 +10:00
psychedelicious
7c526390ed fix(ui): compare upscaledPixels vs square of max dimension 2024-08-06 11:49:35 +10:00
Mary Hipp
2cff20f87a update translations, change config value to be dimension instead of total pixels 2024-08-06 11:49:35 +10:00
Mary Hipp
90ec757802 lint 2024-08-06 11:49:35 +10:00
Mary Hipp
4b85dfcefe (ui): restore optioanl limit on upcsale output resolution 2024-08-06 11:49:35 +10:00
Mary Hipp
21deefdc41 (ui): add image resolution badge to initial upscale image 2024-08-06 11:49:35 +10:00
Mary Hipp
857d74bbfe wip apply and calculate prompt with interpolation 2024-08-05 19:11:48 -04:00
Mary Hipp
fd7a635777 (ui) the most basic crud ui: view list of presets, create a new preset, edit/delete existing presets 2024-08-05 15:48:23 -04:00
Mary Hipp
af9110e964 fix prompt concat logic 2024-08-05 13:42:28 -04:00
Mary Hipp
a61209206b remove custom SDXL prompts component 2024-08-05 13:40:46 -04:00
Mary Hipp
e05cc62e5f add style presets API layer to UI 2024-08-05 13:37:07 -04:00
psychedelicious
4d4f921a4e build: exclude matplotlib 3.9.1
There was a problem w/ this release on windows and the builds were pulled from pypi. When installing invoke on windows, pip attempts to build from source, but most (all?) systems won't have the prerequisites for this and installs fail.

This also affects GH actions.

The simple fix is to exclude version 3.9.1 from our deps.

For more information, see https://github.com/matplotlib/matplotlib/issues/28551
2024-08-05 08:38:44 +10:00
psychedelicious
98db8f395b feat(app): clean up DiskImageStorage types 2024-08-04 09:43:20 +10:00
psychedelicious
f465a956a3 feat(ui): remove "images can be restored" messages 2024-08-04 09:43:20 +10:00
psychedelicious
9edb02d7ef build: remove send2trash dependency 2024-08-04 09:43:20 +10:00
psychedelicious
6c4cf58a31 feat(app): delete model_images instead of using send2trash 2024-08-04 09:43:20 +10:00
psychedelicious
08993c0d29 feat(app): delete images instead of using send2trash
Closes #6709
2024-08-04 09:43:20 +10:00
blessedcoolant
4f8a4b0f22 Merge branch 'main' into depth_anything_v2 2024-08-03 00:38:57 +05:30
blessedcoolant
a743f3c9b5 fix: implement model to func for depth anything 2024-08-03 00:37:17 +05:30
Mary Hipp
217fe40d99 feat(api): add style_presets router, make sure all CRUD is working, add is_default 2024-08-02 12:29:54 -04:00
Mary Hipp
b76bf50b93 feat(db,api): create new table for style presets, build out record storage service for style presets 2024-08-01 22:20:11 -04:00
Mary Hipp
571ba87e13 fix(ui): include upscale metadata for SDXL multidiffusion 2024-08-01 21:30:42 -04:00
Ryan Dick
f27b6e2b44 Add Grounded SAM support (text prompt image segmentation) (#6701)
## Summary

This PR enables Grounded SAM workflows
(https://arxiv.org/pdf/2401.14159) via the following:
- `GroundingDinoInvocation` for running a Grounding DINO model.
- `SegmentAnythingModelInvocation` for running a SAM model.
- `MaskTensorToImageInvocation` for convenient visualization.

Other notes:
- Uses the transformers implementation of Grounding DINO and SAM.
- The new models are treated as 'utility models' meaning that they are
not visible in the Models tab, and are downloaded automatically the
first time that they are used.

<img width="874" alt="image"
src="https://github.com/user-attachments/assets/1cbaa97d-0e27-4943-86b1-dc7327ba8675">

## Example

Input image

![be10ec0c-20a8-4ac7-840e-d1a05fffdb6a](https://github.com/user-attachments/assets/bf21572c-635d-4703-b4ab-7aba658a9671)

Prompt: "wheels", all other configs default
Result:

![2221c44e-64e6-4b18-b4cb-610514b7a554](https://github.com/user-attachments/assets/344b91f4-7f4a-4b70-8e2e-3b4a0e55176d)

## Related Issues / Discussions

Thanks to @blessedcoolant for the initial draft here:
https://github.com/invoke-ai/InvokeAI/pull/6678

## QA Instructions

Manual tests:
- [ ] Test that default settings work well.
- [ ] Test with / without apply_polygon_refinement
- [ ] Test mask_filter options
- [ ] Test detection_threshold values
- [ ] Test RGB input image
- [ ] Test RGBA input image
- [ ] Test grayscale input image
- [ ] Smoke test that an empty mask is returned when 0 objects are
detected
- [ ] Test on CPU
- [ ] Test on MPS (Works on Mac OS, but had to force both models to run
on CPU instead of MPS)

Performance:
- Peak GPU memory utilization with both Grounding DINO and SAM models
loaded is ~4.5GB. (The models do not need to be loaded at the same time,
so could be offloaded by the MM if needed.)
- On an RTX4090, with the models already cached, node execution takes
~0.6 secs.
- On my CPU, with the models cached, node execution takes ~10secs.

## Merge Plan

No special instructions.

## Checklist

- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
2024-08-01 20:40:18 +02:00
Ryan Dick
981475a624 Merge branch 'main' into ryan/grounded-sam 2024-08-01 20:30:35 +02:00
Ryan Dick
27ac61a4fb Expose all model options in the GroundingDinoInvocation and the SegmentAnythingInvocation. 2024-08-01 14:23:32 -04:00
Ryan Dick
675ffc2757 Remove BoundingBoxInvocation field name overrides. 2024-08-01 14:05:44 -04:00
Ryan Dick
44b21f10f1 Add a pydantic model_validator to BoundingBoxField to check the validity of the coords. 2024-08-01 14:00:57 -04:00
Ryan Dick
c6d49e8b1f Shorten SegmentAnythingInvocation and GroundingDinoInvocatino docstrings, since they are used as the invocation descriptions in the UI. 2024-08-01 10:17:42 -04:00
Ryan Dick
e6a512aa86 (minor) Tweak order of mask operations. 2024-08-01 10:12:24 -04:00
Ryan Dick
c3a6a6fb22 Rename SegmentAnythingModelInvocation -> SegmentAnythingInvocation. 2024-08-01 10:00:36 -04:00
Ryan Dick
b9dc3460ba Rename SegmentAnythingModel -> SegmentAnythingPipeline. 2024-08-01 09:57:47 -04:00
Ryan Dick
63581ec980 (minor) Add None check to fix static type checking error. 2024-08-01 09:51:53 -04:00
chainchompa
08b1feeed7 add base prop for destination to direct users to different tabs on initial load (#6706)
## Summary
- we want a way to load the studio while being directed to a specific
tab, introduced a destination prop to achieve that
<!--A description of the changes in this PR. Include the kind of change
(fix, feature, docs, etc), the "why" and the "how". Screenshots or
videos are useful for frontend changes.-->

## Related Issues / Discussions

<!--WHEN APPLICABLE: List any related issues or discussions on github or
discord. If this PR closes an issue, please use the "Closes #1234"
format, so that the issue will be automatically closed when the PR
merges.-->

## QA Instructions

<!--WHEN APPLICABLE: Describe how you have tested the changes in this
PR. Provide enough detail that a reviewer can reproduce your tests.-->

## Merge Plan

<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->

## Checklist

- [ ] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
2024-07-31 19:25:36 -04:00
blessedcoolant
f5cfdcf32d feat: Add BoundingBox Primitive Node 2024-08-01 04:09:08 +05:30
chainchompa
e78fb428f0 simplify destination prop handling 2024-07-31 18:06:22 -04:00
chainchompa
31e270e32c add base prop for destination to direct users to different tabs 2024-07-31 17:20:51 -04:00
Ryan Dick
b5832768dc Return a MaskOutput from SegmentAnythingModelInvocation. And add a MaskTensorToImageInvocation. 2024-07-31 17:16:14 -04:00
Ryan Dick
4ce64b69cb Modular backend - LoRA/LyCORIS (#6667)
## Summary

Code for lora patching from #6577.
Additionally made it the way, that lora can patch not only `weight`, but
also `bias`, because saw some loras which doing it.

## Related Issues / Discussions

#6606 

https://invokeai.notion.site/Modular-Stable-Diffusion-Backend-Design-Document-e8952daab5d5472faecdc4a72d377b0d

## QA Instructions

Run with and without set `USE_MODULAR_DENOISE` environment.

## Merge Plan

Replace old lora patcher with new after review done.
If you think that there should be some kind of tests - feel free to add.

## Checklist

- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
2024-07-31 21:31:31 +02:00
Ryan Dick
5a9173f766 Merge branch 'main' into stalker-modular_lora 2024-07-31 15:13:22 -04:00
Ryan Dick
0bb7ed44f6 Add some docs to OriginalWeightsStorage and fix type hints. 2024-07-31 15:08:24 -04:00
blessedcoolant
332bc9da5b fix: Update depth anything node default to v2 2024-07-31 23:52:29 +05:30
blessedcoolant
08def3da95 fix: Update canvas depth anything processor default to v2 2024-07-31 23:50:13 +05:30
blessedcoolant
daf899f9c4 fix: Move the manual image resizing out of the depth anything pipeline 2024-07-31 23:38:12 +05:30
blessedcoolant
13fb2d1f49 fix: Add Depth Anything V2 as a new option
It is also now the default in the UI replacing Depth Anything V1 small
2024-07-31 23:29:43 +05:30
blessedcoolant
95dde802ea fix: assert the return depth map to be a PIL image 2024-07-31 23:22:01 +05:30
Ryan Dick
fca119773b Split invokeai/backend/image_util/segment_anything/ dir into grounding_dino/ and segment_anything/ 2024-07-31 12:28:47 -04:00
Ryan Dick
0193267a53 Split GroundedSamInvocation into GroundingDinoInvocation and SegmentAnythingModelInvocation. 2024-07-31 12:20:23 -04:00
blessedcoolant
b4cf78a95d fix: make DA Pipeline a subclass of RawModel 2024-07-31 21:14:49 +05:30
Ryan Dick
73386826d6 Make GroundingDinoPipeline and SegmentAnythingModel subclasses of RawModel for type checking purposes. 2024-07-31 10:25:34 -04:00
Ryan Dick
9f448fecb7 Move invokeai/backend/grounded_sam -> invokeai/backend/image_util/grounded_sam 2024-07-31 10:00:30 -04:00
Ryan Dick
bcd1483a14 Re-order GroundedSAMInvocation._to_numpy_masks(...) to do slightly more work on the GPU. 2024-07-31 09:51:14 -04:00
Ryan Dick
e206890e25 Use staticmethods rather than inner functions for the Grounding DINO and SAM model loaders. 2024-07-31 09:28:52 -04:00
Ryan Dick
0a7048f650 (minor) Simplify GroundedSAMInvocation._merge_masks(...). 2024-07-31 08:58:51 -04:00
Ryan Dick
e8ecf5e155 (minor) Move apply_polygon_refinement condition up a layer. 2024-07-31 08:50:56 -04:00
Ryan Dick
33e8604b57 Make Grounding DINO DetectionResult a Pydantic model. 2024-07-31 08:47:00 -04:00
Ryan Dick
cec7399366 (minor) Use a new variable name to satisfy type checks. 2024-07-31 08:27:01 -04:00
Ryan Dick
bdae81e429 (minor) Simplify GroundedSAMInvocation._filter_detections() 2024-07-31 08:25:19 -04:00
Ryan Dick
67c32f3d6c Fix typo: zip(..., strict=True) 2024-07-31 08:15:28 -04:00
blessedcoolant
94d64b8a78 Fix gradient mask values range (#6688)
## Summary

Gradient mask node outputs mask tensor with values in range [-1, 1],
which unexpected range for mask.
It handled in denoise node the way it translates to [0, 2] mask, which
looks even more wrongly)
From discussion with @dunkeroni I understand him as he thought that
negative values will be treated same as 0, so clamping values not change
intended node logic.

## Related Issues / Discussions

#6643 

## QA Instructions

\-

## Merge Plan

\-

## Checklist

- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
2024-07-31 06:37:32 +05:30
blessedcoolant
fa3c0c81b3 Merge branch 'main' into stalker7779/fix_gradient_mask 2024-07-31 06:30:44 +05:30
blessedcoolant
66547b99c1 Add more karras schedulers (#6695)
## Summary

Add karras variants of `deis`, `unipc`, `kdpm2` and `kdpm_2_a`
schedulers.
Also added `dpmpp_3` schedulers, but `dpmpp_3s` currently bugged, so
added only 3m:
https://github.com/huggingface/diffusers/issues/9007

## Related Issues / Discussions

\-

## QA Instructions

\-

## Merge Plan

~@psychedelicious We need to decide what to do with schedulers order, as
it looks a bit broken:~

![image](https://github.com/user-attachments/assets/e41674af-d87c-4432-8014-c90bd86965a6)

## Checklist

- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
2024-07-31 06:09:26 +05:30
blessedcoolant
328e58be4c Merge branch 'main' into stalker7779/new_karras_schedulers 2024-07-31 05:56:13 +05:30
blessedcoolant
18f89ed5ed fix: Make DepthAnything work with Invoke's Model Management 2024-07-31 03:57:54 +05:30
Ryan Dick
5701c79fab Prevent Grounding DINO and Segment Anything from being moved to MPS - they don't work on MPS devices. 2024-07-30 23:04:15 +02:00
Ryan Dick
2da9f913f3 Add detection_result.py - was forgotten in a prior commit 2024-07-30 16:04:29 -04:00
Ryan Dick
6b10b59abe Make GroundedSAMInvocation work with any input image mode (RGB, RGBA, grayscale). 2024-07-30 15:55:57 -04:00
Ryan Dick
918f77bce0 Move some logic from GroundedSAMInvocation to the backend classes. 2024-07-30 15:34:33 -04:00
blessedcoolant
f170697ebe Merge branch 'main' into depth_anything_v2 2024-07-31 00:53:32 +05:30
blessedcoolant
556c6a1d84 fix: Update DepthAnything to use the transformers implementation 2024-07-31 00:51:55 +05:30
Ryan Dick
aca2a2fa13 Add mask_filter and detection_threshold options to the GroundedSAMInvocation. 2024-07-30 14:22:40 -04:00
Ryan Dick
ff6398f7d8 Add a GroundedSamInvocation for image segmentation from a text prompt (Grounding DINO + Segment Anything Model). 2024-07-30 11:12:26 -04:00
Sergey Borisov
cf996472b9 Suggested changes
Co-Authored-By: psychedelicious <4822129+psychedelicious@users.noreply.github.com>
2024-07-30 04:50:56 +03:00
Sergey Borisov
156d14c349 Run api regen 2024-07-30 04:05:21 +03:00
Sergey Borisov
86f705bf48 Optimize weights handling 2024-07-30 03:39:01 +03:00
Sergey Borisov
1fd9631f2d Comments fix
Co-Authored-By: Ryan Dick <14897797+RyanJDick@users.noreply.github.com>
2024-07-30 00:39:50 +03:00
Sergey Borisov
2227a2357f Suggested changes + simplify weights logic in patching
Co-Authored-By: Ryan Dick <14897797+RyanJDick@users.noreply.github.com>
2024-07-30 00:34:37 +03:00
Sergey Borisov
58e7ab157d Ruff format 2024-07-29 22:59:17 +03:00
Sergey Borisov
8d16fa6a49 Remove dpmpp_3s schedulers as it bugged now 2024-07-29 22:55:45 +03:00
Sergey Borisov
55e810efa3 Add dpmpp_3 schedulers 2024-07-29 22:52:15 +03:00
chainchompa
2755316021 update delete board modal to be more descriptive (#6690)
## Summary

<!--A description of the changes in this PR. Include the kind of change
(fix, feature, docs, etc), the "why" and the "how". Screenshots or
videos are useful for frontend changes.-->

## Related Issues / Discussions

<!--WHEN APPLICABLE: List any related issues or discussions on github or
discord. If this PR closes an issue, please use the "Closes #1234"
format, so that the issue will be automatically closed when the PR
merges.-->

## QA Instructions

<!--WHEN APPLICABLE: Describe how you have tested the changes in this
PR. Provide enough detail that a reviewer can reproduce your tests.-->

## Merge Plan

<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->

## Checklist

- [ ] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
2024-07-29 13:43:17 -04:00
chainchompa
6525f18610 Merge branch 'main' into chainchompa/board-delete-info 2024-07-29 12:52:36 -04:00
Ryan Dick
2ad13ac7eb Modular backend - inpaint (#6643)
## Summary

Code for inpainting and inpaint models handling from
https://github.com/invoke-ai/InvokeAI/pull/6577.
Separated in 2 extensions as discussed briefly before, so wait for
discussion about such implementation.

## Related Issues / Discussions

#6606

https://invokeai.notion.site/Modular-Stable-Diffusion-Backend-Design-Document-e8952daab5d5472faecdc4a72d377b0d

## QA Instructions

Run with and without set `USE_MODULAR_DENOISE` environment.
Try and compare outputs between backends in cases:
- Normal generation on inpaint model
- Inpainting on inpaint model
- Inpainting on normal model

## Merge Plan

Nope.
If you think that there should be some kind of tests - feel free to add.

## Checklist

- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
2024-07-29 10:27:25 -04:00
Ryan Dick
693a3eaff5 Merge branch 'main' into stalker-modular_inpaint-2 2024-07-29 10:14:45 -04:00
chainchompa
ffca792d5b edited copy for deleted boards message 2024-07-29 09:46:08 -04:00
Sergey Borisov
86a92bb6b5 Add more karras schedulers 2024-07-29 15:14:34 +03:00
psychedelicious
171a4e6d80 fix(ui): race condition when deleting a board and resetting selected/auto-add
We were checking the selected and auto-add board ids against the query cache to see if they still exist. If not, we reset.

This only works if the query cache is updated by the time we do the check - race condition!

We already have the board id from the query args, so there's no need to check the query cache - just compare the deleted board ID directly.

Previously this file's several listeners were all in a single one and I had adapted/split its logic up a bit wonkily, introducing these problems.
2024-07-29 11:36:03 +10:00
psychedelicious
e3a75a8adf fix(ui): fix logic to reset selected/auto-add boards when toggling show archived boards
The logic was incorrect in two ways:
1. We only ran the logic if we _enable_ showing archived boards. It should be run we we _disable_ showing archived boards.
2. If we couldn't find the selected board in the query cache, we didn't do the reset. This is wrong - if the board isn't in the query cache, we _should_ do the reset. This inverted logic makes more sense before the fix for issue 1.
2024-07-29 11:36:03 +10:00
Ryan Dick
ee7503ce13 Modular backend - T2I Adapter (#6662)
## Summary

T2I Adapter code from #6577.

## Related Issues / Discussions

#6606 

https://invokeai.notion.site/Modular-Stable-Diffusion-Backend-Design-Document-e8952daab5d5472faecdc4a72d377b0d

## QA Instructions

Run with and without set `USE_MODULAR_DENOISE` environment.

## Merge Plan

Nope.
If you think that there should be some kind of tests - feel free to add.

## Checklist

- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
2024-07-28 15:52:04 -04:00
Sergey Borisov
8500bac3ca Use logger for warning 2024-07-28 22:51:52 +03:00
Ryan Dick
310719eb4c Merge branch 'main' into stalker-modular_t2i_adapter 2024-07-28 15:30:00 -04:00
Ryan Dick
e8e24822ec Modular backend - Seamless (#6651)
## Summary

Seamless code from #6577.

## Related Issues / Discussions

#6606 

https://invokeai.notion.site/Modular-Stable-Diffusion-Backend-Design-Document-e8952daab5d5472faecdc4a72d377b0d

## QA Instructions

Run with and without set `USE_MODULAR_DENOISE` environment.

## Merge Plan

Nope.
If you think that there should be some kind of tests - feel free to add.

## Checklist

- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
2024-07-28 13:57:38 -04:00
Ryan Dick
c57a7afb87 Merge branch 'main' into stalker7779/modular_seamless 2024-07-28 13:49:43 -04:00
Sergey Borisov
84d028898c Revert wrong comment copy 2024-07-27 13:20:58 +03:00
Sergey Borisov
ed0174fbc6 Suggested changes
Co-Authored-By: Ryan Dick <14897797+RyanJDick@users.noreply.github.com>
2024-07-27 13:18:28 +03:00
Sergey Borisov
9e582563eb Suggested changes
Co-Authored-By: Ryan Dick <14897797+RyanJDick@users.noreply.github.com>
2024-07-27 04:25:15 +03:00
Sergey Borisov
faa88f72bf Make lora as separate extensions 2024-07-27 02:39:53 +03:00
chainchompa
0d69a31df0 Merge branch 'main' into chainchompa/board-delete-info 2024-07-26 14:03:18 -04:00
brandonrising
daa5a88eb2 Update docker image to use pnpm version 8 2024-07-26 13:57:33 -04:00
Sergey Borisov
5b84e117b2 Suggested changes
Co-Authored-By: Ryan Dick <14897797+RyanJDick@users.noreply.github.com>
2024-07-26 20:51:12 +03:00
chainchompa
eb257d2d28 update delete board modal to be more descriptive 2024-07-26 13:34:25 -04:00
Sergey Borisov
5810cee6c9 Suggested changes
Co-Authored-By: Ryan Dick <14897797+RyanJDick@users.noreply.github.com>
2024-07-26 19:47:28 +03:00
Sergey Borisov
eef88d1f83 Update gradient mask node version 2024-07-26 19:33:41 +03:00
Sergey Borisov
78f6850fc0 Fix gradient mask values range 2024-07-26 19:28:00 +03:00
Sergey Borisov
bd8890be11 Revert "Fix create gradient mask node output"
This reverts commit 9d1fcba415.
2024-07-26 19:24:46 +03:00
Sergey Borisov
adf1a977ea Suggested changes
Co-Authored-By: Ryan Dick <14897797+RyanJDick@users.noreply.github.com>
2024-07-26 19:22:26 +03:00
Mary Hipp
e1509bcb45 bump version to 4.2.7 2024-07-26 09:11:17 -07:00
psychedelicious
edcaf8287d feat(app): remove beta from multidiffusion workflows 2024-07-26 13:47:51 +10:00
psychedelicious
39bd30f2a0 feat(app): update default workflows
- Update `MultiDiffusion SDXL (Beta)`
- Add `MultiDiffusion SD1.5 (Beta)`
2024-07-26 13:47:51 +10:00
psychedelicious
102b47190f feat(ui): update qr code cnet starter model
- For SD1.5, use the new V2 version
- Add the SDXL version
2024-07-26 13:34:32 +10:00
Mary Hipp
269fe2e3bb track accordions in tabs separately so open/close state isnt shared 2024-07-26 08:20:24 +10:00
Mary Hipp
b32aa1c77f fix missing quote in translation 2024-07-26 08:20:24 +10:00
Mary Hipp Rogers
6656544ed5 tooltip copy updates
Co-authored-by: psychedelicious <4822129+psychedelicious@users.noreply.github.com>
2024-07-26 08:20:24 +10:00
Mary Hipp
4c75b93410 feat(ui): add informational popovers for upscale params 2024-07-26 08:20:24 +10:00
Mary Hipp
5be0de967d feat(ui): close generation and advanced accordions when switching to upscale tab 2024-07-26 08:20:24 +10:00
psychedelicious
f8e27b837b fix(ui): memoize model manager components 2024-07-26 07:52:10 +10:00
psychedelicious
47414be1e6 fix(ui): dropped model config cache breaking model edit UI
The model edit UI's composition allows for the model edit form to be instantiated before the model's config has been received. This results in the form having no values - all the fields are blank instead of populated by the model config.

Part of the fix is to pass the model config around directly instead of relying on _all_ components to fetch the model directly.

I also fixed a crapload of performance issues related to improper use of redux selectors.
2024-07-26 07:52:10 +10:00
psychedelicious
74cef38bcf fix(backend): add refiner to single-file load_classes
Fixes single-file refiner loading.
2024-07-26 05:08:01 +10:00
psychedelicious
bb876b8d4e fix(ui): copied edges must have new ids set
Problems this was causing:
- Deleting an edge was a copy of another edge deletes both edges
- Deleting a node that was a copy-with-edges of another node deletes its edges and it's original edges, leaving what I will call "ghost noodles" behind
2024-07-26 04:54:33 +10:00
psychedelicious
ba747373db feat(ui): add button to disable info popovers from info popover 2024-07-25 08:06:41 -04:00
psychedelicious
95661c8b21 feat(ui): enable info popovers by default 2024-07-25 08:06:41 -04:00
blessedcoolant
e5d9ca013e fix: use v1 models for large and base versions 2024-07-25 17:24:12 +05:30
blessedcoolant
4166c756ce wip: depth_anything_v2 init lint fixes 2024-07-25 14:41:22 +05:30
blessedcoolant
4f0dfbd34d wip: depth_anything_v2 initial implementation 2024-07-25 13:53:06 +05:30
psychedelicious
b70ac88684 perf(ui): throttle page changes
Previously you could spam the next/prev buttons and really thrash the server. Throttled to 500ms, which feels like a happy medium between responsive and not-thrash-y.
2024-07-25 11:57:54 +10:00
psychedelicious
24609da6ab feat(ui): tweak pagination styles 2024-07-25 11:57:54 +10:00
psychedelicious
524647b1f1 fix(ui): jumpto interactions
- Autofocus on popover open
- Autoselect number on popover open
- Enter works to change page when input is focused
- Esc works to close popover when input is focused
2024-07-25 11:57:54 +10:00
Mary Hipp
cf1af94f53 feat(ui): make jump to page a popover 2024-07-25 11:57:54 +10:00
Mary Hipp
2a9fdc6314 feat(ui): add jump to option for gallery pagination 2024-07-25 11:57:54 +10:00
Sergey Borisov
46c632e7cc Change layer detection keys according to LyCORIS repository 2024-07-25 02:10:47 +03:00
Sergey Borisov
653f63ae71 Add layer keys check 2024-07-25 02:03:08 +03:00
Sergey Borisov
8a9e2f57a4 Handle bias in full/diff lora layer 2024-07-25 02:02:37 +03:00
Sergey Borisov
31949ed2f2 Refactor code a bit 2024-07-25 02:00:30 +03:00
psychedelicious
3657285b1b chore: bump version v4.2.7rc1 2024-07-25 06:23:50 +10:00
Hosted Weblate
e4b5975305 translationBot(ui): update translation files
Updated by "Cleanup translation files" hook in Weblate.

Co-authored-by: Hosted Weblate <hosted@weblate.org>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/
Translation: InvokeAI/Web UI
2024-07-25 06:09:04 +10:00
gallegonovato
b59825edc0 translationBot(ui): update translation (Spanish)
Currently translated at 34.4% (448 of 1300 strings)

Co-authored-by: gallegonovato <fran-carro@hotmail.es>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/es/
Translation: InvokeAI/Web UI
2024-07-25 06:09:04 +10:00
Riccardo Giovanetti
25788f6869 translationBot(ui): update translation (Italian)
Currently translated at 98.6% (1289 of 1307 strings)

translationBot(ui): update translation (Italian)

Currently translated at 98.5% (1277 of 1296 strings)

Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
2024-07-25 06:09:04 +10:00
Sergey Borisov
0ccb304b8b Ruff format 2024-07-24 16:01:29 +03:00
Sergey Borisov
ab0bfa709a Handle loras in modular denoise 2024-07-24 05:07:29 +03:00
Sergey Borisov
6af659b1da Handle t2i adapter in modular denoise 2024-07-24 02:55:33 +03:00
Sergey Borisov
416d29fb83 Ruff format 2024-07-24 01:17:28 +03:00
Sergey Borisov
19c00241c6 Use non-inverted mask generally(except inpaint model handling) 2024-07-24 00:59:13 +03:00
Sergey Borisov
c323a760a5 Suggested changes
Co-Authored-By: Ryan Dick <14897797+RyanJDick@users.noreply.github.com>
2024-07-23 23:34:28 +03:00
Sergey Borisov
9d1fcba415 Fix create gradient mask node output 2024-07-23 23:29:28 +03:00
Sergey Borisov
ca21996a97 Remove old seamless class 2024-07-23 18:04:33 +03:00
Sergey Borisov
62aa064e56 Handle seamless in modular denoise 2024-07-23 18:03:59 +03:00
Sergey Borisov
87eb018380 Revert debug change 2024-07-22 23:49:20 +03:00
Sergey Borisov
5003e5d763 Same changes as in other PRs, add check for running inpainting on inpaint model without source image
Co-Authored-By: Ryan Dick <14897797+RyanJDick@users.noreply.github.com>
2024-07-22 23:47:39 +03:00
Sergey Borisov
58f3072b91 Handle inpainting on normal models 2024-07-21 22:17:29 +03:00
Sergey Borisov
9e7b470189 Handle inpaint models 2024-07-21 20:45:55 +03:00
250 changed files with 28758 additions and 22839 deletions

View File

@@ -62,7 +62,7 @@ jobs:
- name: install ruff
if: ${{ steps.changed-files.outputs.python_any_changed == 'true' || inputs.always_run == true }}
run: pip install ruff
run: pip install ruff==0.6.0
shell: bash
- name: ruff check

View File

@@ -55,6 +55,7 @@ RUN --mount=type=cache,target=/root/.cache/pip \
FROM node:20-slim AS web-builder
ENV PNPM_HOME="/pnpm"
ENV PATH="$PNPM_HOME:$PATH"
RUN corepack use pnpm@8.x
RUN corepack enable
WORKDIR /build

View File

@@ -17,7 +17,7 @@
set -eu
# Ensure we're in the correct folder in case user's CWD is somewhere else
scriptdir=$(dirname "$0")
scriptdir=$(dirname $(readlink -f "$0"))
cd "$scriptdir"
. .venv/bin/activate

View File

@@ -1,5 +1,6 @@
# Copyright (c) 2022 Kyle Schouviller (https://github.com/kyle0654)
import asyncio
from logging import Logger
import torch
@@ -31,6 +32,8 @@ from invokeai.app.services.session_processor.session_processor_default import (
)
from invokeai.app.services.session_queue.session_queue_sqlite import SqliteSessionQueue
from invokeai.app.services.shared.sqlite.sqlite_util import init_db
from invokeai.app.services.style_preset_images.style_preset_images_disk import StylePresetImageFileStorageDisk
from invokeai.app.services.style_preset_records.style_preset_records_sqlite import SqliteStylePresetRecordsStorage
from invokeai.app.services.urls.urls_default import LocalUrlService
from invokeai.app.services.workflow_records.workflow_records_sqlite import SqliteWorkflowRecordsStorage
from invokeai.backend.stable_diffusion.diffusion.conditioning_data import ConditioningFieldData
@@ -63,7 +66,12 @@ class ApiDependencies:
invoker: Invoker
@staticmethod
def initialize(config: InvokeAIAppConfig, event_handler_id: int, logger: Logger = logger) -> None:
def initialize(
config: InvokeAIAppConfig,
event_handler_id: int,
loop: asyncio.AbstractEventLoop,
logger: Logger = logger,
) -> None:
logger.info(f"InvokeAI version {__version__}")
logger.info(f"Root directory = {str(config.root_path)}")
@@ -74,6 +82,7 @@ class ApiDependencies:
image_files = DiskImageFileStorage(f"{output_folder}/images")
model_images_folder = config.models_path
style_presets_folder = config.style_presets_path
db = init_db(config=config, logger=logger, image_files=image_files)
@@ -84,7 +93,7 @@ class ApiDependencies:
board_images = BoardImagesService()
board_records = SqliteBoardRecordStorage(db=db)
boards = BoardService()
events = FastAPIEventService(event_handler_id)
events = FastAPIEventService(event_handler_id, loop=loop)
bulk_download = BulkDownloadService()
image_records = SqliteImageRecordStorage(db=db)
images = ImageService()
@@ -109,6 +118,8 @@ class ApiDependencies:
session_queue = SqliteSessionQueue(db=db)
urls = LocalUrlService()
workflow_records = SqliteWorkflowRecordsStorage(db=db)
style_preset_records = SqliteStylePresetRecordsStorage(db=db)
style_preset_image_files = StylePresetImageFileStorageDisk(style_presets_folder / "images")
services = InvocationServices(
board_image_records=board_image_records,
@@ -134,6 +145,8 @@ class ApiDependencies:
workflow_records=workflow_records,
tensors=tensors,
conditioning=conditioning,
style_preset_records=style_preset_records,
style_preset_image_files=style_preset_image_files,
)
ApiDependencies.invoker = Invoker(services)

View File

@@ -218,9 +218,8 @@ async def get_image_workflow(
raise HTTPException(status_code=404)
@images_router.api_route(
@images_router.get(
"/i/{image_name}/full",
methods=["GET", "HEAD"],
operation_id="get_image_full",
response_class=Response,
responses={
@@ -231,6 +230,18 @@ async def get_image_workflow(
404: {"description": "Image not found"},
},
)
@images_router.head(
"/i/{image_name}/full",
operation_id="get_image_full_head",
response_class=Response,
responses={
200: {
"description": "Return the full-resolution image",
"content": {"image/png": {}},
},
404: {"description": "Image not found"},
},
)
async def get_image_full(
image_name: str = Path(description="The name of full-resolution image file to get"),
) -> Response:
@@ -242,6 +253,7 @@ async def get_image_full(
content = f.read()
response = Response(content, media_type="image/png")
response.headers["Cache-Control"] = f"max-age={IMAGE_MAX_AGE}"
response.headers["Content-Disposition"] = f'inline; filename="{image_name}"'
return response
except Exception:
raise HTTPException(status_code=404)

View File

@@ -0,0 +1,274 @@
import csv
import io
import json
import traceback
from typing import Optional
import pydantic
from fastapi import APIRouter, File, Form, HTTPException, Path, Response, UploadFile
from fastapi.responses import FileResponse
from PIL import Image
from pydantic import BaseModel, Field
from invokeai.app.api.dependencies import ApiDependencies
from invokeai.app.api.routers.model_manager import IMAGE_MAX_AGE
from invokeai.app.services.style_preset_images.style_preset_images_common import StylePresetImageFileNotFoundException
from invokeai.app.services.style_preset_records.style_preset_records_common import (
InvalidPresetImportDataError,
PresetData,
PresetType,
StylePresetChanges,
StylePresetNotFoundError,
StylePresetRecordWithImage,
StylePresetWithoutId,
UnsupportedFileTypeError,
parse_presets_from_file,
)
class StylePresetFormData(BaseModel):
name: str = Field(description="Preset name")
positive_prompt: str = Field(description="Positive prompt")
negative_prompt: str = Field(description="Negative prompt")
type: PresetType = Field(description="Preset type")
style_presets_router = APIRouter(prefix="/v1/style_presets", tags=["style_presets"])
@style_presets_router.get(
"/i/{style_preset_id}",
operation_id="get_style_preset",
responses={
200: {"model": StylePresetRecordWithImage},
},
)
async def get_style_preset(
style_preset_id: str = Path(description="The style preset to get"),
) -> StylePresetRecordWithImage:
"""Gets a style preset"""
try:
image = ApiDependencies.invoker.services.style_preset_image_files.get_url(style_preset_id)
style_preset = ApiDependencies.invoker.services.style_preset_records.get(style_preset_id)
return StylePresetRecordWithImage(image=image, **style_preset.model_dump())
except StylePresetNotFoundError:
raise HTTPException(status_code=404, detail="Style preset not found")
@style_presets_router.patch(
"/i/{style_preset_id}",
operation_id="update_style_preset",
responses={
200: {"model": StylePresetRecordWithImage},
},
)
async def update_style_preset(
image: Optional[UploadFile] = File(description="The image file to upload", default=None),
style_preset_id: str = Path(description="The id of the style preset to update"),
data: str = Form(description="The data of the style preset to update"),
) -> StylePresetRecordWithImage:
"""Updates a style preset"""
if image is not None:
if not image.content_type or not image.content_type.startswith("image"):
raise HTTPException(status_code=415, detail="Not an image")
contents = await image.read()
try:
pil_image = Image.open(io.BytesIO(contents))
except Exception:
ApiDependencies.invoker.services.logger.error(traceback.format_exc())
raise HTTPException(status_code=415, detail="Failed to read image")
try:
ApiDependencies.invoker.services.style_preset_image_files.save(style_preset_id, pil_image)
except ValueError as e:
raise HTTPException(status_code=409, detail=str(e))
else:
try:
ApiDependencies.invoker.services.style_preset_image_files.delete(style_preset_id)
except StylePresetImageFileNotFoundException:
pass
try:
parsed_data = json.loads(data)
validated_data = StylePresetFormData(**parsed_data)
name = validated_data.name
type = validated_data.type
positive_prompt = validated_data.positive_prompt
negative_prompt = validated_data.negative_prompt
except pydantic.ValidationError:
raise HTTPException(status_code=400, detail="Invalid preset data")
preset_data = PresetData(positive_prompt=positive_prompt, negative_prompt=negative_prompt)
changes = StylePresetChanges(name=name, preset_data=preset_data, type=type)
style_preset_image = ApiDependencies.invoker.services.style_preset_image_files.get_url(style_preset_id)
style_preset = ApiDependencies.invoker.services.style_preset_records.update(
style_preset_id=style_preset_id, changes=changes
)
return StylePresetRecordWithImage(image=style_preset_image, **style_preset.model_dump())
@style_presets_router.delete(
"/i/{style_preset_id}",
operation_id="delete_style_preset",
)
async def delete_style_preset(
style_preset_id: str = Path(description="The style preset to delete"),
) -> None:
"""Deletes a style preset"""
try:
ApiDependencies.invoker.services.style_preset_image_files.delete(style_preset_id)
except StylePresetImageFileNotFoundException:
pass
ApiDependencies.invoker.services.style_preset_records.delete(style_preset_id)
@style_presets_router.post(
"/",
operation_id="create_style_preset",
responses={
200: {"model": StylePresetRecordWithImage},
},
)
async def create_style_preset(
image: Optional[UploadFile] = File(description="The image file to upload", default=None),
data: str = Form(description="The data of the style preset to create"),
) -> StylePresetRecordWithImage:
"""Creates a style preset"""
try:
parsed_data = json.loads(data)
validated_data = StylePresetFormData(**parsed_data)
name = validated_data.name
type = validated_data.type
positive_prompt = validated_data.positive_prompt
negative_prompt = validated_data.negative_prompt
except pydantic.ValidationError:
raise HTTPException(status_code=400, detail="Invalid preset data")
preset_data = PresetData(positive_prompt=positive_prompt, negative_prompt=negative_prompt)
style_preset = StylePresetWithoutId(name=name, preset_data=preset_data, type=type)
new_style_preset = ApiDependencies.invoker.services.style_preset_records.create(style_preset=style_preset)
if image is not None:
if not image.content_type or not image.content_type.startswith("image"):
raise HTTPException(status_code=415, detail="Not an image")
contents = await image.read()
try:
pil_image = Image.open(io.BytesIO(contents))
except Exception:
ApiDependencies.invoker.services.logger.error(traceback.format_exc())
raise HTTPException(status_code=415, detail="Failed to read image")
try:
ApiDependencies.invoker.services.style_preset_image_files.save(new_style_preset.id, pil_image)
except ValueError as e:
raise HTTPException(status_code=409, detail=str(e))
preset_image = ApiDependencies.invoker.services.style_preset_image_files.get_url(new_style_preset.id)
return StylePresetRecordWithImage(image=preset_image, **new_style_preset.model_dump())
@style_presets_router.get(
"/",
operation_id="list_style_presets",
responses={
200: {"model": list[StylePresetRecordWithImage]},
},
)
async def list_style_presets() -> list[StylePresetRecordWithImage]:
"""Gets a page of style presets"""
style_presets_with_image: list[StylePresetRecordWithImage] = []
style_presets = ApiDependencies.invoker.services.style_preset_records.get_many()
for preset in style_presets:
image = ApiDependencies.invoker.services.style_preset_image_files.get_url(preset.id)
style_preset_with_image = StylePresetRecordWithImage(image=image, **preset.model_dump())
style_presets_with_image.append(style_preset_with_image)
return style_presets_with_image
@style_presets_router.get(
"/i/{style_preset_id}/image",
operation_id="get_style_preset_image",
responses={
200: {
"description": "The style preset image was fetched successfully",
},
400: {"description": "Bad request"},
404: {"description": "The style preset image could not be found"},
},
status_code=200,
)
async def get_style_preset_image(
style_preset_id: str = Path(description="The id of the style preset image to get"),
) -> FileResponse:
"""Gets an image file that previews the model"""
try:
path = ApiDependencies.invoker.services.style_preset_image_files.get_path(style_preset_id)
response = FileResponse(
path,
media_type="image/png",
filename=style_preset_id + ".png",
content_disposition_type="inline",
)
response.headers["Cache-Control"] = f"max-age={IMAGE_MAX_AGE}"
return response
except Exception:
raise HTTPException(status_code=404)
@style_presets_router.get(
"/export",
operation_id="export_style_presets",
responses={200: {"content": {"text/csv": {}}, "description": "A CSV file with the requested data."}},
status_code=200,
)
async def export_style_presets():
# Create an in-memory stream to store the CSV data
output = io.StringIO()
writer = csv.writer(output)
# Write the header
writer.writerow(["name", "prompt", "negative_prompt"])
style_presets = ApiDependencies.invoker.services.style_preset_records.get_many(type=PresetType.User)
for preset in style_presets:
writer.writerow([preset.name, preset.preset_data.positive_prompt, preset.preset_data.negative_prompt])
csv_data = output.getvalue()
output.close()
return Response(
content=csv_data,
media_type="text/csv",
headers={"Content-Disposition": "attachment; filename=prompt_templates.csv"},
)
@style_presets_router.post(
"/import",
operation_id="import_style_presets",
)
async def import_style_presets(file: UploadFile = File(description="The file to import")):
try:
style_presets = await parse_presets_from_file(file)
ApiDependencies.invoker.services.style_preset_records.create_many(style_presets)
except InvalidPresetImportDataError as e:
ApiDependencies.invoker.services.logger.error(traceback.format_exc())
raise HTTPException(status_code=400, detail=str(e))
except UnsupportedFileTypeError as e:
ApiDependencies.invoker.services.logger.error(traceback.format_exc())
raise HTTPException(status_code=415, detail=str(e))

View File

@@ -30,6 +30,7 @@ from invokeai.app.api.routers import (
images,
model_manager,
session_queue,
style_presets,
utilities,
workflows,
)
@@ -55,11 +56,13 @@ mimetypes.add_type("text/css", ".css")
torch_device_name = TorchDevice.get_torch_device_name()
logger.info(f"Using torch device: {torch_device_name}")
loop = asyncio.new_event_loop()
@asynccontextmanager
async def lifespan(app: FastAPI):
# Add startup event to load dependencies
ApiDependencies.initialize(config=app_config, event_handler_id=event_handler_id, logger=logger)
ApiDependencies.initialize(config=app_config, event_handler_id=event_handler_id, loop=loop, logger=logger)
yield
# Shut down threads
ApiDependencies.shutdown()
@@ -106,6 +109,7 @@ app.include_router(board_images.board_images_router, prefix="/api")
app.include_router(app_info.app_router, prefix="/api")
app.include_router(session_queue.session_queue_router, prefix="/api")
app.include_router(workflows.workflows_router, prefix="/api")
app.include_router(style_presets.style_presets_router, prefix="/api")
app.openapi = get_openapi_func(app)
@@ -184,8 +188,6 @@ def invoke_api() -> None:
check_cudnn(logger)
# Start our own event loop for eventing usage
loop = asyncio.new_event_loop()
config = uvicorn.Config(
app=app,
host=app_config.host,

View File

@@ -80,12 +80,12 @@ class CompelInvocation(BaseInvocation):
with (
# apply all patches while the model is on the target device
text_encoder_info.model_on_device() as (model_state_dict, text_encoder),
text_encoder_info.model_on_device() as (cached_weights, text_encoder),
tokenizer_info as tokenizer,
ModelPatcher.apply_lora_text_encoder(
text_encoder,
loras=_lora_loader(),
model_state_dict=model_state_dict,
cached_weights=cached_weights,
),
# Apply CLIP Skip after LoRA to prevent LoRA application from failing on skipped layers.
ModelPatcher.apply_clip_skip(text_encoder, self.clip.skipped_layers),
@@ -175,13 +175,13 @@ class SDXLPromptInvocationBase:
with (
# apply all patches while the model is on the target device
text_encoder_info.model_on_device() as (state_dict, text_encoder),
text_encoder_info.model_on_device() as (cached_weights, text_encoder),
tokenizer_info as tokenizer,
ModelPatcher.apply_lora(
text_encoder,
loras=_lora_loader(),
prefix=lora_prefix,
model_state_dict=state_dict,
cached_weights=cached_weights,
),
# Apply CLIP Skip after LoRA to prevent LoRA application from failing on skipped layers.
ModelPatcher.apply_clip_skip(text_encoder, clip_field.skipped_layers),

View File

@@ -21,6 +21,8 @@ from controlnet_aux import (
from controlnet_aux.util import HWC3, ade_palette
from PIL import Image
from pydantic import BaseModel, Field, field_validator, model_validator
from transformers import pipeline
from transformers.pipelines import DepthEstimationPipeline
from invokeai.app.invocations.baseinvocation import (
BaseInvocation,
@@ -44,13 +46,12 @@ from invokeai.app.invocations.util import validate_begin_end_step, validate_weig
from invokeai.app.services.shared.invocation_context import InvocationContext
from invokeai.app.util.controlnet_utils import CONTROLNET_MODE_VALUES, CONTROLNET_RESIZE_VALUES, heuristic_resize
from invokeai.backend.image_util.canny import get_canny_edges
from invokeai.backend.image_util.depth_anything import DEPTH_ANYTHING_MODELS, DepthAnythingDetector
from invokeai.backend.image_util.depth_anything.depth_anything_pipeline import DepthAnythingPipeline
from invokeai.backend.image_util.dw_openpose import DWPOSE_MODELS, DWOpenposeDetector
from invokeai.backend.image_util.hed import HEDProcessor
from invokeai.backend.image_util.lineart import LineartProcessor
from invokeai.backend.image_util.lineart_anime import LineartAnimeProcessor
from invokeai.backend.image_util.util import np_to_pil, pil_to_np
from invokeai.backend.util.devices import TorchDevice
class ControlField(BaseModel):
@@ -592,7 +593,14 @@ class ColorMapImageProcessorInvocation(ImageProcessorInvocation):
return color_map
DEPTH_ANYTHING_MODEL_SIZES = Literal["large", "base", "small"]
DEPTH_ANYTHING_MODEL_SIZES = Literal["large", "base", "small", "small_v2"]
# DepthAnything V2 Small model is licensed under Apache 2.0 but not the base and large models.
DEPTH_ANYTHING_MODELS = {
"large": "LiheYoung/depth-anything-large-hf",
"base": "LiheYoung/depth-anything-base-hf",
"small": "LiheYoung/depth-anything-small-hf",
"small_v2": "depth-anything/Depth-Anything-V2-Small-hf",
}
@invocation(
@@ -600,28 +608,33 @@ DEPTH_ANYTHING_MODEL_SIZES = Literal["large", "base", "small"]
title="Depth Anything Processor",
tags=["controlnet", "depth", "depth anything"],
category="controlnet",
version="1.1.2",
version="1.1.3",
)
class DepthAnythingImageProcessorInvocation(ImageProcessorInvocation):
"""Generates a depth map based on the Depth Anything algorithm"""
model_size: DEPTH_ANYTHING_MODEL_SIZES = InputField(
default="small", description="The size of the depth model to use"
default="small_v2", description="The size of the depth model to use"
)
resolution: int = InputField(default=512, ge=1, description=FieldDescriptions.image_res)
def run_processor(self, image: Image.Image) -> Image.Image:
def loader(model_path: Path):
return DepthAnythingDetector.load_model(
model_path, model_size=self.model_size, device=TorchDevice.choose_torch_device()
)
def load_depth_anything(model_path: Path):
depth_anything_pipeline = pipeline(model=str(model_path), task="depth-estimation", local_files_only=True)
assert isinstance(depth_anything_pipeline, DepthEstimationPipeline)
return DepthAnythingPipeline(depth_anything_pipeline)
with self._context.models.load_remote_model(
source=DEPTH_ANYTHING_MODELS[self.model_size], loader=loader
) as model:
depth_anything_detector = DepthAnythingDetector(model, TorchDevice.choose_torch_device())
processed_image = depth_anything_detector(image=image, resolution=self.resolution)
return processed_image
source=DEPTH_ANYTHING_MODELS[self.model_size], loader=load_depth_anything
) as depth_anything_detector:
assert isinstance(depth_anything_detector, DepthAnythingPipeline)
depth_map = depth_anything_detector.generate_depth(image)
# Resizing to user target specified size
new_height = int(image.size[1] * (self.resolution / image.size[0]))
depth_map = depth_map.resize((self.resolution, new_height))
return depth_map
@invocation(

View File

@@ -39,7 +39,7 @@ class GradientMaskOutput(BaseInvocationOutput):
title="Create Gradient Mask",
tags=["mask", "denoise"],
category="latents",
version="1.1.0",
version="1.2.0",
)
class CreateGradientMaskInvocation(BaseInvocation):
"""Creates mask for denoising model run."""
@@ -93,6 +93,7 @@ class CreateGradientMaskInvocation(BaseInvocation):
# redistribute blur so that the original edges are 0 and blur outwards to 1
blur_tensor = (blur_tensor - 0.5) * 2
blur_tensor[blur_tensor < 0] = 0.0
threshold = 1 - self.minimum_denoise

View File

@@ -37,9 +37,9 @@ from invokeai.app.services.shared.invocation_context import InvocationContext
from invokeai.app.util.controlnet_utils import prepare_control_image
from invokeai.backend.ip_adapter.ip_adapter import IPAdapter
from invokeai.backend.lora import LoRAModelRaw
from invokeai.backend.model_manager import BaseModelType
from invokeai.backend.model_manager import BaseModelType, ModelVariantType
from invokeai.backend.model_patcher import ModelPatcher
from invokeai.backend.stable_diffusion import PipelineIntermediateState, set_seamless
from invokeai.backend.stable_diffusion import PipelineIntermediateState
from invokeai.backend.stable_diffusion.denoise_context import DenoiseContext, DenoiseInputs
from invokeai.backend.stable_diffusion.diffusers_pipeline import (
ControlNetData,
@@ -60,8 +60,13 @@ from invokeai.backend.stable_diffusion.diffusion_backend import StableDiffusionB
from invokeai.backend.stable_diffusion.extension_callback_type import ExtensionCallbackType
from invokeai.backend.stable_diffusion.extensions.controlnet import ControlNetExt
from invokeai.backend.stable_diffusion.extensions.freeu import FreeUExt
from invokeai.backend.stable_diffusion.extensions.inpaint import InpaintExt
from invokeai.backend.stable_diffusion.extensions.inpaint_model import InpaintModelExt
from invokeai.backend.stable_diffusion.extensions.lora import LoRAExt
from invokeai.backend.stable_diffusion.extensions.preview import PreviewExt
from invokeai.backend.stable_diffusion.extensions.rescale_cfg import RescaleCFGExt
from invokeai.backend.stable_diffusion.extensions.seamless import SeamlessExt
from invokeai.backend.stable_diffusion.extensions.t2i_adapter import T2IAdapterExt
from invokeai.backend.stable_diffusion.extensions_manager import ExtensionsManager
from invokeai.backend.stable_diffusion.schedulers import SCHEDULER_MAP
from invokeai.backend.stable_diffusion.schedulers.schedulers import SCHEDULER_NAME_VALUES
@@ -498,6 +503,33 @@ class DenoiseLatentsInvocation(BaseInvocation):
)
)
@staticmethod
def parse_t2i_adapter_field(
exit_stack: ExitStack,
context: InvocationContext,
t2i_adapters: Optional[Union[T2IAdapterField, list[T2IAdapterField]]],
ext_manager: ExtensionsManager,
) -> None:
if t2i_adapters is None:
return
# Handle the possibility that t2i_adapters could be a list or a single T2IAdapterField.
if isinstance(t2i_adapters, T2IAdapterField):
t2i_adapters = [t2i_adapters]
for t2i_adapter_field in t2i_adapters:
ext_manager.add_extension(
T2IAdapterExt(
node_context=context,
model_id=t2i_adapter_field.t2i_adapter_model,
image=context.images.get_pil(t2i_adapter_field.image.image_name),
weight=t2i_adapter_field.weight,
begin_step_percent=t2i_adapter_field.begin_step_percent,
end_step_percent=t2i_adapter_field.end_step_percent,
resize_mode=t2i_adapter_field.resize_mode,
)
)
def prep_ip_adapter_image_prompts(
self,
context: InvocationContext,
@@ -707,7 +739,7 @@ class DenoiseLatentsInvocation(BaseInvocation):
else:
masked_latents = torch.where(mask < 0.5, 0.0, latents)
return 1 - mask, masked_latents, self.denoise_mask.gradient
return mask, masked_latents, self.denoise_mask.gradient
@staticmethod
def prepare_noise_and_latents(
@@ -765,10 +797,6 @@ class DenoiseLatentsInvocation(BaseInvocation):
dtype = TorchDevice.choose_torch_dtype()
seed, noise, latents = self.prepare_noise_and_latents(context, self.noise, self.latents)
latents = latents.to(device=device, dtype=dtype)
if noise is not None:
noise = noise.to(device=device, dtype=dtype)
_, _, latent_height, latent_width = latents.shape
conditioning_data = self.get_conditioning_data(
@@ -801,21 +829,6 @@ class DenoiseLatentsInvocation(BaseInvocation):
denoising_end=self.denoising_end,
)
denoise_ctx = DenoiseContext(
inputs=DenoiseInputs(
orig_latents=latents,
timesteps=timesteps,
init_timestep=init_timestep,
noise=noise,
seed=seed,
scheduler_step_kwargs=scheduler_step_kwargs,
conditioning_data=conditioning_data,
attention_processor_cls=CustomAttnProcessor2_0,
),
unet=None,
scheduler=scheduler,
)
# get the unet's config so that we can pass the base to sd_step_callback()
unet_config = context.models.get_config(self.unet.unet.key)
@@ -833,6 +846,50 @@ class DenoiseLatentsInvocation(BaseInvocation):
if self.unet.freeu_config:
ext_manager.add_extension(FreeUExt(self.unet.freeu_config))
### lora
if self.unet.loras:
for lora_field in self.unet.loras:
ext_manager.add_extension(
LoRAExt(
node_context=context,
model_id=lora_field.lora,
weight=lora_field.weight,
)
)
### seamless
if self.unet.seamless_axes:
ext_manager.add_extension(SeamlessExt(self.unet.seamless_axes))
### inpaint
mask, masked_latents, is_gradient_mask = self.prep_inpaint_mask(context, latents)
# NOTE: We used to identify inpainting models by inpecting the shape of the loaded UNet model weights. Now we
# use the ModelVariantType config. During testing, there was a report of a user with models that had an
# incorrect ModelVariantType value. Re-installing the model fixed the issue. If this issue turns out to be
# prevalent, we will have to revisit how we initialize the inpainting extensions.
if unet_config.variant == ModelVariantType.Inpaint:
ext_manager.add_extension(InpaintModelExt(mask, masked_latents, is_gradient_mask))
elif mask is not None:
ext_manager.add_extension(InpaintExt(mask, is_gradient_mask))
# Initialize context for modular denoise
latents = latents.to(device=device, dtype=dtype)
if noise is not None:
noise = noise.to(device=device, dtype=dtype)
denoise_ctx = DenoiseContext(
inputs=DenoiseInputs(
orig_latents=latents,
timesteps=timesteps,
init_timestep=init_timestep,
noise=noise,
seed=seed,
scheduler_step_kwargs=scheduler_step_kwargs,
conditioning_data=conditioning_data,
attention_processor_cls=CustomAttnProcessor2_0,
),
unet=None,
scheduler=scheduler,
)
# context for loading additional models
with ExitStack() as exit_stack:
# later should be smth like:
@@ -840,6 +897,7 @@ class DenoiseLatentsInvocation(BaseInvocation):
# ext = extension_field.to_extension(exit_stack, context, ext_manager)
# ext_manager.add_extension(ext)
self.parse_controlnet_field(exit_stack, context, self.control, ext_manager)
self.parse_t2i_adapter_field(exit_stack, context, self.t2i_adapter, ext_manager)
# ext: t2i/ip adapter
ext_manager.run_callback(ExtensionCallbackType.SETUP, denoise_ctx)
@@ -871,6 +929,10 @@ class DenoiseLatentsInvocation(BaseInvocation):
seed, noise, latents = self.prepare_noise_and_latents(context, self.noise, self.latents)
mask, masked_latents, gradient_mask = self.prep_inpaint_mask(context, latents)
# At this point, the mask ranges from 0 (leave unchanged) to 1 (inpaint).
# We invert the mask here for compatibility with the old backend implementation.
if mask is not None:
mask = 1 - mask
# TODO(ryand): I have hard-coded `do_classifier_free_guidance=True` to mirror the behaviour of ControlNets,
# below. Investigate whether this is appropriate.
@@ -913,14 +975,14 @@ class DenoiseLatentsInvocation(BaseInvocation):
assert isinstance(unet_info.model, UNet2DConditionModel)
with (
ExitStack() as exit_stack,
unet_info.model_on_device() as (model_state_dict, unet),
unet_info.model_on_device() as (cached_weights, unet),
ModelPatcher.apply_freeu(unet, self.unet.freeu_config),
set_seamless(unet, self.unet.seamless_axes), # FIXME
SeamlessExt.static_patch_model(unet, self.unet.seamless_axes), # FIXME
# Apply the LoRA after unet has been moved to its target device for faster patching.
ModelPatcher.apply_lora_unet(
unet,
loras=_lora_loader(),
model_state_dict=model_state_dict,
cached_weights=cached_weights,
),
):
assert isinstance(unet, UNet2DConditionModel)

View File

@@ -1,7 +1,7 @@
from enum import Enum
from typing import Any, Callable, Optional, Tuple
from pydantic import BaseModel, ConfigDict, Field, RootModel, TypeAdapter
from pydantic import BaseModel, ConfigDict, Field, RootModel, TypeAdapter, model_validator
from pydantic.fields import _Unset
from pydantic_core import PydanticUndefined
@@ -242,6 +242,31 @@ class ConditioningField(BaseModel):
)
class BoundingBoxField(BaseModel):
"""A bounding box primitive value."""
x_min: int = Field(ge=0, description="The minimum x-coordinate of the bounding box (inclusive).")
x_max: int = Field(ge=0, description="The maximum x-coordinate of the bounding box (exclusive).")
y_min: int = Field(ge=0, description="The minimum y-coordinate of the bounding box (inclusive).")
y_max: int = Field(ge=0, description="The maximum y-coordinate of the bounding box (exclusive).")
score: Optional[float] = Field(
default=None,
ge=0.0,
le=1.0,
description="The score associated with the bounding box. In the range [0, 1]. This value is typically set "
"when the bounding box was produced by a detector and has an associated confidence score.",
)
@model_validator(mode="after")
def check_coords(self):
if self.x_min > self.x_max:
raise ValueError(f"x_min ({self.x_min}) is greater than x_max ({self.x_max}).")
if self.y_min > self.y_max:
raise ValueError(f"y_min ({self.y_min}) is greater than y_max ({self.y_max}).")
return self
class MetadataField(RootModel[dict[str, Any]]):
"""
Pydantic model for metadata with custom root of type dict[str, Any].

View File

@@ -0,0 +1,100 @@
from pathlib import Path
from typing import Literal
import torch
from PIL import Image
from transformers import pipeline
from transformers.pipelines import ZeroShotObjectDetectionPipeline
from invokeai.app.invocations.baseinvocation import BaseInvocation, invocation
from invokeai.app.invocations.fields import BoundingBoxField, ImageField, InputField
from invokeai.app.invocations.primitives import BoundingBoxCollectionOutput
from invokeai.app.services.shared.invocation_context import InvocationContext
from invokeai.backend.image_util.grounding_dino.detection_result import DetectionResult
from invokeai.backend.image_util.grounding_dino.grounding_dino_pipeline import GroundingDinoPipeline
GroundingDinoModelKey = Literal["grounding-dino-tiny", "grounding-dino-base"]
GROUNDING_DINO_MODEL_IDS: dict[GroundingDinoModelKey, str] = {
"grounding-dino-tiny": "IDEA-Research/grounding-dino-tiny",
"grounding-dino-base": "IDEA-Research/grounding-dino-base",
}
@invocation(
"grounding_dino",
title="Grounding DINO (Text Prompt Object Detection)",
tags=["prompt", "object detection"],
category="image",
version="1.0.0",
)
class GroundingDinoInvocation(BaseInvocation):
"""Runs a Grounding DINO model. Performs zero-shot bounding-box object detection from a text prompt."""
# Reference:
# - https://arxiv.org/pdf/2303.05499
# - https://huggingface.co/docs/transformers/v4.43.3/en/model_doc/grounding-dino#grounded-sam
# - https://github.com/NielsRogge/Transformers-Tutorials/blob/a39f33ac1557b02ebfb191ea7753e332b5ca933f/Grounding%20DINO/GroundingDINO_with_Segment_Anything.ipynb
model: GroundingDinoModelKey = InputField(description="The Grounding DINO model to use.")
prompt: str = InputField(description="The prompt describing the object to segment.")
image: ImageField = InputField(description="The image to segment.")
detection_threshold: float = InputField(
description="The detection threshold for the Grounding DINO model. All detected bounding boxes with scores above this threshold will be returned.",
ge=0.0,
le=1.0,
default=0.3,
)
@torch.no_grad()
def invoke(self, context: InvocationContext) -> BoundingBoxCollectionOutput:
# The model expects a 3-channel RGB image.
image_pil = context.images.get_pil(self.image.image_name, mode="RGB")
detections = self._detect(
context=context, image=image_pil, labels=[self.prompt], threshold=self.detection_threshold
)
# Convert detections to BoundingBoxCollectionOutput.
bounding_boxes: list[BoundingBoxField] = []
for detection in detections:
bounding_boxes.append(
BoundingBoxField(
x_min=detection.box.xmin,
x_max=detection.box.xmax,
y_min=detection.box.ymin,
y_max=detection.box.ymax,
score=detection.score,
)
)
return BoundingBoxCollectionOutput(collection=bounding_boxes)
@staticmethod
def _load_grounding_dino(model_path: Path):
grounding_dino_pipeline = pipeline(
model=str(model_path),
task="zero-shot-object-detection",
local_files_only=True,
# TODO(ryand): Setting the torch_dtype here doesn't work. Investigate whether fp16 is supported by the
# model, and figure out how to make it work in the pipeline.
# torch_dtype=TorchDevice.choose_torch_dtype(),
)
assert isinstance(grounding_dino_pipeline, ZeroShotObjectDetectionPipeline)
return GroundingDinoPipeline(grounding_dino_pipeline)
def _detect(
self,
context: InvocationContext,
image: Image.Image,
labels: list[str],
threshold: float = 0.3,
) -> list[DetectionResult]:
"""Use Grounding DINO to detect bounding boxes for a set of labels in an image."""
# TODO(ryand): I copied this "."-handling logic from the transformers example code. Test it and see if it
# actually makes a difference.
labels = [label if label.endswith(".") else label + "." for label in labels]
with context.models.load_remote_model(
source=GROUNDING_DINO_MODEL_IDS[self.model], loader=GroundingDinoInvocation._load_grounding_dino
) as detector:
assert isinstance(detector, GroundingDinoPipeline)
return detector.detect(image=image, candidate_labels=labels, threshold=threshold)

View File

@@ -24,7 +24,7 @@ from invokeai.app.invocations.fields import (
from invokeai.app.invocations.model import VAEField
from invokeai.app.invocations.primitives import ImageOutput
from invokeai.app.services.shared.invocation_context import InvocationContext
from invokeai.backend.stable_diffusion import set_seamless
from invokeai.backend.stable_diffusion.extensions.seamless import SeamlessExt
from invokeai.backend.stable_diffusion.vae_tiling import patch_vae_tiling_params
from invokeai.backend.util.devices import TorchDevice
@@ -59,7 +59,7 @@ class LatentsToImageInvocation(BaseInvocation, WithMetadata, WithBoard):
vae_info = context.models.load(self.vae.vae)
assert isinstance(vae_info.model, (AutoencoderKL, AutoencoderTiny))
with set_seamless(vae_info.model, self.vae.seamless_axes), vae_info as vae:
with SeamlessExt.static_patch_model(vae_info.model, self.vae.seamless_axes), vae_info as vae:
assert isinstance(vae, (AutoencoderKL, AutoencoderTiny))
latents = latents.to(vae.device)
if self.fp32:

View File

@@ -1,9 +1,10 @@
import numpy as np
import torch
from PIL import Image
from invokeai.app.invocations.baseinvocation import BaseInvocation, Classification, InvocationContext, invocation
from invokeai.app.invocations.fields import ImageField, InputField, TensorField, WithMetadata
from invokeai.app.invocations.primitives import MaskOutput
from invokeai.app.invocations.fields import ImageField, InputField, TensorField, WithBoard, WithMetadata
from invokeai.app.invocations.primitives import ImageOutput, MaskOutput
@invocation(
@@ -118,3 +119,27 @@ class ImageMaskToTensorInvocation(BaseInvocation, WithMetadata):
height=mask.shape[1],
width=mask.shape[2],
)
@invocation(
"tensor_mask_to_image",
title="Tensor Mask to Image",
tags=["mask"],
category="mask",
version="1.0.0",
)
class MaskTensorToImageInvocation(BaseInvocation, WithMetadata, WithBoard):
"""Convert a mask tensor to an image."""
mask: TensorField = InputField(description="The mask tensor to convert.")
def invoke(self, context: InvocationContext) -> ImageOutput:
mask = context.tensors.load(self.mask.tensor_name)
# Ensure that the mask is binary.
if mask.dtype != torch.bool:
mask = mask > 0.5
mask_np = (mask.float() * 255).byte().cpu().numpy()
mask_pil = Image.fromarray(mask_np, mode="L")
image_dto = context.images.save(image=mask_pil)
return ImageOutput.build(image_dto)

View File

@@ -7,6 +7,7 @@ import torch
from invokeai.app.invocations.baseinvocation import BaseInvocation, BaseInvocationOutput, invocation, invocation_output
from invokeai.app.invocations.constants import LATENT_SCALE_FACTOR
from invokeai.app.invocations.fields import (
BoundingBoxField,
ColorField,
ConditioningField,
DenoiseMaskField,
@@ -469,3 +470,42 @@ class ConditioningCollectionInvocation(BaseInvocation):
# endregion
# region BoundingBox
@invocation_output("bounding_box_output")
class BoundingBoxOutput(BaseInvocationOutput):
"""Base class for nodes that output a single bounding box"""
bounding_box: BoundingBoxField = OutputField(description="The output bounding box.")
@invocation_output("bounding_box_collection_output")
class BoundingBoxCollectionOutput(BaseInvocationOutput):
"""Base class for nodes that output a collection of bounding boxes"""
collection: list[BoundingBoxField] = OutputField(description="The output bounding boxes.", title="Bounding Boxes")
@invocation(
"bounding_box",
title="Bounding Box",
tags=["primitives", "segmentation", "collection", "bounding box"],
category="primitives",
version="1.0.0",
)
class BoundingBoxInvocation(BaseInvocation):
"""Create a bounding box manually by supplying box coordinates"""
x_min: int = InputField(default=0, description="x-coordinate of the bounding box's top left vertex")
y_min: int = InputField(default=0, description="y-coordinate of the bounding box's top left vertex")
x_max: int = InputField(default=0, description="x-coordinate of the bounding box's bottom right vertex")
y_max: int = InputField(default=0, description="y-coordinate of the bounding box's bottom right vertex")
def invoke(self, context: InvocationContext) -> BoundingBoxOutput:
bounding_box = BoundingBoxField(x_min=self.x_min, y_min=self.y_min, x_max=self.x_max, y_max=self.y_max)
return BoundingBoxOutput(bounding_box=bounding_box)
# endregion

View File

@@ -1,85 +0,0 @@
from pathlib import Path
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
from invokeai.app.invocations.baseinvocation import BaseInvocation, invocation
from invokeai.app.invocations.fields import (
InputField,
)
from invokeai.app.invocations.primitives import StringOutput
from invokeai.app.services.shared.invocation_context import InvocationContext
from invokeai.backend.util.devices import TorchDevice
AUGMENT_PROMPT_INSTRUCTION = """Your task is to translate a short image caption and a style caption to a more detailed caption for the same image. The detailed caption should adhere to the following:
- be 1 sentence long
- use descriptive language that relates to the subject of interest
- it may add new details, but shouldn't change the subject of the original caption
Here are some examples:
Original caption: "A cat on a table"
Detailed caption: "A fluffy cat with a curious expression, sitting on a wooden table next to a vase of flowers."
Original caption: "medieval armor"
Detailed caption: "The gleaming suit of medieval armor stands proudly in the museum, its intricate engravings telling tales of long-forgotten battles and chivalry."
Original caption: "A panda bear as a mad scientist"
Detailed caption: "Clad in a tiny lab coat and goggles, the panda bear feverishly mixes colorful potions, embodying the eccentricity of a mad scientist in its whimsical laboratory."
Here is the prompt to translate:
Original caption: "{}"
Detailed caption:"""
@invocation("promp_augment", title="Prompt Augmentation", tags=["prompt"], category="conditioning", version="1.0.0")
class PrompAugmentationInvocation(BaseInvocation):
"""Use an LLM to augment a text prompt."""
prompt: str = InputField(description="The text prompt to augment.")
@torch.inference_mode()
def invoke(self, context: InvocationContext) -> StringOutput:
# TODO(ryand): Address the following situations in the input prompt:
# - Prompt contains a TI embeddings.
# - Prompt contains .and() compel syntax. (Is ther any other compel syntax we need to handle?)
# - Prompt contains quotation marks that could cause confusion when embedded in an LLM instruct prompt.
# Load the model and tokenizer.
model_source = "microsoft/Phi-3-mini-4k-instruct"
def model_loader(model_path: Path):
return AutoModelForCausalLM.from_pretrained(
model_path, torch_dtype=TorchDevice.choose_torch_dtype(), local_files_only=True
)
def tokenizer_loader(model_path: Path):
return AutoTokenizer.from_pretrained(model_path, local_files_only=True)
with (
context.models.load_remote_model(source=model_source, loader=model_loader) as model,
context.models.load_remote_model(source=model_source, loader=tokenizer_loader) as tokenizer,
):
# Tokenize the input prompt.
augmented_prompt = self._run_instruct_model(model, tokenizer, self.prompt)
return StringOutput(value=augmented_prompt)
def _run_instruct_model(self, model: AutoModelForCausalLM, tokenizer: AutoTokenizer, prompt: str) -> str:
messages = [
{
"role": "user",
"content": AUGMENT_PROMPT_INSTRUCTION.format(prompt),
}
]
inputs = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt")
inputs = inputs.to(model.device)
outputs = model.generate(
inputs,
max_new_tokens=200,
temperature=0.9,
do_sample=True,
)
text = tokenizer.batch_decode(outputs)[0]
assert isinstance(text, str)
output = text.split("<|assistant|>")[-1].strip()
output = output.split("<|end|>")[0].strip()
return output

View File

@@ -0,0 +1,161 @@
from pathlib import Path
from typing import Literal
import numpy as np
import torch
from PIL import Image
from transformers import AutoModelForMaskGeneration, AutoProcessor
from transformers.models.sam import SamModel
from transformers.models.sam.processing_sam import SamProcessor
from invokeai.app.invocations.baseinvocation import BaseInvocation, invocation
from invokeai.app.invocations.fields import BoundingBoxField, ImageField, InputField, TensorField
from invokeai.app.invocations.primitives import MaskOutput
from invokeai.app.services.shared.invocation_context import InvocationContext
from invokeai.backend.image_util.segment_anything.mask_refinement import mask_to_polygon, polygon_to_mask
from invokeai.backend.image_util.segment_anything.segment_anything_pipeline import SegmentAnythingPipeline
SegmentAnythingModelKey = Literal["segment-anything-base", "segment-anything-large", "segment-anything-huge"]
SEGMENT_ANYTHING_MODEL_IDS: dict[SegmentAnythingModelKey, str] = {
"segment-anything-base": "facebook/sam-vit-base",
"segment-anything-large": "facebook/sam-vit-large",
"segment-anything-huge": "facebook/sam-vit-huge",
}
@invocation(
"segment_anything",
title="Segment Anything",
tags=["prompt", "segmentation"],
category="segmentation",
version="1.0.0",
)
class SegmentAnythingInvocation(BaseInvocation):
"""Runs a Segment Anything Model."""
# Reference:
# - https://arxiv.org/pdf/2304.02643
# - https://huggingface.co/docs/transformers/v4.43.3/en/model_doc/grounding-dino#grounded-sam
# - https://github.com/NielsRogge/Transformers-Tutorials/blob/a39f33ac1557b02ebfb191ea7753e332b5ca933f/Grounding%20DINO/GroundingDINO_with_Segment_Anything.ipynb
model: SegmentAnythingModelKey = InputField(description="The Segment Anything model to use.")
image: ImageField = InputField(description="The image to segment.")
bounding_boxes: list[BoundingBoxField] = InputField(description="The bounding boxes to prompt the SAM model with.")
apply_polygon_refinement: bool = InputField(
description="Whether to apply polygon refinement to the masks. This will smooth the edges of the masks slightly and ensure that each mask consists of a single closed polygon (before merging).",
default=True,
)
mask_filter: Literal["all", "largest", "highest_box_score"] = InputField(
description="The filtering to apply to the detected masks before merging them into a final output.",
default="all",
)
@torch.no_grad()
def invoke(self, context: InvocationContext) -> MaskOutput:
# The models expect a 3-channel RGB image.
image_pil = context.images.get_pil(self.image.image_name, mode="RGB")
if len(self.bounding_boxes) == 0:
combined_mask = torch.zeros(image_pil.size[::-1], dtype=torch.bool)
else:
masks = self._segment(context=context, image=image_pil)
masks = self._filter_masks(masks=masks, bounding_boxes=self.bounding_boxes)
# masks contains bool values, so we merge them via max-reduce.
combined_mask, _ = torch.stack(masks).max(dim=0)
mask_tensor_name = context.tensors.save(combined_mask)
height, width = combined_mask.shape
return MaskOutput(mask=TensorField(tensor_name=mask_tensor_name), width=width, height=height)
@staticmethod
def _load_sam_model(model_path: Path):
sam_model = AutoModelForMaskGeneration.from_pretrained(
model_path,
local_files_only=True,
# TODO(ryand): Setting the torch_dtype here doesn't work. Investigate whether fp16 is supported by the
# model, and figure out how to make it work in the pipeline.
# torch_dtype=TorchDevice.choose_torch_dtype(),
)
assert isinstance(sam_model, SamModel)
sam_processor = AutoProcessor.from_pretrained(model_path, local_files_only=True)
assert isinstance(sam_processor, SamProcessor)
return SegmentAnythingPipeline(sam_model=sam_model, sam_processor=sam_processor)
def _segment(
self,
context: InvocationContext,
image: Image.Image,
) -> list[torch.Tensor]:
"""Use Segment Anything (SAM) to generate masks given an image + a set of bounding boxes."""
# Convert the bounding boxes to the SAM input format.
sam_bounding_boxes = [[bb.x_min, bb.y_min, bb.x_max, bb.y_max] for bb in self.bounding_boxes]
with (
context.models.load_remote_model(
source=SEGMENT_ANYTHING_MODEL_IDS[self.model], loader=SegmentAnythingInvocation._load_sam_model
) as sam_pipeline,
):
assert isinstance(sam_pipeline, SegmentAnythingPipeline)
masks = sam_pipeline.segment(image=image, bounding_boxes=sam_bounding_boxes)
masks = self._process_masks(masks)
if self.apply_polygon_refinement:
masks = self._apply_polygon_refinement(masks)
return masks
def _process_masks(self, masks: torch.Tensor) -> list[torch.Tensor]:
"""Convert the tensor output from the Segment Anything model from a tensor of shape
[num_masks, channels, height, width] to a list of tensors of shape [height, width].
"""
assert masks.dtype == torch.bool
# [num_masks, channels, height, width] -> [num_masks, height, width]
masks, _ = masks.max(dim=1)
# Split the first dimension into a list of masks.
return list(masks.cpu().unbind(dim=0))
def _apply_polygon_refinement(self, masks: list[torch.Tensor]) -> list[torch.Tensor]:
"""Apply polygon refinement to the masks.
Convert each mask to a polygon, then back to a mask. This has the following effect:
- Smooth the edges of the mask slightly.
- Ensure that each mask consists of a single closed polygon
- Removes small mask pieces.
- Removes holes from the mask.
"""
# Convert tensor masks to np masks.
np_masks = [mask.cpu().numpy().astype(np.uint8) for mask in masks]
# Apply polygon refinement.
for idx, mask in enumerate(np_masks):
shape = mask.shape
assert len(shape) == 2 # Assert length to satisfy type checker.
polygon = mask_to_polygon(mask)
mask = polygon_to_mask(polygon, shape)
np_masks[idx] = mask
# Convert np masks back to tensor masks.
masks = [torch.tensor(mask, dtype=torch.bool) for mask in np_masks]
return masks
def _filter_masks(self, masks: list[torch.Tensor], bounding_boxes: list[BoundingBoxField]) -> list[torch.Tensor]:
"""Filter the detected masks based on the specified mask filter."""
assert len(masks) == len(bounding_boxes)
if self.mask_filter == "all":
return masks
elif self.mask_filter == "largest":
# Find the largest mask.
return [max(masks, key=lambda x: float(x.sum()))]
elif self.mask_filter == "highest_box_score":
# Find the index of the bounding box with the highest score.
# Note that we fallback to -1.0 if the score is None. This is mainly to satisfy the type checker. In most
# cases the scores should all be non-None when using this filtering mode. That being said, -1.0 is a
# reasonable fallback since the expected score range is [0.0, 1.0].
max_score_idx = max(range(len(bounding_boxes)), key=lambda i: bounding_boxes[i].score or -1.0)
return [masks[max_score_idx]]
else:
raise ValueError(f"Invalid mask filter: {self.mask_filter}")

View File

@@ -91,6 +91,7 @@ class InvokeAIAppConfig(BaseSettings):
db_dir: Path to InvokeAI databases directory.
outputs_dir: Path to directory for outputs.
custom_nodes_dir: Path to directory for custom nodes.
style_presets_dir: Path to directory for style presets.
log_handlers: Log handler. Valid options are "console", "file=<path>", "syslog=path|address:host:port", "http=<url>".
log_format: Log format. Use "plain" for text-only, "color" for colorized output, "legacy" for 2.3-style logging and "syslog" for syslog-style.<br>Valid values: `plain`, `color`, `syslog`, `legacy`
log_level: Emit logging messages at this level or higher.<br>Valid values: `debug`, `info`, `warning`, `error`, `critical`
@@ -153,6 +154,7 @@ class InvokeAIAppConfig(BaseSettings):
db_dir: Path = Field(default=Path("databases"), description="Path to InvokeAI databases directory.")
outputs_dir: Path = Field(default=Path("outputs"), description="Path to directory for outputs.")
custom_nodes_dir: Path = Field(default=Path("nodes"), description="Path to directory for custom nodes.")
style_presets_dir: Path = Field(default=Path("style_presets"), description="Path to directory for style presets.")
# LOGGING
log_handlers: list[str] = Field(default=["console"], description='Log handler. Valid options are "console", "file=<path>", "syslog=path|address:host:port", "http=<url>".')
@@ -300,6 +302,11 @@ class InvokeAIAppConfig(BaseSettings):
"""Path to the models directory, resolved to an absolute path.."""
return self._resolve(self.models_dir)
@property
def style_presets_path(self) -> Path:
"""Path to the style presets directory, resolved to an absolute path.."""
return self._resolve(self.style_presets_dir)
@property
def convert_cache_path(self) -> Path:
"""Path to the converted cache models directory, resolved to an absolute path.."""

View File

@@ -1,46 +1,44 @@
# Copyright (c) 2022 Kyle Schouviller (https://github.com/kyle0654)
import asyncio
import threading
from queue import Empty, Queue
from fastapi_events.dispatcher import dispatch
from invokeai.app.services.events.events_base import EventServiceBase
from invokeai.app.services.events.events_common import (
EventBase,
)
from invokeai.app.services.events.events_common import EventBase
class FastAPIEventService(EventServiceBase):
def __init__(self, event_handler_id: int) -> None:
def __init__(self, event_handler_id: int, loop: asyncio.AbstractEventLoop) -> None:
self.event_handler_id = event_handler_id
self._queue = Queue[EventBase | None]()
self._queue = asyncio.Queue[EventBase | None]()
self._stop_event = threading.Event()
asyncio.create_task(self._dispatch_from_queue(stop_event=self._stop_event))
self._loop = loop
# We need to store a reference to the task so it doesn't get GC'd
# See: https://docs.python.org/3/library/asyncio-task.html#creating-tasks
self._background_tasks: set[asyncio.Task[None]] = set()
task = self._loop.create_task(self._dispatch_from_queue(stop_event=self._stop_event))
self._background_tasks.add(task)
task.add_done_callback(self._background_tasks.remove)
super().__init__()
def stop(self, *args, **kwargs):
self._stop_event.set()
self._queue.put(None)
self._loop.call_soon_threadsafe(self._queue.put_nowait, None)
def dispatch(self, event: EventBase) -> None:
self._queue.put(event)
self._loop.call_soon_threadsafe(self._queue.put_nowait, event)
async def _dispatch_from_queue(self, stop_event: threading.Event):
"""Get events on from the queue and dispatch them, from the correct thread"""
while not stop_event.is_set():
try:
event = self._queue.get(block=False)
event = await self._queue.get()
if not event: # Probably stopping
continue
# Leave the payloads as live pydantic models
dispatch(event, middleware_id=self.event_handler_id, payload_schema_dump=False)
except Empty:
await asyncio.sleep(0.1)
pass
except asyncio.CancelledError as e:
raise e # Raise a proper error

View File

@@ -1,11 +1,10 @@
# Copyright (c) 2022 Kyle Schouviller (https://github.com/kyle0654) and the InvokeAI Team
from pathlib import Path
from queue import Queue
from typing import Dict, Optional, Union
from typing import Optional, Union
from PIL import Image, PngImagePlugin
from PIL.Image import Image as PILImageType
from send2trash import send2trash
from invokeai.app.services.image_files.image_files_base import ImageFileStorageBase
from invokeai.app.services.image_files.image_files_common import (
@@ -20,18 +19,12 @@ from invokeai.app.util.thumbnails import get_thumbnail_name, make_thumbnail
class DiskImageFileStorage(ImageFileStorageBase):
"""Stores images on disk"""
__output_folder: Path
__cache_ids: Queue # TODO: this is an incredibly naive cache
__cache: Dict[Path, PILImageType]
__max_cache_size: int
__invoker: Invoker
def __init__(self, output_folder: Union[str, Path]):
self.__cache = {}
self.__cache_ids = Queue()
self.__cache: dict[Path, PILImageType] = {}
self.__cache_ids = Queue[Path]()
self.__max_cache_size = 10 # TODO: get this from config
self.__output_folder: Path = output_folder if isinstance(output_folder, Path) else Path(output_folder)
self.__output_folder = output_folder if isinstance(output_folder, Path) else Path(output_folder)
self.__thumbnails_folder = self.__output_folder / "thumbnails"
# Validate required output folders at launch
self.__validate_storage_folders()
@@ -103,7 +96,7 @@ class DiskImageFileStorage(ImageFileStorageBase):
image_path = self.get_path(image_name)
if image_path.exists():
send2trash(image_path)
image_path.unlink()
if image_path in self.__cache:
del self.__cache[image_path]
@@ -111,7 +104,7 @@ class DiskImageFileStorage(ImageFileStorageBase):
thumbnail_path = self.get_path(thumbnail_name, True)
if thumbnail_path.exists():
send2trash(thumbnail_path)
thumbnail_path.unlink()
if thumbnail_path in self.__cache:
del self.__cache[thumbnail_path]
except Exception as e:

View File

@@ -4,6 +4,8 @@ from __future__ import annotations
from typing import TYPE_CHECKING
from invokeai.app.services.object_serializer.object_serializer_base import ObjectSerializerBase
from invokeai.app.services.style_preset_images.style_preset_images_base import StylePresetImageFileStorageBase
from invokeai.app.services.style_preset_records.style_preset_records_base import StylePresetRecordsStorageBase
if TYPE_CHECKING:
from logging import Logger
@@ -61,6 +63,8 @@ class InvocationServices:
workflow_records: "WorkflowRecordsStorageBase",
tensors: "ObjectSerializerBase[torch.Tensor]",
conditioning: "ObjectSerializerBase[ConditioningFieldData]",
style_preset_records: "StylePresetRecordsStorageBase",
style_preset_image_files: "StylePresetImageFileStorageBase",
):
self.board_images = board_images
self.board_image_records = board_image_records
@@ -85,3 +89,5 @@ class InvocationServices:
self.workflow_records = workflow_records
self.tensors = tensors
self.conditioning = conditioning
self.style_preset_records = style_preset_records
self.style_preset_image_files = style_preset_image_files

View File

@@ -2,7 +2,6 @@ from pathlib import Path
from PIL import Image
from PIL.Image import Image as PILImageType
from send2trash import send2trash
from invokeai.app.services.invoker import Invoker
from invokeai.app.services.model_images.model_images_base import ModelImageFileStorageBase
@@ -70,7 +69,7 @@ class ModelImageFileStorageDisk(ModelImageFileStorageBase):
if not self._validate_path(path):
raise ModelImageFileNotFoundException
send2trash(path)
path.unlink()
except Exception as e:
raise ModelImageFileDeleteException from e

View File

@@ -16,6 +16,7 @@ from invokeai.app.services.shared.sqlite_migrator.migrations.migration_10 import
from invokeai.app.services.shared.sqlite_migrator.migrations.migration_11 import build_migration_11
from invokeai.app.services.shared.sqlite_migrator.migrations.migration_12 import build_migration_12
from invokeai.app.services.shared.sqlite_migrator.migrations.migration_13 import build_migration_13
from invokeai.app.services.shared.sqlite_migrator.migrations.migration_14 import build_migration_14
from invokeai.app.services.shared.sqlite_migrator.sqlite_migrator_impl import SqliteMigrator
@@ -49,6 +50,7 @@ def init_db(config: InvokeAIAppConfig, logger: Logger, image_files: ImageFileSto
migrator.register_migration(build_migration_11(app_config=config, logger=logger))
migrator.register_migration(build_migration_12(app_config=config))
migrator.register_migration(build_migration_13())
migrator.register_migration(build_migration_14())
migrator.run_migrations()
return db

View File

@@ -0,0 +1,61 @@
import sqlite3
from invokeai.app.services.shared.sqlite_migrator.sqlite_migrator_common import Migration
class Migration14Callback:
def __call__(self, cursor: sqlite3.Cursor) -> None:
self._create_style_presets(cursor)
def _create_style_presets(self, cursor: sqlite3.Cursor) -> None:
"""Create the table used to store style presets."""
tables = [
"""--sql
CREATE TABLE IF NOT EXISTS style_presets (
id TEXT NOT NULL PRIMARY KEY,
name TEXT NOT NULL,
preset_data TEXT NOT NULL,
type TEXT NOT NULL DEFAULT "user",
created_at DATETIME NOT NULL DEFAULT(STRFTIME('%Y-%m-%d %H:%M:%f', 'NOW')),
-- Updated via trigger
updated_at DATETIME NOT NULL DEFAULT(STRFTIME('%Y-%m-%d %H:%M:%f', 'NOW'))
);
"""
]
# Add trigger for `updated_at`.
triggers = [
"""--sql
CREATE TRIGGER IF NOT EXISTS style_presets
AFTER UPDATE
ON style_presets FOR EACH ROW
BEGIN
UPDATE style_presets SET updated_at = STRFTIME('%Y-%m-%d %H:%M:%f', 'NOW')
WHERE id = old.id;
END;
"""
]
# Add indexes for searchable fields
indices = [
"CREATE INDEX IF NOT EXISTS idx_style_presets_name ON style_presets(name);",
]
for stmt in tables + indices + triggers:
cursor.execute(stmt)
def build_migration_14() -> Migration:
"""
Build the migration from database version 13 to 14..
This migration does the following:
- Create the table used to store style presets.
"""
migration_14 = Migration(
from_version=13,
to_version=14,
callback=Migration14Callback(),
)
return migration_14

Binary file not shown.

After

Width:  |  Height:  |  Size: 98 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 138 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 122 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 123 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 160 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 146 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 119 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 117 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 110 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 46 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 79 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 156 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 141 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 96 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 91 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 88 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 107 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 132 KiB

View File

@@ -0,0 +1,33 @@
from abc import ABC, abstractmethod
from pathlib import Path
from PIL.Image import Image as PILImageType
class StylePresetImageFileStorageBase(ABC):
"""Low-level service responsible for storing and retrieving image files."""
@abstractmethod
def get(self, style_preset_id: str) -> PILImageType:
"""Retrieves a style preset image as PIL Image."""
pass
@abstractmethod
def get_path(self, style_preset_id: str) -> Path:
"""Gets the internal path to a style preset image."""
pass
@abstractmethod
def get_url(self, style_preset_id: str) -> str | None:
"""Gets the URL to fetch a style preset image."""
pass
@abstractmethod
def save(self, style_preset_id: str, image: PILImageType) -> None:
"""Saves a style preset image."""
pass
@abstractmethod
def delete(self, style_preset_id: str) -> None:
"""Deletes a style preset image."""
pass

View File

@@ -0,0 +1,19 @@
class StylePresetImageFileNotFoundException(Exception):
"""Raised when an image file is not found in storage."""
def __init__(self, message: str = "Style preset image file not found"):
super().__init__(message)
class StylePresetImageFileSaveException(Exception):
"""Raised when an image cannot be saved."""
def __init__(self, message: str = "Style preset image file not saved"):
super().__init__(message)
class StylePresetImageFileDeleteException(Exception):
"""Raised when an image cannot be deleted."""
def __init__(self, message: str = "Style preset image file not deleted"):
super().__init__(message)

View File

@@ -0,0 +1,88 @@
from pathlib import Path
from PIL import Image
from PIL.Image import Image as PILImageType
from invokeai.app.services.invoker import Invoker
from invokeai.app.services.style_preset_images.style_preset_images_base import StylePresetImageFileStorageBase
from invokeai.app.services.style_preset_images.style_preset_images_common import (
StylePresetImageFileDeleteException,
StylePresetImageFileNotFoundException,
StylePresetImageFileSaveException,
)
from invokeai.app.services.style_preset_records.style_preset_records_common import PresetType
from invokeai.app.util.misc import uuid_string
from invokeai.app.util.thumbnails import make_thumbnail
class StylePresetImageFileStorageDisk(StylePresetImageFileStorageBase):
"""Stores images on disk"""
def __init__(self, style_preset_images_folder: Path):
self._style_preset_images_folder = style_preset_images_folder
self._validate_storage_folders()
def start(self, invoker: Invoker) -> None:
self._invoker = invoker
def get(self, style_preset_id: str) -> PILImageType:
try:
path = self.get_path(style_preset_id)
return Image.open(path)
except FileNotFoundError as e:
raise StylePresetImageFileNotFoundException from e
def save(self, style_preset_id: str, image: PILImageType) -> None:
try:
self._validate_storage_folders()
image_path = self._style_preset_images_folder / (style_preset_id + ".webp")
thumbnail = make_thumbnail(image, 256)
thumbnail.save(image_path, format="webp")
except Exception as e:
raise StylePresetImageFileSaveException from e
def get_path(self, style_preset_id: str) -> Path:
style_preset = self._invoker.services.style_preset_records.get(style_preset_id)
if style_preset.type is PresetType.Default:
default_images_dir = Path(__file__).parent / Path("default_style_preset_images")
path = default_images_dir / (style_preset.name + ".png")
else:
path = self._style_preset_images_folder / (style_preset_id + ".webp")
return path
def get_url(self, style_preset_id: str) -> str | None:
path = self.get_path(style_preset_id)
if not self._validate_path(path):
return
url = self._invoker.services.urls.get_style_preset_image_url(style_preset_id)
# The image URL never changes, so we must add random query string to it to prevent caching
url += f"?{uuid_string()}"
return url
def delete(self, style_preset_id: str) -> None:
try:
path = self.get_path(style_preset_id)
if not self._validate_path(path):
raise StylePresetImageFileNotFoundException
path.unlink()
except StylePresetImageFileNotFoundException as e:
raise StylePresetImageFileNotFoundException from e
except Exception as e:
raise StylePresetImageFileDeleteException from e
def _validate_path(self, path: Path) -> bool:
"""Validates the path given for an image."""
return path.exists()
def _validate_storage_folders(self) -> None:
"""Checks if the required folders exist and create them if they don't"""
self._style_preset_images_folder.mkdir(parents=True, exist_ok=True)

View File

@@ -0,0 +1,146 @@
[
{
"name": "Photography (General)",
"type": "default",
"preset_data": {
"positive_prompt": "{prompt}. photography. f/2.8 macro photo, bokeh, photorealism",
"negative_prompt": "painting, digital art. sketch, blurry"
}
},
{
"name": "Photography (Studio Lighting)",
"type": "default",
"preset_data": {
"positive_prompt": "{prompt}, photography. f/8 photo. centered subject, studio lighting.",
"negative_prompt": "painting, digital art. sketch, blurry"
}
},
{
"name": "Photography (Landscape)",
"type": "default",
"preset_data": {
"positive_prompt": "{prompt}, landscape photograph, f/12, lifelike, highly detailed.",
"negative_prompt": "painting, digital art. sketch, blurry"
}
},
{
"name": "Photography (Portrait)",
"type": "default",
"preset_data": {
"positive_prompt": "{prompt}. photography. portraiture. catch light in eyes. one flash. rembrandt lighting. Soft box. dark shadows. High contrast. 80mm lens. F2.8.",
"negative_prompt": "painting, digital art. sketch, blurry"
}
},
{
"name": "Photography (Black and White)",
"type": "default",
"preset_data": {
"positive_prompt": "{prompt} photography. natural light. 80mm lens. F1.4. strong contrast, hard light. dark contrast. blurred background. black and white",
"negative_prompt": "painting, digital art. sketch, colour+"
}
},
{
"name": "Architectural Visualization",
"type": "default",
"preset_data": {
"positive_prompt": "{prompt}. architectural photography, f/12, luxury, aesthetically pleasing form and function.",
"negative_prompt": "painting, digital art. sketch, blurry"
}
},
{
"name": "Concept Art (Fantasy)",
"type": "default",
"preset_data": {
"positive_prompt": "concept artwork of a {prompt}. (digital painterly art style)++, mythological, (textured 2d dry media brushpack)++, glazed brushstrokes, otherworldly. painting+, illustration+",
"negative_prompt": "photo. distorted, blurry, out of focus. sketch. (cgi, 3d.)++"
}
},
{
"name": "Concept Art (Sci-Fi)",
"type": "default",
"preset_data": {
"positive_prompt": "(concept art)++, {prompt}, (sleek futurism)++, (textured 2d dry media)++, metallic highlights, digital painting style",
"negative_prompt": "photo. distorted, blurry, out of focus. sketch. (cgi, 3d.)++"
}
},
{
"name": "Concept Art (Character)",
"type": "default",
"preset_data": {
"positive_prompt": "(character concept art)++, stylized painterly digital painting of {prompt}, (painterly, impasto. Dry brush.)++",
"negative_prompt": "photo. distorted, blurry, out of focus. sketch. (cgi, 3d.)++"
}
},
{
"name": "Concept Art (Painterly)",
"type": "default",
"preset_data": {
"positive_prompt": "{prompt} oil painting. high contrast. impasto. sfumato. chiaroscuro. Palette knife.",
"negative_prompt": "photo. smooth. border. frame"
}
},
{
"name": "Environment Art",
"type": "default",
"preset_data": {
"positive_prompt": "{prompt} environment artwork, hyper-realistic digital painting style with cinematic composition, atmospheric, depth and detail, voluminous. textured dry brush 2d media",
"negative_prompt": "photo, distorted, blurry, out of focus. sketch."
}
},
{
"name": "Interior Design (Visualization)",
"type": "default",
"preset_data": {
"positive_prompt": "{prompt} interior design photo, gentle shadows, light mid-tones, dimension, mix of smooth and textured surfaces, focus on negative space and clean lines, focus",
"negative_prompt": "photo, distorted. sketch."
}
},
{
"name": "Product Rendering",
"type": "default",
"preset_data": {
"positive_prompt": "{prompt} high quality product photography, 3d rendering with key lighting, shallow depth of field, simple plain background, studio lighting.",
"negative_prompt": "blurry, sketch, messy, dirty. unfinished."
}
},
{
"name": "Sketch",
"type": "default",
"preset_data": {
"positive_prompt": "{prompt} black and white pencil drawing, off-center composition, cross-hatching for shadows, bold strokes, textured paper. sketch+++",
"negative_prompt": "blurry, photo, painting, color. messy, dirty. unfinished. frame, borders."
}
},
{
"name": "Line Art",
"type": "default",
"preset_data": {
"positive_prompt": "{prompt} Line art. bold outline. simplistic. white background. 2d",
"negative_prompt": "photo. digital art. greyscale. solid black. painting"
}
},
{
"name": "Anime",
"type": "default",
"preset_data": {
"positive_prompt": "{prompt} anime++, bold outline, cel-shaded coloring, shounen, seinen",
"negative_prompt": "(photo)+++. greyscale. solid black. painting"
}
},
{
"name": "Illustration",
"type": "default",
"preset_data": {
"positive_prompt": "{prompt} illustration, bold linework, illustrative details, vector art style, flat coloring",
"negative_prompt": "(photo)+++. greyscale. painting, black and white."
}
},
{
"name": "Vehicles",
"type": "default",
"preset_data": {
"positive_prompt": "A weird futuristic normal auto, {prompt} elegant design, nice color, nice wheels",
"negative_prompt": "sketch. digital art. greyscale. painting"
}
}
]

View File

@@ -0,0 +1,42 @@
from abc import ABC, abstractmethod
from invokeai.app.services.style_preset_records.style_preset_records_common import (
PresetType,
StylePresetChanges,
StylePresetRecordDTO,
StylePresetWithoutId,
)
class StylePresetRecordsStorageBase(ABC):
"""Base class for style preset storage services."""
@abstractmethod
def get(self, style_preset_id: str) -> StylePresetRecordDTO:
"""Get style preset by id."""
pass
@abstractmethod
def create(self, style_preset: StylePresetWithoutId) -> StylePresetRecordDTO:
"""Creates a style preset."""
pass
@abstractmethod
def create_many(self, style_presets: list[StylePresetWithoutId]) -> None:
"""Creates many style presets."""
pass
@abstractmethod
def update(self, style_preset_id: str, changes: StylePresetChanges) -> StylePresetRecordDTO:
"""Updates a style preset."""
pass
@abstractmethod
def delete(self, style_preset_id: str) -> None:
"""Deletes a style preset."""
pass
@abstractmethod
def get_many(self, type: PresetType | None = None) -> list[StylePresetRecordDTO]:
"""Gets many workflows."""
pass

View File

@@ -0,0 +1,139 @@
import codecs
import csv
import json
from enum import Enum
from typing import Any, Optional
import pydantic
from fastapi import UploadFile
from pydantic import AliasChoices, BaseModel, ConfigDict, Field, TypeAdapter
from invokeai.app.util.metaenum import MetaEnum
class StylePresetNotFoundError(Exception):
"""Raised when a style preset is not found"""
class PresetData(BaseModel, extra="forbid"):
positive_prompt: str = Field(description="Positive prompt")
negative_prompt: str = Field(description="Negative prompt")
PresetDataValidator = TypeAdapter(PresetData)
class PresetType(str, Enum, metaclass=MetaEnum):
User = "user"
Default = "default"
Project = "project"
class StylePresetChanges(BaseModel, extra="forbid"):
name: Optional[str] = Field(default=None, description="The style preset's new name.")
preset_data: Optional[PresetData] = Field(default=None, description="The updated data for style preset.")
type: Optional[PresetType] = Field(description="The updated type of the style preset")
class StylePresetWithoutId(BaseModel):
name: str = Field(description="The name of the style preset.")
preset_data: PresetData = Field(description="The preset data")
type: PresetType = Field(description="The type of style preset")
class StylePresetRecordDTO(StylePresetWithoutId):
id: str = Field(description="The style preset ID.")
@classmethod
def from_dict(cls, data: dict[str, Any]) -> "StylePresetRecordDTO":
data["preset_data"] = PresetDataValidator.validate_json(data.get("preset_data", ""))
return StylePresetRecordDTOValidator.validate_python(data)
StylePresetRecordDTOValidator = TypeAdapter(StylePresetRecordDTO)
class StylePresetRecordWithImage(StylePresetRecordDTO):
image: Optional[str] = Field(description="The path for image")
class StylePresetImportRow(BaseModel):
name: str = Field(min_length=1, description="The name of the preset.")
positive_prompt: str = Field(
default="",
description="The positive prompt for the preset.",
validation_alias=AliasChoices("positive_prompt", "prompt"),
)
negative_prompt: str = Field(default="", description="The negative prompt for the preset.")
model_config = ConfigDict(str_strip_whitespace=True, extra="forbid")
StylePresetImportList = list[StylePresetImportRow]
StylePresetImportListTypeAdapter = TypeAdapter(StylePresetImportList)
class UnsupportedFileTypeError(ValueError):
"""Raised when an unsupported file type is encountered"""
pass
class InvalidPresetImportDataError(ValueError):
"""Raised when invalid preset import data is encountered"""
pass
async def parse_presets_from_file(file: UploadFile) -> list[StylePresetWithoutId]:
"""Parses style presets from a file. The file must be a CSV or JSON file.
If CSV, the file must have the following columns:
- name
- prompt (or positive_prompt)
- negative_prompt
If JSON, the file must be a list of objects with the following keys:
- name
- prompt (or positive_prompt)
- negative_prompt
Args:
file (UploadFile): The file to parse.
Returns:
list[StylePresetWithoutId]: The parsed style presets.
Raises:
UnsupportedFileTypeError: If the file type is not supported.
InvalidPresetImportDataError: If the data in the file is invalid.
"""
if file.content_type not in ["text/csv", "application/json"]:
raise UnsupportedFileTypeError()
if file.content_type == "text/csv":
csv_reader = csv.DictReader(codecs.iterdecode(file.file, "utf-8"))
data = list(csv_reader)
else: # file.content_type == "application/json":
json_data = await file.read()
data = json.loads(json_data)
try:
imported_presets = StylePresetImportListTypeAdapter.validate_python(data)
style_presets: list[StylePresetWithoutId] = []
for imported in imported_presets:
preset_data = PresetData(positive_prompt=imported.positive_prompt, negative_prompt=imported.negative_prompt)
style_preset = StylePresetWithoutId(name=imported.name, preset_data=preset_data, type=PresetType.User)
style_presets.append(style_preset)
except pydantic.ValidationError as e:
if file.content_type == "text/csv":
msg = "Invalid CSV format: must include columns 'name', 'prompt', and 'negative_prompt' and name cannot be blank"
else: # file.content_type == "application/json":
msg = "Invalid JSON format: must be a list of objects with keys 'name', 'prompt', and 'negative_prompt' and name cannot be blank"
raise InvalidPresetImportDataError(msg) from e
finally:
file.file.close()
return style_presets

View File

@@ -0,0 +1,215 @@
import json
from pathlib import Path
from invokeai.app.services.invoker import Invoker
from invokeai.app.services.shared.sqlite.sqlite_database import SqliteDatabase
from invokeai.app.services.style_preset_records.style_preset_records_base import StylePresetRecordsStorageBase
from invokeai.app.services.style_preset_records.style_preset_records_common import (
PresetType,
StylePresetChanges,
StylePresetNotFoundError,
StylePresetRecordDTO,
StylePresetWithoutId,
)
from invokeai.app.util.misc import uuid_string
class SqliteStylePresetRecordsStorage(StylePresetRecordsStorageBase):
def __init__(self, db: SqliteDatabase) -> None:
super().__init__()
self._lock = db.lock
self._conn = db.conn
self._cursor = self._conn.cursor()
def start(self, invoker: Invoker) -> None:
self._invoker = invoker
self._sync_default_style_presets()
def get(self, style_preset_id: str) -> StylePresetRecordDTO:
"""Gets a style preset by ID."""
try:
self._lock.acquire()
self._cursor.execute(
"""--sql
SELECT *
FROM style_presets
WHERE id = ?;
""",
(style_preset_id,),
)
row = self._cursor.fetchone()
if row is None:
raise StylePresetNotFoundError(f"Style preset with id {style_preset_id} not found")
return StylePresetRecordDTO.from_dict(dict(row))
except Exception:
self._conn.rollback()
raise
finally:
self._lock.release()
def create(self, style_preset: StylePresetWithoutId) -> StylePresetRecordDTO:
style_preset_id = uuid_string()
try:
self._lock.acquire()
self._cursor.execute(
"""--sql
INSERT OR IGNORE INTO style_presets (
id,
name,
preset_data,
type
)
VALUES (?, ?, ?, ?);
""",
(
style_preset_id,
style_preset.name,
style_preset.preset_data.model_dump_json(),
style_preset.type,
),
)
self._conn.commit()
except Exception:
self._conn.rollback()
raise
finally:
self._lock.release()
return self.get(style_preset_id)
def create_many(self, style_presets: list[StylePresetWithoutId]) -> None:
style_preset_ids = []
try:
self._lock.acquire()
for style_preset in style_presets:
style_preset_id = uuid_string()
style_preset_ids.append(style_preset_id)
self._cursor.execute(
"""--sql
INSERT OR IGNORE INTO style_presets (
id,
name,
preset_data,
type
)
VALUES (?, ?, ?, ?);
""",
(
style_preset_id,
style_preset.name,
style_preset.preset_data.model_dump_json(),
style_preset.type,
),
)
self._conn.commit()
except Exception:
self._conn.rollback()
raise
finally:
self._lock.release()
return None
def update(self, style_preset_id: str, changes: StylePresetChanges) -> StylePresetRecordDTO:
try:
self._lock.acquire()
# Change the name of a style preset
if changes.name is not None:
self._cursor.execute(
"""--sql
UPDATE style_presets
SET name = ?
WHERE id = ?;
""",
(changes.name, style_preset_id),
)
# Change the preset data for a style preset
if changes.preset_data is not None:
self._cursor.execute(
"""--sql
UPDATE style_presets
SET preset_data = ?
WHERE id = ?;
""",
(changes.preset_data.model_dump_json(), style_preset_id),
)
self._conn.commit()
except Exception:
self._conn.rollback()
raise
finally:
self._lock.release()
return self.get(style_preset_id)
def delete(self, style_preset_id: str) -> None:
try:
self._lock.acquire()
self._cursor.execute(
"""--sql
DELETE from style_presets
WHERE id = ?;
""",
(style_preset_id,),
)
self._conn.commit()
except Exception:
self._conn.rollback()
raise
finally:
self._lock.release()
return None
def get_many(self, type: PresetType | None = None) -> list[StylePresetRecordDTO]:
try:
self._lock.acquire()
main_query = """
SELECT
*
FROM style_presets
"""
if type is not None:
main_query += "WHERE type = ? "
main_query += "ORDER BY LOWER(name) ASC"
if type is not None:
self._cursor.execute(main_query, (type,))
else:
self._cursor.execute(main_query)
rows = self._cursor.fetchall()
style_presets = [StylePresetRecordDTO.from_dict(dict(row)) for row in rows]
return style_presets
except Exception:
self._conn.rollback()
raise
finally:
self._lock.release()
def _sync_default_style_presets(self) -> None:
"""Syncs default style presets to the database. Internal use only."""
# First delete all existing default style presets
try:
self._lock.acquire()
self._cursor.execute(
"""--sql
DELETE FROM style_presets
WHERE type = "default";
"""
)
self._conn.commit()
except Exception:
self._conn.rollback()
raise
finally:
self._lock.release()
# Next, parse and create the default style presets
with self._lock, open(Path(__file__).parent / Path("default_style_presets.json"), "r") as file:
presets = json.load(file)
for preset in presets:
style_preset = StylePresetWithoutId.model_validate(preset)
self.create(style_preset)

View File

@@ -13,3 +13,8 @@ class UrlServiceBase(ABC):
def get_model_image_url(self, model_key: str) -> str:
"""Gets the URL for a model image"""
pass
@abstractmethod
def get_style_preset_image_url(self, style_preset_id: str) -> str:
"""Gets the URL for a style preset image"""
pass

View File

@@ -19,3 +19,6 @@ class LocalUrlService(UrlServiceBase):
def get_model_image_url(self, model_key: str) -> str:
return f"{self._base_url_v2}/models/i/{model_key}/image"
def get_style_preset_image_url(self, style_preset_id: str) -> str:
return f"{self._base_url}/style_presets/i/{style_preset_id}/image"

View File

@@ -81,7 +81,7 @@ def get_openapi_func(
# Add the output map to the schema
openapi_schema["components"]["schemas"]["InvocationOutputMap"] = {
"type": "object",
"properties": invocation_output_map_properties,
"properties": dict(sorted(invocation_output_map_properties.items())),
"required": invocation_output_map_required,
}

View File

@@ -1,90 +0,0 @@
from pathlib import Path
from typing import Literal
import cv2
import numpy as np
import torch
import torch.nn.functional as F
from einops import repeat
from PIL import Image
from torchvision.transforms import Compose
from invokeai.app.services.config.config_default import get_config
from invokeai.backend.image_util.depth_anything.model.dpt import DPT_DINOv2
from invokeai.backend.image_util.depth_anything.utilities.util import NormalizeImage, PrepareForNet, Resize
from invokeai.backend.util.logging import InvokeAILogger
config = get_config()
logger = InvokeAILogger.get_logger(config=config)
DEPTH_ANYTHING_MODELS = {
"large": "https://huggingface.co/spaces/LiheYoung/Depth-Anything/resolve/main/checkpoints/depth_anything_vitl14.pth?download=true",
"base": "https://huggingface.co/spaces/LiheYoung/Depth-Anything/resolve/main/checkpoints/depth_anything_vitb14.pth?download=true",
"small": "https://huggingface.co/spaces/LiheYoung/Depth-Anything/resolve/main/checkpoints/depth_anything_vits14.pth?download=true",
}
transform = Compose(
[
Resize(
width=518,
height=518,
resize_target=False,
keep_aspect_ratio=True,
ensure_multiple_of=14,
resize_method="lower_bound",
image_interpolation_method=cv2.INTER_CUBIC,
),
NormalizeImage(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
PrepareForNet(),
]
)
class DepthAnythingDetector:
def __init__(self, model: DPT_DINOv2, device: torch.device) -> None:
self.model = model
self.device = device
@staticmethod
def load_model(
model_path: Path, device: torch.device, model_size: Literal["large", "base", "small"] = "small"
) -> DPT_DINOv2:
match model_size:
case "small":
model = DPT_DINOv2(encoder="vits", features=64, out_channels=[48, 96, 192, 384])
case "base":
model = DPT_DINOv2(encoder="vitb", features=128, out_channels=[96, 192, 384, 768])
case "large":
model = DPT_DINOv2(encoder="vitl", features=256, out_channels=[256, 512, 1024, 1024])
model.load_state_dict(torch.load(model_path.as_posix(), map_location="cpu"))
model.eval()
model.to(device)
return model
def __call__(self, image: Image.Image, resolution: int = 512) -> Image.Image:
if not self.model:
logger.warn("DepthAnything model was not loaded. Returning original image")
return image
np_image = np.array(image, dtype=np.uint8)
np_image = np_image[:, :, ::-1] / 255.0
image_height, image_width = np_image.shape[:2]
np_image = transform({"image": np_image})["image"]
tensor_image = torch.from_numpy(np_image).unsqueeze(0).to(self.device)
with torch.no_grad():
depth = self.model(tensor_image)
depth = F.interpolate(depth[None], (image_height, image_width), mode="bilinear", align_corners=False)[0, 0]
depth = (depth - depth.min()) / (depth.max() - depth.min()) * 255.0
depth_map = repeat(depth, "h w -> h w 3").cpu().numpy().astype(np.uint8)
depth_map = Image.fromarray(depth_map)
new_height = int(image_height * (resolution / image_width))
depth_map = depth_map.resize((resolution, new_height))
return depth_map

View File

@@ -0,0 +1,31 @@
from typing import Optional
import torch
from PIL import Image
from transformers.pipelines import DepthEstimationPipeline
from invokeai.backend.raw_model import RawModel
class DepthAnythingPipeline(RawModel):
"""Custom wrapper for the Depth Estimation pipeline from transformers adding compatibility
for Invoke's Model Management System"""
def __init__(self, pipeline: DepthEstimationPipeline) -> None:
self._pipeline = pipeline
def generate_depth(self, image: Image.Image) -> Image.Image:
depth_map = self._pipeline(image)["depth"]
assert isinstance(depth_map, Image.Image)
return depth_map
def to(self, device: Optional[torch.device] = None, dtype: Optional[torch.dtype] = None):
if device is not None and device.type not in {"cpu", "cuda"}:
device = None
self._pipeline.model.to(device=device, dtype=dtype)
self._pipeline.device = self._pipeline.model.device
def calc_size(self) -> int:
from invokeai.backend.model_manager.load.model_util import calc_module_size
return calc_module_size(self._pipeline.model)

View File

@@ -1,145 +0,0 @@
import torch.nn as nn
def _make_scratch(in_shape, out_shape, groups=1, expand=False):
scratch = nn.Module()
out_shape1 = out_shape
out_shape2 = out_shape
out_shape3 = out_shape
if len(in_shape) >= 4:
out_shape4 = out_shape
if expand:
out_shape1 = out_shape
out_shape2 = out_shape * 2
out_shape3 = out_shape * 4
if len(in_shape) >= 4:
out_shape4 = out_shape * 8
scratch.layer1_rn = nn.Conv2d(
in_shape[0], out_shape1, kernel_size=3, stride=1, padding=1, bias=False, groups=groups
)
scratch.layer2_rn = nn.Conv2d(
in_shape[1], out_shape2, kernel_size=3, stride=1, padding=1, bias=False, groups=groups
)
scratch.layer3_rn = nn.Conv2d(
in_shape[2], out_shape3, kernel_size=3, stride=1, padding=1, bias=False, groups=groups
)
if len(in_shape) >= 4:
scratch.layer4_rn = nn.Conv2d(
in_shape[3], out_shape4, kernel_size=3, stride=1, padding=1, bias=False, groups=groups
)
return scratch
class ResidualConvUnit(nn.Module):
"""Residual convolution module."""
def __init__(self, features, activation, bn):
"""Init.
Args:
features (int): number of features
"""
super().__init__()
self.bn = bn
self.groups = 1
self.conv1 = nn.Conv2d(features, features, kernel_size=3, stride=1, padding=1, bias=True, groups=self.groups)
self.conv2 = nn.Conv2d(features, features, kernel_size=3, stride=1, padding=1, bias=True, groups=self.groups)
if self.bn:
self.bn1 = nn.BatchNorm2d(features)
self.bn2 = nn.BatchNorm2d(features)
self.activation = activation
self.skip_add = nn.quantized.FloatFunctional()
def forward(self, x):
"""Forward pass.
Args:
x (tensor): input
Returns:
tensor: output
"""
out = self.activation(x)
out = self.conv1(out)
if self.bn:
out = self.bn1(out)
out = self.activation(out)
out = self.conv2(out)
if self.bn:
out = self.bn2(out)
if self.groups > 1:
out = self.conv_merge(out)
return self.skip_add.add(out, x)
class FeatureFusionBlock(nn.Module):
"""Feature fusion block."""
def __init__(self, features, activation, deconv=False, bn=False, expand=False, align_corners=True, size=None):
"""Init.
Args:
features (int): number of features
"""
super(FeatureFusionBlock, self).__init__()
self.deconv = deconv
self.align_corners = align_corners
self.groups = 1
self.expand = expand
out_features = features
if self.expand:
out_features = features // 2
self.out_conv = nn.Conv2d(features, out_features, kernel_size=1, stride=1, padding=0, bias=True, groups=1)
self.resConfUnit1 = ResidualConvUnit(features, activation, bn)
self.resConfUnit2 = ResidualConvUnit(features, activation, bn)
self.skip_add = nn.quantized.FloatFunctional()
self.size = size
def forward(self, *xs, size=None):
"""Forward pass.
Returns:
tensor: output
"""
output = xs[0]
if len(xs) == 2:
res = self.resConfUnit1(xs[1])
output = self.skip_add.add(output, res)
output = self.resConfUnit2(output)
if (size is None) and (self.size is None):
modifier = {"scale_factor": 2}
elif size is None:
modifier = {"size": self.size}
else:
modifier = {"size": size}
output = nn.functional.interpolate(output, **modifier, mode="bilinear", align_corners=self.align_corners)
output = self.out_conv(output)
return output

View File

@@ -1,183 +0,0 @@
from pathlib import Path
import torch
import torch.nn as nn
import torch.nn.functional as F
from invokeai.backend.image_util.depth_anything.model.blocks import FeatureFusionBlock, _make_scratch
torchhub_path = Path(__file__).parent.parent / "torchhub"
def _make_fusion_block(features, use_bn, size=None):
return FeatureFusionBlock(
features,
nn.ReLU(False),
deconv=False,
bn=use_bn,
expand=False,
align_corners=True,
size=size,
)
class DPTHead(nn.Module):
def __init__(self, nclass, in_channels, features, out_channels, use_bn=False, use_clstoken=False):
super(DPTHead, self).__init__()
self.nclass = nclass
self.use_clstoken = use_clstoken
self.projects = nn.ModuleList(
[
nn.Conv2d(
in_channels=in_channels,
out_channels=out_channel,
kernel_size=1,
stride=1,
padding=0,
)
for out_channel in out_channels
]
)
self.resize_layers = nn.ModuleList(
[
nn.ConvTranspose2d(
in_channels=out_channels[0], out_channels=out_channels[0], kernel_size=4, stride=4, padding=0
),
nn.ConvTranspose2d(
in_channels=out_channels[1], out_channels=out_channels[1], kernel_size=2, stride=2, padding=0
),
nn.Identity(),
nn.Conv2d(
in_channels=out_channels[3], out_channels=out_channels[3], kernel_size=3, stride=2, padding=1
),
]
)
if use_clstoken:
self.readout_projects = nn.ModuleList()
for _ in range(len(self.projects)):
self.readout_projects.append(nn.Sequential(nn.Linear(2 * in_channels, in_channels), nn.GELU()))
self.scratch = _make_scratch(
out_channels,
features,
groups=1,
expand=False,
)
self.scratch.stem_transpose = None
self.scratch.refinenet1 = _make_fusion_block(features, use_bn)
self.scratch.refinenet2 = _make_fusion_block(features, use_bn)
self.scratch.refinenet3 = _make_fusion_block(features, use_bn)
self.scratch.refinenet4 = _make_fusion_block(features, use_bn)
head_features_1 = features
head_features_2 = 32
if nclass > 1:
self.scratch.output_conv = nn.Sequential(
nn.Conv2d(head_features_1, head_features_1, kernel_size=3, stride=1, padding=1),
nn.ReLU(True),
nn.Conv2d(head_features_1, nclass, kernel_size=1, stride=1, padding=0),
)
else:
self.scratch.output_conv1 = nn.Conv2d(
head_features_1, head_features_1 // 2, kernel_size=3, stride=1, padding=1
)
self.scratch.output_conv2 = nn.Sequential(
nn.Conv2d(head_features_1 // 2, head_features_2, kernel_size=3, stride=1, padding=1),
nn.ReLU(True),
nn.Conv2d(head_features_2, 1, kernel_size=1, stride=1, padding=0),
nn.ReLU(True),
nn.Identity(),
)
def forward(self, out_features, patch_h, patch_w):
out = []
for i, x in enumerate(out_features):
if self.use_clstoken:
x, cls_token = x[0], x[1]
readout = cls_token.unsqueeze(1).expand_as(x)
x = self.readout_projects[i](torch.cat((x, readout), -1))
else:
x = x[0]
x = x.permute(0, 2, 1).reshape((x.shape[0], x.shape[-1], patch_h, patch_w))
x = self.projects[i](x)
x = self.resize_layers[i](x)
out.append(x)
layer_1, layer_2, layer_3, layer_4 = out
layer_1_rn = self.scratch.layer1_rn(layer_1)
layer_2_rn = self.scratch.layer2_rn(layer_2)
layer_3_rn = self.scratch.layer3_rn(layer_3)
layer_4_rn = self.scratch.layer4_rn(layer_4)
path_4 = self.scratch.refinenet4(layer_4_rn, size=layer_3_rn.shape[2:])
path_3 = self.scratch.refinenet3(path_4, layer_3_rn, size=layer_2_rn.shape[2:])
path_2 = self.scratch.refinenet2(path_3, layer_2_rn, size=layer_1_rn.shape[2:])
path_1 = self.scratch.refinenet1(path_2, layer_1_rn)
out = self.scratch.output_conv1(path_1)
out = F.interpolate(out, (int(patch_h * 14), int(patch_w * 14)), mode="bilinear", align_corners=True)
out = self.scratch.output_conv2(out)
return out
class DPT_DINOv2(nn.Module):
def __init__(
self,
features,
out_channels,
encoder="vitl",
use_bn=False,
use_clstoken=False,
):
super(DPT_DINOv2, self).__init__()
assert encoder in ["vits", "vitb", "vitl"]
# # in case the Internet connection is not stable, please load the DINOv2 locally
# if use_local:
# self.pretrained = torch.hub.load(
# torchhub_path / "facebookresearch_dinov2_main",
# "dinov2_{:}14".format(encoder),
# source="local",
# pretrained=False,
# )
# else:
# self.pretrained = torch.hub.load(
# "facebookresearch/dinov2",
# "dinov2_{:}14".format(encoder),
# )
self.pretrained = torch.hub.load(
"facebookresearch/dinov2",
"dinov2_{:}14".format(encoder),
)
dim = self.pretrained.blocks[0].attn.qkv.in_features
self.depth_head = DPTHead(1, dim, features, out_channels=out_channels, use_bn=use_bn, use_clstoken=use_clstoken)
def forward(self, x):
h, w = x.shape[-2:]
features = self.pretrained.get_intermediate_layers(x, 4, return_class_token=True)
patch_h, patch_w = h // 14, w // 14
depth = self.depth_head(features, patch_h, patch_w)
depth = F.interpolate(depth, size=(h, w), mode="bilinear", align_corners=True)
depth = F.relu(depth)
return depth.squeeze(1)

View File

@@ -1,227 +0,0 @@
import math
import cv2
import numpy as np
import torch
import torch.nn.functional as F
def apply_min_size(sample, size, image_interpolation_method=cv2.INTER_AREA):
"""Rezise the sample to ensure the given size. Keeps aspect ratio.
Args:
sample (dict): sample
size (tuple): image size
Returns:
tuple: new size
"""
shape = list(sample["disparity"].shape)
if shape[0] >= size[0] and shape[1] >= size[1]:
return sample
scale = [0, 0]
scale[0] = size[0] / shape[0]
scale[1] = size[1] / shape[1]
scale = max(scale)
shape[0] = math.ceil(scale * shape[0])
shape[1] = math.ceil(scale * shape[1])
# resize
sample["image"] = cv2.resize(sample["image"], tuple(shape[::-1]), interpolation=image_interpolation_method)
sample["disparity"] = cv2.resize(sample["disparity"], tuple(shape[::-1]), interpolation=cv2.INTER_NEAREST)
sample["mask"] = cv2.resize(
sample["mask"].astype(np.float32),
tuple(shape[::-1]),
interpolation=cv2.INTER_NEAREST,
)
sample["mask"] = sample["mask"].astype(bool)
return tuple(shape)
class Resize(object):
"""Resize sample to given size (width, height)."""
def __init__(
self,
width,
height,
resize_target=True,
keep_aspect_ratio=False,
ensure_multiple_of=1,
resize_method="lower_bound",
image_interpolation_method=cv2.INTER_AREA,
):
"""Init.
Args:
width (int): desired output width
height (int): desired output height
resize_target (bool, optional):
True: Resize the full sample (image, mask, target).
False: Resize image only.
Defaults to True.
keep_aspect_ratio (bool, optional):
True: Keep the aspect ratio of the input sample.
Output sample might not have the given width and height, and
resize behaviour depends on the parameter 'resize_method'.
Defaults to False.
ensure_multiple_of (int, optional):
Output width and height is constrained to be multiple of this parameter.
Defaults to 1.
resize_method (str, optional):
"lower_bound": Output will be at least as large as the given size.
"upper_bound": Output will be at max as large as the given size. (Output size might be smaller
than given size.)
"minimal": Scale as least as possible. (Output size might be smaller than given size.)
Defaults to "lower_bound".
"""
self.__width = width
self.__height = height
self.__resize_target = resize_target
self.__keep_aspect_ratio = keep_aspect_ratio
self.__multiple_of = ensure_multiple_of
self.__resize_method = resize_method
self.__image_interpolation_method = image_interpolation_method
def constrain_to_multiple_of(self, x, min_val=0, max_val=None):
y = (np.round(x / self.__multiple_of) * self.__multiple_of).astype(int)
if max_val is not None and y > max_val:
y = (np.floor(x / self.__multiple_of) * self.__multiple_of).astype(int)
if y < min_val:
y = (np.ceil(x / self.__multiple_of) * self.__multiple_of).astype(int)
return y
def get_size(self, width, height):
# determine new height and width
scale_height = self.__height / height
scale_width = self.__width / width
if self.__keep_aspect_ratio:
if self.__resize_method == "lower_bound":
# scale such that output size is lower bound
if scale_width > scale_height:
# fit width
scale_height = scale_width
else:
# fit height
scale_width = scale_height
elif self.__resize_method == "upper_bound":
# scale such that output size is upper bound
if scale_width < scale_height:
# fit width
scale_height = scale_width
else:
# fit height
scale_width = scale_height
elif self.__resize_method == "minimal":
# scale as least as possbile
if abs(1 - scale_width) < abs(1 - scale_height):
# fit width
scale_height = scale_width
else:
# fit height
scale_width = scale_height
else:
raise ValueError(f"resize_method {self.__resize_method} not implemented")
if self.__resize_method == "lower_bound":
new_height = self.constrain_to_multiple_of(scale_height * height, min_val=self.__height)
new_width = self.constrain_to_multiple_of(scale_width * width, min_val=self.__width)
elif self.__resize_method == "upper_bound":
new_height = self.constrain_to_multiple_of(scale_height * height, max_val=self.__height)
new_width = self.constrain_to_multiple_of(scale_width * width, max_val=self.__width)
elif self.__resize_method == "minimal":
new_height = self.constrain_to_multiple_of(scale_height * height)
new_width = self.constrain_to_multiple_of(scale_width * width)
else:
raise ValueError(f"resize_method {self.__resize_method} not implemented")
return (new_width, new_height)
def __call__(self, sample):
width, height = self.get_size(sample["image"].shape[1], sample["image"].shape[0])
# resize sample
sample["image"] = cv2.resize(
sample["image"],
(width, height),
interpolation=self.__image_interpolation_method,
)
if self.__resize_target:
if "disparity" in sample:
sample["disparity"] = cv2.resize(
sample["disparity"],
(width, height),
interpolation=cv2.INTER_NEAREST,
)
if "depth" in sample:
sample["depth"] = cv2.resize(sample["depth"], (width, height), interpolation=cv2.INTER_NEAREST)
if "semseg_mask" in sample:
# sample["semseg_mask"] = cv2.resize(
# sample["semseg_mask"], (width, height), interpolation=cv2.INTER_NEAREST
# )
sample["semseg_mask"] = F.interpolate(
torch.from_numpy(sample["semseg_mask"]).float()[None, None, ...], (height, width), mode="nearest"
).numpy()[0, 0]
if "mask" in sample:
sample["mask"] = cv2.resize(
sample["mask"].astype(np.float32),
(width, height),
interpolation=cv2.INTER_NEAREST,
)
# sample["mask"] = sample["mask"].astype(bool)
# print(sample['image'].shape, sample['depth'].shape)
return sample
class NormalizeImage(object):
"""Normlize image by given mean and std."""
def __init__(self, mean, std):
self.__mean = mean
self.__std = std
def __call__(self, sample):
sample["image"] = (sample["image"] - self.__mean) / self.__std
return sample
class PrepareForNet(object):
"""Prepare sample for usage as network input."""
def __init__(self):
pass
def __call__(self, sample):
image = np.transpose(sample["image"], (2, 0, 1))
sample["image"] = np.ascontiguousarray(image).astype(np.float32)
if "mask" in sample:
sample["mask"] = sample["mask"].astype(np.float32)
sample["mask"] = np.ascontiguousarray(sample["mask"])
if "depth" in sample:
depth = sample["depth"].astype(np.float32)
sample["depth"] = np.ascontiguousarray(depth)
if "semseg_mask" in sample:
sample["semseg_mask"] = sample["semseg_mask"].astype(np.float32)
sample["semseg_mask"] = np.ascontiguousarray(sample["semseg_mask"])
return sample

View File

@@ -0,0 +1,22 @@
from pydantic import BaseModel, ConfigDict
class BoundingBox(BaseModel):
"""Bounding box helper class."""
xmin: int
ymin: int
xmax: int
ymax: int
class DetectionResult(BaseModel):
"""Detection result from Grounding DINO."""
score: float
label: str
box: BoundingBox
model_config = ConfigDict(
# Allow arbitrary types for mask, since it will be a numpy array.
arbitrary_types_allowed=True
)

View File

@@ -0,0 +1,37 @@
from typing import Optional
import torch
from PIL import Image
from transformers.pipelines import ZeroShotObjectDetectionPipeline
from invokeai.backend.image_util.grounding_dino.detection_result import DetectionResult
from invokeai.backend.raw_model import RawModel
class GroundingDinoPipeline(RawModel):
"""A wrapper class for a ZeroShotObjectDetectionPipeline that makes it compatible with the model manager's memory
management system.
"""
def __init__(self, pipeline: ZeroShotObjectDetectionPipeline):
self._pipeline = pipeline
def detect(self, image: Image.Image, candidate_labels: list[str], threshold: float = 0.1) -> list[DetectionResult]:
results = self._pipeline(image=image, candidate_labels=candidate_labels, threshold=threshold)
assert results is not None
results = [DetectionResult.model_validate(result) for result in results]
return results
def to(self, device: Optional[torch.device] = None, dtype: Optional[torch.dtype] = None):
# HACK(ryand): The GroundingDinoPipeline does not work on MPS devices. We only allow it to be moved to CPU or
# CUDA.
if device is not None and device.type not in {"cpu", "cuda"}:
device = None
self._pipeline.model.to(device=device, dtype=dtype)
self._pipeline.device = self._pipeline.model.device
def calc_size(self) -> int:
# HACK(ryand): Fix the circular import issue.
from invokeai.backend.model_manager.load.model_util import calc_module_size
return calc_module_size(self._pipeline.model)

View File

@@ -0,0 +1,50 @@
# This file contains utilities for Grounded-SAM mask refinement based on:
# https://github.com/NielsRogge/Transformers-Tutorials/blob/a39f33ac1557b02ebfb191ea7753e332b5ca933f/Grounding%20DINO/GroundingDINO_with_Segment_Anything.ipynb
import cv2
import numpy as np
import numpy.typing as npt
def mask_to_polygon(mask: npt.NDArray[np.uint8]) -> list[tuple[int, int]]:
"""Convert a binary mask to a polygon.
Returns:
list[list[int]]: List of (x, y) coordinates representing the vertices of the polygon.
"""
# Find contours in the binary mask.
contours, _ = cv2.findContours(mask.astype(np.uint8), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
# Find the contour with the largest area.
largest_contour = max(contours, key=cv2.contourArea)
# Extract the vertices of the contour.
polygon = largest_contour.reshape(-1, 2).tolist()
return polygon
def polygon_to_mask(
polygon: list[tuple[int, int]], image_shape: tuple[int, int], fill_value: int = 1
) -> npt.NDArray[np.uint8]:
"""Convert a polygon to a segmentation mask.
Args:
polygon (list): List of (x, y) coordinates representing the vertices of the polygon.
image_shape (tuple): Shape of the image (height, width) for the mask.
fill_value (int): Value to fill the polygon with.
Returns:
np.ndarray: Segmentation mask with the polygon filled (with value 255).
"""
# Create an empty mask.
mask = np.zeros(image_shape, dtype=np.uint8)
# Convert polygon to an array of points.
pts = np.array(polygon, dtype=np.int32)
# Fill the polygon with white color (255).
cv2.fillPoly(mask, [pts], color=(fill_value,))
return mask

View File

@@ -0,0 +1,53 @@
from typing import Optional
import torch
from PIL import Image
from transformers.models.sam import SamModel
from transformers.models.sam.processing_sam import SamProcessor
from invokeai.backend.raw_model import RawModel
class SegmentAnythingPipeline(RawModel):
"""A wrapper class for the transformers SAM model and processor that makes it compatible with the model manager."""
def __init__(self, sam_model: SamModel, sam_processor: SamProcessor):
self._sam_model = sam_model
self._sam_processor = sam_processor
def to(self, device: Optional[torch.device] = None, dtype: Optional[torch.dtype] = None):
# HACK(ryand): The SAM pipeline does not work on MPS devices. We only allow it to be moved to CPU or CUDA.
if device is not None and device.type not in {"cpu", "cuda"}:
device = None
self._sam_model.to(device=device, dtype=dtype)
def calc_size(self) -> int:
# HACK(ryand): Fix the circular import issue.
from invokeai.backend.model_manager.load.model_util import calc_module_size
return calc_module_size(self._sam_model)
def segment(self, image: Image.Image, bounding_boxes: list[list[int]]) -> torch.Tensor:
"""Run the SAM model.
Args:
image (Image.Image): The image to segment.
bounding_boxes (list[list[int]]): The bounding box prompts. Each bounding box is in the format
[xmin, ymin, xmax, ymax].
Returns:
torch.Tensor: The segmentation masks. dtype: torch.bool. shape: [num_masks, channels, height, width].
"""
# Add batch dimension of 1 to the bounding boxes.
boxes = [bounding_boxes]
inputs = self._sam_processor(images=image, input_boxes=boxes, return_tensors="pt").to(self._sam_model.device)
outputs = self._sam_model(**inputs)
masks = self._sam_processor.post_process_masks(
masks=outputs.pred_masks,
original_sizes=inputs.original_sizes,
reshaped_input_sizes=inputs.reshaped_input_sizes,
)
# There should be only one batch.
assert len(masks) == 1
return masks[0]

View File

@@ -3,12 +3,13 @@
import bisect
from pathlib import Path
from typing import Dict, List, Optional, Tuple, Union
from typing import Dict, List, Optional, Set, Tuple, Union
import torch
from safetensors.torch import load_file
from typing_extensions import Self
import invokeai.backend.util.logging as logger
from invokeai.backend.model_manager import BaseModelType
from invokeai.backend.raw_model import RawModel
@@ -46,9 +47,19 @@ class LoRALayerBase:
self.rank = None # set in layer implementation
self.layer_key = layer_key
def get_weight(self, orig_weight: Optional[torch.Tensor]) -> torch.Tensor:
def get_weight(self, orig_weight: torch.Tensor) -> torch.Tensor:
raise NotImplementedError()
def get_bias(self, orig_bias: torch.Tensor) -> Optional[torch.Tensor]:
return self.bias
def get_parameters(self, orig_module: torch.nn.Module) -> Dict[str, torch.Tensor]:
params = {"weight": self.get_weight(orig_module.weight)}
bias = self.get_bias(orig_module.bias)
if bias is not None:
params["bias"] = bias
return params
def calc_size(self) -> int:
model_size = 0
for val in [self.bias]:
@@ -60,6 +71,17 @@ class LoRALayerBase:
if self.bias is not None:
self.bias = self.bias.to(device=device, dtype=dtype)
def check_keys(self, values: Dict[str, torch.Tensor], known_keys: Set[str]):
"""Log a warning if values contains unhandled keys."""
# {"alpha", "bias_indices", "bias_values", "bias_size"} are hard-coded, because they are handled by
# `LoRALayerBase`. Sub-classes should provide the known_keys that they handled.
all_known_keys = known_keys | {"alpha", "bias_indices", "bias_values", "bias_size"}
unknown_keys = set(values.keys()) - all_known_keys
if unknown_keys:
logger.warning(
f"Unexpected keys found in LoRA/LyCORIS layer, model might work incorrectly! Keys: {unknown_keys}"
)
# TODO: find and debug lora/locon with bias
class LoRALayer(LoRALayerBase):
@@ -76,14 +98,19 @@ class LoRALayer(LoRALayerBase):
self.up = values["lora_up.weight"]
self.down = values["lora_down.weight"]
if "lora_mid.weight" in values:
self.mid: Optional[torch.Tensor] = values["lora_mid.weight"]
else:
self.mid = None
self.mid = values.get("lora_mid.weight", None)
self.rank = self.down.shape[0]
self.check_keys(
values,
{
"lora_up.weight",
"lora_down.weight",
"lora_mid.weight",
},
)
def get_weight(self, orig_weight: Optional[torch.Tensor]) -> torch.Tensor:
def get_weight(self, orig_weight: torch.Tensor) -> torch.Tensor:
if self.mid is not None:
up = self.up.reshape(self.up.shape[0], self.up.shape[1])
down = self.down.reshape(self.down.shape[0], self.down.shape[1])
@@ -125,20 +152,23 @@ class LoHALayer(LoRALayerBase):
self.w1_b = values["hada_w1_b"]
self.w2_a = values["hada_w2_a"]
self.w2_b = values["hada_w2_b"]
if "hada_t1" in values:
self.t1: Optional[torch.Tensor] = values["hada_t1"]
else:
self.t1 = None
if "hada_t2" in values:
self.t2: Optional[torch.Tensor] = values["hada_t2"]
else:
self.t2 = None
self.t1 = values.get("hada_t1", None)
self.t2 = values.get("hada_t2", None)
self.rank = self.w1_b.shape[0]
self.check_keys(
values,
{
"hada_w1_a",
"hada_w1_b",
"hada_w2_a",
"hada_w2_b",
"hada_t1",
"hada_t2",
},
)
def get_weight(self, orig_weight: Optional[torch.Tensor]) -> torch.Tensor:
def get_weight(self, orig_weight: torch.Tensor) -> torch.Tensor:
if self.t1 is None:
weight: torch.Tensor = (self.w1_a @ self.w1_b) * (self.w2_a @ self.w2_b)
@@ -186,37 +216,45 @@ class LoKRLayer(LoRALayerBase):
):
super().__init__(layer_key, values)
if "lokr_w1" in values:
self.w1: Optional[torch.Tensor] = values["lokr_w1"]
self.w1_a = None
self.w1_b = None
else:
self.w1 = None
self.w1 = values.get("lokr_w1", None)
if self.w1 is None:
self.w1_a = values["lokr_w1_a"]
self.w1_b = values["lokr_w1_b"]
if "lokr_w2" in values:
self.w2: Optional[torch.Tensor] = values["lokr_w2"]
self.w2_a = None
self.w2_b = None
else:
self.w2 = None
self.w1_b = None
self.w1_a = None
self.w2 = values.get("lokr_w2", None)
if self.w2 is None:
self.w2_a = values["lokr_w2_a"]
self.w2_b = values["lokr_w2_b"]
if "lokr_t2" in values:
self.t2: Optional[torch.Tensor] = values["lokr_t2"]
else:
self.t2 = None
self.w2_a = None
self.w2_b = None
if "lokr_w1_b" in values:
self.rank = values["lokr_w1_b"].shape[0]
elif "lokr_w2_b" in values:
self.rank = values["lokr_w2_b"].shape[0]
self.t2 = values.get("lokr_t2", None)
if self.w1_b is not None:
self.rank = self.w1_b.shape[0]
elif self.w2_b is not None:
self.rank = self.w2_b.shape[0]
else:
self.rank = None # unscaled
def get_weight(self, orig_weight: Optional[torch.Tensor]) -> torch.Tensor:
self.check_keys(
values,
{
"lokr_w1",
"lokr_w1_a",
"lokr_w1_b",
"lokr_w2",
"lokr_w2_a",
"lokr_w2_b",
"lokr_t2",
},
)
def get_weight(self, orig_weight: torch.Tensor) -> torch.Tensor:
w1: Optional[torch.Tensor] = self.w1
if w1 is None:
assert self.w1_a is not None
@@ -272,7 +310,9 @@ class LoKRLayer(LoRALayerBase):
class FullLayer(LoRALayerBase):
# bias handled in LoRALayerBase(calc_size, to)
# weight: torch.Tensor
# bias: Optional[torch.Tensor]
def __init__(
self,
@@ -282,15 +322,12 @@ class FullLayer(LoRALayerBase):
super().__init__(layer_key, values)
self.weight = values["diff"]
if len(values.keys()) > 1:
_keys = list(values.keys())
_keys.remove("diff")
raise NotImplementedError(f"Unexpected keys in lora diff layer: {_keys}")
self.bias = values.get("diff_b", None)
self.rank = None # unscaled
self.check_keys(values, {"diff", "diff_b"})
def get_weight(self, orig_weight: Optional[torch.Tensor]) -> torch.Tensor:
def get_weight(self, orig_weight: torch.Tensor) -> torch.Tensor:
return self.weight
def calc_size(self) -> int:
@@ -319,8 +356,9 @@ class IA3Layer(LoRALayerBase):
self.on_input = values["on_input"]
self.rank = None # unscaled
self.check_keys(values, {"weight", "on_input"})
def get_weight(self, orig_weight: Optional[torch.Tensor]) -> torch.Tensor:
def get_weight(self, orig_weight: torch.Tensor) -> torch.Tensor:
weight = self.weight
if not self.on_input:
weight = weight.reshape(-1, 1)
@@ -340,7 +378,39 @@ class IA3Layer(LoRALayerBase):
self.on_input = self.on_input.to(device=device, dtype=dtype)
AnyLoRALayer = Union[LoRALayer, LoHALayer, LoKRLayer, FullLayer, IA3Layer]
class NormLayer(LoRALayerBase):
# bias handled in LoRALayerBase(calc_size, to)
# weight: torch.Tensor
# bias: Optional[torch.Tensor]
def __init__(
self,
layer_key: str,
values: Dict[str, torch.Tensor],
):
super().__init__(layer_key, values)
self.weight = values["w_norm"]
self.bias = values.get("b_norm", None)
self.rank = None # unscaled
self.check_keys(values, {"w_norm", "b_norm"})
def get_weight(self, orig_weight: torch.Tensor) -> torch.Tensor:
return self.weight
def calc_size(self) -> int:
model_size = super().calc_size()
model_size += self.weight.nelement() * self.weight.element_size()
return model_size
def to(self, device: Optional[torch.device] = None, dtype: Optional[torch.dtype] = None) -> None:
super().to(device=device, dtype=dtype)
self.weight = self.weight.to(device=device, dtype=dtype)
AnyLoRALayer = Union[LoRALayer, LoHALayer, LoKRLayer, FullLayer, IA3Layer, NormLayer]
class LoRAModelRaw(RawModel): # (torch.nn.Module):
@@ -458,16 +528,19 @@ class LoRAModelRaw(RawModel): # (torch.nn.Module):
state_dict = cls._convert_sdxl_keys_to_diffusers_format(state_dict)
for layer_key, values in state_dict.items():
# Detect layers according to LyCORIS detection logic(`weight_list_det`)
# https://github.com/KohakuBlueleaf/LyCORIS/tree/8ad8000efb79e2b879054da8c9356e6143591bad/lycoris/modules
# lora and locon
if "lora_down.weight" in values:
if "lora_up.weight" in values:
layer: AnyLoRALayer = LoRALayer(layer_key, values)
# loha
elif "hada_w1_b" in values:
elif "hada_w1_a" in values:
layer = LoHALayer(layer_key, values)
# lokr
elif "lokr_w1_b" in values or "lokr_w1" in values:
elif "lokr_w1" in values or "lokr_w1_a" in values:
layer = LoKRLayer(layer_key, values)
# diff
@@ -475,9 +548,13 @@ class LoRAModelRaw(RawModel): # (torch.nn.Module):
layer = FullLayer(layer_key, values)
# ia3
elif "weight" in values and "on_input" in values:
elif "on_input" in values:
layer = IA3Layer(layer_key, values)
# norms
elif "w_norm" in values:
layer = NormLayer(layer_key, values)
else:
print(f">> Encountered unknown lora layer module in {model.name}: {layer_key} - {list(values.keys())}")
raise Exception("Unknown lora format!")

View File

@@ -98,6 +98,9 @@ class StableDiffusionDiffusersModel(GenericDiffusersLoader):
ModelVariantType.Normal: StableDiffusionXLPipeline,
ModelVariantType.Inpaint: StableDiffusionXLInpaintPipeline,
},
BaseModelType.StableDiffusionXLRefiner: {
ModelVariantType.Normal: StableDiffusionXLPipeline,
},
}
assert isinstance(config, MainCheckpointConfig)
try:

View File

@@ -11,6 +11,9 @@ from diffusers.pipelines.pipeline_utils import DiffusionPipeline
from diffusers.schedulers.scheduling_utils import SchedulerMixin
from transformers import CLIPTokenizer
from invokeai.backend.image_util.depth_anything.depth_anything_pipeline import DepthAnythingPipeline
from invokeai.backend.image_util.grounding_dino.grounding_dino_pipeline import GroundingDinoPipeline
from invokeai.backend.image_util.segment_anything.segment_anything_pipeline import SegmentAnythingPipeline
from invokeai.backend.ip_adapter.ip_adapter import IPAdapter
from invokeai.backend.lora import LoRAModelRaw
from invokeai.backend.model_manager.config import AnyModel
@@ -34,7 +37,18 @@ def calc_model_size_by_data(logger: logging.Logger, model: AnyModel) -> int:
elif isinstance(model, CLIPTokenizer):
# TODO(ryand): Accurately calculate the tokenizer's size. It's small enough that it shouldn't matter for now.
return 0
elif isinstance(model, (TextualInversionModelRaw, IPAdapter, LoRAModelRaw, SpandrelImageToImageModel)):
elif isinstance(
model,
(
TextualInversionModelRaw,
IPAdapter,
LoRAModelRaw,
SpandrelImageToImageModel,
GroundingDinoPipeline,
SegmentAnythingPipeline,
DepthAnythingPipeline,
),
):
return model.calc_size()
else:
# TODO(ryand): Promote this from a log to an exception once we are confident that we are handling all of the

View File

@@ -187,164 +187,171 @@ STARTER_MODELS: list[StarterModel] = [
# endregion
# region ControlNet
StarterModel(
name="QRCode Monster",
name="QRCode Monster v2 (SD1.5)",
base=BaseModelType.StableDiffusion1,
source="monster-labs/control_v1p_sd15_qrcode_monster",
description="Controlnet model that generates scannable creative QR codes",
source="monster-labs/control_v1p_sd15_qrcode_monster::v2",
description="ControlNet model that generates scannable creative QR codes",
type=ModelType.ControlNet,
),
StarterModel(
name="QRCode Monster (SDXL)",
base=BaseModelType.StableDiffusionXL,
source="monster-labs/control_v1p_sdxl_qrcode_monster",
description="ControlNet model that generates scannable creative QR codes",
type=ModelType.ControlNet,
),
StarterModel(
name="canny",
base=BaseModelType.StableDiffusion1,
source="lllyasviel/control_v11p_sd15_canny",
description="Controlnet weights trained on sd-1.5 with canny conditioning.",
description="ControlNet weights trained on sd-1.5 with canny conditioning.",
type=ModelType.ControlNet,
),
StarterModel(
name="inpaint",
base=BaseModelType.StableDiffusion1,
source="lllyasviel/control_v11p_sd15_inpaint",
description="Controlnet weights trained on sd-1.5 with canny conditioning, inpaint version",
description="ControlNet weights trained on sd-1.5 with canny conditioning, inpaint version",
type=ModelType.ControlNet,
),
StarterModel(
name="mlsd",
base=BaseModelType.StableDiffusion1,
source="lllyasviel/control_v11p_sd15_mlsd",
description="Controlnet weights trained on sd-1.5 with canny conditioning, MLSD version",
description="ControlNet weights trained on sd-1.5 with canny conditioning, MLSD version",
type=ModelType.ControlNet,
),
StarterModel(
name="depth",
base=BaseModelType.StableDiffusion1,
source="lllyasviel/control_v11f1p_sd15_depth",
description="Controlnet weights trained on sd-1.5 with depth conditioning",
description="ControlNet weights trained on sd-1.5 with depth conditioning",
type=ModelType.ControlNet,
),
StarterModel(
name="normal_bae",
base=BaseModelType.StableDiffusion1,
source="lllyasviel/control_v11p_sd15_normalbae",
description="Controlnet weights trained on sd-1.5 with normalbae image conditioning",
description="ControlNet weights trained on sd-1.5 with normalbae image conditioning",
type=ModelType.ControlNet,
),
StarterModel(
name="seg",
base=BaseModelType.StableDiffusion1,
source="lllyasviel/control_v11p_sd15_seg",
description="Controlnet weights trained on sd-1.5 with seg image conditioning",
description="ControlNet weights trained on sd-1.5 with seg image conditioning",
type=ModelType.ControlNet,
),
StarterModel(
name="lineart",
base=BaseModelType.StableDiffusion1,
source="lllyasviel/control_v11p_sd15_lineart",
description="Controlnet weights trained on sd-1.5 with lineart image conditioning",
description="ControlNet weights trained on sd-1.5 with lineart image conditioning",
type=ModelType.ControlNet,
),
StarterModel(
name="lineart_anime",
base=BaseModelType.StableDiffusion1,
source="lllyasviel/control_v11p_sd15s2_lineart_anime",
description="Controlnet weights trained on sd-1.5 with anime image conditioning",
description="ControlNet weights trained on sd-1.5 with anime image conditioning",
type=ModelType.ControlNet,
),
StarterModel(
name="openpose",
base=BaseModelType.StableDiffusion1,
source="lllyasviel/control_v11p_sd15_openpose",
description="Controlnet weights trained on sd-1.5 with openpose image conditioning",
description="ControlNet weights trained on sd-1.5 with openpose image conditioning",
type=ModelType.ControlNet,
),
StarterModel(
name="scribble",
base=BaseModelType.StableDiffusion1,
source="lllyasviel/control_v11p_sd15_scribble",
description="Controlnet weights trained on sd-1.5 with scribble image conditioning",
description="ControlNet weights trained on sd-1.5 with scribble image conditioning",
type=ModelType.ControlNet,
),
StarterModel(
name="softedge",
base=BaseModelType.StableDiffusion1,
source="lllyasviel/control_v11p_sd15_softedge",
description="Controlnet weights trained on sd-1.5 with soft edge conditioning",
description="ControlNet weights trained on sd-1.5 with soft edge conditioning",
type=ModelType.ControlNet,
),
StarterModel(
name="shuffle",
base=BaseModelType.StableDiffusion1,
source="lllyasviel/control_v11e_sd15_shuffle",
description="Controlnet weights trained on sd-1.5 with shuffle image conditioning",
description="ControlNet weights trained on sd-1.5 with shuffle image conditioning",
type=ModelType.ControlNet,
),
StarterModel(
name="tile",
base=BaseModelType.StableDiffusion1,
source="lllyasviel/control_v11f1e_sd15_tile",
description="Controlnet weights trained on sd-1.5 with tiled image conditioning",
description="ControlNet weights trained on sd-1.5 with tiled image conditioning",
type=ModelType.ControlNet,
),
StarterModel(
name="ip2p",
base=BaseModelType.StableDiffusion1,
source="lllyasviel/control_v11e_sd15_ip2p",
description="Controlnet weights trained on sd-1.5 with ip2p conditioning.",
description="ControlNet weights trained on sd-1.5 with ip2p conditioning.",
type=ModelType.ControlNet,
),
StarterModel(
name="canny-sdxl",
base=BaseModelType.StableDiffusionXL,
source="xinsir/controlnet-canny-sdxl-1.0",
description="Controlnet weights trained on sdxl-1.0 with canny conditioning, by Xinsir.",
source="xinsir/controlNet-canny-sdxl-1.0",
description="ControlNet weights trained on sdxl-1.0 with canny conditioning, by Xinsir.",
type=ModelType.ControlNet,
),
StarterModel(
name="depth-sdxl",
base=BaseModelType.StableDiffusionXL,
source="diffusers/controlnet-depth-sdxl-1.0",
description="Controlnet weights trained on sdxl-1.0 with depth conditioning.",
source="diffusers/controlNet-depth-sdxl-1.0",
description="ControlNet weights trained on sdxl-1.0 with depth conditioning.",
type=ModelType.ControlNet,
),
StarterModel(
name="softedge-dexined-sdxl",
base=BaseModelType.StableDiffusionXL,
source="SargeZT/controlnet-sd-xl-1.0-softedge-dexined",
description="Controlnet weights trained on sdxl-1.0 with dexined soft edge preprocessing.",
source="SargeZT/controlNet-sd-xl-1.0-softedge-dexined",
description="ControlNet weights trained on sdxl-1.0 with dexined soft edge preprocessing.",
type=ModelType.ControlNet,
),
StarterModel(
name="depth-16bit-zoe-sdxl",
base=BaseModelType.StableDiffusionXL,
source="SargeZT/controlnet-sd-xl-1.0-depth-16bit-zoe",
description="Controlnet weights trained on sdxl-1.0 with Zoe's preprocessor (16 bits).",
source="SargeZT/controlNet-sd-xl-1.0-depth-16bit-zoe",
description="ControlNet weights trained on sdxl-1.0 with Zoe's preprocessor (16 bits).",
type=ModelType.ControlNet,
),
StarterModel(
name="depth-zoe-sdxl",
base=BaseModelType.StableDiffusionXL,
source="diffusers/controlnet-zoe-depth-sdxl-1.0",
description="Controlnet weights trained on sdxl-1.0 with Zoe's preprocessor (32 bits).",
source="diffusers/controlNet-zoe-depth-sdxl-1.0",
description="ControlNet weights trained on sdxl-1.0 with Zoe's preprocessor (32 bits).",
type=ModelType.ControlNet,
),
StarterModel(
name="openpose-sdxl",
base=BaseModelType.StableDiffusionXL,
source="xinsir/controlnet-openpose-sdxl-1.0",
description="Controlnet weights trained on sdxl-1.0 compatible with the DWPose processor by Xinsir.",
source="xinsir/controlNet-openpose-sdxl-1.0",
description="ControlNet weights trained on sdxl-1.0 compatible with the DWPose processor by Xinsir.",
type=ModelType.ControlNet,
),
StarterModel(
name="scribble-sdxl",
base=BaseModelType.StableDiffusionXL,
source="xinsir/controlnet-scribble-sdxl-1.0",
description="Controlnet weights trained on sdxl-1.0 compatible with various lineart processors and black/white sketches by Xinsir.",
source="xinsir/controlNet-scribble-sdxl-1.0",
description="ControlNet weights trained on sdxl-1.0 compatible with various lineart processors and black/white sketches by Xinsir.",
type=ModelType.ControlNet,
),
StarterModel(
name="tile-sdxl",
base=BaseModelType.StableDiffusionXL,
source="xinsir/controlnet-tile-sdxl-1.0",
description="Controlnet weights trained on sdxl-1.0 with tiled image conditioning",
source="xinsir/controlNet-tile-sdxl-1.0",
description="ControlNet weights trained on sdxl-1.0 with tiled image conditioning",
type=ModelType.ControlNet,
),
# endregion

View File

@@ -62,13 +62,7 @@ def filter_files(
# downloading random checkpoints that might also be in the repo. However there is no guarantee
# that a checkpoint doesn't contain "model" in its name, and no guarantee that future diffusers models
# will adhere to this naming convention, so this is an area to be careful of.
#
# On July 24, 2024, this regex filter was modified to support downloading the `microsoft/Phi-3-mini-4k-instruct`
# model. I am making this note in case it is relevant as we continue to improve this logic and make it less
# brittle.
# - Before: r"model(\.[^.]+)?\.(safetensors|bin|onnx|xml|pth|pt|ckpt|msgpack)$"
# - After: r"model.*\.(safetensors|bin|onnx|xml|pth|pt|ckpt|msgpack)$"
elif re.search(r"model.*\.(safetensors|bin|onnx|xml|pth|pt|ckpt|msgpack)$", file.name):
elif re.search(r"model(\.[^.]+)?\.(safetensors|bin|onnx|xml|pth|pt|ckpt|msgpack)$", file.name):
paths.append(file)
# limit search to subfolder if requested

View File

@@ -17,8 +17,9 @@ from invokeai.backend.lora import LoRAModelRaw
from invokeai.backend.model_manager import AnyModel
from invokeai.backend.model_manager.load.optimizations import skip_torch_weight_init
from invokeai.backend.onnx.onnx_runtime import IAIOnnxRuntimeModel
from invokeai.backend.stable_diffusion.extensions.lora import LoRAExt
from invokeai.backend.textual_inversion import TextualInversionManager, TextualInversionModelRaw
from invokeai.backend.util.devices import TorchDevice
from invokeai.backend.util.original_weights_storage import OriginalWeightsStorage
"""
loras = [
@@ -85,13 +86,13 @@ class ModelPatcher:
cls,
unet: UNet2DConditionModel,
loras: Iterator[Tuple[LoRAModelRaw, float]],
model_state_dict: Optional[Dict[str, torch.Tensor]] = None,
cached_weights: Optional[Dict[str, torch.Tensor]] = None,
) -> Generator[None, None, None]:
with cls.apply_lora(
unet,
loras=loras,
prefix="lora_unet_",
model_state_dict=model_state_dict,
cached_weights=cached_weights,
):
yield
@@ -101,9 +102,9 @@ class ModelPatcher:
cls,
text_encoder: CLIPTextModel,
loras: Iterator[Tuple[LoRAModelRaw, float]],
model_state_dict: Optional[Dict[str, torch.Tensor]] = None,
cached_weights: Optional[Dict[str, torch.Tensor]] = None,
) -> Generator[None, None, None]:
with cls.apply_lora(text_encoder, loras=loras, prefix="lora_te_", model_state_dict=model_state_dict):
with cls.apply_lora(text_encoder, loras=loras, prefix="lora_te_", cached_weights=cached_weights):
yield
@classmethod
@@ -113,7 +114,7 @@ class ModelPatcher:
model: AnyModel,
loras: Iterator[Tuple[LoRAModelRaw, float]],
prefix: str,
model_state_dict: Optional[Dict[str, torch.Tensor]] = None,
cached_weights: Optional[Dict[str, torch.Tensor]] = None,
) -> Generator[None, None, None]:
"""
Apply one or more LoRAs to a model.
@@ -121,66 +122,26 @@ class ModelPatcher:
:param model: The model to patch.
:param loras: An iterator that returns the LoRA to patch in and its patch weight.
:param prefix: A string prefix that precedes keys used in the LoRAs weight layers.
:model_state_dict: Read-only copy of the model's state dict in CPU, for unpatching purposes.
:cached_weights: Read-only copy of the model's state dict in CPU, for unpatching purposes.
"""
original_weights = {}
original_weights = OriginalWeightsStorage(cached_weights)
try:
with torch.no_grad():
for lora, lora_weight in loras:
# assert lora.device.type == "cpu"
for layer_key, layer in lora.layers.items():
if not layer_key.startswith(prefix):
continue
for lora_model, lora_weight in loras:
LoRAExt.patch_model(
model=model,
prefix=prefix,
lora=lora_model,
lora_weight=lora_weight,
original_weights=original_weights,
)
del lora_model
# TODO(ryand): A non-negligible amount of time is currently spent resolving LoRA keys. This
# should be improved in the following ways:
# 1. The key mapping could be more-efficiently pre-computed. This would save time every time a
# LoRA model is applied.
# 2. From an API perspective, there's no reason that the `ModelPatcher` should be aware of the
# intricacies of Stable Diffusion key resolution. It should just expect the input LoRA
# weights to have valid keys.
assert isinstance(model, torch.nn.Module)
module_key, module = cls._resolve_lora_key(model, layer_key, prefix)
# All of the LoRA weight calculations will be done on the same device as the module weight.
# (Performance will be best if this is a CUDA device.)
device = module.weight.device
dtype = module.weight.dtype
if module_key not in original_weights:
if model_state_dict is not None: # we were provided with the CPU copy of the state dict
original_weights[module_key] = model_state_dict[module_key + ".weight"]
else:
original_weights[module_key] = module.weight.detach().to(device="cpu", copy=True)
layer_scale = layer.alpha / layer.rank if (layer.alpha and layer.rank) else 1.0
# We intentionally move to the target device first, then cast. Experimentally, this was found to
# be significantly faster for 16-bit CPU tensors being moved to a CUDA device than doing the
# same thing in a single call to '.to(...)'.
layer.to(device=device)
layer.to(dtype=torch.float32)
# TODO(ryand): Using torch.autocast(...) over explicit casting may offer a speed benefit on CUDA
# devices here. Experimentally, it was found to be very slow on CPU. More investigation needed.
layer_weight = layer.get_weight(module.weight) * (lora_weight * layer_scale)
layer.to(device=TorchDevice.CPU_DEVICE)
assert isinstance(layer_weight, torch.Tensor) # mypy thinks layer_weight is a float|Any ??!
if module.weight.shape != layer_weight.shape:
# TODO: debug on lycoris
assert hasattr(layer_weight, "reshape")
layer_weight = layer_weight.reshape(module.weight.shape)
assert isinstance(layer_weight, torch.Tensor) # mypy thinks layer_weight is a float|Any ??!
module.weight += layer_weight.to(dtype=dtype)
yield # wait for context manager exit
yield
finally:
assert hasattr(model, "get_submodule") # mypy not picking up fact that torch.nn.Module has get_submodule()
with torch.no_grad():
for module_key, weight in original_weights.items():
model.get_submodule(module_key).weight.copy_(weight)
for param_key, weight in original_weights.get_changed_weights():
model.get_parameter(param_key).copy_(weight)
@classmethod
@contextmanager

View File

@@ -7,11 +7,9 @@ from invokeai.backend.stable_diffusion.diffusers_pipeline import ( # noqa: F401
StableDiffusionGeneratorPipeline,
)
from invokeai.backend.stable_diffusion.diffusion import InvokeAIDiffuserComponent # noqa: F401
from invokeai.backend.stable_diffusion.seamless import set_seamless # noqa: F401
__all__ = [
"PipelineIntermediateState",
"StableDiffusionGeneratorPipeline",
"InvokeAIDiffuserComponent",
"set_seamless",
]

View File

@@ -2,14 +2,14 @@ from __future__ import annotations
from contextlib import contextmanager
from dataclasses import dataclass
from typing import TYPE_CHECKING, Callable, Dict, List, Optional
from typing import TYPE_CHECKING, Callable, Dict, List
import torch
from diffusers import UNet2DConditionModel
if TYPE_CHECKING:
from invokeai.backend.stable_diffusion.denoise_context import DenoiseContext
from invokeai.backend.stable_diffusion.extension_callback_type import ExtensionCallbackType
from invokeai.backend.util.original_weights_storage import OriginalWeightsStorage
@dataclass
@@ -56,5 +56,17 @@ class ExtensionBase:
yield None
@contextmanager
def patch_unet(self, unet: UNet2DConditionModel, cached_weights: Optional[Dict[str, torch.Tensor]] = None):
yield None
def patch_unet(self, unet: UNet2DConditionModel, original_weights: OriginalWeightsStorage):
"""A context manager for applying patches to the UNet model. The context manager's lifetime spans the entire
diffusion process. Weight unpatching is handled upstream, and is achieved by saving unchanged weights by
`original_weights.save` function. Note that this enables some performance optimization by avoiding redundant
operations. All other patches (e.g. changes to tensor shapes, function monkey-patches, etc.) should be unpatched
by this context manager.
Args:
unet (UNet2DConditionModel): The UNet model on execution device to patch.
original_weights (OriginalWeightsStorage): A storage with copy of the model's original weights in CPU, for
unpatching purposes. Extension should save tensor which being modified in this storage, also extensions
can access original weights values.
"""
yield

View File

@@ -1,15 +1,15 @@
from __future__ import annotations
from contextlib import contextmanager
from typing import TYPE_CHECKING, Dict, Optional
from typing import TYPE_CHECKING
import torch
from diffusers import UNet2DConditionModel
from invokeai.backend.stable_diffusion.extensions.base import ExtensionBase
if TYPE_CHECKING:
from invokeai.app.shared.models import FreeUConfig
from invokeai.backend.util.original_weights_storage import OriginalWeightsStorage
class FreeUExt(ExtensionBase):
@@ -21,7 +21,7 @@ class FreeUExt(ExtensionBase):
self._freeu_config = freeu_config
@contextmanager
def patch_unet(self, unet: UNet2DConditionModel, cached_weights: Optional[Dict[str, torch.Tensor]] = None):
def patch_unet(self, unet: UNet2DConditionModel, original_weights: OriginalWeightsStorage):
unet.enable_freeu(
b1=self._freeu_config.b1,
b2=self._freeu_config.b2,

View File

@@ -0,0 +1,120 @@
from __future__ import annotations
from typing import TYPE_CHECKING, Optional
import einops
import torch
from diffusers import UNet2DConditionModel
from invokeai.backend.stable_diffusion.extension_callback_type import ExtensionCallbackType
from invokeai.backend.stable_diffusion.extensions.base import ExtensionBase, callback
if TYPE_CHECKING:
from invokeai.backend.stable_diffusion.denoise_context import DenoiseContext
class InpaintExt(ExtensionBase):
"""An extension for inpainting with non-inpainting models. See `InpaintModelExt` for inpainting with inpainting
models.
"""
def __init__(
self,
mask: torch.Tensor,
is_gradient_mask: bool,
):
"""Initialize InpaintExt.
Args:
mask (torch.Tensor): The inpainting mask. Shape: (1, 1, latent_height, latent_width). Values are
expected to be in the range [0, 1]. A value of 1 means that the corresponding 'pixel' should not be
inpainted.
is_gradient_mask (bool): If True, mask is interpreted as a gradient mask meaning that the mask values range
from 0 to 1. If False, mask is interpreted as binary mask meaning that the mask values are either 0 or
1.
"""
super().__init__()
self._mask = mask
self._is_gradient_mask = is_gradient_mask
# Noise, which used to noisify unmasked part of image
# if noise provided to context, then it will be used
# if no noise provided, then noise will be generated based on seed
self._noise: Optional[torch.Tensor] = None
@staticmethod
def _is_normal_model(unet: UNet2DConditionModel):
"""Checks if the provided UNet belongs to a regular model.
The `in_channels` of a UNet vary depending on model type:
- normal - 4
- depth - 5
- inpaint - 9
"""
return unet.conv_in.in_channels == 4
def _apply_mask(self, ctx: DenoiseContext, latents: torch.Tensor, t: torch.Tensor) -> torch.Tensor:
batch_size = latents.size(0)
mask = einops.repeat(self._mask, "b c h w -> (repeat b) c h w", repeat=batch_size)
if t.dim() == 0:
# some schedulers expect t to be one-dimensional.
# TODO: file diffusers bug about inconsistency?
t = einops.repeat(t, "-> batch", batch=batch_size)
# Noise shouldn't be re-randomized between steps here. The multistep schedulers
# get very confused about what is happening from step to step when we do that.
mask_latents = ctx.scheduler.add_noise(ctx.inputs.orig_latents, self._noise, t)
# TODO: Do we need to also apply scheduler.scale_model_input? Or is add_noise appropriately scaled already?
# mask_latents = self.scheduler.scale_model_input(mask_latents, t)
mask_latents = einops.repeat(mask_latents, "b c h w -> (repeat b) c h w", repeat=batch_size)
if self._is_gradient_mask:
threshold = (t.item()) / ctx.scheduler.config.num_train_timesteps
mask_bool = mask < 1 - threshold
masked_input = torch.where(mask_bool, latents, mask_latents)
else:
masked_input = torch.lerp(latents, mask_latents.to(dtype=latents.dtype), mask.to(dtype=latents.dtype))
return masked_input
@callback(ExtensionCallbackType.PRE_DENOISE_LOOP)
def init_tensors(self, ctx: DenoiseContext):
if not self._is_normal_model(ctx.unet):
raise ValueError(
"InpaintExt should be used only on normal (non-inpainting) models. This could be caused by an "
"inpainting model that was incorrectly marked as a non-inpainting model. In some cases, this can be "
"fixed by removing and re-adding the model (so that it gets re-probed)."
)
self._mask = self._mask.to(device=ctx.latents.device, dtype=ctx.latents.dtype)
self._noise = ctx.inputs.noise
# 'noise' might be None if the latents have already been noised (e.g. when running the SDXL refiner).
# We still need noise for inpainting, so we generate it from the seed here.
if self._noise is None:
self._noise = torch.randn(
ctx.latents.shape,
dtype=torch.float32,
device="cpu",
generator=torch.Generator(device="cpu").manual_seed(ctx.seed),
).to(device=ctx.latents.device, dtype=ctx.latents.dtype)
# Use negative order to make extensions with default order work with patched latents
@callback(ExtensionCallbackType.PRE_STEP, order=-100)
def apply_mask_to_initial_latents(self, ctx: DenoiseContext):
ctx.latents = self._apply_mask(ctx, ctx.latents, ctx.timestep)
# TODO: redo this with preview events rewrite
# Use negative order to make extensions with default order work with patched latents
@callback(ExtensionCallbackType.POST_STEP, order=-100)
def apply_mask_to_step_output(self, ctx: DenoiseContext):
timestep = ctx.scheduler.timesteps[-1]
if hasattr(ctx.step_output, "denoised"):
ctx.step_output.denoised = self._apply_mask(ctx, ctx.step_output.denoised, timestep)
elif hasattr(ctx.step_output, "pred_original_sample"):
ctx.step_output.pred_original_sample = self._apply_mask(ctx, ctx.step_output.pred_original_sample, timestep)
else:
ctx.step_output.pred_original_sample = self._apply_mask(ctx, ctx.step_output.prev_sample, timestep)
# Restore unmasked part after the last step is completed
@callback(ExtensionCallbackType.POST_DENOISE_LOOP)
def restore_unmasked(self, ctx: DenoiseContext):
if self._is_gradient_mask:
ctx.latents = torch.where(self._mask < 1, ctx.latents, ctx.inputs.orig_latents)
else:
ctx.latents = torch.lerp(ctx.latents, ctx.inputs.orig_latents, self._mask)

View File

@@ -0,0 +1,88 @@
from __future__ import annotations
from typing import TYPE_CHECKING, Optional
import torch
from diffusers import UNet2DConditionModel
from invokeai.backend.stable_diffusion.extension_callback_type import ExtensionCallbackType
from invokeai.backend.stable_diffusion.extensions.base import ExtensionBase, callback
if TYPE_CHECKING:
from invokeai.backend.stable_diffusion.denoise_context import DenoiseContext
class InpaintModelExt(ExtensionBase):
"""An extension for inpainting with inpainting models. See `InpaintExt` for inpainting with non-inpainting
models.
"""
def __init__(
self,
mask: Optional[torch.Tensor],
masked_latents: Optional[torch.Tensor],
is_gradient_mask: bool,
):
"""Initialize InpaintModelExt.
Args:
mask (Optional[torch.Tensor]): The inpainting mask. Shape: (1, 1, latent_height, latent_width). Values are
expected to be in the range [0, 1]. A value of 1 means that the corresponding 'pixel' should not be
inpainted.
masked_latents (Optional[torch.Tensor]): Latents of initial image, with masked out by black color inpainted area.
If mask provided, then too should be provided. Shape: (1, 1, latent_height, latent_width)
is_gradient_mask (bool): If True, mask is interpreted as a gradient mask meaning that the mask values range
from 0 to 1. If False, mask is interpreted as binary mask meaning that the mask values are either 0 or
1.
"""
super().__init__()
if mask is not None and masked_latents is None:
raise ValueError("Source image required for inpaint mask when inpaint model used!")
# Inverse mask, because inpaint models treat mask as: 0 - remain same, 1 - inpaint
self._mask = None
if mask is not None:
self._mask = 1 - mask
self._masked_latents = masked_latents
self._is_gradient_mask = is_gradient_mask
@staticmethod
def _is_inpaint_model(unet: UNet2DConditionModel):
"""Checks if the provided UNet belongs to a regular model.
The `in_channels` of a UNet vary depending on model type:
- normal - 4
- depth - 5
- inpaint - 9
"""
return unet.conv_in.in_channels == 9
@callback(ExtensionCallbackType.PRE_DENOISE_LOOP)
def init_tensors(self, ctx: DenoiseContext):
if not self._is_inpaint_model(ctx.unet):
raise ValueError("InpaintModelExt should be used only on inpaint models!")
if self._mask is None:
self._mask = torch.ones_like(ctx.latents[:1, :1])
self._mask = self._mask.to(device=ctx.latents.device, dtype=ctx.latents.dtype)
if self._masked_latents is None:
self._masked_latents = torch.zeros_like(ctx.latents[:1])
self._masked_latents = self._masked_latents.to(device=ctx.latents.device, dtype=ctx.latents.dtype)
# Do last so that other extensions works with normal latents
@callback(ExtensionCallbackType.PRE_UNET, order=1000)
def append_inpaint_layers(self, ctx: DenoiseContext):
batch_size = ctx.unet_kwargs.sample.shape[0]
b_mask = torch.cat([self._mask] * batch_size)
b_masked_latents = torch.cat([self._masked_latents] * batch_size)
ctx.unet_kwargs.sample = torch.cat(
[ctx.unet_kwargs.sample, b_mask, b_masked_latents],
dim=1,
)
# Restore unmasked part as inpaint model can change unmasked part slightly
@callback(ExtensionCallbackType.POST_DENOISE_LOOP)
def restore_unmasked(self, ctx: DenoiseContext):
if self._is_gradient_mask:
ctx.latents = torch.where(self._mask > 0, ctx.latents, ctx.inputs.orig_latents)
else:
ctx.latents = torch.lerp(ctx.inputs.orig_latents, ctx.latents, self._mask)

View File

@@ -0,0 +1,137 @@
from __future__ import annotations
from contextlib import contextmanager
from typing import TYPE_CHECKING, Tuple
import torch
from diffusers import UNet2DConditionModel
from invokeai.backend.stable_diffusion.extensions.base import ExtensionBase
from invokeai.backend.util.devices import TorchDevice
if TYPE_CHECKING:
from invokeai.app.invocations.model import ModelIdentifierField
from invokeai.app.services.shared.invocation_context import InvocationContext
from invokeai.backend.lora import LoRAModelRaw
from invokeai.backend.util.original_weights_storage import OriginalWeightsStorage
class LoRAExt(ExtensionBase):
def __init__(
self,
node_context: InvocationContext,
model_id: ModelIdentifierField,
weight: float,
):
super().__init__()
self._node_context = node_context
self._model_id = model_id
self._weight = weight
@contextmanager
def patch_unet(self, unet: UNet2DConditionModel, original_weights: OriginalWeightsStorage):
lora_model = self._node_context.models.load(self._model_id).model
self.patch_model(
model=unet,
prefix="lora_unet_",
lora=lora_model,
lora_weight=self._weight,
original_weights=original_weights,
)
del lora_model
yield
@classmethod
@torch.no_grad()
def patch_model(
cls,
model: torch.nn.Module,
prefix: str,
lora: LoRAModelRaw,
lora_weight: float,
original_weights: OriginalWeightsStorage,
):
"""
Apply one or more LoRAs to a model.
:param model: The model to patch.
:param lora: LoRA model to patch in.
:param lora_weight: LoRA patch weight.
:param prefix: A string prefix that precedes keys used in the LoRAs weight layers.
:param original_weights: Storage with original weights, filled by weights which lora patches, used for unpatching.
"""
if lora_weight == 0:
return
# assert lora.device.type == "cpu"
for layer_key, layer in lora.layers.items():
if not layer_key.startswith(prefix):
continue
# TODO(ryand): A non-negligible amount of time is currently spent resolving LoRA keys. This
# should be improved in the following ways:
# 1. The key mapping could be more-efficiently pre-computed. This would save time every time a
# LoRA model is applied.
# 2. From an API perspective, there's no reason that the `ModelPatcher` should be aware of the
# intricacies of Stable Diffusion key resolution. It should just expect the input LoRA
# weights to have valid keys.
assert isinstance(model, torch.nn.Module)
module_key, module = cls._resolve_lora_key(model, layer_key, prefix)
# All of the LoRA weight calculations will be done on the same device as the module weight.
# (Performance will be best if this is a CUDA device.)
device = module.weight.device
dtype = module.weight.dtype
layer_scale = layer.alpha / layer.rank if (layer.alpha and layer.rank) else 1.0
# We intentionally move to the target device first, then cast. Experimentally, this was found to
# be significantly faster for 16-bit CPU tensors being moved to a CUDA device than doing the
# same thing in a single call to '.to(...)'.
layer.to(device=device)
layer.to(dtype=torch.float32)
# TODO(ryand): Using torch.autocast(...) over explicit casting may offer a speed benefit on CUDA
# devices here. Experimentally, it was found to be very slow on CPU. More investigation needed.
for param_name, lora_param_weight in layer.get_parameters(module).items():
param_key = module_key + "." + param_name
module_param = module.get_parameter(param_name)
# save original weight
original_weights.save(param_key, module_param)
if module_param.shape != lora_param_weight.shape:
# TODO: debug on lycoris
lora_param_weight = lora_param_weight.reshape(module_param.shape)
lora_param_weight *= lora_weight * layer_scale
module_param += lora_param_weight.to(dtype=dtype)
layer.to(device=TorchDevice.CPU_DEVICE)
@staticmethod
def _resolve_lora_key(model: torch.nn.Module, lora_key: str, prefix: str) -> Tuple[str, torch.nn.Module]:
assert "." not in lora_key
if not lora_key.startswith(prefix):
raise Exception(f"lora_key with invalid prefix: {lora_key}, {prefix}")
module = model
module_key = ""
key_parts = lora_key[len(prefix) :].split("_")
submodule_name = key_parts.pop(0)
while len(key_parts) > 0:
try:
module = module.get_submodule(submodule_name)
module_key += "." + submodule_name
submodule_name = key_parts.pop(0)
except Exception:
submodule_name += "_" + key_parts.pop(0)
module = module.get_submodule(submodule_name)
module_key = (module_key + "." + submodule_name).lstrip(".")
return (module_key, module)

View File

@@ -0,0 +1,71 @@
from __future__ import annotations
from contextlib import contextmanager
from typing import Callable, Dict, List, Optional, Tuple
import torch
import torch.nn as nn
from diffusers import UNet2DConditionModel
from diffusers.models.lora import LoRACompatibleConv
from invokeai.backend.stable_diffusion.extensions.base import ExtensionBase
class SeamlessExt(ExtensionBase):
def __init__(
self,
seamless_axes: List[str],
):
super().__init__()
self._seamless_axes = seamless_axes
@contextmanager
def patch_unet(self, unet: UNet2DConditionModel, cached_weights: Optional[Dict[str, torch.Tensor]] = None):
with self.static_patch_model(
model=unet,
seamless_axes=self._seamless_axes,
):
yield
@staticmethod
@contextmanager
def static_patch_model(
model: torch.nn.Module,
seamless_axes: List[str],
):
if not seamless_axes:
yield
return
x_mode = "circular" if "x" in seamless_axes else "constant"
y_mode = "circular" if "y" in seamless_axes else "constant"
# override conv_forward
# https://github.com/huggingface/diffusers/issues/556#issuecomment-1993287019
def _conv_forward_asymmetric(
self, input: torch.Tensor, weight: torch.Tensor, bias: Optional[torch.Tensor] = None
):
self.paddingX = (self._reversed_padding_repeated_twice[0], self._reversed_padding_repeated_twice[1], 0, 0)
self.paddingY = (0, 0, self._reversed_padding_repeated_twice[2], self._reversed_padding_repeated_twice[3])
working = torch.nn.functional.pad(input, self.paddingX, mode=x_mode)
working = torch.nn.functional.pad(working, self.paddingY, mode=y_mode)
return torch.nn.functional.conv2d(
working, weight, bias, self.stride, torch.nn.modules.utils._pair(0), self.dilation, self.groups
)
original_layers: List[Tuple[nn.Conv2d, Callable]] = []
try:
for layer in model.modules():
if not isinstance(layer, torch.nn.Conv2d):
continue
if isinstance(layer, LoRACompatibleConv) and layer.lora_layer is None:
layer.lora_layer = lambda *x: 0
original_layers.append((layer, layer._conv_forward))
layer._conv_forward = _conv_forward_asymmetric.__get__(layer, torch.nn.Conv2d)
yield
finally:
for layer, orig_conv_forward in original_layers:
layer._conv_forward = orig_conv_forward

View File

@@ -0,0 +1,120 @@
from __future__ import annotations
import math
from typing import TYPE_CHECKING, List, Optional, Union
import torch
from diffusers import T2IAdapter
from PIL.Image import Image
from invokeai.app.util.controlnet_utils import prepare_control_image
from invokeai.backend.model_manager import BaseModelType
from invokeai.backend.stable_diffusion.diffusion.conditioning_data import ConditioningMode
from invokeai.backend.stable_diffusion.extension_callback_type import ExtensionCallbackType
from invokeai.backend.stable_diffusion.extensions.base import ExtensionBase, callback
if TYPE_CHECKING:
from invokeai.app.invocations.model import ModelIdentifierField
from invokeai.app.services.shared.invocation_context import InvocationContext
from invokeai.app.util.controlnet_utils import CONTROLNET_RESIZE_VALUES
from invokeai.backend.stable_diffusion.denoise_context import DenoiseContext
class T2IAdapterExt(ExtensionBase):
def __init__(
self,
node_context: InvocationContext,
model_id: ModelIdentifierField,
image: Image,
weight: Union[float, List[float]],
begin_step_percent: float,
end_step_percent: float,
resize_mode: CONTROLNET_RESIZE_VALUES,
):
super().__init__()
self._node_context = node_context
self._model_id = model_id
self._image = image
self._weight = weight
self._resize_mode = resize_mode
self._begin_step_percent = begin_step_percent
self._end_step_percent = end_step_percent
self._adapter_state: Optional[List[torch.Tensor]] = None
# The max_unet_downscale is the maximum amount that the UNet model downscales the latent image internally.
model_config = self._node_context.models.get_config(self._model_id.key)
if model_config.base == BaseModelType.StableDiffusion1:
self._max_unet_downscale = 8
elif model_config.base == BaseModelType.StableDiffusionXL:
self._max_unet_downscale = 4
else:
raise ValueError(f"Unexpected T2I-Adapter base model type: '{model_config.base}'.")
@callback(ExtensionCallbackType.SETUP)
def setup(self, ctx: DenoiseContext):
t2i_model: T2IAdapter
with self._node_context.models.load(self._model_id) as t2i_model:
_, _, latents_height, latents_width = ctx.inputs.orig_latents.shape
self._adapter_state = self._run_model(
model=t2i_model,
image=self._image,
latents_height=latents_height,
latents_width=latents_width,
)
def _run_model(
self,
model: T2IAdapter,
image: Image,
latents_height: int,
latents_width: int,
):
# Resize the T2I-Adapter input image.
# We select the resize dimensions so that after the T2I-Adapter's total_downscale_factor is applied, the
# result will match the latent image's dimensions after max_unet_downscale is applied.
input_height = latents_height // self._max_unet_downscale * model.total_downscale_factor
input_width = latents_width // self._max_unet_downscale * model.total_downscale_factor
# Note: We have hard-coded `do_classifier_free_guidance=False`. This is because we only want to prepare
# a single image. If CFG is enabled, we will duplicate the resultant tensor after applying the
# T2I-Adapter model.
#
# Note: We re-use the `prepare_control_image(...)` from ControlNet for T2I-Adapter, because it has many
# of the same requirements (e.g. preserving binary masks during resize).
t2i_image = prepare_control_image(
image=image,
do_classifier_free_guidance=False,
width=input_width,
height=input_height,
num_channels=model.config["in_channels"],
device=model.device,
dtype=model.dtype,
resize_mode=self._resize_mode,
)
return model(t2i_image)
@callback(ExtensionCallbackType.PRE_UNET)
def pre_unet_step(self, ctx: DenoiseContext):
# skip if model not active in current step
total_steps = len(ctx.inputs.timesteps)
first_step = math.floor(self._begin_step_percent * total_steps)
last_step = math.ceil(self._end_step_percent * total_steps)
if ctx.step_index < first_step or ctx.step_index > last_step:
return
weight = self._weight
if isinstance(weight, list):
weight = weight[ctx.step_index]
adapter_state = self._adapter_state
if ctx.conditioning_mode == ConditioningMode.Both:
adapter_state = [torch.cat([v] * 2) for v in adapter_state]
if ctx.unet_kwargs.down_intrablock_additional_residuals is None:
ctx.unet_kwargs.down_intrablock_additional_residuals = [v * weight for v in adapter_state]
else:
for i, value in enumerate(adapter_state):
ctx.unet_kwargs.down_intrablock_additional_residuals[i] += value * weight

View File

@@ -7,6 +7,7 @@ import torch
from diffusers import UNet2DConditionModel
from invokeai.app.services.session_processor.session_processor_common import CanceledException
from invokeai.backend.util.original_weights_storage import OriginalWeightsStorage
if TYPE_CHECKING:
from invokeai.backend.stable_diffusion.denoise_context import DenoiseContext
@@ -67,9 +68,15 @@ class ExtensionsManager:
if self._is_canceled and self._is_canceled():
raise CanceledException
# TODO: create weight patch logic in PR with extension which uses it
with ExitStack() as exit_stack:
for ext in self._extensions:
exit_stack.enter_context(ext.patch_unet(unet, cached_weights))
original_weights = OriginalWeightsStorage(cached_weights)
try:
with ExitStack() as exit_stack:
for ext in self._extensions:
exit_stack.enter_context(ext.patch_unet(unet, original_weights))
yield None
yield None
finally:
with torch.no_grad():
for param_key, weight in original_weights.get_changed_weights():
unet.get_parameter(param_key).copy_(weight)

View File

@@ -20,10 +20,14 @@ from diffusers import (
)
from diffusers.schedulers.scheduling_utils import SchedulerMixin
# TODO: add dpmpp_3s/dpmpp_3s_k when fix released
# https://github.com/huggingface/diffusers/issues/9007
SCHEDULER_NAME_VALUES = Literal[
"ddim",
"ddpm",
"deis",
"deis_k",
"lms",
"lms_k",
"pndm",
@@ -33,16 +37,21 @@ SCHEDULER_NAME_VALUES = Literal[
"euler_k",
"euler_a",
"kdpm_2",
"kdpm_2_k",
"kdpm_2_a",
"kdpm_2_a_k",
"dpmpp_2s",
"dpmpp_2s_k",
"dpmpp_2m",
"dpmpp_2m_k",
"dpmpp_2m_sde",
"dpmpp_2m_sde_k",
"dpmpp_3m",
"dpmpp_3m_k",
"dpmpp_sde",
"dpmpp_sde_k",
"unipc",
"unipc_k",
"lcm",
"tcd",
]
@@ -50,7 +59,8 @@ SCHEDULER_NAME_VALUES = Literal[
SCHEDULER_MAP: dict[SCHEDULER_NAME_VALUES, tuple[Type[SchedulerMixin], dict[str, Any]]] = {
"ddim": (DDIMScheduler, {}),
"ddpm": (DDPMScheduler, {}),
"deis": (DEISMultistepScheduler, {}),
"deis": (DEISMultistepScheduler, {"use_karras_sigmas": False}),
"deis_k": (DEISMultistepScheduler, {"use_karras_sigmas": True}),
"lms": (LMSDiscreteScheduler, {"use_karras_sigmas": False}),
"lms_k": (LMSDiscreteScheduler, {"use_karras_sigmas": True}),
"pndm": (PNDMScheduler, {}),
@@ -59,17 +69,28 @@ SCHEDULER_MAP: dict[SCHEDULER_NAME_VALUES, tuple[Type[SchedulerMixin], dict[str,
"euler": (EulerDiscreteScheduler, {"use_karras_sigmas": False}),
"euler_k": (EulerDiscreteScheduler, {"use_karras_sigmas": True}),
"euler_a": (EulerAncestralDiscreteScheduler, {}),
"kdpm_2": (KDPM2DiscreteScheduler, {}),
"kdpm_2_a": (KDPM2AncestralDiscreteScheduler, {}),
"dpmpp_2s": (DPMSolverSinglestepScheduler, {"use_karras_sigmas": False}),
"dpmpp_2s_k": (DPMSolverSinglestepScheduler, {"use_karras_sigmas": True}),
"dpmpp_2m": (DPMSolverMultistepScheduler, {"use_karras_sigmas": False}),
"dpmpp_2m_k": (DPMSolverMultistepScheduler, {"use_karras_sigmas": True}),
"dpmpp_2m_sde": (DPMSolverMultistepScheduler, {"use_karras_sigmas": False, "algorithm_type": "sde-dpmsolver++"}),
"dpmpp_2m_sde_k": (DPMSolverMultistepScheduler, {"use_karras_sigmas": True, "algorithm_type": "sde-dpmsolver++"}),
"kdpm_2": (KDPM2DiscreteScheduler, {"use_karras_sigmas": False}),
"kdpm_2_k": (KDPM2DiscreteScheduler, {"use_karras_sigmas": True}),
"kdpm_2_a": (KDPM2AncestralDiscreteScheduler, {"use_karras_sigmas": False}),
"kdpm_2_a_k": (KDPM2AncestralDiscreteScheduler, {"use_karras_sigmas": True}),
"dpmpp_2s": (DPMSolverSinglestepScheduler, {"use_karras_sigmas": False, "solver_order": 2}),
"dpmpp_2s_k": (DPMSolverSinglestepScheduler, {"use_karras_sigmas": True, "solver_order": 2}),
"dpmpp_2m": (DPMSolverMultistepScheduler, {"use_karras_sigmas": False, "solver_order": 2}),
"dpmpp_2m_k": (DPMSolverMultistepScheduler, {"use_karras_sigmas": True, "solver_order": 2}),
"dpmpp_2m_sde": (
DPMSolverMultistepScheduler,
{"use_karras_sigmas": False, "solver_order": 2, "algorithm_type": "sde-dpmsolver++"},
),
"dpmpp_2m_sde_k": (
DPMSolverMultistepScheduler,
{"use_karras_sigmas": True, "solver_order": 2, "algorithm_type": "sde-dpmsolver++"},
),
"dpmpp_3m": (DPMSolverMultistepScheduler, {"use_karras_sigmas": False, "solver_order": 3}),
"dpmpp_3m_k": (DPMSolverMultistepScheduler, {"use_karras_sigmas": True, "solver_order": 3}),
"dpmpp_sde": (DPMSolverSDEScheduler, {"use_karras_sigmas": False, "noise_sampler_seed": 0}),
"dpmpp_sde_k": (DPMSolverSDEScheduler, {"use_karras_sigmas": True, "noise_sampler_seed": 0}),
"unipc": (UniPCMultistepScheduler, {"cpu_only": True}),
"unipc": (UniPCMultistepScheduler, {"use_karras_sigmas": False, "cpu_only": True}),
"unipc_k": (UniPCMultistepScheduler, {"use_karras_sigmas": True, "cpu_only": True}),
"lcm": (LCMScheduler, {}),
"tcd": (TCDScheduler, {}),
}

View File

@@ -1,51 +0,0 @@
from contextlib import contextmanager
from typing import Callable, List, Optional, Tuple, Union
import torch
import torch.nn as nn
from diffusers.models.autoencoders.autoencoder_kl import AutoencoderKL
from diffusers.models.autoencoders.autoencoder_tiny import AutoencoderTiny
from diffusers.models.lora import LoRACompatibleConv
from diffusers.models.unets.unet_2d_condition import UNet2DConditionModel
@contextmanager
def set_seamless(model: Union[UNet2DConditionModel, AutoencoderKL, AutoencoderTiny], seamless_axes: List[str]):
if not seamless_axes:
yield
return
# override conv_forward
# https://github.com/huggingface/diffusers/issues/556#issuecomment-1993287019
def _conv_forward_asymmetric(self, input: torch.Tensor, weight: torch.Tensor, bias: Optional[torch.Tensor] = None):
self.paddingX = (self._reversed_padding_repeated_twice[0], self._reversed_padding_repeated_twice[1], 0, 0)
self.paddingY = (0, 0, self._reversed_padding_repeated_twice[2], self._reversed_padding_repeated_twice[3])
working = torch.nn.functional.pad(input, self.paddingX, mode=x_mode)
working = torch.nn.functional.pad(working, self.paddingY, mode=y_mode)
return torch.nn.functional.conv2d(
working, weight, bias, self.stride, torch.nn.modules.utils._pair(0), self.dilation, self.groups
)
original_layers: List[Tuple[nn.Conv2d, Callable]] = []
try:
x_mode = "circular" if "x" in seamless_axes else "constant"
y_mode = "circular" if "y" in seamless_axes else "constant"
conv_layers: List[torch.nn.Conv2d] = []
for module in model.modules():
if isinstance(module, torch.nn.Conv2d):
conv_layers.append(module)
for layer in conv_layers:
if isinstance(layer, LoRACompatibleConv) and layer.lora_layer is None:
layer.lora_layer = lambda *x: 0
original_layers.append((layer, layer._conv_forward))
layer._conv_forward = _conv_forward_asymmetric.__get__(layer, torch.nn.Conv2d)
yield
finally:
for layer, orig_conv_forward in original_layers:
layer._conv_forward = orig_conv_forward

View File

@@ -0,0 +1,39 @@
from __future__ import annotations
from typing import Dict, Iterator, Optional, Tuple
import torch
from invokeai.backend.util.devices import TorchDevice
class OriginalWeightsStorage:
"""A class for tracking the original weights of a model for patch/unpatch operations."""
def __init__(self, cached_weights: Optional[Dict[str, torch.Tensor]] = None):
# The original weights of the model.
self._weights: dict[str, torch.Tensor] = {}
# The keys of the weights that have been changed (via `save()`) during the lifetime of this instance.
self._changed_weights: set[str] = set()
if cached_weights:
self._weights.update(cached_weights)
def save(self, key: str, weight: torch.Tensor, copy: bool = True):
self._changed_weights.add(key)
if key in self._weights:
return
self._weights[key] = weight.detach().to(device=TorchDevice.CPU_DEVICE, copy=copy)
def get(self, key: str, copy: bool = False) -> Optional[torch.Tensor]:
weight = self._weights.get(key, None)
if weight is not None and copy:
weight = weight.clone()
return weight
def contains(self, key: str) -> bool:
return key in self._weights
def get_changed_weights(self) -> Iterator[Tuple[str, torch.Tensor]]:
for key in self._changed_weights:
yield key, self._weights[key]

View File

@@ -53,64 +53,63 @@
},
"dependencies": {
"@chakra-ui/react-use-size": "^2.1.0",
"@dagrejs/dagre": "^1.1.2",
"@dagrejs/graphlib": "^2.2.2",
"@dagrejs/dagre": "^1.1.3",
"@dagrejs/graphlib": "^2.2.3",
"@dnd-kit/core": "^6.1.0",
"@dnd-kit/sortable": "^8.0.0",
"@dnd-kit/utilities": "^3.2.2",
"@fontsource-variable/inter": "^5.0.18",
"@invoke-ai/ui-library": "^0.0.25",
"@nanostores/react": "^0.7.2",
"@fontsource-variable/inter": "^5.0.20",
"@invoke-ai/ui-library": "^0.0.29",
"@nanostores/react": "^0.7.3",
"@reduxjs/toolkit": "2.2.3",
"@roarr/browser-log-writer": "^1.3.0",
"chakra-react-select": "^4.7.6",
"compare-versions": "^6.1.0",
"chakra-react-select": "^4.9.1",
"compare-versions": "^6.1.1",
"dateformat": "^5.0.3",
"fracturedjsonjs": "^4.0.1",
"framer-motion": "^11.1.8",
"i18next": "^23.11.3",
"i18next-http-backend": "^2.5.1",
"fracturedjsonjs": "^4.0.2",
"framer-motion": "^11.3.24",
"i18next": "^23.12.2",
"i18next-http-backend": "^2.5.2",
"idb-keyval": "^6.2.1",
"jsondiffpatch": "^0.6.0",
"konva": "^9.3.6",
"konva": "^9.3.14",
"lodash-es": "^4.17.21",
"nanostores": "^0.10.3",
"nanostores": "^0.11.2",
"new-github-issue-url": "^1.0.0",
"overlayscrollbars": "^2.7.3",
"overlayscrollbars": "^2.10.0",
"overlayscrollbars-react": "^0.5.6",
"query-string": "^9.0.0",
"query-string": "^9.1.0",
"react": "^18.3.1",
"react-colorful": "^5.6.1",
"react-dom": "^18.3.1",
"react-dropzone": "^14.2.3",
"react-error-boundary": "^4.0.13",
"react-hook-form": "^7.51.4",
"react-hook-form": "^7.52.2",
"react-hotkeys-hook": "4.5.0",
"react-i18next": "^14.1.1",
"react-icons": "^5.2.0",
"react-i18next": "^14.1.3",
"react-icons": "^5.2.1",
"react-konva": "^18.2.10",
"react-redux": "9.1.2",
"react-resizable-panels": "^2.0.19",
"react-resizable-panels": "^2.0.23",
"react-select": "5.8.0",
"react-use": "^17.5.0",
"react-virtuoso": "^4.7.10",
"reactflow": "^11.11.3",
"react-use": "^17.5.1",
"react-virtuoso": "^4.9.0",
"reactflow": "^11.11.4",
"redux-dynamic-middlewares": "^2.2.0",
"redux-remember": "^5.1.0",
"redux-undo": "^1.1.0",
"rfdc": "^1.3.1",
"rfdc": "^1.4.1",
"roarr": "^7.21.1",
"serialize-error": "^11.0.3",
"socket.io-client": "^4.7.5",
"use-debounce": "^10.0.0",
"use-debounce": "^10.0.2",
"use-device-pixel-ratio": "^1.1.2",
"use-image": "^1.1.1",
"uuid": "^9.0.1",
"zod": "^3.23.6",
"zod-validation-error": "^3.2.0"
"uuid": "^10.0.0",
"zod": "^3.23.8",
"zod-validation-error": "^3.3.1"
},
"peerDependencies": {
"@chakra-ui/react": "^2.8.2",
"react": "^18.2.0",
"react-dom": "^18.2.0",
"ts-toolbelt": "^9.6.0"
@@ -118,38 +117,38 @@
"devDependencies": {
"@invoke-ai/eslint-config-react": "^0.0.14",
"@invoke-ai/prettier-config-react": "^0.0.7",
"@storybook/addon-essentials": "^8.0.10",
"@storybook/addon-interactions": "^8.0.10",
"@storybook/addon-links": "^8.0.10",
"@storybook/addon-storysource": "^8.0.10",
"@storybook/manager-api": "^8.0.10",
"@storybook/react": "^8.0.10",
"@storybook/react-vite": "^8.0.10",
"@storybook/theming": "^8.0.10",
"@storybook/addon-essentials": "^8.2.8",
"@storybook/addon-interactions": "^8.2.8",
"@storybook/addon-links": "^8.2.8",
"@storybook/addon-storysource": "^8.2.8",
"@storybook/manager-api": "^8.2.8",
"@storybook/react": "^8.2.8",
"@storybook/react-vite": "^8.2.8",
"@storybook/theming": "^8.2.8",
"@types/dateformat": "^5.0.2",
"@types/lodash-es": "^4.17.12",
"@types/node": "^20.12.10",
"@types/react": "^18.3.1",
"@types/node": "^20.14.15",
"@types/react": "^18.3.3",
"@types/react-dom": "^18.3.0",
"@types/uuid": "^9.0.8",
"@vitejs/plugin-react-swc": "^3.6.0",
"@types/uuid": "^10.0.0",
"@vitejs/plugin-react-swc": "^3.7.0",
"@vitest/coverage-v8": "^1.5.0",
"@vitest/ui": "^1.5.0",
"concurrently": "^8.2.2",
"dpdm": "^3.14.0",
"eslint": "^8.57.0",
"eslint-plugin-i18next": "^6.0.3",
"eslint-plugin-i18next": "^6.0.9",
"eslint-plugin-path": "^1.3.0",
"knip": "^5.12.3",
"knip": "^5.27.2",
"openapi-types": "^12.1.3",
"openapi-typescript": "^6.7.5",
"prettier": "^3.2.5",
"openapi-typescript": "^7.3.0",
"prettier": "^3.3.3",
"rollup-plugin-visualizer": "^5.12.0",
"storybook": "^8.0.10",
"storybook": "^8.2.8",
"ts-toolbelt": "^9.6.0",
"tsafe": "^1.6.6",
"typescript": "^5.4.5",
"vite": "^5.2.11",
"tsafe": "^1.7.2",
"typescript": "^5.5.4",
"vite": "^5.4.0",
"vite-plugin-css-injected-by-js": "^3.5.1",
"vite-plugin-dts": "^3.9.1",
"vite-plugin-eslint": "^1.8.1",

File diff suppressed because it is too large Load Diff

View File

@@ -77,10 +77,6 @@
"title": "استعادة الوجوه",
"desc": "استعادة الصورة الحالية"
},
"upscale": {
"title": "تحسين الحجم",
"desc": "تحسين حجم الصورة الحالية"
},
"showInfo": {
"title": "عرض المعلومات",
"desc": "عرض معلومات البيانات الخاصة بالصورة الحالية"
@@ -255,8 +251,6 @@
"type": "نوع",
"strength": "قوة",
"upscaling": "تصغير",
"upscale": "تصغير",
"upscaleImage": "تصغير الصورة",
"scale": "مقياس",
"imageFit": "ملائمة الصورة الأولية لحجم الخرج",
"scaleBeforeProcessing": "تحجيم قبل المعالجة",

View File

@@ -91,7 +91,8 @@
"viewingDesc": "Bilder in großer Galerie ansehen",
"tab": "Tabulator",
"enabled": "Aktiviert",
"disabled": "Ausgeschaltet"
"disabled": "Ausgeschaltet",
"dontShowMeThese": "Zeig mir diese nicht"
},
"gallery": {
"galleryImageSize": "Bildgröße",
@@ -106,7 +107,6 @@
"download": "Runterladen",
"setCurrentImage": "Setze aktuelle Bild",
"featuresWillReset": "Wenn Sie dieses Bild löschen, werden diese Funktionen sofort zurückgesetzt.",
"deleteImageBin": "Gelöschte Bilder werden an den Papierkorb Ihres Betriebssystems gesendet.",
"unableToLoad": "Galerie kann nicht geladen werden",
"downloadSelection": "Auswahl herunterladen",
"currentlyInUse": "Dieses Bild wird derzeit in den folgenden Funktionen verwendet:",
@@ -187,10 +187,6 @@
"title": "Gesicht restaurieren",
"desc": "Das aktuelle Bild restaurieren"
},
"upscale": {
"title": "Hochskalieren",
"desc": "Das aktuelle Bild hochskalieren"
},
"showInfo": {
"title": "Info anzeigen",
"desc": "Metadaten des aktuellen Bildes anzeigen"
@@ -433,8 +429,6 @@
"type": "Art",
"strength": "Stärke",
"upscaling": "Hochskalierung",
"upscale": "Hochskalieren (Shift + U)",
"upscaleImage": "Bild hochskalieren",
"scale": "Maßstab",
"imageFit": "Ausgangsbild an Ausgabegröße anpassen",
"scaleBeforeProcessing": "Skalieren vor der Verarbeitung",
@@ -634,7 +628,10 @@
"private": "Private Ordner",
"shared": "Geteilte Ordner",
"archiveBoard": "Ordner archivieren",
"archived": "Archiviert"
"archived": "Archiviert",
"noBoards": "Kein {boardType}} Ordner",
"hideBoards": "Ordner verstecken",
"viewBoards": "Ordner ansehen"
},
"controlnet": {
"showAdvanced": "Zeige Erweitert",
@@ -949,6 +946,21 @@
"paragraphs": [
"Reduziert das Ausgangsbild auf die Breite und Höhe des Ausgangsbildes. Empfohlen zu aktivieren."
]
},
"structure": {
"paragraphs": [
"Die Struktur steuert, wie genau sich das Ausgabebild an das Layout des Originals hält. Eine niedrige Struktur erlaubt größere Änderungen, während eine hohe Struktur die ursprüngliche Komposition und das Layout strikter beibehält."
]
},
"creativity": {
"paragraphs": [
"Die Kreativität bestimmt den Grad der Freiheit, die dem Modell beim Hinzufügen von Details gewährt wird. Eine niedrige Kreativität hält sich eng an das Originalbild, während eine hohe Kreativität mehr Veränderungen zulässt. Bei der Verwendung eines Prompts erhöht eine hohe Kreativität den Einfluss des Prompts."
]
},
"scale": {
"paragraphs": [
"Die Skalierung steuert die Größe des Ausgabebildes und basiert auf einem Vielfachen der Auflösung des Originalbildes. So würde z. B. eine 2-fache Hochskalierung eines 1024x1024px Bildes eine 2048x2048px große Ausgabe erzeugen."
]
}
},
"invocationCache": {

View File

@@ -31,7 +31,8 @@
"deleteBoard": "Delete Board",
"deleteBoardAndImages": "Delete Board and Images",
"deleteBoardOnly": "Delete Board Only",
"deletedBoardsCannotbeRestored": "Deleted boards cannot be restored",
"deletedBoardsCannotbeRestored": "Deleted boards cannot be restored. Selecting 'Delete Board Only' will move images to an uncategorized state.",
"deletedPrivateBoardsCannotbeRestored": "Deleted boards cannot be restored. Selecting 'Delete Board Only' will move images to a private uncategorized state for the image's creator.",
"hideBoards": "Hide Boards",
"loading": "Loading...",
"menuItemAutoAdd": "Auto-add to this Board",
@@ -105,6 +106,7 @@
"negativePrompt": "Negative Prompt",
"discordLabel": "Discord",
"dontAskMeAgain": "Don't ask me again",
"dontShowMeThese": "Don't show me these",
"editor": "Editor",
"error": "Error",
"file": "File",
@@ -198,6 +200,7 @@
"delete": "Delete",
"depthAnything": "Depth Anything",
"depthAnythingDescription": "Depth map generation using the Depth Anything technique",
"depthAnythingSmallV2": "Small V2",
"depthMidas": "Depth (Midas)",
"depthMidasDescription": "Depth map generation using Midas",
"depthZoe": "Depth (Zoe)",
@@ -371,7 +374,6 @@
"dropToUpload": "$t(gallery.drop) to Upload",
"deleteImage_one": "Delete Image",
"deleteImage_other": "Delete {{count}} Images",
"deleteImageBin": "Deleted images will be sent to your operating system's Bin.",
"deleteImagePermanent": "Deleted images cannot be restored.",
"displayBoardSearch": "Display Board Search",
"displaySearch": "Display Search",
@@ -381,7 +383,9 @@
"featuresWillReset": "If you delete this image, those features will immediately be reset.",
"galleryImageSize": "Image Size",
"gallerySettings": "Gallery Settings",
"go": "Go",
"image": "image",
"jump": "Jump",
"loading": "Loading",
"loadMore": "Load More",
"newestFirst": "Newest First",
@@ -1049,11 +1053,7 @@
"remixImage": "Remix Image",
"usePrompt": "Use Prompt",
"useSeed": "Use Seed",
"width": "Width",
"isAllowedToUpscale": {
"useX2Model": "Image is too large to upscale with x4 model, use x2 model",
"tooLarge": "Image is too large to upscale, select smaller image"
}
"width": "Width"
},
"dynamicPrompts": {
"showDynamicPrompts": "Show Dynamic Prompts",
@@ -1097,6 +1097,8 @@
"displayInProgress": "Display Progress Images",
"enableImageDebugging": "Enable Image Debugging",
"enableInformationalPopovers": "Enable Informational Popovers",
"informationalPopoversDisabled": "Informational Popovers Disabled",
"informationalPopoversDisabledDesc": "Informational popovers have been disabled. Enable them in Settings.",
"enableInvisibleWatermark": "Enable Invisible Watermark",
"enableNSFWChecker": "Enable NSFW Checker",
"general": "General",
@@ -1139,6 +1141,8 @@
"imageSavingFailed": "Image Saving Failed",
"imageUploaded": "Image Uploaded",
"imageUploadFailed": "Image Upload Failed",
"importFailed": "Import Failed",
"importSuccessful": "Import Successful",
"invalidUpload": "Invalid Upload",
"loadedWithWarnings": "Workflow Loaded with Warnings",
"maskSavedAssets": "Mask Saved to Assets",
@@ -1504,6 +1508,30 @@
"seamlessTilingYAxis": {
"heading": "Seamless Tiling Y Axis",
"paragraphs": ["Seamlessly tile an image along the vertical axis."]
},
"upscaleModel": {
"heading": "Upscale Model",
"paragraphs": [
"The upscale model scales the image to the output size before details are added. Any supported upscale model may be used, but some are specialized for different kinds of images, like photos or line drawings."
]
},
"scale": {
"heading": "Scale",
"paragraphs": [
"Scale controls the output image size, and is based on a multiple of the input image resolution. For example a 2x upscale on a 1024x1024 image would produce a 2048 x 2048 output."
]
},
"creativity": {
"heading": "Creativity",
"paragraphs": [
"Creativity controls the amount of freedom granted to the model when adding details. Low creativity stays close to the original image, while high creativity allows for more change. When using a prompt, high creativity increases the influence of the prompt."
]
},
"structure": {
"heading": "Structure",
"paragraphs": [
"Structure controls how closely the output image will keep to the layout of the original. Low structure allows major changes, while high structure strictly maintains the original composition and layout."
]
}
},
"unifiedCanvas": {
@@ -1647,7 +1675,10 @@
"layers_other": "Layers"
},
"upscaling": {
"upscale": "Upscale",
"creativity": "Creativity",
"exceedsMaxSize": "Upscale settings exceed max size limit",
"exceedsMaxSizeDetails": "Max upscale limit is {{maxUpscaleDimension}}x{{maxUpscaleDimension}} pixels. Please try a smaller image or decrease your scale selection.",
"structure": "Structure",
"upscaleModel": "Upscale Model",
"postProcessingModel": "Post-Processing Model",
@@ -1661,6 +1692,53 @@
"missingUpscaleModel": "Missing upscale model",
"missingTileControlNetModel": "No valid tile ControlNet models installed"
},
"stylePresets": {
"active": "Active",
"choosePromptTemplate": "Choose Prompt Template",
"clearTemplateSelection": "Clear Template Selection",
"copyTemplate": "Copy Template",
"createPromptTemplate": "Create Prompt Template",
"defaultTemplates": "Default Templates",
"deleteImage": "Delete Image",
"deleteTemplate": "Delete Template",
"deleteTemplate2": "Are you sure you want to delete this template? This cannot be undone.",
"exportPromptTemplates": "Export My Prompt Templates (CSV)",
"editTemplate": "Edit Template",
"exportDownloaded": "Export Downloaded",
"exportFailed": "Unable to generate and download CSV",
"flatten": "Flatten selected template into current prompt",
"importTemplates": "Import Prompt Templates (CSV/JSON)",
"acceptedColumnsKeys": "Accepted columns/keys:",
"nameColumn": "'name'",
"positivePromptColumn": "'prompt' or 'positive_prompt'",
"negativePromptColumn": "'negative_prompt'",
"insertPlaceholder": "Insert placeholder",
"myTemplates": "My Templates",
"name": "Name",
"negativePrompt": "Negative Prompt",
"noTemplates": "No templates",
"noMatchingTemplates": "No matching templates",
"promptTemplatesDesc1": "Prompt templates add text to the prompts you write in the prompt box.",
"promptTemplatesDesc2": "Use the placeholder string <Pre>{{placeholder}}</Pre> to specify where your prompt should be included in the template.",
"promptTemplatesDesc3": "If you omit the placeholder, the template will be appended to the end of your prompt.",
"positivePrompt": "Positive Prompt",
"preview": "Preview",
"private": "Private",
"promptTemplateCleared": "Prompt Template Cleared",
"searchByName": "Search by name",
"shared": "Shared",
"sharedTemplates": "Shared Templates",
"templateActions": "Template Actions",
"templateDeleted": "Prompt template deleted",
"toggleViewMode": "Toggle View Mode",
"type": "Type",
"unableToDeleteTemplate": "Unable to delete prompt template",
"updatePromptTemplate": "Update Prompt Template",
"uploadImage": "Upload Image",
"useForTemplate": "Use For Prompt Template",
"viewList": "View Template List",
"viewModeTooltip": "This is how your prompt will look with your currently selected template. To edit your prompt, click anywhere in the text box."
},
"upsell": {
"inviteTeammates": "Invite Teammates",
"professional": "Professional",

View File

@@ -88,7 +88,6 @@
"deleteImage_one": "Eliminar Imagen",
"deleteImage_many": "",
"deleteImage_other": "",
"deleteImageBin": "Las imágenes eliminadas se enviarán a la papelera de tu sistema operativo.",
"deleteImagePermanent": "Las imágenes eliminadas no se pueden restaurar.",
"assets": "Activos",
"autoAssignBoardOnClick": "Asignación automática de tableros al hacer clic"
@@ -151,10 +150,6 @@
"title": "Restaurar rostros",
"desc": "Restaurar rostros en la imagen actual"
},
"upscale": {
"title": "Aumentar resolución",
"desc": "Aumentar la resolución de la imagen actual"
},
"showInfo": {
"title": "Mostrar información",
"desc": "Mostar metadatos de la imagen actual"
@@ -360,8 +355,6 @@
"type": "Tipo",
"strength": "Fuerza",
"upscaling": "Aumento de resolución",
"upscale": "Aumentar resolución",
"upscaleImage": "Aumentar la resolución de la imagen",
"scale": "Escala",
"imageFit": "Ajuste tamaño de imagen inicial al tamaño objetivo",
"scaleBeforeProcessing": "Redimensionar antes de procesar",
@@ -408,7 +401,12 @@
"showProgressInViewer": "Mostrar las imágenes del progreso en el visor",
"ui": "Interfaz del usuario",
"generation": "Generación",
"beta": "Beta"
"beta": "Beta",
"reloadingIn": "Recargando en",
"intermediatesClearedFailed": "Error limpiando los intermediarios",
"intermediatesCleared_one": "Borrado {{count}} intermediario",
"intermediatesCleared_many": "Borrados {{count}} intermediarios",
"intermediatesCleared_other": "Borrados {{count}} intermediarios"
},
"toast": {
"uploadFailed": "Error al subir archivo",
@@ -426,7 +424,12 @@
"parameterSet": "Conjunto de parámetros",
"parameterNotSet": "Parámetro no configurado",
"problemCopyingImage": "No se puede copiar la imagen",
"errorCopied": "Error al copiar"
"errorCopied": "Error al copiar",
"baseModelChanged": "Modelo base cambiado",
"addedToBoard": "Añadido al tablero",
"baseModelChangedCleared_one": "Borrado o desactivado {{count}} submodelo incompatible",
"baseModelChangedCleared_many": "Borrados o desactivados {{count}} submodelos incompatibles",
"baseModelChangedCleared_other": "Borrados o desactivados {{count}} submodelos incompatibles"
},
"tooltip": {
"feature": {
@@ -540,7 +543,13 @@
"downloadBoard": "Descargar panel",
"deleteBoardOnly": "Borrar solo el panel",
"myBoard": "Mi panel",
"noMatching": "No hay paneles que coincidan"
"noMatching": "No hay paneles que coincidan",
"imagesWithCount_one": "{{count}} imagen",
"imagesWithCount_many": "{{count}} imágenes",
"imagesWithCount_other": "{{count}} imágenes",
"assetsWithCount_one": "{{count}} activo",
"assetsWithCount_many": "{{count}} activos",
"assetsWithCount_other": "{{count}} activos"
},
"accordions": {
"compositing": {
@@ -590,6 +599,27 @@
"balanced": "Equilibrado",
"beginEndStepPercent": "Inicio / Final Porcentaje de pasos",
"detectResolution": "Detectar resolución",
"beginEndStepPercentShort": "Inicio / Final %"
"beginEndStepPercentShort": "Inicio / Final %",
"t2i_adapter": "$t(controlnet.controlAdapter_one) #{{number}} ($t(common.t2iAdapter))",
"controlnet": "$t(controlnet.controlAdapter_one) #{{number}} ($t(common.controlNet))",
"ip_adapter": "$t(controlnet.controlAdapter_one) #{{number}} ($t(common.ipAdapter))",
"addControlNet": "Añadir $t(common.controlNet)",
"addIPAdapter": "Añadir $t(common.ipAdapter)",
"controlAdapter_one": "Adaptador de control",
"controlAdapter_many": "Adaptadores de control",
"controlAdapter_other": "Adaptadores de control",
"addT2IAdapter": "Añadir $t(common.t2iAdapter)"
},
"queue": {
"back": "Atrás",
"front": "Delante",
"batchQueuedDesc_one": "Se agregó {{count}} sesión a {{direction}} la cola",
"batchQueuedDesc_many": "Se agregaron {{count}} sesiones a {{direction}} la cola",
"batchQueuedDesc_other": "Se agregaron {{count}} sesiones a {{direction}} la cola"
},
"upsell": {
"inviteTeammates": "Invitar compañeros de equipo",
"shareAccess": "Compartir acceso",
"professionalUpsell": "Disponible en la edición profesional de Invoke. Haz clic aquí o visita invoke.com/pricing para obtener más detalles."
}
}

View File

@@ -130,10 +130,6 @@
"title": "Restaurer les visages",
"desc": "Restaurer l'image actuelle"
},
"upscale": {
"title": "Agrandir",
"desc": "Agrandir l'image actuelle"
},
"showInfo": {
"title": "Afficher les informations",
"desc": "Afficher les informations de métadonnées de l'image actuelle"
@@ -308,8 +304,6 @@
"type": "Type",
"strength": "Force",
"upscaling": "Agrandissement",
"upscale": "Agrandir",
"upscaleImage": "Image en Agrandissement",
"scale": "Echelle",
"imageFit": "Ajuster Image Initiale à la Taille de Sortie",
"scaleBeforeProcessing": "Echelle Avant Traitement",

View File

@@ -90,10 +90,6 @@
"desc": "שחזור התמונה הנוכחית",
"title": "שחזור פרצופים"
},
"upscale": {
"title": "הגדלת קנה מידה",
"desc": "הגדל את התמונה הנוכחית"
},
"showInfo": {
"title": "הצג מידע",
"desc": "הצגת פרטי מטא-נתונים של התמונה הנוכחית"
@@ -263,8 +259,6 @@
"seed": "זרע",
"type": "סוג",
"strength": "חוזק",
"upscale": "הגדלת קנה מידה",
"upscaleImage": "הגדלת קנה מידת התמונה",
"denoisingStrength": "חוזק מנטרל הרעש",
"scaleBeforeProcessing": "שנה קנה מידה לפני עיבוד",
"scaledWidth": "קנה מידה לאחר שינוי W",

View File

@@ -89,7 +89,8 @@
"enabled": "Abilitato",
"disabled": "Disabilitato",
"comparingDesc": "Confronta due immagini",
"comparing": "Confronta"
"comparing": "Confronta",
"dontShowMeThese": "Non mostrare più"
},
"gallery": {
"galleryImageSize": "Dimensione dell'immagine",
@@ -101,7 +102,6 @@
"deleteImage_many": "Elimina {{count}} immagini",
"deleteImage_other": "Elimina {{count}} immagini",
"deleteImagePermanent": "Le immagini eliminate non possono essere ripristinate.",
"deleteImageBin": "Le immagini eliminate verranno spostate nel cestino del tuo sistema operativo.",
"assets": "Risorse",
"autoAssignBoardOnClick": "Assegna automaticamente la bacheca al clic",
"featuresWillReset": "Se elimini questa immagine, quelle funzionalità verranno immediatamente ripristinate.",
@@ -150,7 +150,13 @@
"showArchivedBoards": "Mostra le bacheche archiviate",
"searchImages": "Ricerca per metadati",
"displayBoardSearch": "Mostra la ricerca nelle Bacheche",
"displaySearch": "Mostra la ricerca"
"displaySearch": "Mostra la ricerca",
"selectAllOnPage": "Seleziona tutto nella pagina",
"selectAllOnBoard": "Seleziona tutto nella bacheca",
"exitBoardSearch": "Esci da Ricerca bacheca",
"exitSearch": "Esci dalla ricerca",
"go": "Vai",
"jump": "Salta"
},
"hotkeys": {
"keyboardShortcuts": "Tasti di scelta rapida",
@@ -210,10 +216,6 @@
"title": "Restaura volti",
"desc": "Restaura l'immagine corrente"
},
"upscale": {
"title": "Amplia",
"desc": "Amplia l'immagine corrente"
},
"showInfo": {
"title": "Mostra informazioni",
"desc": "Mostra le informazioni sui metadati dell'immagine corrente"
@@ -377,6 +379,10 @@
"toggleViewer": {
"title": "Attiva/disattiva il visualizzatore di immagini",
"desc": "Passa dal visualizzatore immagini all'area di lavoro per la scheda corrente."
},
"postProcess": {
"desc": "Elabora l'immagine corrente utilizzando il modello di post-elaborazione selezionato",
"title": "Elabora immagine"
}
},
"modelManager": {
@@ -505,8 +511,6 @@
"type": "Tipo",
"strength": "Forza",
"upscaling": "Ampliamento",
"upscale": "Amplia (Shift + U)",
"upscaleImage": "Amplia Immagine",
"scale": "Scala",
"imageFit": "Adatta l'immagine iniziale alle dimensioni di output",
"scaleBeforeProcessing": "Scala prima dell'elaborazione",
@@ -569,10 +573,6 @@
},
"useCpuNoise": "Usa la CPU per generare rumore",
"iterations": "Iterazioni",
"isAllowedToUpscale": {
"useX2Model": "L'immagine è troppo grande per l'ampliamento con il modello x4, utilizza il modello x2",
"tooLarge": "L'immagine è troppo grande per l'ampliamento, seleziona un'immagine più piccola"
},
"imageActions": "Azioni Immagine",
"cfgRescaleMultiplier": "Moltiplicatore riscala CFG",
"useSize": "Usa Dimensioni",
@@ -591,7 +591,10 @@
"infillColorValue": "Colore di riempimento",
"globalSettings": "Impostazioni globali",
"globalPositivePromptPlaceholder": "Prompt positivo globale",
"globalNegativePromptPlaceholder": "Prompt negativo globale"
"globalNegativePromptPlaceholder": "Prompt negativo globale",
"processImage": "Elabora Immagine",
"sendToUpscale": "Invia a Ampliare",
"postProcessing": "Post-elaborazione (Shift + U)"
},
"settings": {
"models": "Modelli",
@@ -625,7 +628,9 @@
"enableNSFWChecker": "Abilita controllo NSFW",
"enableInvisibleWatermark": "Abilita filigrana invisibile",
"enableInformationalPopovers": "Abilita testo informativo a comparsa",
"reloadingIn": "Ricaricando in"
"reloadingIn": "Ricaricando in",
"informationalPopoversDisabled": "Testo informativo a comparsa disabilitato",
"informationalPopoversDisabledDesc": "I testi informativi a comparsa sono disabilitati. Attivali nelle impostazioni."
},
"toast": {
"uploadFailed": "Caricamento fallito",
@@ -696,7 +701,9 @@
"baseModelChanged": "Modello base modificato",
"sessionRef": "Sessione: {{sessionId}}",
"somethingWentWrong": "Qualcosa è andato storto",
"outOfMemoryErrorDesc": "Le impostazioni della generazione attuale superano la capacità del sistema. Modifica le impostazioni e riprova."
"outOfMemoryErrorDesc": "Le impostazioni della generazione attuale superano la capacità del sistema. Modifica le impostazioni e riprova.",
"importFailed": "Importazione non riuscita",
"importSuccessful": "Importazione riuscita"
},
"tooltip": {
"feature": {
@@ -922,7 +929,7 @@
"missingInvocationTemplate": "Modello di invocazione mancante",
"missingFieldTemplate": "Modello di campo mancante",
"singleFieldType": "{{name}} (Singola)",
"imageAccessError": "Impossibile trovare l'immagine {{image_name}}, ripristino delle impostazioni predefinite",
"imageAccessError": "Impossibile trovare l'immagine {{image_name}}, ripristino ai valori predefiniti",
"boardAccessError": "Impossibile trovare la bacheca {{board_id}}, ripristino ai valori predefiniti",
"modelAccessError": "Impossibile trovare il modello {{key}}, ripristino ai valori predefiniti"
},
@@ -946,7 +953,7 @@
"deleteBoardOnly": "solo la Bacheca",
"deleteBoard": "Elimina Bacheca",
"deleteBoardAndImages": "Bacheca e Immagini",
"deletedBoardsCannotbeRestored": "Le bacheche eliminate non possono essere ripristinate",
"deletedBoardsCannotbeRestored": "Le bacheche eliminate non possono essere ripristinate. Selezionando \"Elimina solo bacheca\" le immagini verranno spostate nella bacheca \"Non categorizzato\".",
"movingImagesToBoard_one": "Spostare {{count}} immagine nella bacheca:",
"movingImagesToBoard_many": "Spostare {{count}} immagini nella bacheca:",
"movingImagesToBoard_other": "Spostare {{count}} immagini nella bacheca:",
@@ -964,7 +971,11 @@
"boards": "Bacheche",
"private": "Bacheche private",
"shared": "Bacheche condivise",
"addPrivateBoard": "Aggiungi una Bacheca Privata"
"addPrivateBoard": "Aggiungi una Bacheca Privata",
"noBoards": "Nessuna bacheca {{boardType}}",
"hideBoards": "Nascondi bacheche",
"viewBoards": "Visualizza bacheche",
"deletedPrivateBoardsCannotbeRestored": "Le bacheche cancellate non possono essere ripristinate. Selezionando 'Cancella solo bacheca', le immagini verranno spostate nella bacheca \"Non categorizzato\" privata dell'autore dell'immagine."
},
"controlnet": {
"contentShuffleDescription": "Rimescola il contenuto di un'immagine",
@@ -1508,6 +1519,30 @@
"paragraphs": [
"Metodo con cui applicare l'adattatore IP corrente."
]
},
"scale": {
"heading": "Scala",
"paragraphs": [
"La scala controlla la dimensione dell'immagine di uscita e si basa su un multiplo della risoluzione dell'immagine di ingresso. Ad esempio, un ampliamento 2x su un'immagine 1024x1024 produrrebbe in uscita a 2048x2048."
]
},
"upscaleModel": {
"paragraphs": [
"Il modello di ampliamento (Upscale), scala l'immagine alle dimensioni di uscita prima di aggiungere i dettagli. È possibile utilizzare qualsiasi modello di ampliamento supportato, ma alcuni sono specializzati per diversi tipi di immagini, come foto o disegni al tratto."
],
"heading": "Modello di ampliamento"
},
"creativity": {
"heading": "Creatività",
"paragraphs": [
"La creatività controlla quanta libertà è concessa al modello quando si aggiungono dettagli. Una creatività bassa rimane vicina all'immagine originale, mentre una creatività alta consente più cambiamenti. Quando si usa un prompt, una creatività alta aumenta l'influenza del prompt."
]
},
"structure": {
"heading": "Struttura",
"paragraphs": [
"La struttura determina quanto l'immagine finale rispecchierà il layout dell'originale. Una struttura bassa permette cambiamenti significativi, mentre una struttura alta conserva la composizione e il layout originali."
]
}
},
"sdxl": {
@@ -1684,7 +1719,76 @@
"models": "Modelli",
"modelsTab": "$t(ui.tabs.models) $t(common.tab)",
"queue": "Coda",
"queueTab": "$t(ui.tabs.queue) $t(common.tab)"
"queueTab": "$t(ui.tabs.queue) $t(common.tab)",
"upscaling": "Ampliamento",
"upscalingTab": "$t(ui.tabs.upscaling) $t(common.tab)"
}
},
"upscaling": {
"creativity": "Creatività",
"structure": "Struttura",
"upscaleModel": "Modello di Ampliamento",
"scale": "Scala",
"missingModelsWarning": "Visita <LinkComponent>Gestione modelli</LinkComponent> per installare i modelli richiesti:",
"mainModelDesc": "Modello principale (architettura SD1.5 o SDXL)",
"tileControlNetModelDesc": "Modello Tile ControlNet per l'architettura del modello principale scelto",
"upscaleModelDesc": "Modello per l'ampliamento (da immagine a immagine)",
"missingUpscaleInitialImage": "Immagine iniziale mancante per l'ampliamento",
"missingUpscaleModel": "Modello per lampliamento mancante",
"missingTileControlNetModel": "Nessun modello ControlNet Tile valido installato",
"postProcessingModel": "Modello di post-elaborazione",
"postProcessingMissingModelWarning": "Visita <LinkComponent>Gestione modelli</LinkComponent> per installare un modello di post-elaborazione (da immagine a immagine).",
"exceedsMaxSize": "Le impostazioni di ampliamento superano il limite massimo delle dimensioni",
"exceedsMaxSizeDetails": "Il limite massimo di ampliamento è {{maxUpscaleDimension}}x{{maxUpscaleDimension}} pixel. Prova un'immagine più piccola o diminuisci la scala selezionata."
},
"upsell": {
"inviteTeammates": "Invita collaboratori",
"shareAccess": "Condividi l'accesso",
"professional": "Professionale",
"professionalUpsell": "Disponibile nell'edizione Professional di Invoke. Fai clic qui o visita invoke.com/pricing per ulteriori dettagli."
},
"stylePresets": {
"active": "Attivo",
"choosePromptTemplate": "Scegli un modello di prompt",
"clearTemplateSelection": "Cancella selezione modello",
"copyTemplate": "Copia modello",
"createPromptTemplate": "Crea modello di prompt",
"defaultTemplates": "Modelli predefiniti",
"deleteImage": "Elimina immagine",
"deleteTemplate": "Elimina modello",
"editTemplate": "Modifica modello",
"flatten": "Unisci il modello selezionato al prompt corrente",
"insertPlaceholder": "Inserisci segnaposto",
"myTemplates": "I miei modelli",
"name": "Nome",
"negativePrompt": "Prompt Negativo",
"noMatchingTemplates": "Nessun modello corrispondente",
"promptTemplatesDesc1": "I modelli di prompt aggiungono testo ai prompt che scrivi nelle caselle dei prompt.",
"promptTemplatesDesc3": "Se si omette il segnaposto, il modello verrà aggiunto alla fine del prompt.",
"positivePrompt": "Prompt Positivo",
"preview": "Anteprima",
"private": "Privato",
"searchByName": "Cerca per nome",
"shared": "Condiviso",
"sharedTemplates": "Modelli condivisi",
"templateDeleted": "Modello di prompt eliminato",
"toggleViewMode": "Attiva/disattiva visualizzazione",
"uploadImage": "Carica immagine",
"useForTemplate": "Usa per modello di prompt",
"viewList": "Visualizza l'elenco dei modelli",
"viewModeTooltip": "Ecco come apparirà il tuo prompt con il modello attualmente selezionato. Per modificare il tuo prompt, clicca in un punto qualsiasi della casella di testo.",
"deleteTemplate2": "Vuoi davvero eliminare questo modello? Questa operazione non può essere annullata.",
"unableToDeleteTemplate": "Impossibile eliminare il modello di prompt",
"updatePromptTemplate": "Aggiorna il modello di prompt",
"type": "Tipo",
"promptTemplatesDesc2": "Utilizza la stringa segnaposto <Pre>{{placeholder}}</Pre> per specificare dove inserire il tuo prompt nel modello.",
"importTemplates": "Importa modelli di prompt (CSV/JSON)",
"exportDownloaded": "Esportazione completata",
"exportFailed": "Impossibile generare e scaricare il file CSV",
"exportPromptTemplates": "Esporta i miei modelli di prompt (CSV)",
"positivePromptColumn": "'prompt' o 'positive_prompt'",
"noTemplates": "Nessun modello",
"acceptedColumnsKeys": "Colonne/chiavi accettate:",
"templateActions": "Azioni modello"
}
}

View File

@@ -109,7 +109,6 @@
"drop": "ドロップ",
"dropOrUpload": "$t(gallery.drop) またはアップロード",
"deleteImage_other": "画像を削除",
"deleteImageBin": "削除された画像はOSのゴミ箱に送られます。",
"deleteImagePermanent": "削除された画像は復元できません。",
"download": "ダウンロード",
"unableToLoad": "ギャラリーをロードできません",
@@ -199,10 +198,6 @@
"title": "顔の修復",
"desc": "現在の画像を修復"
},
"upscale": {
"title": "アップスケール",
"desc": "現在の画像をアップスケール"
},
"showInfo": {
"title": "情報を見る",
"desc": "現在の画像のメタデータ情報を表示"
@@ -427,8 +422,6 @@
"shuffle": "シャッフル",
"strength": "強度",
"upscaling": "アップスケーリング",
"upscale": "アップスケール",
"upscaleImage": "画像をアップスケール",
"scale": "Scale",
"scaleBeforeProcessing": "処理前のスケール",
"scaledWidth": "幅のスケール",

View File

@@ -70,7 +70,6 @@
"gallerySettings": "갤러리 설정",
"deleteSelection": "선택 항목 삭제",
"featuresWillReset": "이 이미지를 삭제하면 해당 기능이 즉시 재설정됩니다.",
"deleteImageBin": "삭제된 이미지는 운영 체제의 Bin으로 전송됩니다.",
"assets": "자산",
"problemDeletingImagesDesc": "하나 이상의 이미지를 삭제할 수 없습니다",
"noImagesInGallery": "보여줄 이미지가 없음",
@@ -258,10 +257,6 @@
"desc": "캔버스 브러시를 선택",
"title": "브러시 선택"
},
"upscale": {
"desc": "현재 이미지를 업스케일",
"title": "업스케일"
},
"previousImage": {
"title": "이전 이미지",
"desc": "갤러리에 이전 이미지 표시"

View File

@@ -97,7 +97,6 @@
"noImagesInGallery": "Geen afbeeldingen om te tonen",
"deleteImage_one": "Verwijder afbeelding",
"deleteImage_other": "",
"deleteImageBin": "Verwijderde afbeeldingen worden naar de prullenbak van je besturingssysteem gestuurd.",
"deleteImagePermanent": "Verwijderde afbeeldingen kunnen niet worden hersteld.",
"assets": "Eigen onderdelen",
"autoAssignBoardOnClick": "Ken automatisch bord toe bij klikken",
@@ -168,10 +167,6 @@
"title": "Herstel gezichten",
"desc": "Herstelt de huidige afbeelding"
},
"upscale": {
"title": "Schaal op",
"desc": "Schaalt de huidige afbeelding op"
},
"showInfo": {
"title": "Toon info",
"desc": "Toont de metagegevens van de huidige afbeelding"
@@ -412,8 +407,6 @@
"type": "Soort",
"strength": "Sterkte",
"upscaling": "Opschalen",
"upscale": "Vergroot (Shift + U)",
"upscaleImage": "Schaal afbeelding op",
"scale": "Schaal",
"imageFit": "Pas initiële afbeelding in uitvoergrootte",
"scaleBeforeProcessing": "Schalen voor verwerking",
@@ -473,10 +466,6 @@
},
"imageNotProcessedForControlAdapter": "De afbeelding van controle-adapter #{{number}} is niet verwerkt"
},
"isAllowedToUpscale": {
"useX2Model": "Afbeelding is te groot om te vergroten met het x4-model. Gebruik hiervoor het x2-model",
"tooLarge": "Afbeelding is te groot om te vergoten. Kies een kleinere afbeelding"
},
"patchmatchDownScaleSize": "Verklein",
"useCpuNoise": "Gebruik CPU-ruis",
"imageActions": "Afbeeldingshandeling",

View File

@@ -78,10 +78,6 @@
"title": "Popraw twarze",
"desc": "Uruchamia proces poprawiania twarzy dla aktywnego obrazu"
},
"upscale": {
"title": "Powiększ",
"desc": "Uruchamia proces powiększania aktywnego obrazu"
},
"showInfo": {
"title": "Pokaż informacje",
"desc": "Pokazuje metadane zapisane w aktywnym obrazie"
@@ -232,8 +228,6 @@
"type": "Metoda",
"strength": "Siła",
"upscaling": "Powiększanie",
"upscale": "Powiększ",
"upscaleImage": "Powiększ obraz",
"scale": "Skala",
"imageFit": "Przeskaluj oryginalny obraz",
"scaleBeforeProcessing": "Tryb skalowania",

View File

@@ -160,10 +160,6 @@
"title": "Restaurar Rostos",
"desc": "Restaurar a imagem atual"
},
"upscale": {
"title": "Redimensionar",
"desc": "Redimensionar a imagem atual"
},
"showInfo": {
"title": "Mostrar Informações",
"desc": "Mostrar metadados de informações da imagem atual"
@@ -275,8 +271,6 @@
"showOptionsPanel": "Mostrar Painel de Opções",
"strength": "Força",
"upscaling": "Redimensionando",
"upscale": "Redimensionar",
"upscaleImage": "Redimensionar Imagem",
"scaleBeforeProcessing": "Escala Antes do Processamento",
"images": "Imagems",
"steps": "Passos",

Some files were not shown because too many files have changed in this diff Show More