In #7688 we optimized queuing preparation logic. This inadvertently broke retrying queue items.
Previously, a `NamedTuple` was used to store the values to insert in the DB when enqueuing. This handy class provides an API similar to a dataclass, where you can instantiate it with kwargs in any order. The resultant tuple re-orders the kwargs to match the order in the class definition.
For example, consider this `NamedTuple`:
```py
class SessionQueueValueToInsert(NamedTuple):
foo: str
bar: str
```
When instantiating it, no matter the order of the kwargs, if you make a normal tuple out of it, the tuple values are in the same order as in the class definition:
```
t1 = SessionQueueValueToInsert(foo="foo", bar="bar")
print(tuple(t1)) # -> ('foo', 'bar')
t2 = SessionQueueValueToInsert(bar="bar", foo="foo")
print(tuple(t2)) # -> ('foo', 'bar')
```
So, in the old code, when we used the `NamedTuple`, it implicitly normalized the order of the values we insert into the DB.
In the retry logic, the values of the tuple were not ordered correctly, but the use of `NamedTuple` had secretly fixed the order for us.
In the linked PR, `NamedTuple` was dropped for a normal tuple, after profiling showed `NamedTuple` to be meaningfully slower than a normal tuple.
The implicit order normalization behaviour wasn't understood, and the order wasn't fixed when changin the retry logic to use a normal tuple instead of `NamedTuple`. This results in a bug where we incorrectly create queue items in the DB. For example, we stored the `destination` in the `field_values` column.
When such an incorrectly-created queue item is dequeued, it fails pydantic validation and causes what appears to be an endless loop of errors.
The only user-facing solution is to add this line to `invokeai.yaml` and restart the app:
```yaml
clear_queue_on_startup: true
```
On next startup, the queue is forcibly cleared before the error loop is triggered. Then the user should remove this line so their queue is persisted across app launches per usual.
The solution is simple - fix the ordering of the tuple. I also added a type annotation and comment to the tuple type alias definition.
Note: The endless error loop, as a general problem, will take some thinking to fix. The queue service methods to cancel and fail a queue item still retrieve it and parse it. And the list queue items methods parse the queue items. Bit of a catch 22, maybe the solution is to simply delete totally borked queue items and log an error.
Currently translated at 98.7% (1800 of 1822 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.7% (1798 of 1820 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.7% (1796 of 1818 strings)
Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
There is now a single entrypoint for loading a workflow - `useLoadWorkflowWithDialog`.
The hook:
Handles loading workflows from various sources. If there are unsaved changes, the user will be prompted to confirm before loading the workflow.
It returns a function that:
Loads a workflow from various sources. If there are unsaved changes, the user will be prompted to confirm before loading the workflow. The workflow will be loaded immediately if there are no unsaved changes. On success, error or completion, the corresponding callback will be called.
WHEW
- Replace `get_counts` method with `get_tag_counts_with_filter` which gets the counts for a list of tags, filtering by a list of selected tags
- Update `get_many` logic to apply tag filtering with AND logic, to match the new `get_tag_counts_with_filter` method
- Update workflow library router
User facing:
When a FLUX main model is selected, users may now add Regional Reference Image layers.
When switching between FLUX Redux and FLUX IP Adapter, the settings will change to match the model type. (IP Adapter has weight, begin/end step, but Redux does not.) The image will be retained when switching between the two.
Otherwise it works the same way as IP Adapter - both in Global and Regional Reference Image layers.
---
Internal state handling:
Slightly awkward, but it was easiest to make FLUX Redux a second type of IP Adapter in redux state.
Global and regional reference images still have a single `ipAdapter` field, but it can have a type of `ip_adapter` or `flux_redux`.
Ideally, this field is called `config` or `settings` or something, but we are past that point. We _could_ do a migration to rename it, but I don't think it's worth the effort.
---
Other changes:
- Updated canvas layer validators to handle FLUX Redux.
- Updated model list loading logic to un-set FLUX Redux models in Canvas if they are not in the list (e.g. if the user deletes the model in the main app).
- Updated graph builders - new `addFLUXRedux` util & updated `addRegions` util.
- Updated the `buildModelsHook` util to return a hook that accepts a filter callback. This handles a discrepancy: FLUX IP Adapter does not support regional guidance, but FLUX Redux does. The Regional Guidance settings provide the filter to filter out FLUX IP Adapter models from the combined list of IP Adapter ahd Redux models.
This follows the same pattern for IP Adapter w/ its CLIP Vision model. The SigLIP model is unlikely to ever change and we don't want to force the user to select it anywhere. Hardcoding it is safe and makes the UX much nicer.
The alternative is a model dropdown that will likely only ever have one valid choice in it.
- We don't need to copy the init file. Just crawl the custom nodes dir for modules and import them all. Dunno why I didn't do this initially.
- Pass the logger in as an arg. There was a race condition where if we got the logger directly in the load_custom_nodes function, the config would not have been loaded fully yet and we'd end up with the wrong custom nodes path!
- Remove permissions-setting logic, I do not believe it is relevant for custom nodes
- Minor cleanup of the utility
There's a pydantic thing that causes the graphs to fail validation erroneously. Details in the comments - not a high priority to fix but we should figure it out someday.
This method simply sets the `opened_at` attribute to the current time.
Previously `opened_at` was set when calling `get`, but that is not correct. We `get` workflows often, even when not opening them. So this needs to be a separate thing
Get the counts of workflows for the given tags and/or categories. Made a separate method bc get_many will deserialize all matching workflows, which is unnecessary for this use case.
This big chungus reworks and simplifies much of the logic around loading and saving workflows. It also makes some minor changes to how store the current workflow and determine if it is a draft, user workflow or default workflow.
---
The lower-level hooks to save a workflow have been revised:
- `useSaveLibraryWorkflow`: Saves a user or project workflow that has had changes made to it.
- `useCreateNewWorkflow`: Saves a workflow as a new entity.
A new higher-level hook `useSaveOrSaveAsWorkflow` is intended to be used by components. It returns a single function that:
- Constructs the workflow payload to be sent to the server
- Checks if the workflow is an existing user workflow. If so, it immediately saves (updates) that workflow.
- If it's not an existing user workflow, it opens the save as dialog so the user can choose a name for it and create a new workflow. This occurs for both draft workflows and loaded default workflows.
---
The logic to build the current redux state into a workflow - either to be saved as JSON, to update an existing user workflow, or save as - was a bit convoluted.
Changes to redux state triggered a debounced function to build the workflow, setting it in a global nanostores atom. Then, all of the functions that consumed the "built workflow" referenced this atom.
Now, this logic is strictly imperative. When a consumer wants to save a workflow, we build it on the spot. This removes a layer of indirection.
The logic is in the `useBuildWorkflowFast` hook.
---
The logic for loading a workflow is also revised. Previously, it happened in an RTK listener. You'd need to dispatch an action to load a workflow, and wouldn't know if it succeeded or not (though the listener would make a toast if the load failed).
This is now done in a callback, outside redux middleware. The callback is returned from the `useLoadWorkflow` hook.
---
Previously, we stripped the id from default workflows when loading them. Then, when saving the workflow, we built a workflow object from redux state and hit the API with it.
This has two issues:
- It relies on redux state never having an ID set when a default workflow is loaded. If we somehow ended up with a default workflow's ID in redux, when we go to save the workflow, we'd get and error or it wouldn't work, because you cannot save a default workflow. You can only save-as it.
- We do not know the default workflow from which the current workflow was loaded. And be cause we don't know the default workflow, we cannot show a thumbnail image.
The responsibilities have been shifted around a bit.
Now, when we load a workflow, we load it as-is. The default workflow IDs are saved in redux state. We can render the thumbnail, and if the user goes to save the workflow, we detect that it is a default workflow and save-as it.
---
In `App.tsx`, the long list of modals are moved into their own "isolator" component to ensure any re-renders there do not affect the rest of the app.
---
The save-workflow-as modal is restructured to be a bit simpler. Still works the same. On commercial, "save to project" will be enabled by default.
---
The workflow JSON tab uses a debounced version of "buildWorkflow" to build the workflow as JSON.
---
`buildWorkflowFast` is updated to deep-copy its _whole_ output, preventing issues where field types could accidentally get mutated. I don't think this has ever happened but we may as well be safe.
---
Fixed an issue where the edit button in the workflow list didn't open the workflow in edit mode.
It's only by misunderstanding the pydantic API that this field was is typed as optional. Workflows must _always_ have a category, and indeed they do.
Fixing this allows the generated types in the frontend to be easier to work with..
There was a bit of wonk with default workflows. On every app startup, we wiped them all out and recreated them with new IDs. This is a quick-and-dirty way to ensure default workflows are always in sync.
Unfortunately, it also means default workflows are newly-created entities on every app load. Any thumbnails associated to them will be lost (bc they have new IDs), and `updated_at` doesn't work.
This changes makes default workflows stable entities.
The workflows we bundle in the python package in JSON format are still the source of truth for default workflows, but the startup logic that syncs them to the user DB is a bit smarter.
- All bundled workflows have an ID. It is prefixed with "default_" for clarity.
- Any default workflows in the user's DB that are not in the bundled default workflows are deleted from the DB.
- Any bundled default workflows that are not in the user's DB are added to the DB.
- If a default workflow in the user's DB does not match the content of its corresponding bundled workflow, it is updated in the DB.
The end result is that default workflows are still kept in sync for the user, but they don't change their identity.
We may now add thumbnails to default workflows, and sorting by `updated_at` is now meaningful.
## Summary
Upgrade ruff version to 0.9.9 and format existing code.
## Related Issues / Discussions
<!--WHEN APPLICABLE: List any related issues or discussions on github or
discord. If this PR closes an issue, please use the "Closes #1234"
format, so that the issue will be automatically closed when the PR
merges.-->
## QA Instructions
<!--WHEN APPLICABLE: Describe how you have tested the changes in this
PR. Provide enough detail that a reviewer can reproduce your tests.-->
## Merge Plan
<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->
## Checklist
- [ ] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
This allows our tests to run in an isolated environment. For tests taht implicitly depend on import behaviour, this can prevent side-effects.
The function should only be used for tests.
by adding a layer with all the pytorch dependencies that don't change
most of the time.
## Summary
Every time the [`main` docker
images](https://github.com/invoke-ai/InvokeAI/pkgs/container/invokeai)
rebuild and I pull `main-cuda`, it gets another 3+ GB, which seems like
about a zillion times too much since most things don't change from one
commit on `main` to the next.
This is an attempt to follow the guidance in [Using uv in Docker:
Intermediate
Layers](https://docs.astral.sh/uv/guides/integration/docker/#intermediate-layers)
so there's one layer that installs all the dependencies—including
PyTorch with its bundled nvidia libraries—_before_ the project's own
frequently-changing files are copied in to the image.
## Related Issues / Discussions
- [Improved docker layer cache with
uv](https://discord.com/channels/1020123559063990373/1329975172022927370)
- [astral: Can `uv pip install` torch, but not `uv sync`
it](https://discord.com/channels/1039017663004942429/1329986610770612347)
## QA Instructions
Hopefully the CI system building the docker images is sufficient.
But there is one change to `pyproject.toml` related to xformers, so it'd
be worth checking that `python -m xformers.info` still says it has
triton on the platforms that expect it.
## Merge Plan
I don't expect this to be a disruptive merge.
(An earlier revision of this PR moved the venv, but I've reverted that
change at ebr's recommendation.)
## Checklist
- [ ] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
SQLite cursors are meant to be lightweight and not reused. For whatever reason, we reuse one per service for the entire app lifecycle.
This can cause issues where a cursor is used twice at the same time in different transactions.
This experiment makes the session queue use a fresh cursor for each method, hopefully fixing the issue.
This allows tags to be invalidated while mutations are executing, resolving an issue in this situation:
- A long-running mutation starts.
- A tag is invalidated; for example, user edits a board name, and the boards list query tag is invalidated.
- The boards list query isn't fired, and the board name isn't updated.
- The long-running mutation finishes.
- Finally, the boards list query fires and the board name is updated.
This is the "delayed" behaviour. The "immediately" behaviour has the fires requests from tag invalidation immediately, without waiting for all mutations to finish.
It may cause extra network requests and stale data if we are mutating a lot of things very quickly. I don't think it will be an issue in practice and the improved responsiveness will be a net benefit.
Rely on WAL mode and the busy timeout.
Also changed:
- Remove extraneous rollbacks when we were only doing a `SELECT`
- Remove try/catch blocks that were made extraneous when removing the extraneous rollbacks
This allows for read and write concurrency without using a global mutex. Operations may still fail they take longer than the busy timeout (5s).
If we get a database lock error after waiting 5s for an operation, we have a problem. So, I think it's actually better to use a busy timeout instead of a global mutex.
Alternatively, we could add a timeout to the global mutex.
Fixes an issue where fields like control weight on ControlNet nodes and image on IP Adapter nodes didn't render.
These are "single or collection" fields. They accept a single input object, or collection. They are supposed to render the UI input for a single object.
In a7a71ca935 a performance optimisation for a hot code-path inadvertently broke this.
The determination of which UI component to render for a given field was done using a type guard function for the field's template. Previously, this used a zod schema to parse the template. This is very slow, especially when the template was not the expected type.
The optimization changed the type guards to check the field name (aka its type, integer, image, etc) and cardinality directly, without any zod parsing.
It's much faster, but subtly changed the behaviour because it was a bit stricter. For some fields, it rejected "single or collection" cardinalities when it should have accepted them.
When these fields - like the aforementioned Control Weight and Image - were being rendered, none of the type guards passed and they rendered nothing.
The fix here updates the type guard functions to support multiple cardinalities. So now, when we go to render a "single or collection" field, we will render the "single" input component as it should be.
## Summary
This PR adds a `pytorch_cuda_alloc_conf` config flag to control the
torch memory allocator behavior.
- `pytorch_cuda_alloc_conf` defaults to `None`, preserving the current
behavior.
- The configuration options are explained here:
https://pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf.
Tuning this configuration can reduce peak reserved VRAM and improve
performance.
- Setting `pytorch_cuda_alloc_conf: "backend:cudaMallocAsync"` in
`invokeai.yaml` is expected to work well on many systems. This is a good
first step for those looking to tune this config. (We may make this the
default in the future.)
- The optimal configuration seems to be dependent on a number of factors
such as device version, VRAM, CUDA kernel version, etc. For now, users
will have to experiment with this config to see if it hurts or helps on
their systems. In most cases, I expect it to help.
### Memory Tests
```
VAE decode memory usage comparison:
- SDXL, fp16, 1024x1024:
- `cudaMallocAsync`: allocated=2593 MB, reserved=3200 MB
- `native`: allocated=2595 MB, reserved=4418 MB
- SDXL, fp32, 1024x1024:
- `cudaMallocAsync`: allocated=3982 MB, reserved=5536 MB
- `native`: allocated=3982 MB, reserved=7276 MB
- SDXL, fp32, 1536x1536:
- `cudaMallocAsync`: allocated=8643 MB, reserved=12032 MB
- `native`: allocated=8643 MB, reserved=15900 MB
```
## Related Issues / Discussions
N/A
## QA Instructions
- [x] Performance tests with `pytorch_cuda_alloc_conf` unset.
- [x] Performance tests with `pytorch_cuda_alloc_conf:
"backend:cudaMallocAsync"`.
## Merge Plan
- [x] Merge #7668 first and change target branch to `main`
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
## Summary
Prior to this PR, most of the app setup was being done in `api_app.py`
at import time. This PR cleans this up, by:
- Splitting app setup into more modular functions
- Narrower responsibility for the `api_app.py` file - it just
initializes the `FastAPI` app
The main motivation for this changes is to make it easier to support an
upcoming torch configuration feature that requires more careful ordering
of app initialization steps.
## Related Issues / Discussions
N/A
## QA Instructions
- [x] Launch the app via invokeai-web.py and smoke test it.
- [ ] Launch the app via the installer and smoke test it.
- [x] Test that generate_openapi_schema.py produces the same result
before and after the change.
- [x] No regression in unit tests that directly interact with the app.
(test_images.py)
## Merge Plan
- [x] Check to see if there are any commercial implications to modifying
the app entrypoint.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
On the Canvas tab, when we made the network request to enqueue a batch, we were immediately resetting the request. This effectively disabled RTKQ's tracking of the request - including the loading state.
As a result, when you click the Invoke button on the Canvas tab, it didn't show a spinner, and it was not clear that anything was happening.
The solution is simple - just await the enqueue request before resetting the tracking, same as we already did on the workflows and upscaling tabs.
I also added some extra logging messages for enqueuing, so we get the same JS console logs for each tab on success or failure.
Currently translated at 40.3% (727 of 1801 strings)
translationBot(ui): update translation (Japanese)
Currently translated at 37.7% (680 of 1801 strings)
Co-authored-by: Hiroto N <hironow365@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/ja/
Translation: InvokeAI/Web UI
Previously, custom node loading occurred _during module imports_. A consequence of this is that when a custom node import fails (e.g. its type clobbers an existing node), the app fails to start up.
In fact, any time we import basically anything from the app, we trigger custom node imports! Not good.
This logic is now in its own function, called as the API app starts up.
If a custom node load fails for any reason, it no longer prevents the app from starting up.
One other bonus we get from this is that we can now ensure custom nodes are loaded _after_ core nodes.
Any clobbering that may occur while loading custom nodes is now guaranteed to be a custom node clobbering a core node's type - and not the other way round.
When deleting a board w/ images, the image usage checking logic was not checking image collection fields. This could result in a nonexistent image lingering in a node.
We already handle single image fields correctly, it's only the image collection fields taht were affected.
Found another place where we deepcopy a dict, but it is safe to mutate.
Restructured the prep logic a bit to support this. Updated tests to use the new structure.
- Avoid pydantic models when dict manipulation works
- Avoid extraneous deep copies when we can safely mutate
- Avoid NamedTuple construct and its overhead
- Fix tests to use altered function signatures
- Remove extraneous populate_graph function
The method and route now supports:
- "none" as a board ID, sentinel value for uncategorized
- Optionally specify image categories
- Optionally specify is_intermediate
This fixes the broken readiness checks introduced in the previous commit.
To support async batch generators, all of the validation of the generators needs to be async. This is problematic because a lot of the validation logic was in redux selectors, which are necessarily synchronous.
To resolve this, the readiness checks and related logic are restructured to be run async in response to redux state changes via `useEffect` (another option is to directly subscribe to redux store). These async functions then set some react state. The checks are debounced to prevent thrashing the UI.
See #7580 for more context about this issue.
Other changes:
- Fix a minor issue where empty collections were also checked against their min and max sizes, and errors were shown for all the checks. If a collection is empty, we don't need to do the min/max checks. If a collection is empty, we skip the other min/max checks and do not report those errors to the user.
- When a field is connected, do not attempt to check its value. This fixes an issue where collection fields with a connection could erroneously appear to be invalid.
- Improved error messages for batch nodes.
Board fields in the workflow editor now default to using the auto-add board by default.
**This is a change in behaviour - previously, we defaulted to no board (i.e. Uncategorized).**
There is some translation needed between the UI field values for a board and what the graph expects.
A "BoardField" is an object in the shape of `{board_id: string}`.
Valid board field values in the graph:
- undefined
- a BoardField
Value UI values and their mapping to the graph values:
- 'none' -> undefined
- 'auto' -> BoardField for the auto-add board, or if the auto-add board is Uncategorized, undefined
- undefined -> undefined (this is a fallback case with the new logic)
- a BoardField -> the same BoardField
Currently translated at 98.9% (1737 of 1755 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.9% (1735 of 1753 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.9% (1731 of 1749 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.9% (1731 of 1749 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.6% (1726 of 1749 strings)
Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
All but the core `vitest` package were updated recently. Tests still ran but the test UI dashboard didn't. After updating, all tests still run, seems fine.
Also tested building in app and package mode.
- Support transparency w/ color picker. To do this, we need to hide the bg layer before sampling. In testing, this has a negligible performance impact.
- Add an RGBA value readout next to the color picker ring.
Unfortunately I couldn't reliably reproduce the issue, so I'm not 100% sure this fixes it. But I think there is a race condition that results in `updateCompositingRectSize` erroneously seeing the layer has no objects and skipping the update.
To address this, the compositing rect fill/size/pos are all now force-updated when the fill/objects are changed. Theoretically it should be impossible for the issue to occur now.
- Fix an issue where the cursor disappeared when selecting a non-renderable entity. For example, when selecting a reference image layer and certain tools, the cursor would disappear.
- Ensure color picker works no matter what layer types are selected.
The logic for showing/hiding the cursor needed to be rearranged a bit for this fix.
Retrying a queue item means cloning it, resetting all execution-related state. Retried queue items reference the item they were retried from by id. This relationship is not enforced by any DB constraints.
- Add `retried_from_item_id` to `session_queue` table in DB in a migration.
- Add `retry_items_by_id` method to session queue service. Accepts a list of queue item IDs and clones them (minus execution state). Returns a list of retried items. Items that are not in a canceled or failed state are skipped.
- Add `retry_items_by_id` HTTP endpoint that maps 1-to-1 to the queue service method.
- Add `queue_items_retried` event, which includes the list of retried items.
- Optimize component and hook structure for input fields to reduce rerenders of component tree
- Remove memoization on some selectors where it serves no purpose (bc the object will have a stable identity until it changes, at which point we need to re-render anyways)
- Shift the connection error selector logic around to rely more on the stable identity of pending connection objects
including just invokeai/version seems sufficient to appease uv sync here. including everything else would invalidate the cache we're trying to establish.
- Simplify and de-insane-ify component structure, hooks, selectors, etc.
- Some perf improvements by using data attributes for styling instead of dynamic CSS-in-JS.
- Add field notes and start of linear view config, got blocked when I ran into deeper layout issues that made it very difficult to handle field configs. So those are WIP in this commit.
Currently translated at 98.9% (1735 of 1753 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.9% (1731 of 1749 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.9% (1731 of 1749 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.6% (1726 of 1749 strings)
Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
Currently translated at 99.2% (1695 of 1708 strings)
translationBot(ui): update translation (Italian)
Currently translated at 99.2% (1692 of 1705 strings)
translationBot(ui): update translation (Italian)
Currently translated at 99.2% (1691 of 1704 strings)
Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
## Summary
This PR adds support for the FLUX LoRA model format produced by
OneTrainer.
Specifically, this PR adds:
- Support for DoRA patches
- Support for patch models that modify the FLUX T5 encoder
- Probing / loading support for OneTrainer models
## Known limitations
- DoRA patches cannot currently be applied to base weights that are
quantized with `bitsandbytes`. The DoRA algorithm requires accessing the
original model weight in order to compute the patch diff, and the
bitsandbytes quantization layers make this difficult. DoRA patches can
be applied to non-quantized and GGUF-quantized layers without issue.
- This PR results in a slight speed regression for a very particular
inference combination: quantized base model + LoRA with diffusers keys
(i.e. uses the `MergedLayerPatch`). Now that more LoRA formats are using
the `MergedLayerPatch`, it was becoming too much work to maintain this
optimization. Regression from ~1.7 it/s to ~1.4 it/s.
## Future Notes
- We may want to consider dropping support for bitsandbytes
quantization. It is very difficult to maintain compatibility for across
features like partial-loading and LoRA patching.
- At a future time, we should refactor the LoRA parsing logic to be more
generalized rather than handling each format independently.
- There are some redundant device casts and dequantizations in
`autocast_linear_forward_sidecar_patches(...)` (and its sub-calls).
Optimizing this is left for future work.
## Related Issues / Discussions
- This PR should address a handful of the LoRAs reported in
https://github.com/invoke-ai/InvokeAI/issues/7131 (specifically, most of
the `envy*` LoRAs).
- This PR should address the example in
https://github.com/invoke-ai/InvokeAI/issues/6912 (though the intended
effect of that LoRA is not totally clear, so its hard to verify with
full confidence).
## QA Instructions
OneTrainer test models:
-
https://civitai.com/models/844821/envy-flux-dark-watercolor-01?modelVersionId=945159
(DoRA, transformer only)
-
https://civitai.com/models/836757/envy-flux-digital-brush-01?modelVersionId=936167
(hada, transformer only)
- ball_flux from https://github.com/invoke-ai/InvokeAI/issues/6912
(DoRA, transformer/clip/t5)
The following tests were repeated with each of the OneTrainer test
models:
- [x] Test with non-quantized base model
- [x] Test with GGUF-quantized base model
- [x] Test with BnB-quantized base model
- [x] Test with non-quantized base model that is partially-loaded onto
the GPU
Other regression test:
- [x] Test some SD1 LoRAs
- [x] Test some SDXL LoRAs
- [x] Test a variety of existing FLUX LoRA formats
- [x] Test a FLUX Control LoRA on all base model quantization formats.
## Merge Plan
No special instructions.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
## Summary
This PR fixes an issue with mask dimension consistency. Prior to this
change, the following workflow would fail with `tuple out of range`
error:
<img width="1072" alt="image"
src="https://github.com/user-attachments/assets/d0a9e658-1d64-4db4-adee-973bbdaca745"
/>
### Before this PR
Dimension compatibility for invocations that take a mask input:
- `ApplyMaskTensorToImageInvocation`: 2 or 3
- `MaskTensorToImageInvocation`: 2 or 3
- `InvertTensorMaskInvocation`: 3
Mask dimension for invocations that produce a MaskOutput:
- `RectangleMaskInvocation`: 3
- `AlphaMaskToTensorInvocation`: 3
- `InvertTensorMaskInvocation`: 3
- `ImageMaskToTensorInvocation`: 3
- `SegmentAnythingInvocation`: 2
### After this PR (changes in bold)
Dimension compatibility for invocations that take a mask input:
- `ApplyMaskTensorToImageInvocation`: 2 or 3
- `MaskTensorToImageInvocation`: 2 or 3
- `InvertTensorMaskInvocation`: **2 or 3** <----------------
Mask dimension for invocations that produce a MaskOutput:
- `RectangleMaskInvocation`: 3
- `AlphaMaskToTensorInvocation`: 3
- `InvertTensorMaskInvocation`: 3
- `ImageMaskToTensorInvocation`: 3
- `SegmentAnythingInvocation`: **3** <-------------------
## QA Instructions
I tested the workflow in the PR description and this workflow:
<img width="872" alt="image"
src="https://github.com/user-attachments/assets/20496860-ce81-47c0-a46a-a611b73faa22"
/>
## Merge Plan
No special instructions.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
Currently translated at 100.0% (1697 of 1697 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 99.2% (1684 of 1697 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 99.7% (1676 of 1681 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 99.3% (1670 of 1681 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 99.5% (1658 of 1666 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 100.0% (1652 of 1652 strings)
Co-authored-by: Linos <linos.coding@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/vi/
Translation: InvokeAI/Web UI
Dynamic prompts string generators can cause an infinite feedback loop when added to the linear view.
The root cause is how these generators handle "resolving" their collections. They hit the dynamic prompts HTTP API within the view component to get the prompts, then set the batch node's internal state with those values.
When the same generator is rendered in both the node editor view and linear view and the timing is just right, that state update causes an infinite feedback loop between the two components as they respond to the state updates from the other component.
The other generators never store the generated values in the batch node's internal state. The values are "resolved" just-in-time as they are needed.
To fix this, the batch value "resolver" utilities could be made async and hit the API. But there's a problem - the resolver utilities are used within the "are we ready to invoke? are there any problems with the current settings?" redux selectors, which are strictly synchronous. To fix that, we can refactor that "are we ready to invoke?" logic to not use redux selectors, so the whole thing could be async.
It's not a big change but I'm not going to spend time on it at the moment.
So, until I address this, the dynamic prompts generators are disabled.
- Add JS Mersenne Twister implementation dependency to use as seeded PRNG. This is not a cryptographically secure algorithm.
- Add nullish seed field to float and integer random generators.
- Add UI to control the seed.
- When seed is not set, behaviour is unchanged - the values are randomized when you Invoke. When seed is set, the random distribution is deterministic depending on the seed. In this case, we can display the values to the user.
Unfortunately we cannot do strict floats or ints.
The batch data models don't specify the value types, it instead relies on pydantic parsing. JSON doesn't differentiate between float and int, so a float `1.0` gets parsed as `1` in python.
As a result, we _must_ accept mixed floats and ints for BatchDatum.items.
Tests and validation updated to handle this.
Maybe we should update the BatchDatum model to have a `type` field? Then we could parse as float or int, depending on the inputs...
## Summary
This PR revises the logic for calculating the model cache RAM limit. See
the code for thorough documentation of the change.
The updated logic is more conservative in the amount of RAM that it will
use. This will likely be a better default for more users. Of course,
users can still choose to set a more aggressive limit by overriding the
logic with `max_cache_ram_gb`.
## Related Issues / Discussions
- Should help with https://github.com/invoke-ai/InvokeAI/issues/7563
## QA Instructions
Exercise all heuristics:
- [x] Heuristic 1
- [x] Heuristic 2
- [x] Heuristic 3
- [x] Heuristic 4
## Merge Plan
- [x] Merge https://github.com/invoke-ai/InvokeAI/pull/7565 first and
update the target branch
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
## Summary
This PR adds a `keep_ram_copy_of_weights` config option the default (and
legacy) behavior is `true`. The tradeoffs for this setting are as
follows:
- `keep_ram_copy_of_weights: true`: Faster model switching and LoRA
patching.
- `keep_ram_copy_of_weights: false`: Lower average RAM load (may not
help significantly with peak RAM).
## Related Issues / Discussions
- Helps with https://github.com/invoke-ai/InvokeAI/issues/7563
- The Low-VRAM docs are updated to include this feature in
https://github.com/invoke-ai/InvokeAI/pull/7566
## QA Instructions
- Test with `enable_partial_load: false` and `keep_ram_copy_of_weights:
false`.
- [x] RAM usage when model is loaded is reduced.
- [x] Model loading / unloading works as expected.
- [x] LoRA patching still works.
- Test with `enable_partial_load: false` and `keep_ram_copy_of_weights:
true`.
- [x] Behavior should be unchanged.
- Test with `enable_partial_load: true` and `keep_ram_copy_of_weights:
false`.
- [x] RAM usage when model is loaded is reduced.
- [x] Model loading / unloading works as expected.
- [x] LoRA patching still works.
- Test with `enable_partial_load: true` and `keep_ram_copy_of_weights:
true`.
- [x] Behavior should be unchanged.
- [x] Smoke test CPU-only and MPS with default configs.
## Merge Plan
- [x] Merge https://github.com/invoke-ai/InvokeAI/pull/7564 first and
change target branch.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
## Summary
Prior to this change, there were several cases where we initialized the
weights of a FLUX model before loading its state dict (and, to make
things worse, in some cases the weights were in float32). This PR fixes
a handful of these cases. (I think I found all instances for the FLUX
family of models.)
## Related Issues / Discussions
- Helps with https://github.com/invoke-ai/InvokeAI/issues/7563
## QA Instructions
I tested that that model loading still works and that there is no
virtual memory reservation on model initialization for the following
models:
- [x] FLUX VAE
- [x] Full T5 Encoder
- [x] Full FLUX checkpoint
- [x] GGUF FLUX checkpoint
## Merge Plan
No special instructions.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
Previously, when previewing a filter on a layer with some transparency or a filter that changes the alpha, the preview was rendered on top of the layer. The preview blended with the layer, which isn't right.
In this change, the layer is hidden during the preview, and when the filter finishes (having been applied or canceled - the two possible paths), the layer is shown.
Technically, we are hiding and showing the layer's object renderer's konva group, which contains the layer's "real" data.
Another small change was made to prevent a flash of empty layer, by waiting to destroy a previous filter preview image until the new preview image is ready to display.
Due to the limited floating point precision, and konva's `scale` properties, it is possible for the relative rect of an object to have non-integer coordinates and dimensions.
When we go to rasterize and otherwise export images, the HTML canvas API truncates these numbers.
So, we can end up with situations where the relative width and height of a layer are very close to the "real" value, but slightly off.
For example, width and height might be 512px, but the relative rect is calculated to be something like 512.000000003 or 511.9999999997.
In the first case, the truncation results in 512x512 for the dimensions - which is correct. But in the second case, it results in 511x511!
One place where this causes issues is the image action `New Canvas from image -> As Raster Layer (resize)`. For certain input image sizes, this results in an incorrectly resized image. For example, a 1496x1946 input image is resized to 511x511 pixels when the bbox is 512x512.
To fix this, we can round both coords and dimensions of rects when rasterizing.
I've thought through the implications and done some testing. I believe this change will not cause any regressions and only fix edge cases. But, it's possible that something was inadvertently relying on the old behavior.
There's a bug where preset image tooltips get stuck open in the list.
After much fiddling, debugging, and review of upstream dependencies, I have determined that this is bug in Chakra-UI v2.
Specifically, it appears to be a race condition related to the Tooltip component's internal use of the `useDisclosure` hook to manage tooltip open state, and the react render cycle.
Unfortunately, Chakra v2 is no longer being updated, and it's a pain in the butt to vendor and fix that component given its dependencies. Not 100% sure I could easily fix it, anyways.
Fortunately, there is a workaround - reduce the tooltip openDelay to 0ms. I prefer the current 500ms delay but I think it's preferable to have too-quick tooltips than too-sticky tooltips...
## Summary
Changes:
- Deprecate `ram` and `vram` configs. If these are set in invokeai.yaml,
they will be ignored.
- Create new `max_cache_ram_gb` and `max_cache_vram_gb` configs with the
same definitions as the old configs.
The main motivation of this change is to make the migration path
smoother for users who had previously added `ram` /`vram` to their
config files. Now, these users will be automatically migrated into the
new dynamic limit behavior (which is better in most cases). These users
will have to manually re-add `max_cache_ram_gb` and `max_cache_vram_gb`
to their configs if they wish to go back to specifying manual limits.
## Related Issues / Discussions
See the release notes for RC v5.6.0rc1 for the old migration behavior
that we are trying to improve:
https://github.com/invoke-ai/InvokeAI/releases/tag/v5.6.0rc1
## QA Instructions
- [x] Test that if `ram` or `vram` are present in a user's
`invokeai.yaml`, these values are ignored.
- [x] Test that `max_cache_ram_gb` and `max_cache_vram_gb` are applied,
if set.
## Merge Plan
- Don't forget to update the RC release notes accordingly.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
## Summary
This PR contains a bugfix for an edge case with model unloading (from
VRAM to RAM). Thanks to @JPPhoto for finding it.
The bug was triggered under the following conditions:
- A GGML-quantized model is loaded in VRAM
- We run a Spandrel image-to-image invocation (which is wrapped in a
`torch.inference_mode()` context manager.
- The model cache attempts to unload the GGML-quantized model from VRAM
to RAM.
- Doing this inside of the `torch.inference_mode()` cm results in the
following error:
```
[2025-01-07 15:48:17,744]::[InvokeAI]::ERROR --> Error while invoking session 98a07259-0c03-4111-a8d8-107041cb86f9, invocation d8daa90b-7e4c-4fc4-807c-50ba9be1a4ed (spandrel_image_to_image): Cannot set version_counter for inference tensor
[2025-01-07 15:48:17,744]::[InvokeAI]::ERROR --> Traceback (most recent call last):
File "/home/ryan/src/InvokeAI/invokeai/app/services/session_processor/session_processor_default.py", line 129, in run_node
output = invocation.invoke_internal(context=context, services=self._services)
File "/home/ryan/src/InvokeAI/invokeai/app/invocations/baseinvocation.py", line 300, in invoke_internal
output = self.invoke(context)
File "/home/ryan/.pyenv/versions/3.10.14/envs/InvokeAI_3.10.14/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
File "/home/ryan/src/InvokeAI/invokeai/app/invocations/spandrel_image_to_image.py", line 167, in invoke
with context.models.load(self.image_to_image_model) as spandrel_model:
File "/home/ryan/src/InvokeAI/invokeai/backend/model_manager/load/load_base.py", line 60, in __enter__
self._cache.lock(self._cache_record, None)
File "/home/ryan/src/InvokeAI/invokeai/backend/model_manager/load/model_cache/model_cache.py", line 224, in lock
self._load_locked_model(cache_entry, working_mem_bytes)
File "/home/ryan/src/InvokeAI/invokeai/backend/model_manager/load/model_cache/model_cache.py", line 272, in _load_locked_model
vram_bytes_freed = self._offload_unlocked_models(model_vram_needed, working_mem_bytes)
File "/home/ryan/src/InvokeAI/invokeai/backend/model_manager/load/model_cache/model_cache.py", line 458, in _offload_unlocked_models
cache_entry_bytes_freed = self._move_model_to_ram(cache_entry, vram_bytes_to_free)
File "/home/ryan/src/InvokeAI/invokeai/backend/model_manager/load/model_cache/model_cache.py", line 330, in _move_model_to_ram
return cache_entry.cached_model.partial_unload_from_vram(
File "/home/ryan/.pyenv/versions/3.10.14/envs/InvokeAI_3.10.14/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
File "/home/ryan/src/InvokeAI/invokeai/backend/model_manager/load/model_cache/cached_model/cached_model_with_partial_load.py", line 182, in partial_unload_from_vram
cur_state_dict = self._model.state_dict()
File "/home/ryan/.pyenv/versions/3.10.14/envs/InvokeAI_3.10.14/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1939, in state_dict
module.state_dict(destination=destination, prefix=prefix + name + '.', keep_vars=keep_vars)
File "/home/ryan/.pyenv/versions/3.10.14/envs/InvokeAI_3.10.14/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1936, in state_dict
self._save_to_state_dict(destination, prefix, keep_vars)
File "/home/ryan/.pyenv/versions/3.10.14/envs/InvokeAI_3.10.14/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1843, in _save_to_state_dict
destination[prefix + name] = param if keep_vars else param.detach()
RuntimeError: Cannot set version_counter for inference tensor
```
### Explanation
From the `torch.inference_mode()` docs:
> Code run under this mode gets better performance by disabling view
tracking and version counter bumps.
Disabling version counter bumps results in the aforementioned error when
saving `GGMLTensor`s to a state_dict.
This incompatibility between `GGMLTensors` and `torch.inference_mode()`
is likely caused by the custom tensor type implementation. There may
very well be a way to get these to cooperate, but for now it is much
simpler to remove the `torch.inference_mode()` contexts.
Note that there are several other uses of `torch.inference_mode()` in
the Invoke codebase, but they are all tight wrappers around the
inference forward pass and do not contain the model load/unload process.
## Related Issues / Discussions
Original discussion:
https://discord.com/channels/1020123559063990373/1149506274971631688/1326180753159094303
## QA Instructions
Find a sequence of operations that triggers the condition. For me, this
was:
- Reserve VRAM in a separate process so that there was ~12GB left.
- Fresh start of Invoke
- Run FLUX inference with a GGML 8K model
- Run Spandrel upscaling
Tests:
- [x] Confirmed that I can reproduce the error and that it is no longer
hit after the change
- [x] Confirm that there is no speed regression from switching from
`torch.inference_mode()` to `torch.no_grad()`.
- Before: `50.354s`, After: `51.536s`
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
Currently translated at 16.5% (273 of 1645 strings)
translationBot(ui): update translation (Polish)
Currently translated at 15.4% (254 of 1645 strings)
translationBot(ui): update translation (Polish)
Currently translated at 10.8% (178 of 1645 strings)
Co-authored-by: Nik Nikovsky <zejdzztegomaila@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/pl/
Translation: InvokeAI/Web UI
Currently translated at 100.0% (1649 of 1649 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 100.0% (1645 of 1645 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 100.0% (1645 of 1645 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 100.0% (1645 of 1645 strings)
Co-authored-by: Linos <linos.coding@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/vi/
Translation: InvokeAI/Web UI
## Summary
This PR enables RAM/VRAM cache size limits to be determined dynamically
based on availability.
**Config Changes**
This PR modifies the app configs in the following ways:
- A new `device_working_mem_gb` config was added. This is the amount of
non-model working memory to keep available on the execution device (i.e.
GPU) when using dynamic cache limits. It default to 3GB.
- The `ram` and `vram` configs now default to `None`. If these configs
are set, they will take precedence over the dynamic limits. **Note: Some
users may have previously overriden the `ram` and `vram` values in their
`invokeai.yaml`. They will need to remove these configs to enable the
new dynamic limit feature.**
**Working Memory**
In addition to the new `device_working_mem_gb` config described above,
memory-intensive operations can estimate the amount of working memory
that they will need and request it from the model cache. This is
currently applied to the VAE decoding step for all models. In the
future, we may apply this to other operations as we work out which ops
tend to exceed the default working memory reservation.
**Mitigations for https://github.com/invoke-ai/InvokeAI/issues/7513**
This PR includes some mitigations for the issue described in
https://github.com/invoke-ai/InvokeAI/issues/7513. Without these
mitigations, it would occur with higher frequency when dynamic RAM
limits are used and the RAM is close to maxed-out.
## Limitations / Future Work
- Only _models_ can be offloaded to RAM to conserve VRAM. I.e. if VAE
decoding requires more working VRAM than available, the best we can do
is keep the full model on the CPU, but we will still hit an OOM error.
In the future, we could detect this ahead of time and switch to running
inference on the CPU for those ops.
- There is often a non-negligible amount of VRAM 'reserved' by the torch
CUDA allocator, but not used by any allocated tensors. We may be able to
tune the torch CUDA allocator to work better for our use case.
Reference:
https://pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf
- There may be some ops that require high working memory that haven't
been updated to request extra memory yet. We will update these as we
uncover them.
- If a model is 'locked' in VRAM, it won't be partially unloaded if a
later model load requests extra working memory. This should be uncommon,
but I can think of cases where it would matter.
## Related Issues / Discussions
- #7492
- #7494
- #7500
- #7505
## QA Instructions
Run a variety of models near the cache limits to ensure that model
switching works properly for the following configurations:
- [x] CUDA, `enable_partial_loading=true`, all other configs default
(i.e. dynamic memory limits)
- [x] CUDA, `enable_partial_loading=true`, CPU and CUDA memory reserved
in another process so there is limited RAM/VRAM remaining, all other
configs default (i.e. dynamic memory limits)
- [x] CUDA, `enable_partial_loading=false`, all other configs default
(i.e. dynamic memory limits)
- [x] CUDA, ram/vram limits set (these should take precedence over the
dynamic limits)
- [x] MPS, all other default (i.e. dynamic memory limits)
- [x] CPU, all other default (i.e. dynamic memory limits)
## Merge Plan
- [x] Merge #7505 first and change target branch to main
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
## Summary
This PR adds support for partial loading of models onto the GPU. This
enables models to run with much lower peak VRAM requirements (e.g. full
FLUX dev with 8GB of VRAM).
The partial loading feature is enabled behind a new config flag:
`enable_partial_loading=True`. This flag defaults to `False`.
**Note about performance:**
The `ram` and `vram` config limits are still applied when
`enable_partial_loading=True` is set. This can result in significant
slowdowns compared to the 'old' behaviour. Consider the case where the
VRAM limit is set to `vram=0.75` (GB) and we are trying to run an 8GB
model. When `enable_partial_loading=False`, we attempt to load the
entire model into VRAM, and if it fits (no OOM error) then it will run
at full speed. When `enable_partial_loading=True`, since we have the
option to partially load the model we will only load 0.75 GB into VRAM
and leave the remaining 7.25 GB in RAM. This will cause inference to be
much slower than before. To workaround this, it is important that your
`ram` and `vram` configs are carefully tuned. In a future PR, we will
add the ability to dynamically set the RAM/VRAM limits based on the
available memory / VRAM.
## Related Issues / Discussions
- #7492
- #7494
- #7500
## QA Instructions
Tests with `enable_partial_loading=True`, `vram=2`, on CUDA device:
For all tests, we expect model memory to stay below 2 GB. Peak working
memory will be higher.
- [x] SD1 inference
- [x] SDXL inference
- [x] FLUX non-quantized inference
- [x] FLUX GGML-quantized inference
- [x] FLUX BnB quantized inference
- [x] Variety of ControlNet / IP-Adapter / LoRA smoke tests
Tests with `enable_partial_loading=True`, and hack to force all models
to load 10%, on CUDA device:
- [x] SD1 inference
- [x] SDXL inference
- [x] FLUX non-quantized inference
- [x] FLUX GGML-quantized inference
- [x] FLUX BnB quantized inference
- [x] Variety of ControlNet / IP-Adapter / LoRA smoke tests
Tests with `enable_partial_loading=False`, `vram=30`:
We expect no change in behaviour when `enable_partial_loading=False`.
- [x] SD1 inference
- [x] SDXL inference
- [x] FLUX non-quantized inference
- [x] FLUX GGML-quantized inference
- [x] FLUX BnB quantized inference
- [x] Variety of ControlNet / IP-Adapter / LoRA smoke tests
Other platforms:
- [x] No change in behavior on MPS, even if
`enable_partial_loading=True`.
- [x] No change in behavior on CPU-only systems, even if
`enable_partial_loading=True`.
## Merge Plan
- [x] Merge #7500 first, and change the target branch to main
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
## Summary
This is an unplanned fix between PR3 and PR4 in the sequence of partial
loading (i.e. low-VRAM) PRs. This PR restores the 'Current Workaround'
documented in https://github.com/invoke-ai/InvokeAI/issues/7513. In
other words, to work around a flaw in the model cache API, this fix
allows models to be loaded into VRAM _even if_ they have been dropped
from the RAM cache.
This PR also adds an info log each time that this workaround is hit. In
a future PR (#7509), we will eliminate the places in the application
code that are capable of triggering this condition.
## Related Issues / Discussions
- #7492
- #7494
- #7500
- https://github.com/invoke-ai/InvokeAI/issues/7513
## QA Instructions
- Set RAM cache limit to a small value. E.g. `ram: 4`
- Run FLUX text-to-image with the full T5 encoder, which exceeds 4GB.
This will trigger the error condition.
- Before the fix, this test configuration would cause a `KeyError`.
After the fix, we should see an info-level log explaining that the
condition was hit, but that generation should continue successfully.
## Merge Plan
No special instructions.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
Previously, we didn't differentiate between model install errors for different types of model install sources, resulting in a buggy UX:
- If a HF model install failed, but it was a HF URL install and not a repo id install, the link to the HF model page was incorrect.
- If a non-HF URL install (e.g. civitai) failed, we treated it as a HF URL install. In this case, if the user's HF token was invalid or unset, we directed the user to set it. If the HF token was valid, we displayed an empty red toast. If it's not a HF URL install, then of course neither of these are correct.
Also, the logic for handling the toasts was a bit complicated.
This change does a few things:
- Consolidate the model install error toasts into one place - the socket.io event handler for the model install error event. There is no more global state for the toasts and there are no hooks managing them.
- Handling the different cases for errors, including all combinations of HF/non-HF and unauthorized/forbidden/unknown.
This is required to fix an issue with the MM UI's error handling.
Previously, we only included the model source as a string. That could be an arbitrary URL, file path or HF repo id, but the frontend has no parsing logic to differentiate between these different model sources.
Without access to the type of model source, it is difficult to determine how the user should proceed. For example, if it's HF URL with an HTTP unauthorized error, we should direct the user to log in to HF. But if it's a civitai URL with the same error, we should not direct the user to HF.
There are a variety of related edge cases.
With this change, the full `ModelSource` object is included in each model install event, including error events.
I had to fix some circular import issues, hence the import changes to files other than `events_common.py`.
## Summary
This PR is the third in a sequence of PRs working towards support for
partial loading of models onto the compute device (for low-VRAM
operation). This PR updates the LoRA patching code so that the following
features can cooperate fully:
- Partial loading of weights onto the GPU
- Quantized layers / weights
- Model patches (e.g. LoRA)
Note that this PR does not yet enable partial loading. It adds support
in the model patching code so that partial loading can be enabled in a
future PR.
## Technical Design Decisions
The layer patching logic has been integrated into the custom layers (via
`CustomModuleMixin`) rather than keeping it in a separate set of wrapper
layers, as before. This has the following advantages:
- It makes it easier to calculate the modified weights on the fly and
then reuse the normal forward() logic.
- In the future, it makes it possible to pass original parameters that
have been cast to the device down to the LoRA calculation without having
to re-cast (but the current implementation hasn't fully taken advantage
of this yet).
## Know Limitations
1. I haven't fully solved device management for patch types that require
the original layer value to calculate the patch. These aren't very
common, and are not compatible with some quantized layers, so leaving
this for future if there's demand.
2. There is a small speed regression for models that have CPU
bottlenecks. This seems to be caused by slightly slower method
resolution on the custom layers sub-classes. The regression does not
show up on larger models, like FLUX, that are almost entirely
GPU-limited. I think this small regression is tolerable, but if we
decide that it's not, then the slowdown can easily be reclaimed by
optimizing other CPU operations (e.g. if we only sent every 2nd progress
image, we'd see a much more significant speedup).
## Related Issues / Discussions
- https://github.com/invoke-ai/InvokeAI/pull/7492
- https://github.com/invoke-ai/InvokeAI/pull/7494
## QA Instructions
Speed tests:
- Vanilla SD1 speed regression
- Before: 3.156s (8.78 it/s)
- After: 3.54s (8.35 it/s)
- Vanilla SDXL speed regression
- Before: 6.23s (4.46 it/s)
- After: 6.45s (4.31 it/s)
- Vanilla FLUX speed regression
- Before: 12.02s (2.27 it/s)
- After: 11.91s (2.29 it/s)
LoRA tests with default configuration:
- [x] SD1: A handful of LoRA variants
- [x] SDXL: A handful of LoRA variants
- [x] flux non-quantized: multiple lora variants
- [x] flux bnb-quantized: multiple lora variants
- [x] flux ggml-quantized: muliple lora variants
- [x] flux non-quantized: FLUX control LoRA
- [x] flux bnb-quantized: FLUX control LoRA
- [x] flux ggml-quantized: FLUX control LoRA
LoRA tests with sidecar patching forced:
- [x] SD1: A handful of LoRA variants
- [x] SDXL: A handful of LoRA variants
- [x] flux non-quantized: multiple lora variants
- [x] flux bnb-quantized: multiple lora variants
- [x] flux ggml-quantized: muliple lora variants
- [x] flux non-quantized: FLUX control LoRA
- [x] flux bnb-quantized: FLUX control LoRA
- [x] flux ggml-quantized: FLUX control LoRA
Other:
- [x] Smoke testing of IP-Adapter, ControlNet
All tests repeated on:
- [x] cuda
- [x] cpu (only test SD1, because larger models are prohibitively slow)
- [x] mps (skipped FLUX tests, because my Mac doesn't have enough memory
to run them in a reasonable amount of time)
## Merge Plan
No special instructions.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
## Summary
This PR adds utilities to support partial loading of models from CPU to
GPU. The new utilities are not yet being used by the ModelCache, so
there should be no functional behavior changes in this PR.
Detailed changes:
- Add autocast modules that are designed to wrap common
`torch.nn.Module`s and enable them to run with automatic device casting.
E.g. a linear layer on the CPU can be executed with an input tensor on
the GPU by streaming the weights to the GPU at runtime.
- Add unit tests for the aforementioned autocast modules to verify that
they work for all supported quantization formats (GGUF, BnB NF4, BnB
LLM.int8()).
- Add `CachedModelWithPartialLoad` and `CachedModelOnlyFullLoad` classes
to manage partial loading at the model level.
## Alternative Implementations
Several options were explored for supporting inference on
partially-loaded models. The pros/cons of the explored options are
summarized here for reference. In the end, wrapper modules were selected
as the best overall solution for our use case.
Option 1: Re-implement the .forward() methods of modules to add support
for device conversions
- This is the option implemented in this PR.
- This approach is the most manual of the three, but as a result offers
the broadest compatibility with unusual model types. It is manual in
that we have to explicitly add support for all module types that we wish
to support. Fortunately, the list of foundational module types is
relatively small (e.g. the current set of implemented layers covers all
but 0.04 MB of the full FLUX model.).
Option 2: Implement a custom Tensor type that casts tensors to a
`target_device` each time the tensor is used
- This approach has the nice property that it is injected at the tensor
level, and the model does not need to be modified in any way.
- One challenge with this approach is handling interactions with other
custom tensor types (e.g. GGMLTensor). This problem is solvable, but
definitely introduces a layer of complexity. (There are likely to also
be some similar issues with interactions with the BnB quantization, but
I didn't get as far as testing BnB.)
Option 3: Override the `__torch_function__` dispatch calls globally and
cast all params to the execution device.
- This approach is nice and simple: just apply a global context manager
and all operations will happen on the compute device regardless of the
device of the participating tensors.
- Challenges:
- Overriding the `__torch_function__` dispatch calls introduces some
overhead even if the tensors are already on the correct device.
- It is difficult to manage the autocasting context manager. E.g. it is
tempting to apply it to the model's `.forward(...)` method, but we use
some models with non-standard entrypoints. And we don't want to end up
with nested autocasting context managers.
- BnB applies quantization side effects when a param is moved to the GPU
- this interacts in unexpected ways with a global context manager.
## QA Instructions
Most of the changes in this PR should not impact active code, and thus
should not cause any changes to behavior. The main risks come from
bumping the bitsandbytes dependency and some minor modifications to the
bitsandbytes quantization code.
- [x] Regression test bitsandbytes NF4 quantization
- [x] Regression test bitsandbytes LLM.int8() quantization
- [x] Regression test on MacOS (to ensure that there are no lingering
bitsandbytes import errors)
I also tested the new utilities for inference on full models in another
branch to validate that there were not major issues. This functionality
will be tested more thoroughly in a future PR.
## Merge Plan
- [x] #7492 should be merged first so that the target branch can be
updated to main.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
## Summary
This PR tidies up the model cache code in preparation for further
refactoring to support partial loading of models onto the GPU. **These
code changes should not change the functional behavior in any way.**
Changes:
- Remove the `ModelCacheBase` class. `ModelCache` is the only
implementation, so there is no benefit to the separate abstract class.
- Split `CacheRecord` and `CacheStats` out into their own files.
- Remove the `ModelLocker` class. This extra layer of indirection was
not providing any benefit. Locking is now done directly with the
`ModelCache`.
- Tidy up relative imports that were contributing to circular import
issues.
- Pull the 'submodel' concern out of the `ModelCache`. The `ModelCache`
should not need to be aware of the model manager submodel system.
- Delete unused properties from the `ModelCache` (e.g.
`.lazy_offloading`, `.storage_device`, etc.)
## QA Instructions
I ran smoke tests with a variety of SD1, SDXL and FLUX models. No change
to behavior is expected.
## Merge Plan
<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
Uvicorn's logging is rather verbose. This change adds a `log_level_network` config setting to independently control uvicorn's log outputs. The setting defaults to warning.
The change hides the helpful startup message that says the host and port we are running on.
For example: `Uvicorn running on http://0.0.0.0:9090 (Press CTRL+C to quit`
The ASGI lifespan handler is updated to log an equivalent message on startup, regardless of log level settings.
Besides being helpful, the launcher relies on a message like this to launch the app. So, previously, if the user set their log level to anything above info (e.g. warning or error), the launcher would fail to open the app. This change prevents that edge case.
Currently translated at 100.0% (1644 of 1644 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 100.0% (1643 of 1643 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 100.0% (1643 of 1643 strings)
Co-authored-by: Linos <linos.coding@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/vi/
Translation: InvokeAI/Web UI
## Summary
This PR refactors the LoRA handling code to enable the use of FLUX
control LoRAs on top of quantized transformers.
Changes:
- Renamed a bunch of the model patching utilities to reflect that they
are not LoRA-specific
- Improved the unit test coverage.
- Refactored the handling of 'sidecar' patch layers to make them work
with more layer patch types. (This was necessary to get FLUX control
LoRAs working on top of quantized models.)
- Removed `ONNXModelPatcher`. It is out-of-date and hasn't been used in
a while.
## QA Instructions
I completed the following tests.
**These should be repeated after changing the target branch to main.**
**Due to the large surface area of this PR, reviewers should do
regression tests on a range of LoRA formats. There is a risk of
regression on a specific format that was missed during the
refactoring.**
- [x] FLUX Control LoRA + full FLUX transformer
- [x] FLUX Control LoRA + BnB NF4 quantized transformer
- [x] FLUX Control LoRA + GGUF quantized transformer
- [x] FLUX Control LoRA + non-control LoRA + full FLUX transformer
- [x] FLUX Contro LoRA + non-control LoRA + BnB quantized transformer
- [x] FLUX Control LoRA + non-control LoRA + GGUF quantized transformer
- Test the following cases for regression:
- [x] Misc SD1/SDXL LoRA variants (LoRA, LoKr, IA3)
- [x] FLUX, non-quantized, variety of LoRA formats
- [x] FLUX, quantized, variety of LoRA formats
## Merge Plan
**_Don't merge this PR yet._**
Merge plan:
1. First merge brandon/flux-tools-loras into main
2. Change the target branch of this PR to main
3. Review / test / merge this PR
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
- Ensure the currently-rasterizing adapter is reset to `null` on success or failure of a rasterization operation. In case of failure, this prevents the UI from getting stuck with a disabled Invoke button and tooltip message "Canvas is busy (rasterizing)".
- Log the error if there is one.
## Summary
https://github.com/invoke-ai/InvokeAI/issues/7422
As reported in the above ticket, a recent FLUX performance improvement
caused a regression on MacOS. This PR reverts the offending part of the
change.
## Related Issues / Discussions
- Closes#7422
- Original perf improvement:
https://github.com/invoke-ai/InvokeAI/pull/7399
## QA Instructions
I don't have a Mac capable of running this test, so trusting the report
in #7422 that this fixes the problem.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
The `is` operator compares references, not values. Thanks to a wonderfully unintuitive quirk of python, `is` works on integers from `-5` to `256`, inclusive.
Whenever integers in this range are used for a value, internally python returns a reference to a stable object in memory. When integers outside this range are used as a value, python creates a new object in memory for that integer.
See `PyLong_FromLong` documentation here: https://docs.python.org/3/c-api/long.html
Tying this back to our session processor, we were using `is` to compare the queue item ids for equality. Our queue item ids start at 0, and each queue item created increments this by one. So this comparison works only for the first 256 queue items on the machine.
Starting with the 257th queue item, the comparison starts returning `False`, and cancelation gets weird.
Easy fix - use `!=` instead of `is not`.
The "adding to" text indicates if images are going to the gallery or staging area. This info is relevant only to the canvas tab, but was displayed on Upscaling and Workflows tabs. Removed it from those tabs.
A redux selector is used to get the "default" IP Adapter. The selector uses the model list query result to select an IP Adapter model to be preset by default.
The selector is memoized, so if we mutate the returned default IP Adapter state, it mutates the result of the selector for all consumers.
For example, the `image` property of the default IP Adapter selector result is `null`. When we set the `image` property of the selector result while creating an IP Adapter, this does not trigger the selector to recompute its result. We end up setting the image for the selector result directly, and all other consumers now have that same image set.
Solution - we need to clone the selector result everywhere it is used. This was missed in a few spots, causing the issue.
It was easy to misunderstand the empty state for a regional guidance reference image. There was no label, so it seemed like it was the whole region that was empty.
This small change adds the "Reference Image" heading to the empty state, so it's clear that the empty state messaging refers to this reference image, not the whole regional guidance layer.
## Summary
This PR adds support for regional prompting with FLUX.
### Example 1
Global prompt: `An architecture rendering of the reception area of a
corporate office with modern decor.`
<img width="1386" alt="image"
src="https://github.com/user-attachments/assets/c8169bdb-49a9-44bc-bd9e-58d98e09094b">

## QA Instructions
- [x] Test that there is no slowdown in the base case with a single
global prompt.
- [x] Test image fully covered by regional masks.
- [x] Test image covered by region masks with small gaps.
- [x] Test region masks with large unmasked ‘background’ regions
- [x] Test region masks with significant overlap
- [x] Test multiple global prompts.
- [x] Test no global prompt.
- [x] Test regional negative prompts (It runs... but results are not
great. Needs more tuning to be useful.)
- Test compatibility with:
- [x] ControlNet
- [x] LoRA
- [x] IP-Adapter
## Remaining TODO
- [x] Disable the following UI features for FLUX prompt regions:
negative prompts, reference images, auto-negative.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
These helpers consolidate layer validation checks. For example, checking that the layer has content drawn, is compatible with the selected main model, has valid reference images, etc.
There's a technical challenge with outputting these values directly. `ImageField` does not store them, so the batch's `ImageField` collection does not have width and height for each image.
In order to set up the batch and pass along width and height for each image, we'd need to make a network request for each image when the user clicks Invoke. It would often be cached, but this will eventually create a scaling issue and poor user experience.
As a very simple workaround, users can output the batch image output into an `Image Primitive` node to access the width and height.
This change is implemented by adding some simple special handling when parsing the output fields for the `image_batch` node.
I'll keep this situation in mind when extending the batching system to other field types.
- Split up logic to determine reason why the user cannot invoke for each tab.
- Fix issue where the workflows tab would show reasons related to canvas/upscale tab. The tooltip now only shows information relevant to the current tab.
- Add calculation for batch size to the queue count prediction.
- Use a constant for the enqueue mutation's fixed cache key, instead of a string. Just some typo protection.
Currently translated at 42.3% (672 of 1588 strings)
translationBot(ui): update translation (Spanish)
Currently translated at 28.0% (445 of 1588 strings)
Co-authored-by: gallegonovato <fran-carro@hotmail.es>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/es/
Translation: InvokeAI/Web UI
- Add special handling for `ImageBatchInvocation`
- Add input component for image collections, supporting multi-image upload and dnd
- Minor rework of some hooks for accessing node data
The canvas react components pass canvas entity identifiers around, then redux selectors are used to access that entity. This is good for perf - entity states may rapidly change. Passing only the identifiers allows components and other logic to have more granular state updates.
Unfortunately, this design opens the possibility for for an entity identifier to point to an entity that does not exist.
To get around this, I had created a redux selector `selectEntityOrThrow` for canvas entities. As the name implies, it throws if the entity is not found.
While it prevents components/hooks from needing to deal with missing entities, it results in mysterious errors if an entity is missing. Without sourcemaps, it's very difficult to determine what component or hook couldn't find the entity.
Refactoring the app to not depend on this behaviour is tricky. We could pass the entity state around directly as a prop or via context, but as mentioned, this could cause performance issues with rapidly changing entities.
As a workaround, I've made two changes:
- `<CanvasEntityStateGate/>` is a component that takes an entity identifier, returning its children if the entity state exists, or null if not. This component is wraps every usage of `selectEntityOrThrow`. Theoretically, this should prevent the entity not found errors.
- Add a `caller: string` arg to `selectEntityOrThrow`. This string is now added to the error message when the assertion fails, so we can more easily track the source of the errors.
In the future we can work out a way to not use this throwing selector and retain perf. The app has changed quite a bit since that selector was created - so we may not have to worry about perf at all.
When we added more progress events during generation, we indirectly broke the logic that controls when the progress bar throbs.
Co-authored-by: Mary Hipp Rogers <maryhipp@gmail.com>
Currently translated at 33.6% (533 of 1583 strings)
translationBot(ui): update translation (Japanese)
Currently translated at 30.3% (481 of 1583 strings)
Co-authored-by: Gohsuke Shimada <ghoskay@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/ja/
Translation: InvokeAI/Web UI
Currently translated at 17.6% (278 of 1575 strings)
translationBot(ui): update translation (Spanish)
Currently translated at 17.3% (274 of 1575 strings)
Co-authored-by: gallegonovato <fran-carro@hotmail.es>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/es/
Translation: InvokeAI/Web UI
Currently translated at 100.0% (1581 of 1581 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 100.0% (1576 of 1576 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 100.0% (1575 of 1575 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 85.0% (1340 of 1575 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 78.7% (1240 of 1575 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 73.1% (1152 of 1575 strings)
translationBot(ui): update translation (English)
Currently translated at 99.9% (1574 of 1575 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 57.9% (913 of 1575 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 37.0% (584 of 1575 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 3.2% (51 of 1575 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 3.2% (51 of 1575 strings)
Co-authored-by: Linos <tt250208@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/en/
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/vi/
Translation: InvokeAI/Web UI
Currently translated at 79.9% (1266 of 1583 strings)
translationBot(ui): update translation (Chinese (Simplified Han script))
Currently translated at 74.4% (1171 of 1573 strings)
Co-authored-by: aidawanglion <youjayjeel@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/zh_Hans/
Translation: InvokeAI/Web UI
Currently translated at 99.6% (1569 of 1575 strings)
translationBot(ui): update translation (Italian)
Currently translated at 99.4% (1567 of 1575 strings)
translationBot(ui): update translation (Italian)
Currently translated at 99.4% (1565 of 1573 strings)
Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
Turns out a gallery image's `imageDTO` object can actually be a different object by reference. I thought this was not possible thanks to how we have a quasi-normalized cache.
Need to check against image name instead of reference equality when deciding whether or not to use the single image or the gallery selection for the dnd payload.
Rework uploadImage and uploadImages helpers and the RTK listener, ensuring gallery view isn't changed unexpectedly and preventing extraneous toasts.
Fix staging area save to gallery button to essentially make a copy of the image, instead of changing its intermediate status.
- New name: "Output only Generated Regions"
- New default: true (this was the intention, but at some point the behaviour of the setting was inverted without the default being changed)
The styling in gallery for selected vs hovered was very similar, leading users to think that the hovered image was also selected.
Reducing the borders for hovered images to a single pixel makes it easier to distinguish between selected and hovered.
- Tweak layout/styling of alerts for consistent spacing
- Add percentage to message if it has percentage
- Only show events if the destination is canvas (so workflows events are hidden for example)
- Pass in the `UtilInterface` to the `ModelsInterface` so we can call the simple `signal_progress` method instead of the complicated `emit_invocation_progress` method.
- Only emit load events when starting to load - not after.
- Add more detail to the messages, like submodel type
## Summary
Add support for SD3 image-to-image and inpainting. Similar to FLUX, the
implementation supports fractional denoise_start/denoise_end for more
fine-grained denoise strength control, and a gradient mask adjustment
schedule for smoother inpainting seams.
## Example
Workflow
<img width="1016" alt="image"
src="https://github.com/user-attachments/assets/ee598d77-be80-4ca7-9355-c3cbefa2ef43">
Result

## QA Instructions
- [x] Regression test of text-to-image
- [x] Test image-to-image without mask
- [x] Test that adjusting denoising_start allows fine-grained control of
amount of change in image-to-image
- [x] Test inpainting with mask
- [x] Smoke test SD1, SDXL, FLUX image-to-image to make sure there was
no regression with the frontend changes.
## Merge Plan
<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
## Summary
The Flux VAE, like many VAEs, is broken if run using float16 inputs
returning black images due to NaNs
This will fix the issue by forcing the VAE to run in bfloat16 or float32
were compatible
## Related Issues / Discussions
Fix for issue https://github.com/invoke-ai/InvokeAI/issues/7208
## QA Instructions
Tested on MacOS, VAE works with float16 in the invoke.yaml and left to
default.
I also briefly forced it down the float32 route to check that to.
Needs testing on CUDA / ROCm
## Merge Plan
It should be a straight forward merge,
When an unsupported model architecture is selected, show that warning only, without the extra warnings (i.e. no "missing tile controlnet" warning)
Update Invoke tooltip warnings accordingly
Closes#7239Closes#7177
- Add `withToast` flag to `uploadImage` util
- Skip the toast if this is not set
- Use the flag to disable toasts when canvas does internal image-uploading stuff that should be invisible to user
We don't need a "dnd" image system. We need a "image action" system. We need to execute specific flows with images from various "origins":
- internal dnd e.g. from gallery
- external dnd e.g. user drags an image file into the browser
- direct file upload e.g. user clicks an upload button
- some other internal app button e.g. a context menu
The actions are now generalized to better support these various use-cases.
## Summary
Nodes to support SD3.5 txt2img generations
* adds SD3.5 to starter models
* adds default workflow for SD3.5 txt2img
## Related Issues / Discussions
<!--WHEN APPLICABLE: List any related issues or discussions on github or
discord. If this PR closes an issue, please use the "Closes #1234"
format, so that the issue will be automatically closed when the PR
merges.-->
## QA Instructions
<!--WHEN APPLICABLE: Describe how you have tested the changes in this
PR. Provide enough detail that a reviewer can reproduce your tests.-->
## Merge Plan
<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->
## Checklist
- [ ] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
In a8de6406c5 a change was made to many menus in an effort to improve performance. The menus were made to be lazy, so that they are mounted only while open.
This causes unexpected behaviour when there is some logic in the menu that may need to execute after the user selects a menu item.
In this case, when you click to load a workflow from file, the file picker opens but then the menuitem unmounts, taking the input element and all uploading logic with it. When you select a file, nothing happens because we've nuked the handlers by unmounting everything.
Easy fix - un-lazy-fy the menu.
Closes#7240
The validation on this node causes graph validation to valid. It must be validated _after_ instantiation.
Also, it was a bit too strict. The only case we explicitly do not handle is when both bboxes and points are provided. It's acceptable if neither are provided.
Closes#7248
When filtering, we use a listener to trigger processing the image whenever a filter setting changes. For example, if the user changes from canny to depth, and auto-process is enabled, we re-process the layer with new filter settings.
The filterer has a method to reset its ephemeral state. This includes the filter settings, so resetting the ephemeral state is expected to trigger processing of the filter.
When we exit filtering, we reset the ephemeral state before resetting everything else, like the listeners.
This can cause problem when we exit filtering. The sequence:
- Start filtering a layer.
- Auto-process the filter in response to starting the filter process.
- Change the filter settings.
- Auto-process the filter in response to the changed settings.
- Apply the filter.
- Exit filtering, first by resetting the ephemeral state.
- Auto-process the filter in response to the reset settings.*
- Finish exiting, including unsubscribing from listeners.
*Whoops! That last auto-process has now borked the layer's rendering by processing a filter when we shouldn't be processing a filter.
We need to first unsubscribe from listeners, so we don't react to that change to the filter settings and erroneously process the layer.
Also, add a check to the `processImmediate` method to prevent processing if that method is accidentally called without first starting the filterer.
The same issue could affect the segmenyanything module - same fixes are implemented there.
The root issue is the compositing cache. When we save the canvas to gallery, we need to first composite raster layers together and then upload the image.
The compositor makes extensive use of caching to reduce the number of images created and improve performance. There are two "layers" of caching:
1. Caching the composite canvas element, which is used both for uploading the canvas and for generation mode analysis.
2. Caching the uploaded composite canvas element as an image.
The combination of these caches allows for the various processes that require composite canvases to do minimal work.
But this causes a problem in this situation, because the user expects a new image to be uploaded when they click save to gallery.
For example, suppose we have already composited and uploaded the raster layer state for use in a generation. Then, we ask the compositor to save the canvas to gallery.
The compositor sees that we are requesting an image for the current canvas state, and instead of recompositing and uploading the image again, it just returns the cached image.
In this case, no image is uploaded and it the button does nothing.
We need to be able to opt out of the caching at some level, for certain actions. A `forceUpload` arg is added to the compositor's high-level `getCompositeImageDTO` method to do this.
When true, we ignore the uppermost caching layer (the uploaded image layer), but still use the lower caching layer (the canvas element layer). So we don't recompute the canvas element, but we do upload it as a new image to the server.
Previously, we cleared the canvas progress image when the canvas had no active generations. This allowed for a brief flash of canvas state between the last progress image for a given generation, and when the output image for that generation rendered. Here's the sequence:
- Progress images are received and rendered
- Generation completes - no active canvas generations
- Clear the progress image -> canvas layers visible unexpectedly, creating an awkward jarring change
- Generation output image is rendered -> output image overlaid on canvas layers
In 83538c4b2b I attempted to fix this by only clearing the progress image while we were not staging.
This isn't quite right, though. We are often staging with no active generations - for example, you have a few images completed and are waiting to choose one.
In this situation, if you cancel a pending generation, the logic to clear the progress image doesn't fire because it sees staging is in progress.
What we really need is:
- Staging area module clears the progress image once it has rendered an output image.
- Progress image module clears the progress image when a generation is canceled or failed, in which case there will be no output image.
To do this, we can add an event listener to the progress image module to listen for queue item status changes, and when we get a cancelation or failure, clear the progress image.
pip's dependency resolution doesn't take into account transitive
dependencies when choosing package versions for download.
Even though `torch=~2.4.1` is required by `diffusers`, pip will
download 2.5.0 and higher, but only install 2.4.1.
Pinning torch to <2.5.0 prevents this behaviour.
## Summary
This change mimics the unet padding strategy to align T2I featuremaps
with the latents during denoising. It also slightly adjusts the crop and
scale logic so that the control will match the input image without
shifting when it needs to pad.
## Related Issues / Discussions
<!--WHEN APPLICABLE: List any related issues or discussions on github or
discord. If this PR closes an issue, please use the "Closes #1234"
format, so that the issue will be automatically closed when the PR
merges.-->
## QA Instructions
Image generated at 1032x1024

Image generated at 1080x1040 to prove feature alignment.

Edge artifacts on the bottom and right are a result of SDXL's unet
padding, and t2i influence will be cut off in those regions.
## Merge Plan
Contingent on #7205
Currently the Canvas UI prevents users from generating non-64
resolutions while t2i adapter layers are active. Will leave this as a
draft until fixing that.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
Previously we maintained an `isInteractable` flag, which was derived from these layer flags:
- Locked/unlocked
- Enabled/disabled
- Layer's type visible/hidden
When a layer was not interactable, we blocked all layer actions.
After comparing to the behaviour in Affinity and considering user feedback, I've loosened these restrictions while maintaining safety. First, some definitions.
There two kinds of layer actions - mutating actions and non-mutating actions.
- Mutating actions are drawing on the layer, cropping it, filtering it, converting it, etc. Anything that changes the layer.
- Non-mutating actions are copying the layer, saving the layer to gallery, etc. Anything that _uses_ the layer.
Then, there are two broad canvas states - busy and not busy. "Busy" means the canvas is actively filtering, staging, compositing layers together, etc - something that is "single-threaded" by nature.
And here are the revised restrictions:
- When canvas is busy, you cannot initiate any layer actions.
- When the canvas is not busy, and the layer is locked, you initiate any mutating actions.
- When the canvas is not busy and the layer is not locked, you can initiate any layer action.
Besides safely giving users more freedom, it also fixes an issue where the context menu for a layer was disabled if it was not the selected layer.
- Add method to force a rebuild of the pydantic type adapter for the union of invocations, which is used to validate graphs.
- Update the xfail'd test.
Had missed several of these, which means we were invalidating caches far too often. For example, when you changed a RG prompt, we were invalidating the cached canvas for that entity, even though changing the prompt doesn't affect the canvas at all.
Previously, merge visible deleted all other visible layers. This is not how affinity works, I should have confirmed before making it work like this in the first place.Ï
`CanvasCompositorModule` had a fairly inflexible API, only supporting compositing all raster layers or inpaint masks.
The API has been generalized work with a list of canvas entities. This enables `Merge Down` and `Merge Selected` functionality (though `Merge Selected` is not part of this set of changes).
Let the parent module adopt the filtered/segemented image instead of destroying it and making the parent re-create it, which results in a brief flash of the parent layer's original objects before the new image is rendered.
We were scaling the unscaled image and mask down before doing the paste-back, but this adds an extraneous step & image output.
We can do the paste-back first, then scale to output size after. So instead of 2 resizes before the paste-back, we have 1 resize after.
The end result is the same.
- Restore dedicated `Apply` buttons
- Remove icons from the buttons, too much noise when the words are short and clear
- Update loading state to show a spinner next to the `Process` button instead of on _every_ button
A blue button is begging to be clicked, but clicking it will do nothing. Instead, we should communicate that no action is needed by disabling the button when the default settings are already in use.
Using `&&` will result in false negatives for settings where a falsy value might be valid. For example, any setting for which 0 is a valid number. To be on the safe side, just use an explicit null check on all values.
We use an in-memory cache for PIL images to reduce I/O. If a node mutates the image in any way, the cached image object is also updated (but the on-disk image file is not).
We've lucked out that this hasn't caused major issues in the past (well, maybe it has but we didn't understand them?) mainly because of a happy accident. When you call `context.images.get_pil` in a node, if you provide an image mode (e.g. `mode="RGB"`), we call `convert` on the image. This returns a copy. The node can do whatever it wants to that copy and nothing breaks.
However, when mode is not specified, we return the image directly. This is where we get in trouble - nodes that load the image like this, and then mutate the image, update the cache. Other nodes that reference that same image will now get the mutated version of it.
The fix is super simple - we make sure to return only copies from `get_pil`.
| [Installation and Updates][installation docs] - [Documentation and Tutorials][docs home] - [Bug Reports][github issues] - [Contributing][contributing docs] |
</div>
# Installation
## Quick Start
To get started with Invoke, [Download the Installer](https://www.invoke.com/downloads).
1. Download and unzip the installer from the bottom of the [latest release][latest release link].
2. Run the installer script.
For detailed step by step instructions, or for instructions on manual/docker installations, visit our documentation on [Installation and Updates][installation docs]
- **Windows**: Double-click on the `install.bat` script.
- **macOS**: Open a Terminal window, drag the file `install.sh` from Finder into the Terminal, and press enter.
- **Linux**: Run `install.sh`.
3. When prompted, enter a location for the install and select your GPU type.
4. Once the install finishes, find the directory you selected during install. The default location is `C:\Users\Username\invokeai` for Windows or `~/invokeai` for Linux/macOS.
5. Run the launcher script (`invoke.bat` for Windows, `invoke.sh` for macOS and Linux) the same way you ran the installer script in step 2.
6. Select option 1 to start the application. Once it starts up, open your browser and go to <http://localhost:9090>.
7. Open the model manager tab to install a starter model and then you'll be ready to generate.
More detail, including hardware requirements and manual install instructions, are available in the [installation documentation][installation docs].
## Docker Container
We publish official container images in Github Container Registry: https://github.com/invoke-ai/InvokeAI/pkgs/container/invokeai. Both CUDA and ROCm images are available. Check the above link for relevant tags.
> [!IMPORTANT]
> Ensure that Docker is set up to use the GPU. Refer to [NVIDIA][nvidia docker docs] or [AMD][amd docker docs] documentation.
### Generate!
Run the container, modifying the command as necessary:
```bash
docker run --runtime=nvidia --gpus=all --publish 9090:9090 ghcr.io/invoke-ai/invokeai
```
Then open `http://localhost:9090` and install some models using the Model Manager tab to begin generating.
For ROCm, add `--device /dev/kfd --device /dev/dri` to the `docker run` command.
### Persist your data
You will likely want to persist your workspace outside of the container. Use the `--volume /home/myuser/invokeai:/invokeai` flag to mount some local directory (using its **absolute** path) to the `/invokeai` path inside the container. Your generated images and models will reside there. You can use this directory with other InvokeAI installations, or switch between runtime directories as needed.
### DIY
Build your own image and customize the environment to match your needs using our `docker-compose` stack. See [README.md](./docker/README.md) in the [docker](./docker) directory.
The app is published in twice, in different build formats.
The Invoke application is published as a python package on [PyPI]. This includes both a source distribution and built distribution (a wheel).
- A [PyPI] distribution. This includes both a source distribution and built distribution (a wheel). Users install with `pip install invokeai`. The updater uses this build.
- An installer on the [InvokeAI Releases Page]. This is a zip file with install scripts and a wheel. This is only used for new installs.
Most users install it with the [Launcher](https://github.com/invoke-ai/launcher/), others with `pip`.
The launcher uses GitHub as the source of truth for available releases.
## Broad Strokes
- Merge all changes and bump the version in the codebase.
- Tag the release commit.
- Wait for the release workflow to complete.
- Approve the PyPI publish jobs.
- Write GH release notes.
## General Prep
Make a developer call-out for PRs to merge. Merge and test things out.
While the release workflow does not include end-to-end tests, it does pause before publishing so you can download and test the final build.
Make a developer call-out for PRs to merge. Merge and test things out. Bump the version by editing `invokeai/version/invokeai_version.py`.
## Release Workflow
The `release.yml` workflow runs a number of jobs to handle code checks, tests, build and publish on PyPI.
It is triggered on **tag push**, when the tag matches `v*`. It doesn't matter if you've prepped a release branch like `release/v3.5.0` or are releasing from `main` - it works the same.
> Because commits are reference-counted, it is safe to create a release branch, tag it, let the workflow run, then delete the branch. So long as the tag exists, that commit will exist.
It is triggered on **tag push**, when the tag matches `v*`.
### Triggering the Workflow
Run `make tag-release` to tag the current commit and kick off the workflow.
Ensure all commits that should be in the release are merged, and you have pulled them locally.
The release may also be dispatched [manually].
Double-check that you have checked out the commit that will represent the release (typically the latest commit on `main`).
Run `make tag-release` to tag the current commit and kick off the workflow. You will be prompted to provide a message - use the version specifier.
If this version's tag already exists for some reason (maybe you had to make a last minute change), the script will overwrite it.
> In case you cannot use the Make target, the release may also be dispatched [manually] via GH.
### Workflow Jobs and Process
The workflow consists of a number of concurrently-run jobs, and two final publish jobs.
The workflow consists of a number of concurrently-run checks and tests, then two final publish jobs.
The publish jobs require manual approval and are only run if the other jobs succeed.
#### `check-version` Job
This job checks that the git ref matches the app version. It matches the ref against the `__version__` variable in `invokeai/version/invokeai_version.py`.
When the workflow is triggered by tag push, the ref is the tag. If the workflow is run manually, the ref is the target selected from the **Use workflow from** dropdown.
This job ensures that the `invokeai` python package version specifier matches the tag for the release. The version specifier is pulled from the `__version__` variable in `invokeai/version/invokeai_version.py`.
This job uses [samuelcolvin/check-python-version].
@@ -43,62 +52,52 @@ This job uses [samuelcolvin/check-python-version].
#### Check and Test Jobs
Next, these jobs run and must pass. They are the same jobs that are run for every PR.
- **`python-tests`**: runs `pytest` on matrix of platforms
- **`python-checks`**: runs `ruff` (format and lint)
- **`frontend-tests`**: runs `vitest`
- **`frontend-checks`**: runs `prettier` (format), `eslint` (lint), `dpdm` (circular refs), `tsc` (static type check) and `knip` (unused imports)
> **TODO** We should add `mypy` or `pyright` to the **`check-python`** job.
> **TODO** We should add an end-to-end test job that generates an image.
- **`typegen-checks`**: ensures the frontend and backend types are synced
#### `build-installer` Job
This sets up both python and frontend dependencies and builds the python package. Internally, this runs `installer/create_installer.sh` and uploads two artifacts:
- **`dist`**: the python distribution, to be published on PyPI
- **`InvokeAI-installer-${VERSION}.zip`**: the installer to be included in the GitHub release
- **`InvokeAI-installer-${VERSION}.zip`**: the legacy install scripts
You don't need to download either of these files.
> The legacy install scripts are no longer used, but we haven't updated the workflow to skip building them.
#### Sanity Check & Smoke Test
At this point, the release workflow pauses as the remaining publish jobs require approval. Time to test the installer.
At this point, the release workflow pauses as the remaining publish jobs require approval.
Because the installer pulls from PyPI, and we haven't published to PyPI yet, you will need to install from the wheel:
It's possible to test the python package before it gets published to PyPI. We've never had problems with it, so it's not necessary to do this.
- Download and unzip `dist.zip` and the installer from the **Summary** tab of the workflow
- Run the installer script using the `--wheel` CLI arg, pointing at the wheel:
But, if you want to be extra-super careful, here's how to test it:
- Install to a temporary directory so you get the new user experience
- Download a model and generate
> The same wheel file is bundled in the installer and in the `dist` artifact, which is uploaded to PyPI. You should end up with the exactly the same installation as if the installer got the wheel from PyPI.
- Download the `dist.zip` build artifact from the `build-installer` job
- Unzip it and find the wheel file
- Create a fresh Invoke install by following the [manual install guide](https://invoke-ai.github.io/InvokeAI/installation/manual/) - but instead of installing from PyPI, install from the wheel
- Test the app
##### Something isn't right
If testing reveals any issues, no worries. Cancel the workflow, which will cancel the pending publish jobs (you didn't approve them prematurely, right?).
Now you can start from the top:
- Fix the issues and PR the fixes per usual
- Get the PR approved and merged per usual
- Switch to `main` and pull in the fixes
- Run `make tag-release` to move the tag to `HEAD` (which has the fixes) and kick off the release workflow again
- Re-do the sanity check
If testing reveals any issues, no worries. Cancel the workflow, which will cancel the pending publish jobs (you didn't approve them prematurely, right?) and start over.
#### PyPI Publish Jobs
The publish jobs will run if any of the previous jobs fail.
The publish jobs will not run if any of the previous jobs fail.
They use [GitHub environments], which are configured as [trusted publishers] on PyPI.
Both jobs require a maintainer to approve them from the workflow's **Summary** tab.
Both jobs require a @hipsterusername or @psychedelicious to approve them from the workflow's **Summary** tab.
- Click the **Review deployments** button
- Select the environment (either `testpypi` or `pypi`)
- Select the environment (either `testpypi` or `pypi` - typically you select both)
- Click **Approve and deploy**
> **If the version already exists on PyPI, the publish jobs will fail.** PyPI only allows a given version to be published once - you cannot change it. If version published on PyPI has a problem, you'll need to "fail forward" by bumping the app version and publishing a followup release.
@@ -113,46 +112,33 @@ If there are no incidents, contact @hipsterusername or @lstein, who have owner a
Publishes the distribution on the [Test PyPI] index, using the `testpypi` GitHub environment.
This job is not required for the production PyPI publish, but included just in case you want to test the PyPI release.
This job is not required for the production PyPI publish, but included just in case you want to test the PyPI release for some reason:
If approved and successful, you could try out the test release like this:
- Approve this publish job without approving the prod publish
- Let it finish
- Create a fresh Invoke install by following the [manual install guide](https://invoke-ai.github.io/InvokeAI/installation/manual/), making sure to use the Test PyPI index URL: `https://test.pypi.org/simple/`
- Test the app
#### `publish-pypi` Job
Publishes the distribution on the production PyPI index, using the `pypi` GitHub environment.
## Publish the GitHub Release with installer
It's a good idea to wait to approve and run this job until you have the release notes ready!
Once the release is published to PyPI, it's time to publish the GitHub release.
## Prep and publish the GitHub Release
1. [Draft a new release] on GitHub, choosing the tag that triggered the release.
1. Write the release notes, describing important changes. The **Generate release notes** button automatically inserts the changelog and new contributors, and you can copy/paste the intro from previous releases.
1. Use `scripts/get_external_contributions.py` to get a list of external contributions to shout out in the release notes.
1. Upload the zip file created in **`build`** job into the Assets section of the release notes.
1. Check **Set as a pre-release** if it's a pre-release.
1. Check **Create a discussion for this release**.
1. Publish the release.
1. Announce the release in Discord.
> **TODO** Workflows can create a GitHub release from a template and upload release assets. One popular action to handle this is [ncipollo/release-action]. A future enhancement to the release process could set this up.
## Manual Build
The `build installer` workflow can be dispatched manually. This is useful to test the installer for a given branch or tag.
No checks are run, it just builds.
2. The **Generate release notes** button automatically inserts the changelog and new contributors. Make sure to select the correct tags for this release and the last stable release. GH often selects the wrong tags - do this manually.
3. Write the release notes, describing important changes. Contributions from community members should be shouted out. Use the GH-generated changelog to see all contributors. If there are Weblate translation updates, open that PR and shout out every person who contributed a translation.
4. Check **Set as a pre-release** if it's a pre-release.
5. Approve and wait for the `publish-pypi` job to finish if you haven't already.
6. Publish the GH release.
7. Post the release in Discord in the [releases](https://discord.com/channels/1020123559063990373/1149260708098359327) channel with abbreviated notes. For example:
> It's a pretty big one - Form Builder, Metadata Nodes (thanks @SkunkWorxDark!), and much more.
8. Right click the message in releases and copy the link to it. Then, post that link in the [new-release-discussion](https://discord.com/channels/1020123559063990373/1149506274971631688) channel. For example:
@@ -39,7 +39,7 @@ It has two sections - one for internal use and one for user settings:
```yaml
# Internal metadata - do not edit:
schema_version:4
schema_version:4.0.2
# Put user settings here - see https://invoke-ai.github.io/InvokeAI/features/CONFIGURATION/:
host:0.0.0.0# serve the app on your local network
@@ -83,6 +83,10 @@ A subset of settings may be specified using CLI args:
-`--root`: specify the root directory
-`--config`: override the default `invokeai.yaml` file location
### Low-VRAM Mode
See the [Low-VRAM mode docs][low-vram] for details on enabling this feature.
### All Settings
Following the table are additional explanations for certain settings.
@@ -114,6 +118,10 @@ remote_api_tokens:
The provided token will be added as a `Bearer` token to the network requests to download the model files. As far as we know, this works for all model marketplaces that require authorization.
!!! tip "HuggingFace Models"
If you get an error when installing a HF model using a URL instead of repo id, you may need to [set up a HF API token](https://huggingface.co/settings/tokens) and add an entry for it under `remote_api_tokens`. Use `huggingface.co` for `url_regex`.
#### Model Hashing
Models are hashed during installation, providing a stable identifier for models across all platforms. Hashing is a one-time operation.
@@ -181,3 +189,4 @@ The `log_format` option provides several alternative formats:
[basic guide to yaml files]: https://circleci.com/blog/what-is-yaml-a-beginner-s-guide/
[Model Marketplace API Keys]: #model-marketplace-api-keys
@@ -50,7 +50,7 @@ Applications are built on top of the invoke framework. They should construct `in
### Web UI
The Web UI is built on top of an HTTP API built with [FastAPI](https://fastapi.tiangolo.com/) and [Socket.IO](https://socket.io/). The frontend code is found in `/frontend` and the backend code is found in `/ldm/invoke/app/api_app.py` and `/ldm/invoke/app/api/`. The code is further organized as such:
The Web UI is built on top of an HTTP API built with [FastAPI](https://fastapi.tiangolo.com/) and [Socket.IO](https://socket.io/). The frontend code is found in `/invokeai/frontend` and the backend code is found in `/invokeai/app/api_app.py` and `/invokeai/app/api/`. The code is further organized as such:
| Component | Description |
| --- | --- |
@@ -62,7 +62,7 @@ The Web UI is built on top of an HTTP API built with [FastAPI](https://fastapi.t
### CLI
The CLI is built automatically from invocation metadata, and also supports invocation piping and auto-linking. Code is available in `/ldm/invoke/app/cli_app.py`.
The CLI is built automatically from invocation metadata, and also supports invocation piping and auto-linking. Code is available in `/invokeai/frontend/cli`.
## Invoke
@@ -70,7 +70,7 @@ The Invoke framework provides the interface to the underlying AI systems and is
### Invoker
The invoker (`/ldm/invoke/app/services/invoker.py`) is the primary interface through which applications interact with the framework. Its primary purpose is to create, manage, and invoke sessions. It also maintains two sets of services:
The invoker (`/invokeai/app/services/invoker.py`) is the primary interface through which applications interact with the framework. Its primary purpose is to create, manage, and invoke sessions. It also maintains two sets of services:
- **invocation services**, which are used by invocations to interact with core functionality.
- **invoker services**, which are used by the invoker to manage sessions and manage the invocation queue.
@@ -82,12 +82,12 @@ The session graph does not support looping. This is left as an application probl
### Invocations
Invocations represent individual units of execution, with inputs and outputs. All invocations are located in `/ldm/invoke/app/invocations`, and are all automatically discovered and made available in the applications. These are the primary way to expose new functionality in Invoke.AI, and the [implementation guide](INVOCATIONS.md) explains how to add new invocations.
Invocations represent individual units of execution, with inputs and outputs. All invocations are located in `/invokeai/app/invocations`, and are all automatically discovered and made available in the applications. These are the primary way to expose new functionality in Invoke.AI, and the [implementation guide](INVOCATIONS.md) explains how to add new invocations.
### Services
Services provide invocations access AI Core functionality and other necessary functionality (e.g. image storage). These are available in `/ldm/invoke/app/services`. As a general rule, new services should provide an interface as an abstract base class, and may provide a lightweight local implementation by default in their module. The goal for all services should be to enable the usage of different implementations (e.g. using cloud storage for image storage), but should not load any module dependencies unless that implementation has been used (i.e. don't import anything that won't be used, especially if it's expensive to import).
Services provide invocations access AI Core functionality and other necessary functionality (e.g. image storage). These are available in `/invokeai/app/services`. As a general rule, new services should provide an interface as an abstract base class, and may provide a lightweight local implementation by default in their module. The goal for all services should be to enable the usage of different implementations (e.g. using cloud storage for image storage), but should not load any module dependencies unless that implementation has been used (i.e. don't import anything that won't be used, especially if it's expensive to import).
## AI Core
The AI Core is represented by the rest of the code base (i.e. the code outside of `/ldm/invoke/app/`).
The AI Core is represented by the rest of the code base (i.e. the code outside of `/invokeai/app/`).
We use `pytest` to run the backend python tests. (See [pyproject.toml](/pyproject.toml) for the default `pytest` options.)
We use `pytest` to run the backend python tests. (See [pyproject.toml](https://github.com/invoke-ai/InvokeAI/blob/main/pyproject.toml) for the default `pytest` options.)
## Fast vs. Slow
All tests are categorized as either 'fast' (no test annotation) or 'slow' (annotated with the `@pytest.mark.slow` decorator).
@@ -33,7 +33,7 @@ pytest tests -m ""
## Test Organization
All backend tests are in the [`tests/`](/tests/) directory. This directory mirrors the organization of the `invokeai/` directory. For example, tests for `invokeai/model_management/model_manager.py` would be found in `tests/model_management/test_model_manager.py`.
All backend tests are in the [`tests/`](https://github.com/invoke-ai/InvokeAI/tree/main/tests) directory. This directory mirrors the organization of the `invokeai/` directory. For example, tests for `invokeai/model_management/model_manager.py` would be found in `tests/model_management/test_model_manager.py`.
TODO: The above statement is aspirational. A re-organization of legacy tests is required to make it true.
If you are looking to help with a code contribution, InvokeAI uses several different technologies under the hood: Python (Pydantic, FastAPI, diffusers) and Typescript (React, Redux Toolkit, ChakraUI, Mantine, Konva). Familiarity with StableDiffusion and image generation concepts is helpful, but not essential.
If you are looking to help with a code contribution, InvokeAI uses several different technologies under the hood: Python (Pydantic, FastAPI, diffusers) and Typescript (React, Redux Toolkit, ChakraUI, Mantine, Konva). Familiarity with StableDiffusion and image generation concepts is helpful, but not essential.
## **Get Started**
@@ -12,7 +12,7 @@ To get started, take a look at our [new contributors checklist](newContributorCh
Once you're setup, for more information, you can review the documentation specific to your area of interest:
@@ -20,15 +20,15 @@ Once you're setup, for more information, you can review the documentation specif
If you don't feel ready to make a code contribution yet, no problem! You can also help out in other ways, such as [documentation](documentation.md), [translation](translation.md) or helping support other users and triage issues as they're reported in GitHub.
There are two paths to making a development contribution:
There are two paths to making a development contribution:
1. Choosing an open issue to address. Open issues can be found in the [Issues](https://github.com/invoke-ai/InvokeAI/issues?q=is%3Aissue+is%3Aopen) section of the InvokeAI repository. These are tagged by the issue type (bug, enhancement, etc.) along with the “good first issues” tag denoting if they are suitable for first time contributors.
1. Additional items can be found on our [roadmap](https://github.com/orgs/invoke-ai/projects/7). The roadmap is organized in terms of priority, and contains features of varying size and complexity. If there is an inflight item you’d like to help with, reach out to the contributor assigned to the item to see how you can help.
1. Additional items can be found on our [roadmap](https://github.com/orgs/invoke-ai/projects/7). The roadmap is organized in terms of priority, and contains features of varying size and complexity. If there is an inflight item you’d like to help with, reach out to the contributor assigned to the item to see how you can help.
2. Opening a new issue or feature to add. **Please make sure you have searched through existing issues before creating new ones.**
*Regardless of what you choose, please post in the [#dev-chat](https://discord.com/channels/1020123559063990373/1049495067846524939) channel of the Discord before you start development in order to confirm that the issue or feature is aligned with the current direction of the project. We value our contributors time and effort and want to ensure that no one’s time is being misspent.*
## Best Practices:
## Best Practices:
* Keep your pull requests small. Smaller pull requests are more likely to be accepted and merged
* Comments! Commenting your code helps reviewers easily understand your contribution
* Use Python and Typescript’s typing systems, and consider using an editor with [LSP](https://microsoft.github.io/language-server-protocol/) support to streamline development
@@ -38,7 +38,7 @@ There are two paths to making a development contribution:
If you need help, you can ask questions in the [#dev-chat](https://discord.com/channels/1020123559063990373/1049495067846524939) channel of the Discord.
For frontend related work, **@psychedelicious** is the best person to reach out to.
For frontend related work, **@psychedelicious** is the best person to reach out to.
For backend related work, please reach out to **@blessedcoolant**, **@lstein**, **@StAlKeR7779** or **@psychedelicious**.
@@ -5,7 +5,7 @@ If you're a new contributor to InvokeAI or Open Source Projects, this is the gui
## New Contributor Checklist
- [x] Set up your local development environment & fork of InvokAI by following [the steps outlined here](../dev-environment.md)
- [x] Set up your local tooling with [this guide](InvokeAI/contributing/LOCAL_DEVELOPMENT/#developing-invokeai-in-vscode). Feel free to skip this step if you already have tooling you're comfortable with.
- [x] Set up your local tooling with [this guide](../LOCAL_DEVELOPMENT.md). Feel free to skip this step if you already have tooling you're comfortable with.
- [x] Familiarize yourself with [Git](https://www.atlassian.com/git) & our project structure by reading through the [development documentation](development.md)
- [x] Join the [#dev-chat](https://discord.com/channels/1020123559063990373/1049495067846524939) channel of the Discord
- [x] Choose an issue to work on! This can be achieved by asking in the #dev-chat channel, tackling a [good first issue](https://github.com/invoke-ai/InvokeAI/contribute) or finding an item on the [roadmap](https://github.com/orgs/invoke-ai/projects/7). If nothing in any of those places catches your eye, feel free to work on something of interest to you!
@@ -22,15 +22,15 @@ Before starting these steps, ensure you have your local environment [configured
2. Fork the [InvokeAI](https://github.com/invoke-ai/InvokeAI) repository to your GitHub profile. This means that you will have a copy of the repository under**your-GitHub-username/InvokeAI**.
3. Clone the repository to your local machine using:
If you're unfamiliar with using Git through the commandline, [GitHub Desktop](https://desktop.github.com) is a easy-to-use alternative with a UI. You can do all the same steps listed here, but through the interface. 4. Create a new branch for your fix using:
```bash
git checkout -b branch-name-here
```
```bash
git checkout -b branch-name-here
```
5. Make the appropriate changes for the issue you are trying to address or the feature that you want to add.
6. Add the file contents of the changed files to the "snapshot" git uses to manage the state of the project, also known as the index:
To make changes to Invoke's backend, frontend, or documentation, you'll need to set up a dev environment.
To make changes to Invoke's backend, frontend or documentation, you'll need to set up a dev environment.
If you just want to use Invoke, you should use the [installer][installer link].
If you only want to make changes to the docs site, you can skip the frontend dev environment setup as described in the below guide.
!!! info "Why do I need the frontend toolchain?"
The repo doesn't contain a build of the frontend. You'll be responsible for rebuilding it every time you pull in new changes, or run it in dev mode (which incurs a substantial performance penalty).
If you just want to use Invoke, you should use the [launcher][launcher link].
!!! warning
@@ -17,81 +15,66 @@ If you just want to use Invoke, you should use the [installer][installer link].
## Setup
1. Run through the [requirements][requirements link].
1. [Fork and clone][forking link] the [InvokeAI repo][repo link].
1. Create an directory for user data (images, models, db, etc). This is typically at `~/invokeai`, but if you already have a non-dev install, you may want to create a separate directory for the dev install.
1. Create a python virtual environment inside the directory you just created:
```sh
python3 -m venv .venv --prompt InvokeAI-Dev
```
2. [Fork and clone][forking link] the [InvokeAI repo][repo link].
1. Activate the venv (you'll need to do this every time you want to run the app):
3. Create an directory for user data (images, models, db, etc). This is typically at `~/invokeai`, but if you already have a non-dev install, you may want to create a separate directory for the dev install.
```sh
source .venv/bin/activate
```
4. Follow the [manual install][manual install link] guide, with some modifications to the install command:
1. Install the repo as an [editable install][editable install link]:
- Use `.` instead of `invokeai` to install from the current directory. You don't need to specify the version.
- Add `-e` after the `install` operation to make this an [editable install][editable install link]. That means your changes to the python code will be reflected when you restart the Invoke server.
Refer to the [manual installation][manual install link]] instructions for more determining the correct install options. `xformers` is optional, but `dev` and `test` are not.
- When installing the `invokeai` package, add the `dev`, `test` and `docs` package options to the package specifier. You may or may not need the `xformers` option - follow the manual install guide to figure that out. So, your package specifier will be either `".[dev,test,docs]"` or `".[dev,test,docs,xformers]"`. Note the quotes!
1. Install the frontend dev toolchain:
With the modifications made, the install command should look something like this:
5. At this point, you should have Invoke installed, a venv set up and activated, and the server running. But you will see a warning in the terminal that no UI was found. If you go to the URL for the server, you won't get a UI.
```sh
pnpm build
```
This is because the UI build is not distributed with the source code. You need to build it manually. End the running server instance.
1. Start the application:
If you only want to edit the docs, you can stop here and skip to the **Documentation** section below.
```sh
python scripts/invokeai-web.py
```
6. Install the frontend dev toolchain:
1. Access the UI at `localhost:9090`.
- [`nodejs`](https://nodejs.org/) (v20+)
- [`pnpm`](https://pnpm.io/8.x/installation) (must be v8 - not v9!)
7. Do a production build of the frontend:
```sh
cd <PATH_TO_INVOKEAI_REPO>/invokeai/frontend/web
pnpm i
pnpm build
```
8. Restart the server and navigate to the URL. You should get a UI. After making changes to the python code, restart the server to see those changes.
## Updating the UI
You'll need to run `pnpm build` every time you pull in new changes. Another option is to skip the build and instead run the app in dev mode:
You'll need to run `pnpm build` every time you pull in new changes.
Another option is to skip the build and instead run the UI in dev mode:
```sh
pnpm dev
```
This starts a dev server at `localhost:5173`, which you will use instead of `localhost:9090`.
This starts a vite dev server for the UI at `127.0.0.1:5173`, which you will use instead of `127.0.0.1:9090`.
The dev mode is substantially slower than the production build but may be more convenient if you just need to test things out.
The dev mode is substantially slower than the production build but may be more convenient if you just need to test things out. It will hot-reload the UI as you make changes to the frontend code. Sometimes the hot-reload doesn't work, and you need to manually refresh the browser tab.
## Documentation
The documentation is built with `mkdocs`. To preview it locally, you need a additional set of packages installed.
The documentation is built with `mkdocs`. It provides a hot-reload dev server for the docs. Start it with `mkdocs serve`.
```sh
# after activating the venv
pip install -e ".[docs]"
```
Then, you can start a live docs dev server, which will auto-refresh when you edit the docs:
```sh
mkdocs serve
```
On macOS and Linux, there is a `make` target for this:
@@ -34,11 +34,11 @@ Please reach out to @hipsterusername on [Discord](https://discord.gg/ZmtBAhwWhy)
## Contributors
This project is a combined effort of dedicated people from across the world.[Check out the list of all these amazing people](https://invoke-ai.github.io/InvokeAI/other/CONTRIBUTORS/). We thank them for their time, hard work and effort.
This project is a combined effort of dedicated people from across the world.[Check out the list of all these amazing people](contributors.md). We thank them for their time, hard work and effort.
## Code of Conduct
The InvokeAI community is a welcoming place, and we want your help in maintaining that. Please review our [Code of Conduct](https://github.com/invoke-ai/InvokeAI/blob/main/CODE_OF_CONDUCT.md) to learn more - it's essential to maintaining a respectful and inclusive environment.
The InvokeAI community is a welcoming place, and we want your help in maintaining that. Please review our [Code of Conduct](../CODE_OF_CONDUCT.md) to learn more - it's essential to maintaining a respectful and inclusive environment.
By making a contribution to this project, you certify that:
Many issues can be resolved by re-installing the application. You won't lose any data by re-installing. We suggest downloading the [latest release](https://github.com/invoke-ai/InvokeAI/releases/latest) and using it to re-install the application. Consult the [installer guide](./installation/installer.md) for more information.
When you run the installer, you'll have an option to select the version to install. If you aren't ready to upgrade, you choose the current version to fix a broken install.
If the troubleshooting steps on this page don't get you up and running, please either [create an issue] or hop on [discord] for help.
## How to Install
You can download the latest installers [here](https://github.com/invoke-ai/InvokeAI/releases).
Note that any releases marked as _pre-release_ are in a beta state. You may experience some issues, but we appreciate your help testing those! For stable/reliable installations, please install the [latest release].
Follow the [Quick Start guide](./installation/quick_start.md) to install Invoke.
## Downloading models and using existing models
The Model Manager tab in the UI provides a few ways to install models, including using your already-downloaded models. You'll see a popup directing you there on first startup. For more information, see the [model install docs].
## Missing models after updating to v4
## Missing models after updating from v3
If you find some models are missing after updating to v4, it's likely they weren't correctly registered before the update and didn't get picked up in the migration.
If you find some models are missing after updating from v3, it's likely they weren't correctly registered before the update and didn't get picked up in the migration.
You can use the `Scan Folder` tab in the Model Manager UI to fix this. The models will either be in the old, now-unused `autoimport` folder, or your `models` folder.
@@ -37,115 +29,27 @@ Follow the same steps to scan and import the missing models.
## Slow generation
- Check the [system requirements] to ensure that your system is capable of generating images.
-Check the `ram` setting in `invokeai.yaml`. This setting tells Invoke how much of your system RAM can be used to cache models. Having this too high or too low can slow things down. That said, it's generally safest to not set this at all and instead let Invoke manage it.
- Check the `vram` setting in `invokeai.yaml`. This setting tells Invoke how much of your GPU VRAM can be used to cache models. Counter-intuitively, if this setting is too high, Invoke will need to do a lot of shuffling of models as it juggles the VRAM cache and the currently-loaded model. The default value of 0.25 is generally works well for GPUs without 16GB or more VRAM. Even on a 24GB card, the default works well.
-Check that your generations are happening on your GPU (if you have one). InvokeAI will log what is being used for generation upon startup. If your GPU isn't used, re-install to ensure the correct versions of torch get installed.
- If you are on Windows, you may have exceeded your GPU's VRAM capacity and are using slower [shared GPU memory](#shared-gpu-memory-windows). There's a guide to opt out of this behaviour in the linked FAQ entry.
## Shared GPU Memory (Windows)
!!! tip "Nvidia GPUs with driver 536.40"
This only applies to current Nvidia cards with driver 536.40 or later, released in June 2023.
When the GPU doesn't have enough VRAM for a task, Windows is able to allocate some of its CPU RAM to the GPU. This is much slower than VRAM, but it does allow the system to generate when it otherwise might no have enough VRAM.
When shared GPU memory is used, generation slows down dramatically - but at least it doesn't crash.
If you'd like to opt out of this behavior and instead get an error when you exceed your GPU's VRAM, follow [this guide from Nvidia](https://nvidia.custhelp.com/app/answers/detail/a_id/5490).
Here's how to get the python path required in the linked guide:
- Run `invoke.bat`.
- Select option 2 for developer console.
- At least one python path will be printed. Copy the path that includes your invoke installation directory (typically the first).
## Installer cannot find python (Windows)
Ensure that you checked **Add python.exe to PATH** when installing Python. This can be found at the bottom of the Python Installer window. If you already have Python installed, you can re-run the python installer, choose the Modify option and check the box.
-Follow the [Low-VRAM mode guide](./features/low-vram.md) to optimize performance.
- Check that your generations are happening on your GPU (if you have one). Invoke will log what is being used for generation upon startup. If your GPU isn't used, re-install to and ensure you select the appropriate GPU option.
-If you are on Windows with an Nvidia GPU, you may have exceeded your GPU's VRAM capacity and are triggering Nvidia's "sysmem fallback". There's a guide to opt out of this behaviour in the [Low-VRAM mode guide](./features/low-vram.md).
## Triton error on startup
This can be safely ignored. InvokeAI doesn't use Triton, but if you are on Linux and wish to dismiss the error, you can install Triton.
This can be safely ignored. Invoke doesn't use Triton, but if you are on Linux and wish to dismiss the error, you can install Triton.
## Updated to 3.4.0 and xformers can’t load C++/CUDA
## Unable to Copy on Firefox
An issue occurred with your PyTorch update. Follow these steps to fix :
Firefox does not allow Invoke to directly access the clipboard by default. As a result, you may be unable to use certain copy functions. You can fix this by configuring Firefox to allow access to write to the clipboard:
1. Launch your invoke.bat / invoke.sh and select the option to open the developer console
-If you run into an error with `typing_extensions`, re-open the developer console and run: `pip install -U typing-extensions`
Note that v3.4.0 is an old, unsupported version. Please upgrade to the [latest release].
## Install failed and says `pip` is out of date
An out of date `pip` typically won't cause an installation to fail. The cause of the error can likely be found above the message that says `pip` is out of date.
If you saw that warning but the install went well, don't worry about it (but you can update `pip` afterwards if you'd like).
- Go to `about:config` and click the Accept button
- Search for `dom.events.asyncClipboard.clipboardItem`
-Set it to `true` by clicking the toggle button
- Restart Firefox
## Replicate image found online
Most example images with prompts that you'll find on the internet have been generated using different software, so you can't expect to get identical results. In order to reproduce an image, you need to replicate the exact settings and processing steps, including (but not limited to) the model, the positive and negative prompts, the seed, the sampler, the exact image size, any upscaling steps, etc.
## OSErrors on Windows while installing dependencies
During a zip file installation or an update, installation stops with an error like this:
- Paste your access token when prompted and press Enter. You won't see anything when you paste it.
- Type `n` if prompted about git credentials.
If you get an error, try the command again - maybe the token didn't paste correctly.
Once your token is set, start Invoke and try downloading the model again. The installer will automatically use the access token.
If the install still fails, you may not have access to the model.
## Stable Diffusion XL generation fails after trying to load UNet
InvokeAI is working in other respects, but when trying to generate
images with Stable Diffusion XL you get a "Server Error". The text log
in the launch window contains this log line above several more lines of
error messages:
`INFO --> Loading model:D:\LONG\PATH\TO\MODEL, type sdxl:main:unet`
This failure mode occurs when there is a network glitch during
downloading the very large SDXL model.
To address this, first go to the Model Manager and delete the
Stable-Diffusion-XL-base-1.X model. Then, click the HuggingFace tab,
paste the Repo ID stabilityai/stable-diffusion-xl-base-1.0 and install
the model.
## Package dependency conflicts during installation or update
If you have previously installed InvokeAI or another Stable Diffusion
package, the installer may occasionally pick up outdated libraries and
either the installer or `invoke` will fail with complaints about
library conflicts.
To resolve this, re-install the application as described above.
## Invalid configuration file
Everything seems to install ok, you get a `ValidationError` when starting up the app.
@@ -154,64 +58,9 @@ This is caused by an invalid setting in the `invokeai.yaml` configuration file.
Check the [configuration docs] for more detail about the settings and how to specify them.
## `ModuleNotFoundError: No module named 'controlnet_aux'`
## Out of Memory Errors
`controlnet_aux` is a dependency of Invoke and appears to have been packaged or distributed strangely. Sometimes, it doesn't install correctly. This is outside our control.
If you encounter this error, the solution is to remove the package from the `pip` cache and re-run the Invoke installer so a fresh, working version of `controlnet_aux` can be downloaded and installed:
- Run the Invoke launcher
- Choose the developer console option
- Run this command: `pip cache remove controlnet_aux`
- Close the terminal window
- Download and run the [installer][latest release], selecting your current install location
## Out of Memory Issues
The models are large, VRAM is expensive, and you may find yourself
faced with Out of Memory errors when generating images. Here are some
tips to reduce the problem:
!!! info "Optimizing for GPU VRAM"
=== "4GB VRAM GPU"
This should be adequate for 512x512 pixel images using Stable Diffusion 1.5
and derived models, provided that you do not use the NSFW checker. It won't be loaded unless you go into the UI settings and turn it on.
If you are on a CUDA-enabled GPU, we will automatically use xformers or torch-sdp to reduce VRAM requirements, though you can explicitly configure this. See the [configuration docs].
=== "6GB VRAM GPU"
This is a border case. Using the SD 1.5 series you should be able to
generate images up to 640x640 with the NSFW checker enabled, and up to
1024x1024 with it disabled.
If you run into persistent memory issues there are a series of
environment variables that you can set before launching InvokeAI that
alter how the PyTorch machine learning library manages memory. See
<https://pytorch.org/docs/stable/notes/cuda.html#memory-management> for
a list of these tweaks.
=== "12GB VRAM GPU"
This should be sufficient to generate larger images up to about 1280x1280.
## Checkpoint Models Load Slowly or Use Too Much RAM
The difference between diffusers models (a folder containing multiple
subfolders) and checkpoint models (a file ending with .safetensors or
.ckpt) is that InvokeAI is able to load diffusers models into memory
incrementally, while checkpoint models must be loaded all at
once. With very large models, or systems with limited RAM, you may
experience slowdowns and other memory-related issues when loading
checkpoint models.
To solve this, go to the Model Manager tab (the cube), select the
checkpoint model that's giving you trouble, and press the "Convert"
button in the upper right of your browser window. This will conver the
checkpoint into a diffusers model, after which loading should be
faster and less memory-intensive.
The models are large, VRAM is expensive, and you may find yourself faced with Out of Memory errors when generating images. Follow our [Low-VRAM mode guide](./features/low-vram.md) to configure Invoke to prevent these.
## Memory Leak (Linux)
@@ -253,8 +102,6 @@ Note the differences between memory allocated as chunks in an arena vs. memory a
As of v5.6.0, Invoke has a low-VRAM mode. It works on systems with dedicated GPUs (Nvidia GPUs on Windows/Linux and AMD GPUs on Linux).
This allows you to generate even if your GPU doesn't have enough VRAM to hold full models. Most users should be able to run even the beefiest models - like the ~24GB unquantised FLUX dev model.
## Enabling Low-VRAM mode
To enable Low-VRAM mode, add this line to your `invokeai.yaml` configuration file, then restart Invoke:
```yaml
enable_partial_loading:true
```
**Windows users should also [disable the Nvidia sysmem fallback](#disabling-nvidia-sysmem-fallback-windows-only)**.
It is possible to fine-tune the settings for best performance or if you still get out-of-memory errors (OOMs).
!!! tip "How to find `invokeai.yaml`"
The `invokeai.yaml` configuration file lives in your install directory. To access it, run the **Invoke Community Edition** launcher and click the install location. This will open your install directory in a file explorer window.
You'll see `invokeai.yaml` there and can edit it with any text editor. After making changes, restart Invoke.
If you don't see `invokeai.yaml`, launch Invoke once. It will create the file on its first startup.
## Details and fine-tuning
Low-VRAM mode involves 4 features, each of which can be configured or fine-tuned:
- Partial model loading (`enable_partial_loading`)
- PyTorch CUDA allocator config (`pytorch_cuda_alloc_conf`)
- Dynamic RAM and VRAM cache sizes (`max_cache_ram_gb`, `max_cache_vram_gb`)
- Working memory (`device_working_mem_gb`)
- Keeping a RAM weight copy (`keep_ram_copy_of_weights`)
Read on to learn about these features and understand how to fine-tune them for your system and use-cases.
### Partial model loading
Invoke's partial model loading works by streaming model "layers" between RAM and VRAM as they are needed.
When an operation needs layers that are not in VRAM, but there isn't enough room to load them, inactive layers are offloaded to RAM to make room.
#### Enabling partial model loading
As described above, you can enable partial model loading by adding this line to `invokeai.yaml`:
```yaml
enable_partial_loading:true
```
### PyTorch CUDA allocator config
The PyTorch CUDA allocator's behavior can be configured using the `pytorch_cuda_alloc_conf` config. Tuning the allocator configuration can help to reduce the peak reserved VRAM. The optimal configuration is dependent on many factors (e.g. device type, VRAM, CUDA driver version, etc.), but switching from PyTorch's native allocator to using CUDA's built-in allocator works well on many systems. To try this, add the following line to your `invokeai.yaml` file:
```yaml
pytorch_cuda_alloc_conf:"backend:cudaMallocAsync"
```
A more complete explanation of the available configuration options is [here](https://pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf).
### Dynamic RAM and VRAM cache sizes
Loading models from disk is slow and can be a major bottleneck for performance. Invoke uses two model caches - RAM and VRAM - to reduce loading from disk to a minimum.
By default, Invoke manages these caches' sizes dynamically for best performance.
#### Fine-tuning cache sizes
Prior to v5.6.0, the cache sizes were static, and for best performance, many users needed to manually fine-tune the `ram` and `vram` settings in `invokeai.yaml`.
As of v5.6.0, the caches are dynamically sized. The `ram` and `vram` settings are no longer used, and new settings are added to configure the cache.
**Most users will not need to fine-tune the cache sizes.**
But, if your GPU has enough VRAM to hold models fully, you might get a perf boost by manually setting the cache sizes in `invokeai.yaml`:
```yaml
# The default max cache RAM size is logged on InvokeAI startup. It is determined based on your system RAM / VRAM.
# You can override the default value by setting `max_cache_ram_gb`.
# Increasing `max_cache_ram_gb` will increase the amount of RAM used to cache inactive models, resulting in faster model
# reloads for the cached models.
# As an example, if your system has 32GB of RAM and no other heavy processes, setting the `max_cache_ram_gb` to 28GB
# might be a good value to achieve aggressive model caching.
max_cache_ram_gb:28
# The default max cache VRAM size is adjusted dynamically based on the amount of available VRAM (taking into
# consideration the VRAM used by other processes).
# You can override the default value by setting `max_cache_vram_gb`.
# CAUTION: Most users should not manually set this value. See warning below.
max_cache_vram_gb:16
```
!!! warning "Max safe value for `max_cache_vram_gb`"
Most users should not manually configure the `max_cache_vram_gb`. This configuration value takes precedence over the `device_working_mem_gb` and any operations that explicitly reserve additional working memory (e.g. VAE decode). As such, manually configuring it increases the likelihood of encountering out-of-memory errors.
For users who wish to configure `max_cache_vram_gb`, the max safe value can be determined by subtracting `device_working_mem_gb` from your GPU's VRAM. As described below, the default for `device_working_mem_gb` is 3GB.
For example, if you have a 12GB GPU, the max safe value for `max_cache_vram_gb` is `12GB - 3GB = 9GB`.
If you had increased `device_working_mem_gb` to 4GB, then the max safe value for `max_cache_vram_gb` is `12GB - 4GB = 8GB`.
Most users who override `max_cache_vram_gb` are doing so because they wish to use significantly less VRAM, and should be setting `max_cache_vram_gb` to a value significantly less than the 'max safe value'.
### Working memory
Invoke cannot use _all_ of your VRAM for model caching and loading. It requires some VRAM to use as working memory for various operations.
Invoke reserves 3GB VRAM as working memory by default, which is enough for most use-cases. However, it is possible to fine-tune this setting if you still get OOMs.
#### Fine-tuning working memory
You can increase the working memory size in `invokeai.yaml` to prevent OOMs:
```yaml
# The default is 3GB - bump it up to 4GB to prevent OOMs.
device_working_mem_gb:4
```
!!! tip "Operations may request more working memory"
For some operations, we can determine VRAM requirements in advance and allocate additional working memory to prevent OOMs.
VAE decoding is one such operation. This operation converts the generation process's output into an image. For large image outputs, this might use more than the default working memory size of 3GB.
During this decoding step, Invoke calculates how much VRAM will be required to decode and requests that much VRAM from the model manager. If the amount exceeds the working memory size, the model manager will offload cached model layers from VRAM until there's enough VRAM to decode.
Once decoding completes, the model manager "reclaims" the extra VRAM allocated as working memory for future model loading operations.
### Keeping a RAM weight copy
Invoke has the option of keeping a RAM copy of all model weights, even when they are loaded onto the GPU. This optimization is _on_ by default, and enables faster model switching and LoRA patching. Disabling this feature will reduce the average RAM load while running Invoke (peak RAM likely won't change), at the cost of slower model switching and LoRA patching. If you have limited RAM, you can disable this optimization:
```yaml
# Set to false to reduce the average RAM usage at the cost of slower model switching and LoRA patching.
On Windows, Nvidia GPUs are able to use system RAM when their VRAM fills up via **sysmem fallback**. While it sounds like a good idea on the surface, in practice it causes massive slowdowns during generation.
It is strongly suggested to disable this feature:
- Open the **NVIDIA Control Panel** app.
- Expand **3D Settings** on the left panel.
- Click **Manage 3D Settings** in the left panel.
- Find **CUDA - Sysmem Fallback Policy** in the right panel and set it to **Prefer No Sysmem Fallback**.
If the sysmem fallback feature sounds familiar, that's because Invoke's partial model loading strategy is conceptually very similar - use VRAM when there's room, else fall back to RAM.
Unfortunately, the Nvidia implementation is not optimized for applications like Invoke and does more harm than good.
## Troubleshooting
### Windows page file
Invoke has high virtual memory (a.k.a. 'committed memory') requirements. This can cause issues on Windows if the page file size limits are hit. (See this issue for the technical details on why this happens: https://github.com/invoke-ai/InvokeAI/issues/7563).
If you run out of page file space, InvokeAI may crash. Often, these crashes will happen with one of the following errors:
- InvokeAI exits with Windows error code `3221225477`
- InvokeAI crashes without an error, but `eventvwr.msc` reveals an error with code `0xc0000005` (the hex equivalent of `3221225477`)
If you are running out of page file space, try the following solutions:
- Make sure that you have sufficient disk space for the page file to grow. Watch your disk usage as Invoke runs. If it climbs near 100% leading up to the crash, then this is very likely the source of the issue. Clear out some disk space to resolve the issue.
- Make sure that your page file is set to "System managed size" (this is the default) rather than a custom size. Under the "System managed size" policy, the page file will grow dynamically as needed.
We recommend using the Invoke Launcher to install and update Invoke. It's a desktop application for Windows, macOS and Linux. It takes care of a lot of nitty gritty details for you.
Follow the [quick start guide](./quick_start.md) to get started.
If you want to use Invoke locally, you should probably use the [installer](./installer.md).
If you want to use Invoke locally, you should probably use the [launcher](./quick_start.md).
If you want to contribute to Invoke, instead follow the [dev environment](../contributing/dev-environment.md) guide.
If you want to contribute to Invoke or run the app on the latest dev branch, instead follow the [dev environment](../contributing/dev-environment.md) guide.
InvokeAI is distributed as a python package on PyPI, installable with `pip`. There are a few things that are handled by the installer and launcher that you'll need to manage manually, described in this guide.
InvokeAI is distributed as a python package on PyPI, installable with `pip`. There are a few things that are handled by the launcher that you'll need to manage manually, described in this guide.
## Requirements
@@ -16,43 +16,39 @@ Before you start, go through the [installation requirements](./requirements.md).
## Walkthrough
1. Create a directory to contain your InvokeAI library, configuration files, and models. This is known as the "runtime" or "root" directory, and typically lives in your home directory under the name `invokeai`.
We'll use [`uv`](https://github.com/astral-sh/uv) to install python and create a virtual environment, then install the `invokeai` package. `uv` is a modern, very fast alternative to `pip`.
The following commands vary depending on the version of Invoke being installed and the system onto which it is being installed.
1. Install `uv` as described in its [docs](https://docs.astral.sh/uv/getting-started/installation/#standalone-installer). We suggest using the standalone installer method.
Run `uv --version` to confirm that `uv` is installed and working. After installation, you may need to restart your terminal to get access to `uv`.
2. Create a directory for your installation, typically in your home directory (e.g. `~/invokeai` or `$Home/invokeai`):
=== "Linux/macOS"
```bash
mkdir ~/invokeai
cd ~/invokeai
```
=== "Windows (PowerShell)"
```bash
mkdir $Home/invokeai
```
1. Enter the root directory and create a virtual Python environment within it named `.venv`.
!!! warning "Virtual Environment Location"
While you may create the virtual environment anywhere in the file system, we recommend that you create it within the root directory as shown here. This allows the application to automatically detect its data directories.
If you choose a different location for the venv, then you _must_ set the `INVOKEAI_ROOT` environment variable or specify the root directory using the `--root` CLI arg.
=== "Linux/macOS"
```bash
cd ~/invokeai
python3 -m venv .venv --prompt InvokeAI
```
=== "Windows (PowerShell)"
```bash
cd $Home/invokeai
python3 -m venv .venv --prompt InvokeAI
```
1. Activate the new environment:
3. Create a virtual environment in that directory:
This command creates a portable virtual environment at `.venv` complete with a portable python 3.11. It doesn't matter if your system has no python installed, or has a different version - `uv` will handle everything.
4. Activate the virtual environment:
=== "Linux/macOS"
@@ -60,41 +56,48 @@ Before you start, go through the [installation requirements](./requirements.md).
source .venv/bin/activate
```
=== "Windows"
=== "Windows (PowerShell)"
```ps
.venv\Scripts\activate
```
!!! info "Permissions Error (Windows)"
5. Choose a version to install. Review the [GitHub releases page](https://github.com/invoke-ai/InvokeAI/releases).
If you get a permissions error at this point, run this command and try again.
6. Determine the package package specifier to use when installing. This is a performance optimization.
1. Install the InvokeAI Package. The base command is `pip install InvokeAI --use-pep517`, but you may need to change this depending on your system and the desired features.
If you determined you needed to use a `PyPI` index URL in the previous step, you'll need to add `--index=<INDEX_URL>` like this:
- You may need to provide an [extra index URL](https://pip.pypa.io/en/stable/cli/pip_install/#cmdoption-extra-index-url). Select your platform configuration using [this tool on the PyTorch website](https://pytorch.org/get-started/locally/). Copy the `--extra-index-url` string from this and append it to your install command.
- If you have a CUDA GPU and want to install with `xformers`, you need to add an option to the package name. Note that `xformers` is not strictly necessary. PyTorch includes an implementation of the SDP attention algorithm with similar performance for most GPUs.
```bash
pip install "InvokeAI[xformers]" --use-pep517
```
1. Deactivate and reactivate your venv so that the invokeai-specific commands become available in the environment:
9. Deactivate and reactivate your venv so that the invokeai-specific commands become available in the environment:
=== "Linux/macOS"
@@ -102,17 +105,31 @@ Before you start, go through the [installation requirements](./requirements.md).
deactivate && source .venv/bin/activate
```
=== "Windows"
=== "Windows (PowerShell)"
```ps
deactivate
.venv\Scripts\activate
```
1. Run the application:
10. Run the application, specifying the directory you created earlier as the root directory:
Run `invokeai-web` to start the UI. You must activate the virtual environment before running the app.
=== "Linux/macOS"
!!! warning
```bash
invokeai-web --root ~/invokeai
```
If the virtual environment is _not_ inside the root directory, then you _must_ specify the path to the root directory with `--root \path\to\invokeai` or the `INVOKEAI_ROOT` environment variable.
=== "Windows (PowerShell)"
```bash
invokeai-web --root $Home/invokeai
```
## Headless Install and Launch Scripts
If you run Invoke on a headless server, you might want to install and run Invoke on the command line.
We do not plan to maintain scripts to do this moving forward, instead focusing our dev resources on the GUI [launcher](../installation/quick_start.md).
You can create your own scripts for this by copying the handful of commands in this guide. `uv`'s [`pip` interface docs](https://docs.astral.sh/uv/reference/cli/#uv-pip-install) may be useful.
Welcome to Invoke! Follow these steps to install, update, and get started creating.
## Step 1: System Requirements
Invoke runs on Windows 10+, macOS 14+ and Linux (Ubuntu 20.04+ is well-tested).
Hardware requirements vary significantly depending on model and image output size. The requirements below are rough guidelines.
- All Apple Silicon (M1, M2, etc) Macs work, but 16GB+ memory is recommended.
- AMD GPUs are supported on Linux only. The VRAM requirements are the same as Nvidia GPUs.
!!! info "Hardware Requirements (Windows/Linux)"
=== "SD1.5 - 512×512"
- GPU: Nvidia 10xx series or later, 4GB+ VRAM.
- Memory: At least 8GB RAM.
- Disk: 10GB for base installation plus 30GB for models.
=== "SDXL - 1024×1024"
- GPU: Nvidia 20xx series or later, 8GB+ VRAM.
- Memory: At least 16GB RAM.
- Disk: 10GB for base installation plus 100GB for models.
=== "FLUX - 1024×1024"
- GPU: Nvidia 20xx series or later, 10GB+ VRAM.
- Memory: At least 32GB RAM.
- Disk: 10GB for base installation plus 200GB for models.
More detail on system requirements can be found [here](./requirements.md).
## Step 2: Download
Download the most launcher for your operating system:
- [Download for Windows](https://download.invoke.ai/Invoke%20Community%20Edition.exe)
- [Download for macOS](https://download.invoke.ai/Invoke%20Community%20Edition.dmg)
- [Download for Linux](https://download.invoke.ai/Invoke%20Community%20Edition.AppImage)
## Step 3: Install or Update
Run the launcher you just downloaded, click **Install** and follow the instructions to get set up.
If you have an existing Invoke installation, you can select it and let the launcher manage the install. You'll be able to update or launch the installation.
!!! warning "Problem running the launcher on macOS"
macOS may not allow you to run the launcher. We are working to resolve this by signing the launcher executable. Until that is done, you can either use the [legacy scripts](./legacy_scripts.md) to install, or manually flag the launcher as safe:
- Open the **Invoke-Installer-mac-arm64.dmg** file.
- Drag the launcher to **Applications**.
- Open a terminal.
- Run `xattr -d 'com.apple.quarantine' /Applications/Invoke\ Community\ Edition.app`.
You should now be able to run the launcher.
## Step 4: Launch
Once installed, click **Finish**, then **Launch** to start Invoke.
The very first run after an installation or update will take a few extra moments to get ready.
!!! tip "Server Mode"
The launcher runs Invoke as a desktop application. You can enable **Server Mode** in the launcher's settings to disable this and instead access the UI through your web browser.
## Step 5: Install Models
With Invoke started up, you'll need to install some models.
The quickest way to get started is to install a **Starter Model** bundle. If you already have a model collection, Invoke can use it.
!!! info "Install Models"
=== "Install a Starter Model bundle"
1. Go to the **Models** tab.
2. Click **Starter Models** on the right.
3. Click one of the bundles to install its models. Refer to the [system requirements](#step-1-confirm-system-requirements) if you're unsure which model architecture will work for your system.
=== "Use my model collection"
4. Go to the **Models** tab.
5. Click **Scan Folder** on the right.
6. Paste the path to your models collection and click **Scan Folder**.
7. With **In-place install** enabled, Invoke will leave the model files where they are. If you disable this, **Invoke will move the models into its own folders**.
You’re now ready to start creating!
## Step 6: Learn the Basics
We recommend watching our [Getting Started Playlist](https://www.youtube.com/playlist?list=PLvWK1Kc8iXGrQy8r9TYg6QdUuJ5MMx-ZO). It covers essential features and workflows, including:
- Generating your first image.
- Using control layers and reference guides.
- Refining images with advanced workflows.
## Troubleshooting
If installation fails, retrying the install in Repair Mode may fix it. There's a checkbox to enable this on the Review step of the install flow.
If that doesn't fix it, [clearing the `uv` cache](https://docs.astral.sh/uv/reference/cli/#uv-cache-clean) might do the trick:
- Open and start the dev console (button at the bottom-left of the launcher).
- Run `uv cache clean`.
- Retry the installation. Enable Repair Mode for good measure.
If you are still unable to install, try installing to a different location and see if that works.
If you still have problems, ask for help on the Invoke [discord](https://discord.gg/ZmtBAhwWhy).
## Other Installation Methods
- You can install the Invoke application as a python package. See our [manual install](./manual.md) docs.
- You can run Invoke with docker. See our [docker install](./docker.md) docs.
- You can still use our legacy scripts to install and run Invoke. See the [legacy scripts](./legacy_scripts.md) docs.
Invoke runs on Windows 10+, macOS 14+ and Linux (Ubuntu 20.04+ is well-tested).
!!! warning "Problematic Nvidia GPUs"
## Hardware
We do not recommend these GPUs. They cannot operate with half precision, but have insufficient VRAM to generate 512x512 images at full precision.
Hardware requirements vary significantly depending on model and image output size.
- NVIDIA 10xx series cards such as the 1080 TI
- GTX 1650 series cards
- GTX 1660 series cards
The requirements below are rough guidelines for best performance. GPUs with less VRAM typically still work, if a bit slower. Follow the [Low-VRAM mode guide](./features/low-vram.md) to optimize performance.
Invoke runs best with a dedicated GPU, but will fall back to running on CPU, albeit much slower. You'll need a beefier GPU for SDXL.
- All Apple Silicon (M1, M2, etc) Macs work, but 16GB+ memory is recommended.
- AMD GPUs are supported on Linux only. The VRAM requirements are the same as Nvidia GPUs.
!!! example "Stable Diffusion 1.5"
!!! info "Hardware Requirements (Windows/Linux)"
=== "Nvidia"
=== "SD1.5 - 512×512"
```
Any GPU with at least 4GB VRAM.
```
- GPU: Nvidia 10xx series or later, 4GB+ VRAM.
- Memory: At least 8GB RAM.
- Disk: 10GB for base installation plus 30GB for models.
=== "AMD"
=== "SDXL - 1024×1024"
```
Any GPU with at least 4GB VRAM. Linux only.
```
- GPU: Nvidia 20xx series or later, 8GB+ VRAM.
- Memory: At least 16GB RAM.
- Disk: 10GB for base installation plus 100GB for models.
=== "Mac"
=== "FLUX - 1024×1024"
```
Any Apple Silicon Mac with at least 8GB memory.
```
!!! example "Stable Diffusion XL"
=== "Nvidia"
```
Any GPU with at least 8GB VRAM.
```
=== "AMD"
```
Any GPU with at least 16GB VRAM. Linux only.
```
=== "Mac"
```
Any Apple Silicon Mac with at least 16GB memory.
```
## RAM
At least 12GB of RAM.
## Disk
SSDs will, of course, offer the best performance.
The base application disk usage depends on the torch backend.
!!! example "Disk"
=== "Nvidia (CUDA)"
```
~6.5GB
```
=== "AMD (ROCm)"
```
~12GB
```
=== "Mac (MPS)"
```
~3.5GB
```
You'll need to set aside some space for images, depending on how much you generate. A couple GB is enough to get started.
You'll need a good chunk of space for models. Even if you only install the most popular models and the usual support models (ControlNet, IP Adapter ,etc), you will quickly hit 50GB of models.
- GPU: Nvidia 20xx series or later, 10GB+ VRAM.
- Memory: At least 32GB RAM.
- Disk: 10GB for base installation plus 200GB for models.
!!! info "`tmpfs` on Linux"
@@ -92,26 +37,32 @@ You'll need a good chunk of space for models. Even if you only install the most
## Python
!!! tip "The launcher installs python for you"
You don't need to do this if you are installing with the [Invoke Launcher](./quick_start.md).
Invoke requires python 3.10 or 3.11. If you don't already have one of these versions installed, we suggest installing 3.11, as it will be supported for longer.
Check that your system has an up-to-date Python installed by running `python --version` in the terminal (Linux, macOS) or cmd/powershell (Windows).
Check that your system has an up-to-date Python installed by running `python3 --version` in the terminal (Linux, macOS) or cmd/powershell (Windows).
<h3>Installing Python (Windows)</h3>
!!! info "Installing Python"
- Install python 3.11 with [an official installer].
- The installer includes an option to add python to your PATH. Be sure to enable this. If you missed it, re-run the installer, choose to modify an existing installation, and tick that checkbox.
- You may need to install [Microsoft Visual C++ Redistributable].
=== "Windows"
<h3>Installing Python (macOS)</h3>
-Install python 3.11 with [an official installer].
- The installer includes an option to add python to your PATH. Be sure to enable this. If you missed it, re-run the installer, choose to modify an existing installation, and tick that checkbox.
- You may need to install [Microsoft Visual C++ Redistributable].
- Install python 3.11 with [an official installer].
- If model installs fail with a certificate error, you may need to run this command (changing the python version to match what you have installed): `/Applications/Python\ 3.10/Install\ Certificates.command`
- If you haven't already, you will need to install the XCode CLI Tools by running `xcode-select --install` in a terminal.
=== "macOS"
<h3>Installing Python (Linux)</h3>
-Install python 3.11 with [an official installer].
- If model installs fail with a certificate error, you may need to run this command (changing the python version to match what you have installed): `/Applications/Python\ 3.10/Install\ Certificates.command`
- If you haven't already, you will need to install the XCode CLI Tools by running `xcode-select --install` in a terminal.
- Follow the [linux install instructions], being sure to install python 3.11.
- You'll need to install `libglib2.0-0` and `libgl1-mesa-glx` for OpenCV to work. For example, on a Debian system: `sudo apt update && sudo apt install -y libglib2.0-0 libgl1-mesa-glx`
=== "Linux"
- Installing python varies depending on your system. On Ubuntu, you can use the [deadsnakes PPA](https://launchpad.net/~deadsnakes/+archive/ubuntu/ppa).
- You'll need to install `libglib2.0-0` and `libgl1-mesa-glx` for OpenCV to work. For example, on a Debian system: `sudo apt update && sudo apt install -y libglib2.0-0 libgl1-mesa-glx`
## Drivers
@@ -175,7 +126,4 @@ An alternative to installing ROCm locally is to use a [ROCm docker container] to
**Description:** A set of custom nodes for InvokeAI to create cross-view or parallel-view stereograms. Stereograms are 2D images that, when viewed properly, reveal a 3D scene. Check out [r/crossview](https://www.reddit.com/r/CrossView/) for tutorials.
@@ -4,6 +4,7 @@ from __future__ import annotations
importinspect
importre
importsys
importwarnings
fromabcimportABC,abstractmethod
fromenumimportEnum
@@ -43,8 +44,6 @@ if TYPE_CHECKING:
logger=InvokeAILogger.get_logger()
CUSTOM_NODE_PACK_SUFFIX="__invokeai-custom-node"
classInvalidVersionError(ValueError):
pass
@@ -62,6 +61,7 @@ class Classification(str, Enum, metaclass=MetaEnum):
- `Prototype`: The invocation is not yet stable and may be removed from the application at any time. Workflows built around this invocation may break, and we are *not* committed to supporting this invocation.
- `Deprecated`: The invocation is deprecated and may be removed in a future version.
- `Internal`: The invocation is not intended for use by end-users. It may be changed or removed at any time, but is exposed for users to play with.
- `Special`: The invocation is a special case and does not fit into any of the other classifications.
"""
Stable="stable"
@@ -69,6 +69,7 @@ class Classification(str, Enum, metaclass=MetaEnum):
Prototype="prototype"
Deprecated="deprecated"
Internal="internal"
Special="special"
classUIConfigBase(BaseModel):
@@ -192,12 +193,19 @@ class BaseInvocation(ABC, BaseModel):
"""Gets a pydantc TypeAdapter for the union of all invocation types."""
def__init__(self,message:str="This class should never be executed or instantiated directly."):
super().__init__(message)
pass
classBaseBatchInvocation(BaseInvocation):
batch_group_id:BATCH_GROUP_IDS=InputField(
default="None",
description="The ID of this batch node's group. If provided, all batch nodes in with the same ID will be 'zipped' before execution, and all nodes' collections must be of the same size.",
input=Input.Direct,
title="Batch Group",
)
def__init__(self):
raiseNotExecutableNodeError()
@invocation(
"image_batch",
title="Image Batch",
tags=["primitives","image","batch","special"],
category="primitives",
version="1.0.0",
classification=Classification.Special,
)
classImageBatchInvocation(BaseBatchInvocation):
"""Create a batched generation, where the workflow is executed once for each image in the batch."""
lora_weight="The weight at which the LoRA is applied to each model"
compel_prompt="Prompt to be parsed by Compel to create a conditioning tensor"
raw_prompt="Raw prompt text (no parsing)"
@@ -193,6 +204,7 @@ class FieldDescriptions:
freeu_b1="Scaling factor for stage 1 to amplify the contributions of backbone features."
freeu_b2="Scaling factor for stage 2 to amplify the contributions of backbone features."
instantx_control_mode="The control mode for InstantX ControlNet union models. Ignored for other ControlNet models. The standard mapping is: canny (0), tile (1), depth (2), blur (3), pose (4), gray (5), low quality (6). Negative values will be treated as 'None'."
board_name:Optional[str]=Field(default=None,description="The board's new name.")
board_name:Optional[str]=Field(default=None,description="The board's new name.",max_length=300)
cover_image_name:Optional[str]=Field(default=None,description="The name of the board's new cover image.")
archived:Optional[bool]=Field(default=None,description="Whether or not the board is archived")
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.