When we delete images, boards, or do any other board mutation, we need
to invalidate numerous query caches and related internal frontend state.
This gets complicated very quickly.
We can drastically reduce the complexity by having the backend return
some more information when we make these mutations.
For example, when deleting a list of images by name, we can return a
list of deleted image name and affected boards. The frontend can use
this information to determine which queries to invalidate with far less
tedium.
This will also enable the more efficient storage of images (e.g. in the
gallery selection). Previously, we had to store the entire image DTO
object, else we wouldn't be able to figure out which queries to
invalidate. But now that the backend tells us exactly what images/boards
have changed, we can just store image names in frontend state. This
amounts to a substantial improvement in DX and reduction in frontend
complexity.
When the invocation cache is used, we might skip all progress images. This can prevent auto-switch-on-first-progress from working, as we don't get any of those events.
It's much easier to only support auto-switch on complete.
This appears to be a bug in Chakra UI v2 - use of a fallback component makes the ref passed to an image end up undefined. Had to remove the skeleton loader fallback component.
In #7724 we made a number of perf optimisations related to enqueuing. One of these optimisations included moving the enqueue logic - including expensive prep work and db writes - to a separate thread.
At the same time manual DB locking was abandoned in favor of WAL mode.
Finally, we set `check_same_thread=False` to allow multiple threads to access the connection at a given time.
I think this may be the cause of #7950:
- We start an enqueue in a thread (running in bg)
- We dequeue
- Dequeue pulls a partially-written queue item from DB and we get the errors in the linked issue
To be honest, I don't understand enough about SQLite to confidently say that this kind of race condition is actually possible. But:
- The error started popping up around the time we made this change.
- I have reviewed the logic from enqueue to dequeue very carefully _many_ times over the past month or so, and I am confident that the error is only possible if we are getting unexpectedly `NULL` values from the DB.
- The DB schema includes `NOT NULL` constraints for the column that is apparently returning `NULL`.
- Therefore, without some kind of race condition or schema issue, the error should not be possible.
- The `enqueue_batch` call is the only place I can find where we have the possibility of a race condition due to async logic. Everywhere else, all DB interaction for the queue is synchronous, as far as I can tell.
This change retains the perf benefits by running the heavy enqueue prep logic in a separate thread, but moves back to the main thread for the DB write. It also uses an explicit transaction for the write.
Will just have to wait and see if this fixes the issue.
This reduces peak memory usage at a negligible cost. Queue items typically take on the order of seconds, making the time cost of a GC essentially free.
Not a great idea on a hotter code path though.
We've long suspected there is a memory leak in Invoke, but that may not be true. What looks like a memory leak may in fact be the expected behaviour for our allocation patterns.
We observe ~20 to ~30 MB increase in memory usage per session executed. I did some prolonged tests, where I measured the process's RSS in bytes while doing 200 SDXL generations. I found that it eventually leveled off at around 100 generations, at which point memory usage had climbed by ~900MB from its starting point.
I used tracemalloc to diff the allocations of single session executions and found that we are allocating ~20MB or so per session in `ModelPatcher.apply_ti()`.
In `ModelPatcher.apply_ti()` we add tokens to the tokenizer when handling TIs. The added tokens should be scoped to only the current invocation, but there is no simple way to remove the tokens afterwards.
As a workaround for this, we clone the tokenizer, add the TI tokens to the clone, and use the clone to when running compel. Afterwards, this cloned tokenizer is discarded.
The tokenizer uses ~20MB of memory, and it has referrers/referents to other compel stuff. This is what is causing the observed increases in memory per session!
We'd expect these objects to be GC'd but python doesn't do it immediately. After creating the cond tensors, we quickly move on to denoising. So there isn't any time for the GC to happen to free up its existing memory arenas/blocks to reuse them. Instead, python needs to request more memory from the OS.
We can improve the situation by immediately calling `del` on the tokenizer clone and related objects. In fact, we already had some code in the compel nodes to `del` some of these objects, but not all.
Adding the `del`s vastly improves things. We hit peak RSS in half the sessions (~50 or less) and it's now ~100MB more than starting value. There is still a gradual increase in memory usage until we level off.
* build: prevent `opencv-python` from being installed
Fixes this error: `AttributeError: module 'cv2.ximgproc' has no attribute 'thinning'`
`opencv-contrib-python` supersedes `opencv-python`, providing the same API + additional features. The two packages should not be installed at the same time to avoid conflicts and/or errors.
The `invisible-watermark` package requires `opencv-python`, but we require the contrib variant.
This change updates `pyproject.toml` to prevent `opencv-python` from ever being installed using a `uv` features called dependency overrides.
* feat(ui): data viewer supports disabling wrap
* feat(api): list _all_ pkgs in app deps endpoint
* chore(ui): typegen
* feat(ui): update about modal to display new full deps list
* chore: uv lock
When a layer is initialized, we do not yet know its bbox, so we cannot fit the stage view to the layer. We have to wait for the bbox calculation to finish. Previously, we had no way to wait unti lthat bbox calculation was complete to take an action.
For example, this means we could not fit the layers to the stage immediately after creating a new layer, bc we don't know the dimensions of the layer yet.
This callback lets us do that. When creating a new canvas from an image, we now...
- Register a bbox update callback to fit the layers to stage
- Layer is created
- Canvas initializes the layer's entity adapter module (layer's width and height are set to zero at this point)
- Canvas calculates the bbox
- Bbox is updated (width and height are now correct)
- Callback is ran, fitting layer to stage
Also change import order to ensure CLI args are handled correctly. Had to do this bc importing `InvocationRegistry` before parsing args resulted in the `--root` CLI arg being ignored.
Add `heuristic_resize_fast`, which does the same thing as `heuristic_resize`, except it's about 20x faster.
This is achieved by using opencv for the binary edge handling isntead of python, and checking only 100k pixels to determine what kind of image we are working with.
Besides being much faster, it results in cleaner lines for resized binary canny edge maps, and has results in fewer misidentified segmentation maps.
Tested against normal images, binary canny edge maps, grayscale HED edge maps, segmentation maps, and normal images.
Tested resizing up and down for each.
Besides the new utility function, I needed to swap the `opencv-python` dep for `opencv-contrib-python`, which includes `cv2.ximgproc.thinning`. This function accounts for a good chunk of the perf improvement.
Upstream bug in `transformers` breaks use of `AutoModelForMaskGeneration` class to load SAM models
Simple fix - directly load the model with `SamModel` class instead.
See upstream issue https://github.com/huggingface/transformers/issues/38228
## Summary
- Fallback to new classification API if legacy probe fails
- Method to read model metadata
- Created `StrippedModelOnDisk` class for testing
- Test to verify only a single config `matches` with a model
## Related Issues / Discussions
<!--WHEN APPLICABLE: List any related issues or discussions on github or
discord. If this PR closes an issue, please use the "Closes #1234"
format, so that the issue will be automatically closed when the PR
merges.-->
## QA Instructions
<!--WHEN APPLICABLE: Describe how you have tested the changes in this
PR. Provide enough detail that a reviewer can reproduce your tests.-->
## Merge Plan
<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->
## Checklist
- [ ] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
For example:
```py
my_field: Literal["foo", "bar"] | None = InputField(default=None)
```
Previously, this would cause a field parsing error and prevent the app from loading.
Two fixes:
- This type annotation and resultant schema are now parsed correctly
- Error handling added to template building logic to prevent the hang at startup when an error does occur
Major cleanup of RelatedModels.tsx for improved readability, structure, and maintainability.
Dried out repetitive logic
Consolidated model type sorting into reusable helpers
Added disallowed model type relationships to prevent broken connections (e.g. VAE ↔ LoRA)
- Aware this introduces a new constraint—open to feedback (see PR comment)
Some naming and types may still need refinement; happy to revisit
Adds full support for managing model-to-model relationships in the UI and backend.
Introduces RelatedModels subpanel for linking and unlinking models in model management.
- Adds REST API routes for adding, removing, and retrieving model relationships.
- New database migration: creates model_relationships table for bidirectional links.
- New service layer (model_relationships) for relationship management.
- Updated frontend: Related models float to top of LoRA/Main grouped model comboboxes for quick access.
- Added 'Show Only Related' toggle badge to MainModelPicker filter bar
**Amended commit to remove changes to ParamMainModelSelect.tsx and MainModelPicker.tsx to avoid conflict with upstream deletion/ rewrite**
## Summary
- Modify stats reset to be on a per session basis, rather than a "full
reset", to allow for parallel session execution
- Add "aider" to gitignore
## Related Issues / Discussions
<!--WHEN APPLICABLE: List any related issues or discussions on github or
discord. If this PR closes an issue, please use the "Closes #1234"
format, so that the issue will be automatically closed when the PR
merges.-->
## QA Instructions
<!--WHEN APPLICABLE: Describe how you have tested the changes in this
PR. Provide enough detail that a reviewer can reproduce your tests.-->
## Merge Plan
<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->
## Checklist
- [ ] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
Currently translated at 67.1% (1279 of 1904 strings)
translationBot(ui): update translation (Japanese)
Currently translated at 64.9% (1231 of 1895 strings)
translationBot(ui): update translation (Japanese)
Currently translated at 60.2% (1141 of 1895 strings)
translationBot(ui): update translation (Japanese)
Currently translated at 56.7% (1075 of 1895 strings)
Co-authored-by: RyoKoba <kobayashi_ryo@cyberagent.co.jp>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/ja/
Translation: InvokeAI/Web UI
Currently translated at 100.0% (1896 of 1896 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 100.0% (1895 of 1895 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 100.0% (1886 of 1886 strings)
Co-authored-by: Linos <linos.coding@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/vi/
Translation: InvokeAI/Web UI
Currently translated at 98.8% (1883 of 1904 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.8% (1882 of 1903 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.8% (1881 of 1902 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.8% (1878 of 1899 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.8% (1874 of 1895 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.8% (1873 of 1895 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.8% (1864 of 1886 strings)
Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
When we do our field type overrides to allow invocations to be instantiated without all required fields, we were not modifying the annotation of the field but did set the default value of the field to `None`.
This results in an error when doing a ser/de round trip. Here's what we end up doing:
```py
from pydantic import BaseModel, Field
class MyModel(BaseModel):
foo: str = Field(default=None)
```
And here is a simple round-trip, which should not error but which does:
```py
MyModel(**MyModel().model_dump())
# ValidationError: 1 validation error for MyModel
# foo
# Input should be a valid string [type=string_type, input_value=None, input_type=NoneType]
# For further information visit https://errors.pydantic.dev/2.11/v/string_type
```
To fix this, we now check every incoming field and update its annotation to match its default value. In other words, when we override the default field value to `None`, we make its type annotation `<original type> | None`.
This prevents the error during deserialization.
This slightly alters the schema for all invocations and outputs - the values of all fields without default values are now typed as `<original type> | None`, reflecting the overrides.
This means the autogenerated types for fields have also changed for fields without defaults:
```ts
// Old
image?: components["schemas"]["ImageField"];
// New
image?: components["schemas"]["ImageField"] | null;
```
This does not break anything on the frontend.
* support for custom error toast components, starting with usage limit
* add support for all usage limits
---------
Co-authored-by: Mary Hipp <maryhipp@Marys-MacBook-Air.local>
* display credit column in queue list if shouldShowCredits is true
* change apiModels feature to chatGPT4oModels feature
* empty
---------
Co-authored-by: Mary Hipp <maryhipp@Marys-MacBook-Air.local>
When I followed the Contribute Node documentation, I encountered an import error.
This commit fixes the error, which will help reduce debugging time for all future contributors.
* add GPTimage1 as allowed base model
* fix for non-disabled inpaint layers
* lots of boilerplate for adding gpt-image base model and disabling things along with imagen
* handle gpt-image dimensions
* build graph for gpt-image
* lint
* feat(ui): make chatgpt model naming consistent
* feat(ui): graph builder naming
* feat(ui): disable img2img for imagen3
* feat(ui): more naming
* feat(ui): support presigned url prefetch
* feat(ui): disable neg prompt for chatgpt
* docs(ui): update docstring
* feat(ui): fix graph building issues for chatgpt
* fix(ui): node ids for chatgpt/imagen
* chore(ui): typegen
---------
Co-authored-by: Mary Hipp <maryhipp@Marys-MacBook-Air.local>
Co-authored-by: psychedelicious <4822129+psychedelicious@users.noreply.github.com>
If provided, `<NavigateToModelManagerButton />` will render, even if `disabledTabs` includes "models". If provided, `<NavigateToModelManagerButton />` will run the callback instead of switching tabs within the studio.
The button's tooltip is now just "Manage Models" and its icon is the same as the model manager tab's icon ([CUBE!](https://www.youtube.com/watch?v=4aGDCE6Nrz0)).
There is a subtle change in behaviour with the new model probe API.
Previously, checks for model types was done in a specific order. For example, we did all main model checks before LoRA checks.
With the new API, the order of checks has changed. Check ordering is as follows:
- New API checks are run first, then legacy API checks.
- New API checks categorized by their speed. When we run new API checks, we sort them from fastest to slowest, and run them in that order. This is a performance optimization.
Currently, LoRA and LLaVA models are the only model types with the new API. Checks for them are thus run first.
LoRA checks involve checking the state dict for presence of keys with specific prefixes. We expect these keys to only exist in LoRAs.
It turns out that main models may have some of these keys.
For example, this model has keys that match the LoRA prefix `lora_te_`: https://civitai.com/models/134442/helloyoung25d
Under the old probe, we'd do the main model checks first and correctly identify this as a main model. But with the new setup, we do the LoRA check first, and those pass. So we import this model as a LoRA.
Thankfully, the old probe still exists. For now, the new probe is fully disabled. It was only called in one spot.
I've also added the example affected model as a test case for the model probe. Right now, this causes the test to fail, and I've marked the test as xfail. CI will pass.
Once we enable the new API again, the xfail will pass, and CI will fail, and we'll be reminded to update the test.
In the previous commit, the LLaVA model was updated to support partial loading.
In this commit, the SigLIP model is updated in the same way.
This model is used for FLUX Redux. It's <4GB and only ever run in isolation, so it won't benefit from partial loading for the vast majority of users. Regardless, I think it is best if we make _all_ models work with partial loading.
PS: I also fixed the initial load dtype issue, described in the prev commit. It's probably a non-issue for this model, but we may as well fix it.
The model manager has two types of model cache entries:
- `CachedModelOnlyFullLoad`: The model may only ever be loaded and unloaded as a single object.
- `CachedModelWithPartialLoad`: The model may be partially loaded and unloaded.
Partial loaded is enabled by overwriting certain torch layer classes, adding the ability to autocast the layer to a device on-the-fly. See `CustomLinear` for an example.
So, to take advantage of partial loading and be cached as a `CachedModelWithPartialLoad`, the model must inherit from `torch.nn.Module`.
The LLaVA classes provided by `transformers` do inherit from `torch.nn.Module`, but we wrap those classes in a separate class called `LlavaOnevisionModel`. The wrapper encapsulate both the LLaVA model and its "processor" - a lightweight class that prepares model inputs like text and images.
While it is more elegant to encapsulate both model and processor classes in a single entity, this prevents the model cache from enabling partial loading for the chunky vLLM model.
Fixing this involved a few changes.
- Update the `LlavaOnevisionModelLoader` class to operate on the vLLM model directly, instead the `LlavaOnevisionModel` wrapper class.
- Instantiate the processor directly in the node. The processor is lightweight and does its business on the CPU. We don't need to worry about caching in the model manager.
- Remove caching support code from the `LlavaOnevisionModel` wrapper class. It's not needed, because we do not cache this class. The class now only handles running the models provided to it.
- Rename `LlavaOnevisionModel` to `LlavaOnevisionPipeline` to better represent its purpose.
These changes have a bonus effect of fixing an OOM crash when initially loading the models. This was most apparent when loading LLaVA 7B, which is pretty chunky.
The initial load is onto CPU RAM. In the old version of the loaders, we ignored the loader's target dtype for the initial load. Instead, we loaded the model at `transformers`'s "default" dtype of fp32.
LLaVA 7B is fp16 and weighs ~17GB. Loading as fp32 means we need double that amount (~34GB) of CPU RAM. Many users only have 32GB RAM, so this causes a _CPU_ OOM - which is a hard crash of the whole process.
With the updated loaders, the initial load logic now uses the target dtype for the initial load. LLaVA now needs the expected ~17GB RAM for its initial load.
PS: If we didn't make the accompanying partial loading changes, we still could have solved this OOM. We'd just need to pass the initial load dtype to the wrapper class and have it load on that dtype. But we may as well fix both issues.
PPS: There are other models whose model classes are wrappers around a torch module class, and thus cannot be partially loaded. However, these models are typically fairly small and/or are run only on their own, so they don't benefit as much from partial loading. It's the really big models (like LLaVA 7B) that benefit most from the partial loading.
Currently translated at 56.6% (1069 of 1887 strings)
translationBot(ui): update translation (Japanese)
Currently translated at 50.8% (960 of 1887 strings)
translationBot(ui): update translation (Japanese)
Currently translated at 48.4% (912 of 1882 strings)
Co-authored-by: RyoKoba <kobayashi_ryo@cyberagent.co.jp>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/ja/
Translation: InvokeAI/Web UI
I am at loss as the to cause of this bug. The styles that I needed to change to fix it haven't been changed in a couple months. But these do seem to fix it.
Closes#7910
This query can have potentially large responses. Keeping them around for 24 hours essentially a hardcoded memory leak. Use the default for RTKQ of 60 seconds.
When users generate on the canvas or upscaling tabs, we parse prompts through dynamic prompts before invoking. Whenever the prompt or other settings change, we run dynamic prompts.
Previously, we used a redux listener to react to changes to dynamic prompts' dependent state, keeping the processed dynamic prompts synced. For example, when the user changed the prompt field, we re-processed the dynamic prompts.
This requires that all redux actions that change the dependent state be added to the listener matcher. It's easy to forget actions, though, which can result in the dynamic prompts state being stale.
For example, when resetting canvas state, we dispatch an action that resets the whole params slice, but this wasn't in the matcher. As a result, when resetting canvas, the dynamic prompts aren't updated. If the user then clicks Invoke (with an empty prompt), the last dynamic prompts state will be used.
For example:
- Generate w/ prompt "frog", get frog
- Click new canvas session
- Generate without any prompt, still get frog
To resolve this, the logic that keeps the dynamic prompts synced is moved from the listener to a hook. The way the logic is triggered is improved - it's now triggered in a useEffect, which is run when the dependent state changes. This way, it doesn't matter _how_ the dependent state changes - the changes will always be "seen", and the dynamic prompts will update.
Add `useCanvasIsBusySafe()` hook. This is like `useCanvasIsBusy()`, but when the canvas is not initialized, it gracefully falls back to false instead of raising.
Because app tabs are lazy-loaded, the canvas is not initialized until the user visits that tab. If the page loads up on the workflows tab, the canvas will be uninitialized until the user clicks on it.
This graceful fallback behaviour allows actions like sending an image to canvas to work even when the canvas is not yet initialized. These actions are exposed in the image context menu, and previously were hidden when the canvas was not initialized. We can now show these actions and use them even when the canvas is uninitialized.
- Add `useCanvasIsBusySafe()` hook
- Use the new hook in the image context menu for send to canvas actions
- Do not use `<CanvasManagerProviderGate />` in the image context menu (this was hiding the actions when canvas was uninitialized)
When calling `ctx.drawImage()`, if the image to be drawn has a width of height of 0, the call will raise.
In this change, I have carefully reviewed the call hierarchy for all of our own code that calls this method and ensured that each call has error handling.
Well, with one exception - I'm not sure how to handle errors in `invokeai/frontend/web/src/common/hooks/useClientSideUpload.ts`. But this should never be an issue in that hook - it's a Canvas problem.
Currently translated at 100.0% (1873 of 1873 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 100.0% (1871 of 1871 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 99.2% (1857 of 1871 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 100.0% (1840 of 1840 strings)
Co-authored-by: Linos <linos.coding@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/vi/
Translation: InvokeAI/Web UI
Whether a workflow is published or not shouldn't be something stored on the client. It's properly server-side state.
This change removes the `is_published` flag from redux and updates all references to the flag to use the getWorkflow query.
It also updates the socket event listener that handles session complete events. When a validation run completes, we invalidate the tags for the getWorkflow query. We need to do a bit of juggling to avoid a race condition (documented in the code). Works well though.
Previously, we maintained an `isTouched` flag in redux state to indicate if a workflow had unsaved changes. We manually updated this whenever we changed something on the workflow.
This was tedious and error-prone. It also didn't handle undo/redo, so if you made a change to a node and undid it, we'd still think the workflow had unsaved changes.
Moving forward, we use a simpler and more robust strategy by hashing the server's version of the workflow and comparing it to the client's version of the workflow.
The hashing uses `stable-hash`, which is both fast and, well, stable. Most importantly, the ordering of keys in hashed objects does not change the resultant hash.
- Remove `isTouched` state entirely.
- Extract the logic that builds the "preview" workflow object from redux state into its own hook. This "preview" workflow is what we send to the server when saving a workflow. This "preview" workflow is effectively the client version of the workflow.
- Add `useDoesWorkflowHaveUnsavedChanges()` hook, which compares the hash of the client workflow and server workflow (if it exists).
- Add `useIsWorkflowUntouched()` hook, which compares the hash of the client workflow and the initial workflow that you get when you click new workflow.
- Remove `reactflow` workaround in the nodes slice undo/redo filter. When we set the nodes state while loading a workflow, `reactflow` emits a nodes size/placement change event. This triggered up our `isTouched` flag logic and marked the workflow as unsaved right from the get-go. With the new strategy to track touched status, this workaround can be removed.
- Update all logic that tracked the old `isTouched` flag to use the new hooks.
Previously, the workflow form's root element id was random. Every time we reset the workflow editor, the root id changed. This makes it difficult to check if the workflow editor is untouched (in its default state).
Now that root element's id is simply "root". I can't imagine any way that this would break anything.
This allows it to pull in sentencepiece on its own. In 0.10.0, it didn't have this package listed as a dependency, but in recent releases it does. So we are able to remove sentencepiece as an explicit dep.
The fixes in this module monkeypatched `torch` to resolve some issues with FP16 on macOS. These issues have long since been resolved.
Included in the now-removed fixes is `CustomSlicedAttentionProcessor`, which is intended to reduce memory requirements for MPS. This overrides `diffusers`' own `SlicedAttentionProcessor`.
Unfortunately, `attention_type: sliced` produces hot garbage with the fixes and black images without the fixes. So this class appears to now be a moot point.
Regardless, SDPA is supported on MPS and very efficient, so sliced attention is largely obsolete.
In https://github.com/pydantic/pydantic/pull/10029, pydantic made an improvement to its generated JSON schemas (OpenAPI schemas). The previous and new generated schemas both meet the schema spec.
When we parse the OpenAPI schema to generate node templates, we use some typeguard to narrow schema components from generic OpenAPI schema objects to a node field schema objects. The narrower node field schema objects contain extra data.
For example, they contain a `field_kind` attribute that indicates it the field is an input field or output field. These extra attributes are not part of the OpenAPI spec (but the spec allows does allow for this extra data).
This typeguard relied on a pydantic implementation detail. This was changed in the linked pydantic PR, which released with v2.9.0. With the change, our typeguard rejects input field schema objects, causing parsing to fail with errors/warnings like `Unhandled input property` in the JS console.
In the UI, this causes many fields - mostly model fields - to not show up in the workflow editor.
The fix for this is very simple - instead of relying on an implementation detail for the typeguard, we can check if the incoming schema object has any of our invoke-specific extra attributes. Specifically, we now look for the presence of the `field_kind` attribute on the incoming schema object. If it is present, we know we are dealing with an invocation input field and can parse it appropriately.
In `ObjectSerializerDisk`, we use `torch.load` to load serialized objects from disk. With torch 2.6.0, torch defaults to `weights_only=True`. As a result, torch will raise when attempting to deserialize anything with an unrecognized class.
For example, our `ConditioningFieldData` class is untrusted. When we load conditioning from disk, we will get a runtime error.
Torch provides a method to add trusted classes to an allowlist. This change adds an arg to `ObjectSerializerDisk` to add a list of safe globals to the allowlist and uses it for both `ObjectSerializerDisk` instances.
Note: My first attempt inferred the class from the generic type arg that `ObjectSerializerDisk` accepts, and added that to the allowlist. Unfortunately, this doesn't work.
For example, `ConditioningFieldData` has a `conditionings` attribute that may be one some other untrusted classes representing model-specific conditioning data. So, even if we allowlist `ConditioningFieldData`, loading will fail when torch deserializes the `conditionings` attribute.
This is a squash of a lot of scattered commits that became very difficult to clean up and make individually. Sorry.
Besides the new UI, there are a number of notable changes:
- Publishing logic is disabled in OSS by default. To enable it, provided a `disabledFeatures` prop _without_ "publishWorkflow".
- Enqueuing a workflow is no longer handled in a redux listener. It was hard to track the state of the enqueue logic in the listener. It is now in a hook. I did not migrate the canvas and upscaling tabs - their enqueue logic is still in the listener.
- When queueing a validation run, the new `useEnqueueWorkflows()` hook will update the payload with the required data for the run.
- Some logic is added to the socket event listeners to handle workflow publish runs completing.
- The workflow library side nav has a new "published" view. It is hidden when the "publishWorkflow" feature is disabled.
- I've added `Safe` and `OrThrow` versions of some workflows hooks. These hooks typically retrieve some data from redux. For example, a node. The `Safe` hooks return the node or null if it cannot be found, while the `OrThrow` hooks return the node or raise if it cannot be found. The `OrThrow` hooks should be used within one of the gate components. These components use the `Safe` hooks and render a fallback if e.g. the node isn't found. This change is required for some of the publish flow UI.
- Add support for locking the workflow editor. When locked, you can pan and zoom but that's it. Currently, it is only locked during publish flow and if a published workflow is opened.
This message is logged _every_ time we retrieve a list of models if there is an invalid model. Previously it logged the _whole_ row which can be a lot of data. Truncate the row to 64 characters to reduce log pollution.
Currently translated at 98.8% (1818 of 1840 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.6% (1816 of 1840 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.7% (1816 of 1839 strings)
Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
Previously, reactflow appears to have handled an edge case when using its `applyChanges` utility. If a change was provided without an item, it would skip that change. For example, an "add edge" change that somehow passed `null` as the edge, instead of a valid edge.
In our workflow loading and validation logic, invalid edges were removed from the array using `delete edges[i]`. This left "holes" in the array of edges. We then asked `reactflow` to add these edges to state. When it encountered one of the "holes", it skipped over it.
In a recent release (unsure which, somewhere between the latest v11 and ~v12.4) this seems to have changed. It no longer skips over the "holes" and instead trusts the data. This can cause a couple issues:
- Error when loading the workflow if `reactflow` attempt to do anything with the nonexistent edge.
- If somehow the workflow makes it into state with "holes" in the array of edges, all sorts of other stuff breaks when our code does anything with the nonexistent edge.
Two-part fix:
- Update the invalid edge handling to not use `delete edges[i]`. Instead, as we check each edge, we add invalid ones to a set. Then, after all the checks are finished, filter out the invalid edges. The resultant edges array has no holes.
- Simplify the logic around setting nodes and edges in redux. Previously we were using `reactflow`'s `applyChanges` utils, but this does literally nothing except take extra CPU cycles. We can simply set the loaded nodes and edges directly in redux. Perhaps we were using `applyChanges` because it addressed the "holes" issue? Not sure. But we don't need it now.
Closes#7868
## Summary
`timm` below 1.0.0 prevents llava models from working (broken in
transformers). but `controlnet-aux` pins `timm` to an earlier version
because otherwise it was breaking the ZoeDepth controlnet.
we don't use ZoeDepth (replaced by depthAnything), and downgrading
controlnet-aux seems to be acceptable.
more context here:
https://github.com/huggingface/controlnet_aux/issues/106https://github.com/huggingface/controlnet_aux/pull/101
Note that this results in some warnings on startup, stemming from
controlnet-aux:

we can probably silence the warnings as a separate enhancement
## Related Issues / Discussions
<!--WHEN APPLICABLE: List any related issues or discussions on github or
discord. If this PR closes an issue, please use the "Closes #1234"
format, so that the issue will be automatically closed when the PR
merges.-->
## QA Instructions
<!--WHEN APPLICABLE: Describe how you have tested the changes in this
PR. Provide enough detail that a reviewer can reproduce your tests.-->
## Merge Plan
<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
## Summary
- Port LoRA to new classification API
- Add 2 additional tests cases (ControlLora and Flux Diffusers LoRA)
- Moved `ModelOnDisk` to its own module
## Related Issues / Discussions
<!--WHEN APPLICABLE: List any related issues or discussions on github or
discord. If this PR closes an issue, please use the "Closes #1234"
format, so that the issue will be automatically closed when the PR
merges.-->
## QA Instructions
<!--WHEN APPLICABLE: Describe how you have tested the changes in this
PR. Provide enough detail that a reviewer can reproduce your tests.-->
## Merge Plan
<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->
## Checklist
- [ ] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
Before FLUX Fill was merged, we didn't do any checks for the model variant. We always returned "normal".
To determine if a model is a FLUX Fill model, we need to check the state dict for a specific key. Initially, this logic was too strict and rejected quantized FLUX models. This issue was resolved, but it turns out there is another failure mode - some fine-tunes use a different key.
This change further reduces the strictness, handling the alternate key and also falling back to "normal" if we don't see either key. This effectively restores the previous probing behaviour for all FLUX models.
Closes#7856Closes#7859
The polynomial fit isn't perfect and we end up with alpha values of 1 instead of 0 when applying the mask. This in turn causes issues on canvas where outputs aren't 100% transparent and individual layer bbox calculations are incorrect.
Lots of squashed experimentation heh:
ci: manually specify python version in tests
ci: whoops typo in ruff cmds
ci: specify python versions for uv python install
ci: install python verbosely
ci: try forcing python preference?
ci: try forcing python preference a different way?
ci: try in a venv?
ci: it works, but try without venv
ci: oh maybe we need --preview?
ci: poking it with a stick
ci: it works, add summary to pytest output
ci: fix pytest output
experiment: simulate test failure
Revert "experiment: simulate test failure"
This reverts commit b99ca512f6e61a2a04a1c0636d44018c11019954.
ci: just use default pytest output
cI: attempt again to use uv to install python
cI: attempt again again to use uv to install python
Revert "cI: attempt again again to use uv to install python"
This reverts commit 3cba861c90738081caeeb3eca97b60656ab63929.
Revert "cI: attempt again to use uv to install python"
This reverts commit b30f2277041dc999ed514f6c594c6d6a78f5c810.
## Summary
- Extend `ModelOnDisk` with caching, type hints, default args
- Fail early if there is an error classifying a config
## Related Issues / Discussions
<!--WHEN APPLICABLE: List any related issues or discussions on github or
discord. If this PR closes an issue, please use the "Closes #1234"
format, so that the issue will be automatically closed when the PR
merges.-->
## QA Instructions
<!--WHEN APPLICABLE: Describe how you have tested the changes in this
PR. Provide enough detail that a reviewer can reproduce your tests.-->
## Merge Plan
<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->
## Checklist
- [ ] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
## Summary
This PR moves type definitions out of `config.py` into a new
`taxonomy.py` module.
The goal is to reduce clutter in `config.py`, and to resolve circular
import issues by isolating these types in a dedicated module with
(almost) no internal dependencies.
Because so many places import these definitions, these changes touch 73
files.
Additional changes:
- Removed star imports using "removestar" tool
- Added the commit to `.git-blame-ignore-revs` to avoid noise in git
blame history
## Related Issues / Discussions
<!--WHEN APPLICABLE: List any related issues or discussions on github or
discord. If this PR closes an issue, please use the "Closes #1234"
format, so that the issue will be automatically closed when the PR
merges.-->
## QA Instructions
<!--WHEN APPLICABLE: Describe how you have tested the changes in this
PR. Provide enough detail that a reviewer can reproduce your tests.-->
## Merge Plan
<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->
## Checklist
- [ ] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
The top-level `invokeai` package may have an obscured origin due to the way editible installs work, but it's much more likely that this module is from a specific file.
## Summary
This test imports all modules in the invokeai package and fails if there
are any exceptions.
Existing issues are excluded to avoid blocking main.
## Related Issues / Discussions
<!--WHEN APPLICABLE: List any related issues or discussions on github or
discord. If this PR closes an issue, please use the "Closes #1234"
format, so that the issue will be automatically closed when the PR
merges.-->
## QA Instructions
<!--WHEN APPLICABLE: Describe how you have tested the changes in this
PR. Provide enough detail that a reviewer can reproduce your tests.-->
## Merge Plan
<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->
## Checklist
- [ ] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
## Summary
- Port LLaVA model config to new classification API
- Add 2 test cases (stripped LLaVA models variants to git-lfs)
## Related Issues / Discussions
<!--WHEN APPLICABLE: List any related issues or discussions on github or
discord. If this PR closes an issue, please use the "Closes #1234"
format, so that the issue will be automatically closed when the PR
merges.-->
## QA Instructions
<!--WHEN APPLICABLE: Describe how you have tested the changes in this
PR. Provide enough detail that a reviewer can reproduce your tests.-->
## Merge Plan
<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->
## Checklist
- [ ] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
In #7780 we added FLUX Fill support, and needed the probe to be able to distinguish between "normal" FLUX models and FLUX Fill models.
Logic was added to the probe to check a particular state dict key (input channels), which should be 384 for FLUX Fill and 64 for other FLUX models.
The new logic was stricter and instead of falling back on the "normal" variant, it raised when an unexpected value for input channels was detected.
This caused failures to probe for BNB-NF4 quantized FLUX Dev/Schnell, which apparently only have 1 input channel.
After checking a variety of FLUX models, I loosened the strictness of the variant probing logic to only special-case the new FLUX Fill model, and otherwise fall back to returning the "normal" variant. This better matches the old behaviour and fixes the import errors.
Closes#7822
Currently translated at 100.0% (1827 of 1827 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 100.0% (1826 of 1826 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 100.0% (1825 of 1825 strings)
Co-authored-by: Linos <linos.coding@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/vi/
Translation: InvokeAI/Web UI
Previously we used erode/dilate and a Gaussian blur to expand and fade the edges of Canvas masks. The implementation a number of problems:
- Erode/dilate kernel sizes were not calculated correctly, and extra iterations were run to compensate. The result is the blur size, which should have been pixels, was very inaccurate and unreliable.
- What we want is to add a "soft bleed" - like a drop shadow with no offset - starting from the edge of the mask, extending out by however many pixels. But Gaussian blur does not do this. The blurred area starts _inside_ the mask and extends outside it. So it kinda blurs inwards and outwards. We compensated for this by expanding the mask.
- Using a Gaussian blur can cause banding artifacts. Gaussian blur doesn't have a "size" or "radius" parameter in the sense that you think it should. It's a convolution matrix and there are _no non-zero values in the result_. This means that, far away from the mask, once compositing completes, we have some values that are very close to zero but not quite zero. These values are quantized by HTML Canvas, resulting in banding artifacts where you'd expect the blur to have faded to 0% alpha. At least, that is my understanding of why the banding artifacts occur.
The new node uses a better strategy to expand the mask and add the fade out effect:
- Calculate the distance from each white pixel to the nearest black pixel.
- Normalize this distance by dividing by the fade size in px, then clip the values to 0 - 1. The result represents the distance of each white pixel to its nearest black pixel as a percentage of the fade size. At this point, it is a linear distribution.
- Create a polynomial to describe the fade's intensity so that we can have a smooth transition from the masked region (black) to unmasked (white). There are some magic numbers here, deterined experimentally.
- Evaluate the polynomial over the normalized distances, so we now have a matrix representing the fade intensity for every pixel
- Convert this matrix back to uint8 and apply it to the mask
This works soooo much better than the previous method. Not only does it fix the banding issues, but when we enable "output only generated regions", we get a much smaller image. Will add images to the PR to clarify.
## Summary
- Integrate Git LFS to our automated Python tests in CI
- Add stripped model files with git-lfs
- `README.md` instructions to install and configure git-lfs
- Unrelated change (skip hashing to make unit test run faster)
## Related Issues / Discussions
<!--WHEN APPLICABLE: List any related issues or discussions on github or
discord. If this PR closes an issue, please use the "Closes #1234"
format, so that the issue will be automatically closed when the PR
merges.-->
## QA Instructions
<!--WHEN APPLICABLE: Describe how you have tested the changes in this
PR. Provide enough detail that a reviewer can reproduce your tests.-->
## Merge Plan
<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->
## Checklist
- [ ] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
## Summary
**Problem**
We want to have automated tests for model classification/probing, but
model files are too large to include in the source.
**Proposed Solution**
Classification/probing only requires metadata (key names, tensor
shapes), not weights.
This PR introduces "stripped" models - lightweight versions that retains
only essential metadata.
- Added script to strip models
- Added stripped models to automated tests
**Model size before and after "stripping":**
```
LLaVA Onevision Qwen2 0.5b-ov-hf before: 1.8 GB, after: 11.6 MB
text_encoder before: 246.1 MB, after: 35.6 kB
llava-onevision-qwen2-7b-si-hf before: 16.1 GB, after: 11.7 MB
RealESRGAN_x2plus.pth before: 67.1 MB, after: 143.0 kB
IP Adapter SD1 before: 2.5 GB, after: 94.9 kB
Hard Edge Detection (canny) before: 722.6 MB, after: 63.6 kB
Lineart before: 722.6 MB, after: 63.6 kB
Segmentation Map before: 722.6 MB, after: 63.6 kB
EasyNegative before: 24.7 kB, after: 151 Bytes
Face Reference (IP Adapter Plus Face) before: 98.2 MB, after: 13.7 kB
Standard Reference (IP Adapter) before: 44.6 MB, after: 6.0 kB
shinkai_makoto_offset before: 151.1 MB, after: 160.0 kB
thickline_fp16 before: 151.1 MB, after: 160.0 kB
Alien Style before: 228.5 MB, after: 582.6 kB
Noodles Style before: 228.5 MB, after: 582.6 kB
Juggernaut XL v9 before: 6.9 GB, after: 3.7 MB
dreamshaper-8 before: 168.9 MB, after: 1.6 MB
```
## Related Issues / Discussions
<!--WHEN APPLICABLE: List any related issues or discussions on github or
discord. If this PR closes an issue, please use the "Closes #1234"
format, so that the issue will be automatically closed when the PR
merges.-->
## QA Instructions
<!--WHEN APPLICABLE: Describe how you have tested the changes in this
PR. Provide enough detail that a reviewer can reproduce your tests.-->
## Merge Plan
<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->
## Checklist
- [ ] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
## Summary
The _goal_ of this PR is to make it easier to add an new config type.
This _scope_ of this PR is to integrate the API and does not include
adding new configs (outside tests) or porting existing ones.
One of the glaring issues of the existing *legacy probe* is that the
logic for each type is spread across multiple classes and intertwined
with the other configs. This means that adding a new config type (or
modifying an existing one) is complex and error prone.
This PR attempts to remedy this by providing a new API for adding
configs that:
- Is backwards compatible with the existing probe.
- Encapsulates fields and logic in a single class, keeping things
self-contained and easy to modify safely.
Below is a minimal toy example illustrating the proposed new structure:
```python
class MinimalConfigExample(ModelConfigBase):
type: ModelType = ModelType.Main
format: ModelFormat = ModelFormat.Checkpoint
fun_quote: str
@classmethod
def matches(cls, mod: ModelOnDisk) -> bool:
return mod.path.suffix == ".json"
@classmethod
def parse(cls, mod: ModelOnDisk) -> dict[str, Any]:
with open(mod.path, "r") as f:
contents = json.load(f)
return {
"fun_quote": contents["quote"],
"base": BaseModelType.Any,
}
```
To create a new config type, one needs to inherit from `ModelConfigBase`
and implement its interface.
The code falls back to the legacy model probe for existing models using
the old API.
This allows us to incrementally port the configs one by one.
## Related Issues / Discussions
<!--WHEN APPLICABLE: List any related issues or discussions on github or
discord. If this PR closes an issue, please use the "Closes #1234"
format, so that the issue will be automatically closed when the PR
merges.-->
## QA Instructions
<!--WHEN APPLICABLE: Describe how you have tested the changes in this
PR. Provide enough detail that a reviewer can reproduce your tests.-->
## Merge Plan
<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
In #7688 we optimized queuing preparation logic. This inadvertently broke retrying queue items.
Previously, a `NamedTuple` was used to store the values to insert in the DB when enqueuing. This handy class provides an API similar to a dataclass, where you can instantiate it with kwargs in any order. The resultant tuple re-orders the kwargs to match the order in the class definition.
For example, consider this `NamedTuple`:
```py
class SessionQueueValueToInsert(NamedTuple):
foo: str
bar: str
```
When instantiating it, no matter the order of the kwargs, if you make a normal tuple out of it, the tuple values are in the same order as in the class definition:
```
t1 = SessionQueueValueToInsert(foo="foo", bar="bar")
print(tuple(t1)) # -> ('foo', 'bar')
t2 = SessionQueueValueToInsert(bar="bar", foo="foo")
print(tuple(t2)) # -> ('foo', 'bar')
```
So, in the old code, when we used the `NamedTuple`, it implicitly normalized the order of the values we insert into the DB.
In the retry logic, the values of the tuple were not ordered correctly, but the use of `NamedTuple` had secretly fixed the order for us.
In the linked PR, `NamedTuple` was dropped for a normal tuple, after profiling showed `NamedTuple` to be meaningfully slower than a normal tuple.
The implicit order normalization behaviour wasn't understood, and the order wasn't fixed when changin the retry logic to use a normal tuple instead of `NamedTuple`. This results in a bug where we incorrectly create queue items in the DB. For example, we stored the `destination` in the `field_values` column.
When such an incorrectly-created queue item is dequeued, it fails pydantic validation and causes what appears to be an endless loop of errors.
The only user-facing solution is to add this line to `invokeai.yaml` and restart the app:
```yaml
clear_queue_on_startup: true
```
On next startup, the queue is forcibly cleared before the error loop is triggered. Then the user should remove this line so their queue is persisted across app launches per usual.
The solution is simple - fix the ordering of the tuple. I also added a type annotation and comment to the tuple type alias definition.
Note: The endless error loop, as a general problem, will take some thinking to fix. The queue service methods to cancel and fail a queue item still retrieve it and parse it. And the list queue items methods parse the queue items. Bit of a catch 22, maybe the solution is to simply delete totally borked queue items and log an error.
Currently translated at 98.7% (1800 of 1822 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.7% (1798 of 1820 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.7% (1796 of 1818 strings)
Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
There is now a single entrypoint for loading a workflow - `useLoadWorkflowWithDialog`.
The hook:
Handles loading workflows from various sources. If there are unsaved changes, the user will be prompted to confirm before loading the workflow.
It returns a function that:
Loads a workflow from various sources. If there are unsaved changes, the user will be prompted to confirm before loading the workflow. The workflow will be loaded immediately if there are no unsaved changes. On success, error or completion, the corresponding callback will be called.
WHEW
- Replace `get_counts` method with `get_tag_counts_with_filter` which gets the counts for a list of tags, filtering by a list of selected tags
- Update `get_many` logic to apply tag filtering with AND logic, to match the new `get_tag_counts_with_filter` method
- Update workflow library router
User facing:
When a FLUX main model is selected, users may now add Regional Reference Image layers.
When switching between FLUX Redux and FLUX IP Adapter, the settings will change to match the model type. (IP Adapter has weight, begin/end step, but Redux does not.) The image will be retained when switching between the two.
Otherwise it works the same way as IP Adapter - both in Global and Regional Reference Image layers.
---
Internal state handling:
Slightly awkward, but it was easiest to make FLUX Redux a second type of IP Adapter in redux state.
Global and regional reference images still have a single `ipAdapter` field, but it can have a type of `ip_adapter` or `flux_redux`.
Ideally, this field is called `config` or `settings` or something, but we are past that point. We _could_ do a migration to rename it, but I don't think it's worth the effort.
---
Other changes:
- Updated canvas layer validators to handle FLUX Redux.
- Updated model list loading logic to un-set FLUX Redux models in Canvas if they are not in the list (e.g. if the user deletes the model in the main app).
- Updated graph builders - new `addFLUXRedux` util & updated `addRegions` util.
- Updated the `buildModelsHook` util to return a hook that accepts a filter callback. This handles a discrepancy: FLUX IP Adapter does not support regional guidance, but FLUX Redux does. The Regional Guidance settings provide the filter to filter out FLUX IP Adapter models from the combined list of IP Adapter ahd Redux models.
This follows the same pattern for IP Adapter w/ its CLIP Vision model. The SigLIP model is unlikely to ever change and we don't want to force the user to select it anywhere. Hardcoding it is safe and makes the UX much nicer.
The alternative is a model dropdown that will likely only ever have one valid choice in it.
- We don't need to copy the init file. Just crawl the custom nodes dir for modules and import them all. Dunno why I didn't do this initially.
- Pass the logger in as an arg. There was a race condition where if we got the logger directly in the load_custom_nodes function, the config would not have been loaded fully yet and we'd end up with the wrong custom nodes path!
- Remove permissions-setting logic, I do not believe it is relevant for custom nodes
- Minor cleanup of the utility
There's a pydantic thing that causes the graphs to fail validation erroneously. Details in the comments - not a high priority to fix but we should figure it out someday.
This method simply sets the `opened_at` attribute to the current time.
Previously `opened_at` was set when calling `get`, but that is not correct. We `get` workflows often, even when not opening them. So this needs to be a separate thing
Get the counts of workflows for the given tags and/or categories. Made a separate method bc get_many will deserialize all matching workflows, which is unnecessary for this use case.
This big chungus reworks and simplifies much of the logic around loading and saving workflows. It also makes some minor changes to how store the current workflow and determine if it is a draft, user workflow or default workflow.
---
The lower-level hooks to save a workflow have been revised:
- `useSaveLibraryWorkflow`: Saves a user or project workflow that has had changes made to it.
- `useCreateNewWorkflow`: Saves a workflow as a new entity.
A new higher-level hook `useSaveOrSaveAsWorkflow` is intended to be used by components. It returns a single function that:
- Constructs the workflow payload to be sent to the server
- Checks if the workflow is an existing user workflow. If so, it immediately saves (updates) that workflow.
- If it's not an existing user workflow, it opens the save as dialog so the user can choose a name for it and create a new workflow. This occurs for both draft workflows and loaded default workflows.
---
The logic to build the current redux state into a workflow - either to be saved as JSON, to update an existing user workflow, or save as - was a bit convoluted.
Changes to redux state triggered a debounced function to build the workflow, setting it in a global nanostores atom. Then, all of the functions that consumed the "built workflow" referenced this atom.
Now, this logic is strictly imperative. When a consumer wants to save a workflow, we build it on the spot. This removes a layer of indirection.
The logic is in the `useBuildWorkflowFast` hook.
---
The logic for loading a workflow is also revised. Previously, it happened in an RTK listener. You'd need to dispatch an action to load a workflow, and wouldn't know if it succeeded or not (though the listener would make a toast if the load failed).
This is now done in a callback, outside redux middleware. The callback is returned from the `useLoadWorkflow` hook.
---
Previously, we stripped the id from default workflows when loading them. Then, when saving the workflow, we built a workflow object from redux state and hit the API with it.
This has two issues:
- It relies on redux state never having an ID set when a default workflow is loaded. If we somehow ended up with a default workflow's ID in redux, when we go to save the workflow, we'd get and error or it wouldn't work, because you cannot save a default workflow. You can only save-as it.
- We do not know the default workflow from which the current workflow was loaded. And be cause we don't know the default workflow, we cannot show a thumbnail image.
The responsibilities have been shifted around a bit.
Now, when we load a workflow, we load it as-is. The default workflow IDs are saved in redux state. We can render the thumbnail, and if the user goes to save the workflow, we detect that it is a default workflow and save-as it.
---
In `App.tsx`, the long list of modals are moved into their own "isolator" component to ensure any re-renders there do not affect the rest of the app.
---
The save-workflow-as modal is restructured to be a bit simpler. Still works the same. On commercial, "save to project" will be enabled by default.
---
The workflow JSON tab uses a debounced version of "buildWorkflow" to build the workflow as JSON.
---
`buildWorkflowFast` is updated to deep-copy its _whole_ output, preventing issues where field types could accidentally get mutated. I don't think this has ever happened but we may as well be safe.
---
Fixed an issue where the edit button in the workflow list didn't open the workflow in edit mode.
It's only by misunderstanding the pydantic API that this field was is typed as optional. Workflows must _always_ have a category, and indeed they do.
Fixing this allows the generated types in the frontend to be easier to work with..
There was a bit of wonk with default workflows. On every app startup, we wiped them all out and recreated them with new IDs. This is a quick-and-dirty way to ensure default workflows are always in sync.
Unfortunately, it also means default workflows are newly-created entities on every app load. Any thumbnails associated to them will be lost (bc they have new IDs), and `updated_at` doesn't work.
This changes makes default workflows stable entities.
The workflows we bundle in the python package in JSON format are still the source of truth for default workflows, but the startup logic that syncs them to the user DB is a bit smarter.
- All bundled workflows have an ID. It is prefixed with "default_" for clarity.
- Any default workflows in the user's DB that are not in the bundled default workflows are deleted from the DB.
- Any bundled default workflows that are not in the user's DB are added to the DB.
- If a default workflow in the user's DB does not match the content of its corresponding bundled workflow, it is updated in the DB.
The end result is that default workflows are still kept in sync for the user, but they don't change their identity.
We may now add thumbnails to default workflows, and sorting by `updated_at` is now meaningful.
## Summary
Upgrade ruff version to 0.9.9 and format existing code.
## Related Issues / Discussions
<!--WHEN APPLICABLE: List any related issues or discussions on github or
discord. If this PR closes an issue, please use the "Closes #1234"
format, so that the issue will be automatically closed when the PR
merges.-->
## QA Instructions
<!--WHEN APPLICABLE: Describe how you have tested the changes in this
PR. Provide enough detail that a reviewer can reproduce your tests.-->
## Merge Plan
<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->
## Checklist
- [ ] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
This allows our tests to run in an isolated environment. For tests taht implicitly depend on import behaviour, this can prevent side-effects.
The function should only be used for tests.
by adding a layer with all the pytorch dependencies that don't change
most of the time.
## Summary
Every time the [`main` docker
images](https://github.com/invoke-ai/InvokeAI/pkgs/container/invokeai)
rebuild and I pull `main-cuda`, it gets another 3+ GB, which seems like
about a zillion times too much since most things don't change from one
commit on `main` to the next.
This is an attempt to follow the guidance in [Using uv in Docker:
Intermediate
Layers](https://docs.astral.sh/uv/guides/integration/docker/#intermediate-layers)
so there's one layer that installs all the dependencies—including
PyTorch with its bundled nvidia libraries—_before_ the project's own
frequently-changing files are copied in to the image.
## Related Issues / Discussions
- [Improved docker layer cache with
uv](https://discord.com/channels/1020123559063990373/1329975172022927370)
- [astral: Can `uv pip install` torch, but not `uv sync`
it](https://discord.com/channels/1039017663004942429/1329986610770612347)
## QA Instructions
Hopefully the CI system building the docker images is sufficient.
But there is one change to `pyproject.toml` related to xformers, so it'd
be worth checking that `python -m xformers.info` still says it has
triton on the platforms that expect it.
## Merge Plan
I don't expect this to be a disruptive merge.
(An earlier revision of this PR moved the venv, but I've reverted that
change at ebr's recommendation.)
## Checklist
- [ ] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
SQLite cursors are meant to be lightweight and not reused. For whatever reason, we reuse one per service for the entire app lifecycle.
This can cause issues where a cursor is used twice at the same time in different transactions.
This experiment makes the session queue use a fresh cursor for each method, hopefully fixing the issue.
This allows tags to be invalidated while mutations are executing, resolving an issue in this situation:
- A long-running mutation starts.
- A tag is invalidated; for example, user edits a board name, and the boards list query tag is invalidated.
- The boards list query isn't fired, and the board name isn't updated.
- The long-running mutation finishes.
- Finally, the boards list query fires and the board name is updated.
This is the "delayed" behaviour. The "immediately" behaviour has the fires requests from tag invalidation immediately, without waiting for all mutations to finish.
It may cause extra network requests and stale data if we are mutating a lot of things very quickly. I don't think it will be an issue in practice and the improved responsiveness will be a net benefit.
Rely on WAL mode and the busy timeout.
Also changed:
- Remove extraneous rollbacks when we were only doing a `SELECT`
- Remove try/catch blocks that were made extraneous when removing the extraneous rollbacks
This allows for read and write concurrency without using a global mutex. Operations may still fail they take longer than the busy timeout (5s).
If we get a database lock error after waiting 5s for an operation, we have a problem. So, I think it's actually better to use a busy timeout instead of a global mutex.
Alternatively, we could add a timeout to the global mutex.
Fixes an issue where fields like control weight on ControlNet nodes and image on IP Adapter nodes didn't render.
These are "single or collection" fields. They accept a single input object, or collection. They are supposed to render the UI input for a single object.
In a7a71ca935 a performance optimisation for a hot code-path inadvertently broke this.
The determination of which UI component to render for a given field was done using a type guard function for the field's template. Previously, this used a zod schema to parse the template. This is very slow, especially when the template was not the expected type.
The optimization changed the type guards to check the field name (aka its type, integer, image, etc) and cardinality directly, without any zod parsing.
It's much faster, but subtly changed the behaviour because it was a bit stricter. For some fields, it rejected "single or collection" cardinalities when it should have accepted them.
When these fields - like the aforementioned Control Weight and Image - were being rendered, none of the type guards passed and they rendered nothing.
The fix here updates the type guard functions to support multiple cardinalities. So now, when we go to render a "single or collection" field, we will render the "single" input component as it should be.
## Summary
This PR adds a `pytorch_cuda_alloc_conf` config flag to control the
torch memory allocator behavior.
- `pytorch_cuda_alloc_conf` defaults to `None`, preserving the current
behavior.
- The configuration options are explained here:
https://pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf.
Tuning this configuration can reduce peak reserved VRAM and improve
performance.
- Setting `pytorch_cuda_alloc_conf: "backend:cudaMallocAsync"` in
`invokeai.yaml` is expected to work well on many systems. This is a good
first step for those looking to tune this config. (We may make this the
default in the future.)
- The optimal configuration seems to be dependent on a number of factors
such as device version, VRAM, CUDA kernel version, etc. For now, users
will have to experiment with this config to see if it hurts or helps on
their systems. In most cases, I expect it to help.
### Memory Tests
```
VAE decode memory usage comparison:
- SDXL, fp16, 1024x1024:
- `cudaMallocAsync`: allocated=2593 MB, reserved=3200 MB
- `native`: allocated=2595 MB, reserved=4418 MB
- SDXL, fp32, 1024x1024:
- `cudaMallocAsync`: allocated=3982 MB, reserved=5536 MB
- `native`: allocated=3982 MB, reserved=7276 MB
- SDXL, fp32, 1536x1536:
- `cudaMallocAsync`: allocated=8643 MB, reserved=12032 MB
- `native`: allocated=8643 MB, reserved=15900 MB
```
## Related Issues / Discussions
N/A
## QA Instructions
- [x] Performance tests with `pytorch_cuda_alloc_conf` unset.
- [x] Performance tests with `pytorch_cuda_alloc_conf:
"backend:cudaMallocAsync"`.
## Merge Plan
- [x] Merge #7668 first and change target branch to `main`
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
## Summary
Prior to this PR, most of the app setup was being done in `api_app.py`
at import time. This PR cleans this up, by:
- Splitting app setup into more modular functions
- Narrower responsibility for the `api_app.py` file - it just
initializes the `FastAPI` app
The main motivation for this changes is to make it easier to support an
upcoming torch configuration feature that requires more careful ordering
of app initialization steps.
## Related Issues / Discussions
N/A
## QA Instructions
- [x] Launch the app via invokeai-web.py and smoke test it.
- [ ] Launch the app via the installer and smoke test it.
- [x] Test that generate_openapi_schema.py produces the same result
before and after the change.
- [x] No regression in unit tests that directly interact with the app.
(test_images.py)
## Merge Plan
- [x] Check to see if there are any commercial implications to modifying
the app entrypoint.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
On the Canvas tab, when we made the network request to enqueue a batch, we were immediately resetting the request. This effectively disabled RTKQ's tracking of the request - including the loading state.
As a result, when you click the Invoke button on the Canvas tab, it didn't show a spinner, and it was not clear that anything was happening.
The solution is simple - just await the enqueue request before resetting the tracking, same as we already did on the workflows and upscaling tabs.
I also added some extra logging messages for enqueuing, so we get the same JS console logs for each tab on success or failure.
Currently translated at 40.3% (727 of 1801 strings)
translationBot(ui): update translation (Japanese)
Currently translated at 37.7% (680 of 1801 strings)
Co-authored-by: Hiroto N <hironow365@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/ja/
Translation: InvokeAI/Web UI
Previously, custom node loading occurred _during module imports_. A consequence of this is that when a custom node import fails (e.g. its type clobbers an existing node), the app fails to start up.
In fact, any time we import basically anything from the app, we trigger custom node imports! Not good.
This logic is now in its own function, called as the API app starts up.
If a custom node load fails for any reason, it no longer prevents the app from starting up.
One other bonus we get from this is that we can now ensure custom nodes are loaded _after_ core nodes.
Any clobbering that may occur while loading custom nodes is now guaranteed to be a custom node clobbering a core node's type - and not the other way round.
When deleting a board w/ images, the image usage checking logic was not checking image collection fields. This could result in a nonexistent image lingering in a node.
We already handle single image fields correctly, it's only the image collection fields taht were affected.
Found another place where we deepcopy a dict, but it is safe to mutate.
Restructured the prep logic a bit to support this. Updated tests to use the new structure.
- Avoid pydantic models when dict manipulation works
- Avoid extraneous deep copies when we can safely mutate
- Avoid NamedTuple construct and its overhead
- Fix tests to use altered function signatures
- Remove extraneous populate_graph function
The method and route now supports:
- "none" as a board ID, sentinel value for uncategorized
- Optionally specify image categories
- Optionally specify is_intermediate
This fixes the broken readiness checks introduced in the previous commit.
To support async batch generators, all of the validation of the generators needs to be async. This is problematic because a lot of the validation logic was in redux selectors, which are necessarily synchronous.
To resolve this, the readiness checks and related logic are restructured to be run async in response to redux state changes via `useEffect` (another option is to directly subscribe to redux store). These async functions then set some react state. The checks are debounced to prevent thrashing the UI.
See #7580 for more context about this issue.
Other changes:
- Fix a minor issue where empty collections were also checked against their min and max sizes, and errors were shown for all the checks. If a collection is empty, we don't need to do the min/max checks. If a collection is empty, we skip the other min/max checks and do not report those errors to the user.
- When a field is connected, do not attempt to check its value. This fixes an issue where collection fields with a connection could erroneously appear to be invalid.
- Improved error messages for batch nodes.
Board fields in the workflow editor now default to using the auto-add board by default.
**This is a change in behaviour - previously, we defaulted to no board (i.e. Uncategorized).**
There is some translation needed between the UI field values for a board and what the graph expects.
A "BoardField" is an object in the shape of `{board_id: string}`.
Valid board field values in the graph:
- undefined
- a BoardField
Value UI values and their mapping to the graph values:
- 'none' -> undefined
- 'auto' -> BoardField for the auto-add board, or if the auto-add board is Uncategorized, undefined
- undefined -> undefined (this is a fallback case with the new logic)
- a BoardField -> the same BoardField
Currently translated at 98.9% (1737 of 1755 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.9% (1735 of 1753 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.9% (1731 of 1749 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.9% (1731 of 1749 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.6% (1726 of 1749 strings)
Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
All but the core `vitest` package were updated recently. Tests still ran but the test UI dashboard didn't. After updating, all tests still run, seems fine.
Also tested building in app and package mode.
- Support transparency w/ color picker. To do this, we need to hide the bg layer before sampling. In testing, this has a negligible performance impact.
- Add an RGBA value readout next to the color picker ring.
Unfortunately I couldn't reliably reproduce the issue, so I'm not 100% sure this fixes it. But I think there is a race condition that results in `updateCompositingRectSize` erroneously seeing the layer has no objects and skipping the update.
To address this, the compositing rect fill/size/pos are all now force-updated when the fill/objects are changed. Theoretically it should be impossible for the issue to occur now.
- Fix an issue where the cursor disappeared when selecting a non-renderable entity. For example, when selecting a reference image layer and certain tools, the cursor would disappear.
- Ensure color picker works no matter what layer types are selected.
The logic for showing/hiding the cursor needed to be rearranged a bit for this fix.
Retrying a queue item means cloning it, resetting all execution-related state. Retried queue items reference the item they were retried from by id. This relationship is not enforced by any DB constraints.
- Add `retried_from_item_id` to `session_queue` table in DB in a migration.
- Add `retry_items_by_id` method to session queue service. Accepts a list of queue item IDs and clones them (minus execution state). Returns a list of retried items. Items that are not in a canceled or failed state are skipped.
- Add `retry_items_by_id` HTTP endpoint that maps 1-to-1 to the queue service method.
- Add `queue_items_retried` event, which includes the list of retried items.
- Optimize component and hook structure for input fields to reduce rerenders of component tree
- Remove memoization on some selectors where it serves no purpose (bc the object will have a stable identity until it changes, at which point we need to re-render anyways)
- Shift the connection error selector logic around to rely more on the stable identity of pending connection objects
including just invokeai/version seems sufficient to appease uv sync here. including everything else would invalidate the cache we're trying to establish.
- Simplify and de-insane-ify component structure, hooks, selectors, etc.
- Some perf improvements by using data attributes for styling instead of dynamic CSS-in-JS.
- Add field notes and start of linear view config, got blocked when I ran into deeper layout issues that made it very difficult to handle field configs. So those are WIP in this commit.
Currently translated at 98.9% (1735 of 1753 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.9% (1731 of 1749 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.9% (1731 of 1749 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.6% (1726 of 1749 strings)
Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
Currently translated at 99.2% (1695 of 1708 strings)
translationBot(ui): update translation (Italian)
Currently translated at 99.2% (1692 of 1705 strings)
translationBot(ui): update translation (Italian)
Currently translated at 99.2% (1691 of 1704 strings)
Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
## Summary
This PR adds support for the FLUX LoRA model format produced by
OneTrainer.
Specifically, this PR adds:
- Support for DoRA patches
- Support for patch models that modify the FLUX T5 encoder
- Probing / loading support for OneTrainer models
## Known limitations
- DoRA patches cannot currently be applied to base weights that are
quantized with `bitsandbytes`. The DoRA algorithm requires accessing the
original model weight in order to compute the patch diff, and the
bitsandbytes quantization layers make this difficult. DoRA patches can
be applied to non-quantized and GGUF-quantized layers without issue.
- This PR results in a slight speed regression for a very particular
inference combination: quantized base model + LoRA with diffusers keys
(i.e. uses the `MergedLayerPatch`). Now that more LoRA formats are using
the `MergedLayerPatch`, it was becoming too much work to maintain this
optimization. Regression from ~1.7 it/s to ~1.4 it/s.
## Future Notes
- We may want to consider dropping support for bitsandbytes
quantization. It is very difficult to maintain compatibility for across
features like partial-loading and LoRA patching.
- At a future time, we should refactor the LoRA parsing logic to be more
generalized rather than handling each format independently.
- There are some redundant device casts and dequantizations in
`autocast_linear_forward_sidecar_patches(...)` (and its sub-calls).
Optimizing this is left for future work.
## Related Issues / Discussions
- This PR should address a handful of the LoRAs reported in
https://github.com/invoke-ai/InvokeAI/issues/7131 (specifically, most of
the `envy*` LoRAs).
- This PR should address the example in
https://github.com/invoke-ai/InvokeAI/issues/6912 (though the intended
effect of that LoRA is not totally clear, so its hard to verify with
full confidence).
## QA Instructions
OneTrainer test models:
-
https://civitai.com/models/844821/envy-flux-dark-watercolor-01?modelVersionId=945159
(DoRA, transformer only)
-
https://civitai.com/models/836757/envy-flux-digital-brush-01?modelVersionId=936167
(hada, transformer only)
- ball_flux from https://github.com/invoke-ai/InvokeAI/issues/6912
(DoRA, transformer/clip/t5)
The following tests were repeated with each of the OneTrainer test
models:
- [x] Test with non-quantized base model
- [x] Test with GGUF-quantized base model
- [x] Test with BnB-quantized base model
- [x] Test with non-quantized base model that is partially-loaded onto
the GPU
Other regression test:
- [x] Test some SD1 LoRAs
- [x] Test some SDXL LoRAs
- [x] Test a variety of existing FLUX LoRA formats
- [x] Test a FLUX Control LoRA on all base model quantization formats.
## Merge Plan
No special instructions.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
## Summary
This PR fixes an issue with mask dimension consistency. Prior to this
change, the following workflow would fail with `tuple out of range`
error:
<img width="1072" alt="image"
src="https://github.com/user-attachments/assets/d0a9e658-1d64-4db4-adee-973bbdaca745"
/>
### Before this PR
Dimension compatibility for invocations that take a mask input:
- `ApplyMaskTensorToImageInvocation`: 2 or 3
- `MaskTensorToImageInvocation`: 2 or 3
- `InvertTensorMaskInvocation`: 3
Mask dimension for invocations that produce a MaskOutput:
- `RectangleMaskInvocation`: 3
- `AlphaMaskToTensorInvocation`: 3
- `InvertTensorMaskInvocation`: 3
- `ImageMaskToTensorInvocation`: 3
- `SegmentAnythingInvocation`: 2
### After this PR (changes in bold)
Dimension compatibility for invocations that take a mask input:
- `ApplyMaskTensorToImageInvocation`: 2 or 3
- `MaskTensorToImageInvocation`: 2 or 3
- `InvertTensorMaskInvocation`: **2 or 3** <----------------
Mask dimension for invocations that produce a MaskOutput:
- `RectangleMaskInvocation`: 3
- `AlphaMaskToTensorInvocation`: 3
- `InvertTensorMaskInvocation`: 3
- `ImageMaskToTensorInvocation`: 3
- `SegmentAnythingInvocation`: **3** <-------------------
## QA Instructions
I tested the workflow in the PR description and this workflow:
<img width="872" alt="image"
src="https://github.com/user-attachments/assets/20496860-ce81-47c0-a46a-a611b73faa22"
/>
## Merge Plan
No special instructions.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
Currently translated at 100.0% (1697 of 1697 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 99.2% (1684 of 1697 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 99.7% (1676 of 1681 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 99.3% (1670 of 1681 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 99.5% (1658 of 1666 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 100.0% (1652 of 1652 strings)
Co-authored-by: Linos <linos.coding@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/vi/
Translation: InvokeAI/Web UI
Dynamic prompts string generators can cause an infinite feedback loop when added to the linear view.
The root cause is how these generators handle "resolving" their collections. They hit the dynamic prompts HTTP API within the view component to get the prompts, then set the batch node's internal state with those values.
When the same generator is rendered in both the node editor view and linear view and the timing is just right, that state update causes an infinite feedback loop between the two components as they respond to the state updates from the other component.
The other generators never store the generated values in the batch node's internal state. The values are "resolved" just-in-time as they are needed.
To fix this, the batch value "resolver" utilities could be made async and hit the API. But there's a problem - the resolver utilities are used within the "are we ready to invoke? are there any problems with the current settings?" redux selectors, which are strictly synchronous. To fix that, we can refactor that "are we ready to invoke?" logic to not use redux selectors, so the whole thing could be async.
It's not a big change but I'm not going to spend time on it at the moment.
So, until I address this, the dynamic prompts generators are disabled.
- Add JS Mersenne Twister implementation dependency to use as seeded PRNG. This is not a cryptographically secure algorithm.
- Add nullish seed field to float and integer random generators.
- Add UI to control the seed.
- When seed is not set, behaviour is unchanged - the values are randomized when you Invoke. When seed is set, the random distribution is deterministic depending on the seed. In this case, we can display the values to the user.
Unfortunately we cannot do strict floats or ints.
The batch data models don't specify the value types, it instead relies on pydantic parsing. JSON doesn't differentiate between float and int, so a float `1.0` gets parsed as `1` in python.
As a result, we _must_ accept mixed floats and ints for BatchDatum.items.
Tests and validation updated to handle this.
Maybe we should update the BatchDatum model to have a `type` field? Then we could parse as float or int, depending on the inputs...
## Summary
This PR revises the logic for calculating the model cache RAM limit. See
the code for thorough documentation of the change.
The updated logic is more conservative in the amount of RAM that it will
use. This will likely be a better default for more users. Of course,
users can still choose to set a more aggressive limit by overriding the
logic with `max_cache_ram_gb`.
## Related Issues / Discussions
- Should help with https://github.com/invoke-ai/InvokeAI/issues/7563
## QA Instructions
Exercise all heuristics:
- [x] Heuristic 1
- [x] Heuristic 2
- [x] Heuristic 3
- [x] Heuristic 4
## Merge Plan
- [x] Merge https://github.com/invoke-ai/InvokeAI/pull/7565 first and
update the target branch
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
## Summary
This PR adds a `keep_ram_copy_of_weights` config option the default (and
legacy) behavior is `true`. The tradeoffs for this setting are as
follows:
- `keep_ram_copy_of_weights: true`: Faster model switching and LoRA
patching.
- `keep_ram_copy_of_weights: false`: Lower average RAM load (may not
help significantly with peak RAM).
## Related Issues / Discussions
- Helps with https://github.com/invoke-ai/InvokeAI/issues/7563
- The Low-VRAM docs are updated to include this feature in
https://github.com/invoke-ai/InvokeAI/pull/7566
## QA Instructions
- Test with `enable_partial_load: false` and `keep_ram_copy_of_weights:
false`.
- [x] RAM usage when model is loaded is reduced.
- [x] Model loading / unloading works as expected.
- [x] LoRA patching still works.
- Test with `enable_partial_load: false` and `keep_ram_copy_of_weights:
true`.
- [x] Behavior should be unchanged.
- Test with `enable_partial_load: true` and `keep_ram_copy_of_weights:
false`.
- [x] RAM usage when model is loaded is reduced.
- [x] Model loading / unloading works as expected.
- [x] LoRA patching still works.
- Test with `enable_partial_load: true` and `keep_ram_copy_of_weights:
true`.
- [x] Behavior should be unchanged.
- [x] Smoke test CPU-only and MPS with default configs.
## Merge Plan
- [x] Merge https://github.com/invoke-ai/InvokeAI/pull/7564 first and
change target branch.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
## Summary
Prior to this change, there were several cases where we initialized the
weights of a FLUX model before loading its state dict (and, to make
things worse, in some cases the weights were in float32). This PR fixes
a handful of these cases. (I think I found all instances for the FLUX
family of models.)
## Related Issues / Discussions
- Helps with https://github.com/invoke-ai/InvokeAI/issues/7563
## QA Instructions
I tested that that model loading still works and that there is no
virtual memory reservation on model initialization for the following
models:
- [x] FLUX VAE
- [x] Full T5 Encoder
- [x] Full FLUX checkpoint
- [x] GGUF FLUX checkpoint
## Merge Plan
No special instructions.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
Previously, when previewing a filter on a layer with some transparency or a filter that changes the alpha, the preview was rendered on top of the layer. The preview blended with the layer, which isn't right.
In this change, the layer is hidden during the preview, and when the filter finishes (having been applied or canceled - the two possible paths), the layer is shown.
Technically, we are hiding and showing the layer's object renderer's konva group, which contains the layer's "real" data.
Another small change was made to prevent a flash of empty layer, by waiting to destroy a previous filter preview image until the new preview image is ready to display.
Due to the limited floating point precision, and konva's `scale` properties, it is possible for the relative rect of an object to have non-integer coordinates and dimensions.
When we go to rasterize and otherwise export images, the HTML canvas API truncates these numbers.
So, we can end up with situations where the relative width and height of a layer are very close to the "real" value, but slightly off.
For example, width and height might be 512px, but the relative rect is calculated to be something like 512.000000003 or 511.9999999997.
In the first case, the truncation results in 512x512 for the dimensions - which is correct. But in the second case, it results in 511x511!
One place where this causes issues is the image action `New Canvas from image -> As Raster Layer (resize)`. For certain input image sizes, this results in an incorrectly resized image. For example, a 1496x1946 input image is resized to 511x511 pixels when the bbox is 512x512.
To fix this, we can round both coords and dimensions of rects when rasterizing.
I've thought through the implications and done some testing. I believe this change will not cause any regressions and only fix edge cases. But, it's possible that something was inadvertently relying on the old behavior.
There's a bug where preset image tooltips get stuck open in the list.
After much fiddling, debugging, and review of upstream dependencies, I have determined that this is bug in Chakra-UI v2.
Specifically, it appears to be a race condition related to the Tooltip component's internal use of the `useDisclosure` hook to manage tooltip open state, and the react render cycle.
Unfortunately, Chakra v2 is no longer being updated, and it's a pain in the butt to vendor and fix that component given its dependencies. Not 100% sure I could easily fix it, anyways.
Fortunately, there is a workaround - reduce the tooltip openDelay to 0ms. I prefer the current 500ms delay but I think it's preferable to have too-quick tooltips than too-sticky tooltips...
## Summary
Changes:
- Deprecate `ram` and `vram` configs. If these are set in invokeai.yaml,
they will be ignored.
- Create new `max_cache_ram_gb` and `max_cache_vram_gb` configs with the
same definitions as the old configs.
The main motivation of this change is to make the migration path
smoother for users who had previously added `ram` /`vram` to their
config files. Now, these users will be automatically migrated into the
new dynamic limit behavior (which is better in most cases). These users
will have to manually re-add `max_cache_ram_gb` and `max_cache_vram_gb`
to their configs if they wish to go back to specifying manual limits.
## Related Issues / Discussions
See the release notes for RC v5.6.0rc1 for the old migration behavior
that we are trying to improve:
https://github.com/invoke-ai/InvokeAI/releases/tag/v5.6.0rc1
## QA Instructions
- [x] Test that if `ram` or `vram` are present in a user's
`invokeai.yaml`, these values are ignored.
- [x] Test that `max_cache_ram_gb` and `max_cache_vram_gb` are applied,
if set.
## Merge Plan
- Don't forget to update the RC release notes accordingly.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
## Summary
This PR contains a bugfix for an edge case with model unloading (from
VRAM to RAM). Thanks to @JPPhoto for finding it.
The bug was triggered under the following conditions:
- A GGML-quantized model is loaded in VRAM
- We run a Spandrel image-to-image invocation (which is wrapped in a
`torch.inference_mode()` context manager.
- The model cache attempts to unload the GGML-quantized model from VRAM
to RAM.
- Doing this inside of the `torch.inference_mode()` cm results in the
following error:
```
[2025-01-07 15:48:17,744]::[InvokeAI]::ERROR --> Error while invoking session 98a07259-0c03-4111-a8d8-107041cb86f9, invocation d8daa90b-7e4c-4fc4-807c-50ba9be1a4ed (spandrel_image_to_image): Cannot set version_counter for inference tensor
[2025-01-07 15:48:17,744]::[InvokeAI]::ERROR --> Traceback (most recent call last):
File "/home/ryan/src/InvokeAI/invokeai/app/services/session_processor/session_processor_default.py", line 129, in run_node
output = invocation.invoke_internal(context=context, services=self._services)
File "/home/ryan/src/InvokeAI/invokeai/app/invocations/baseinvocation.py", line 300, in invoke_internal
output = self.invoke(context)
File "/home/ryan/.pyenv/versions/3.10.14/envs/InvokeAI_3.10.14/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
File "/home/ryan/src/InvokeAI/invokeai/app/invocations/spandrel_image_to_image.py", line 167, in invoke
with context.models.load(self.image_to_image_model) as spandrel_model:
File "/home/ryan/src/InvokeAI/invokeai/backend/model_manager/load/load_base.py", line 60, in __enter__
self._cache.lock(self._cache_record, None)
File "/home/ryan/src/InvokeAI/invokeai/backend/model_manager/load/model_cache/model_cache.py", line 224, in lock
self._load_locked_model(cache_entry, working_mem_bytes)
File "/home/ryan/src/InvokeAI/invokeai/backend/model_manager/load/model_cache/model_cache.py", line 272, in _load_locked_model
vram_bytes_freed = self._offload_unlocked_models(model_vram_needed, working_mem_bytes)
File "/home/ryan/src/InvokeAI/invokeai/backend/model_manager/load/model_cache/model_cache.py", line 458, in _offload_unlocked_models
cache_entry_bytes_freed = self._move_model_to_ram(cache_entry, vram_bytes_to_free)
File "/home/ryan/src/InvokeAI/invokeai/backend/model_manager/load/model_cache/model_cache.py", line 330, in _move_model_to_ram
return cache_entry.cached_model.partial_unload_from_vram(
File "/home/ryan/.pyenv/versions/3.10.14/envs/InvokeAI_3.10.14/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
File "/home/ryan/src/InvokeAI/invokeai/backend/model_manager/load/model_cache/cached_model/cached_model_with_partial_load.py", line 182, in partial_unload_from_vram
cur_state_dict = self._model.state_dict()
File "/home/ryan/.pyenv/versions/3.10.14/envs/InvokeAI_3.10.14/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1939, in state_dict
module.state_dict(destination=destination, prefix=prefix + name + '.', keep_vars=keep_vars)
File "/home/ryan/.pyenv/versions/3.10.14/envs/InvokeAI_3.10.14/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1936, in state_dict
self._save_to_state_dict(destination, prefix, keep_vars)
File "/home/ryan/.pyenv/versions/3.10.14/envs/InvokeAI_3.10.14/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1843, in _save_to_state_dict
destination[prefix + name] = param if keep_vars else param.detach()
RuntimeError: Cannot set version_counter for inference tensor
```
### Explanation
From the `torch.inference_mode()` docs:
> Code run under this mode gets better performance by disabling view
tracking and version counter bumps.
Disabling version counter bumps results in the aforementioned error when
saving `GGMLTensor`s to a state_dict.
This incompatibility between `GGMLTensors` and `torch.inference_mode()`
is likely caused by the custom tensor type implementation. There may
very well be a way to get these to cooperate, but for now it is much
simpler to remove the `torch.inference_mode()` contexts.
Note that there are several other uses of `torch.inference_mode()` in
the Invoke codebase, but they are all tight wrappers around the
inference forward pass and do not contain the model load/unload process.
## Related Issues / Discussions
Original discussion:
https://discord.com/channels/1020123559063990373/1149506274971631688/1326180753159094303
## QA Instructions
Find a sequence of operations that triggers the condition. For me, this
was:
- Reserve VRAM in a separate process so that there was ~12GB left.
- Fresh start of Invoke
- Run FLUX inference with a GGML 8K model
- Run Spandrel upscaling
Tests:
- [x] Confirmed that I can reproduce the error and that it is no longer
hit after the change
- [x] Confirm that there is no speed regression from switching from
`torch.inference_mode()` to `torch.no_grad()`.
- Before: `50.354s`, After: `51.536s`
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
Currently translated at 16.5% (273 of 1645 strings)
translationBot(ui): update translation (Polish)
Currently translated at 15.4% (254 of 1645 strings)
translationBot(ui): update translation (Polish)
Currently translated at 10.8% (178 of 1645 strings)
Co-authored-by: Nik Nikovsky <zejdzztegomaila@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/pl/
Translation: InvokeAI/Web UI
Currently translated at 100.0% (1649 of 1649 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 100.0% (1645 of 1645 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 100.0% (1645 of 1645 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 100.0% (1645 of 1645 strings)
Co-authored-by: Linos <linos.coding@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/vi/
Translation: InvokeAI/Web UI
## Summary
This PR enables RAM/VRAM cache size limits to be determined dynamically
based on availability.
**Config Changes**
This PR modifies the app configs in the following ways:
- A new `device_working_mem_gb` config was added. This is the amount of
non-model working memory to keep available on the execution device (i.e.
GPU) when using dynamic cache limits. It default to 3GB.
- The `ram` and `vram` configs now default to `None`. If these configs
are set, they will take precedence over the dynamic limits. **Note: Some
users may have previously overriden the `ram` and `vram` values in their
`invokeai.yaml`. They will need to remove these configs to enable the
new dynamic limit feature.**
**Working Memory**
In addition to the new `device_working_mem_gb` config described above,
memory-intensive operations can estimate the amount of working memory
that they will need and request it from the model cache. This is
currently applied to the VAE decoding step for all models. In the
future, we may apply this to other operations as we work out which ops
tend to exceed the default working memory reservation.
**Mitigations for https://github.com/invoke-ai/InvokeAI/issues/7513**
This PR includes some mitigations for the issue described in
https://github.com/invoke-ai/InvokeAI/issues/7513. Without these
mitigations, it would occur with higher frequency when dynamic RAM
limits are used and the RAM is close to maxed-out.
## Limitations / Future Work
- Only _models_ can be offloaded to RAM to conserve VRAM. I.e. if VAE
decoding requires more working VRAM than available, the best we can do
is keep the full model on the CPU, but we will still hit an OOM error.
In the future, we could detect this ahead of time and switch to running
inference on the CPU for those ops.
- There is often a non-negligible amount of VRAM 'reserved' by the torch
CUDA allocator, but not used by any allocated tensors. We may be able to
tune the torch CUDA allocator to work better for our use case.
Reference:
https://pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf
- There may be some ops that require high working memory that haven't
been updated to request extra memory yet. We will update these as we
uncover them.
- If a model is 'locked' in VRAM, it won't be partially unloaded if a
later model load requests extra working memory. This should be uncommon,
but I can think of cases where it would matter.
## Related Issues / Discussions
- #7492
- #7494
- #7500
- #7505
## QA Instructions
Run a variety of models near the cache limits to ensure that model
switching works properly for the following configurations:
- [x] CUDA, `enable_partial_loading=true`, all other configs default
(i.e. dynamic memory limits)
- [x] CUDA, `enable_partial_loading=true`, CPU and CUDA memory reserved
in another process so there is limited RAM/VRAM remaining, all other
configs default (i.e. dynamic memory limits)
- [x] CUDA, `enable_partial_loading=false`, all other configs default
(i.e. dynamic memory limits)
- [x] CUDA, ram/vram limits set (these should take precedence over the
dynamic limits)
- [x] MPS, all other default (i.e. dynamic memory limits)
- [x] CPU, all other default (i.e. dynamic memory limits)
## Merge Plan
- [x] Merge #7505 first and change target branch to main
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
## Summary
This PR adds support for partial loading of models onto the GPU. This
enables models to run with much lower peak VRAM requirements (e.g. full
FLUX dev with 8GB of VRAM).
The partial loading feature is enabled behind a new config flag:
`enable_partial_loading=True`. This flag defaults to `False`.
**Note about performance:**
The `ram` and `vram` config limits are still applied when
`enable_partial_loading=True` is set. This can result in significant
slowdowns compared to the 'old' behaviour. Consider the case where the
VRAM limit is set to `vram=0.75` (GB) and we are trying to run an 8GB
model. When `enable_partial_loading=False`, we attempt to load the
entire model into VRAM, and if it fits (no OOM error) then it will run
at full speed. When `enable_partial_loading=True`, since we have the
option to partially load the model we will only load 0.75 GB into VRAM
and leave the remaining 7.25 GB in RAM. This will cause inference to be
much slower than before. To workaround this, it is important that your
`ram` and `vram` configs are carefully tuned. In a future PR, we will
add the ability to dynamically set the RAM/VRAM limits based on the
available memory / VRAM.
## Related Issues / Discussions
- #7492
- #7494
- #7500
## QA Instructions
Tests with `enable_partial_loading=True`, `vram=2`, on CUDA device:
For all tests, we expect model memory to stay below 2 GB. Peak working
memory will be higher.
- [x] SD1 inference
- [x] SDXL inference
- [x] FLUX non-quantized inference
- [x] FLUX GGML-quantized inference
- [x] FLUX BnB quantized inference
- [x] Variety of ControlNet / IP-Adapter / LoRA smoke tests
Tests with `enable_partial_loading=True`, and hack to force all models
to load 10%, on CUDA device:
- [x] SD1 inference
- [x] SDXL inference
- [x] FLUX non-quantized inference
- [x] FLUX GGML-quantized inference
- [x] FLUX BnB quantized inference
- [x] Variety of ControlNet / IP-Adapter / LoRA smoke tests
Tests with `enable_partial_loading=False`, `vram=30`:
We expect no change in behaviour when `enable_partial_loading=False`.
- [x] SD1 inference
- [x] SDXL inference
- [x] FLUX non-quantized inference
- [x] FLUX GGML-quantized inference
- [x] FLUX BnB quantized inference
- [x] Variety of ControlNet / IP-Adapter / LoRA smoke tests
Other platforms:
- [x] No change in behavior on MPS, even if
`enable_partial_loading=True`.
- [x] No change in behavior on CPU-only systems, even if
`enable_partial_loading=True`.
## Merge Plan
- [x] Merge #7500 first, and change the target branch to main
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
## Summary
This is an unplanned fix between PR3 and PR4 in the sequence of partial
loading (i.e. low-VRAM) PRs. This PR restores the 'Current Workaround'
documented in https://github.com/invoke-ai/InvokeAI/issues/7513. In
other words, to work around a flaw in the model cache API, this fix
allows models to be loaded into VRAM _even if_ they have been dropped
from the RAM cache.
This PR also adds an info log each time that this workaround is hit. In
a future PR (#7509), we will eliminate the places in the application
code that are capable of triggering this condition.
## Related Issues / Discussions
- #7492
- #7494
- #7500
- https://github.com/invoke-ai/InvokeAI/issues/7513
## QA Instructions
- Set RAM cache limit to a small value. E.g. `ram: 4`
- Run FLUX text-to-image with the full T5 encoder, which exceeds 4GB.
This will trigger the error condition.
- Before the fix, this test configuration would cause a `KeyError`.
After the fix, we should see an info-level log explaining that the
condition was hit, but that generation should continue successfully.
## Merge Plan
No special instructions.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
Previously, we didn't differentiate between model install errors for different types of model install sources, resulting in a buggy UX:
- If a HF model install failed, but it was a HF URL install and not a repo id install, the link to the HF model page was incorrect.
- If a non-HF URL install (e.g. civitai) failed, we treated it as a HF URL install. In this case, if the user's HF token was invalid or unset, we directed the user to set it. If the HF token was valid, we displayed an empty red toast. If it's not a HF URL install, then of course neither of these are correct.
Also, the logic for handling the toasts was a bit complicated.
This change does a few things:
- Consolidate the model install error toasts into one place - the socket.io event handler for the model install error event. There is no more global state for the toasts and there are no hooks managing them.
- Handling the different cases for errors, including all combinations of HF/non-HF and unauthorized/forbidden/unknown.
This is required to fix an issue with the MM UI's error handling.
Previously, we only included the model source as a string. That could be an arbitrary URL, file path or HF repo id, but the frontend has no parsing logic to differentiate between these different model sources.
Without access to the type of model source, it is difficult to determine how the user should proceed. For example, if it's HF URL with an HTTP unauthorized error, we should direct the user to log in to HF. But if it's a civitai URL with the same error, we should not direct the user to HF.
There are a variety of related edge cases.
With this change, the full `ModelSource` object is included in each model install event, including error events.
I had to fix some circular import issues, hence the import changes to files other than `events_common.py`.
## Summary
This PR is the third in a sequence of PRs working towards support for
partial loading of models onto the compute device (for low-VRAM
operation). This PR updates the LoRA patching code so that the following
features can cooperate fully:
- Partial loading of weights onto the GPU
- Quantized layers / weights
- Model patches (e.g. LoRA)
Note that this PR does not yet enable partial loading. It adds support
in the model patching code so that partial loading can be enabled in a
future PR.
## Technical Design Decisions
The layer patching logic has been integrated into the custom layers (via
`CustomModuleMixin`) rather than keeping it in a separate set of wrapper
layers, as before. This has the following advantages:
- It makes it easier to calculate the modified weights on the fly and
then reuse the normal forward() logic.
- In the future, it makes it possible to pass original parameters that
have been cast to the device down to the LoRA calculation without having
to re-cast (but the current implementation hasn't fully taken advantage
of this yet).
## Know Limitations
1. I haven't fully solved device management for patch types that require
the original layer value to calculate the patch. These aren't very
common, and are not compatible with some quantized layers, so leaving
this for future if there's demand.
2. There is a small speed regression for models that have CPU
bottlenecks. This seems to be caused by slightly slower method
resolution on the custom layers sub-classes. The regression does not
show up on larger models, like FLUX, that are almost entirely
GPU-limited. I think this small regression is tolerable, but if we
decide that it's not, then the slowdown can easily be reclaimed by
optimizing other CPU operations (e.g. if we only sent every 2nd progress
image, we'd see a much more significant speedup).
## Related Issues / Discussions
- https://github.com/invoke-ai/InvokeAI/pull/7492
- https://github.com/invoke-ai/InvokeAI/pull/7494
## QA Instructions
Speed tests:
- Vanilla SD1 speed regression
- Before: 3.156s (8.78 it/s)
- After: 3.54s (8.35 it/s)
- Vanilla SDXL speed regression
- Before: 6.23s (4.46 it/s)
- After: 6.45s (4.31 it/s)
- Vanilla FLUX speed regression
- Before: 12.02s (2.27 it/s)
- After: 11.91s (2.29 it/s)
LoRA tests with default configuration:
- [x] SD1: A handful of LoRA variants
- [x] SDXL: A handful of LoRA variants
- [x] flux non-quantized: multiple lora variants
- [x] flux bnb-quantized: multiple lora variants
- [x] flux ggml-quantized: muliple lora variants
- [x] flux non-quantized: FLUX control LoRA
- [x] flux bnb-quantized: FLUX control LoRA
- [x] flux ggml-quantized: FLUX control LoRA
LoRA tests with sidecar patching forced:
- [x] SD1: A handful of LoRA variants
- [x] SDXL: A handful of LoRA variants
- [x] flux non-quantized: multiple lora variants
- [x] flux bnb-quantized: multiple lora variants
- [x] flux ggml-quantized: muliple lora variants
- [x] flux non-quantized: FLUX control LoRA
- [x] flux bnb-quantized: FLUX control LoRA
- [x] flux ggml-quantized: FLUX control LoRA
Other:
- [x] Smoke testing of IP-Adapter, ControlNet
All tests repeated on:
- [x] cuda
- [x] cpu (only test SD1, because larger models are prohibitively slow)
- [x] mps (skipped FLUX tests, because my Mac doesn't have enough memory
to run them in a reasonable amount of time)
## Merge Plan
No special instructions.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
## Summary
This PR adds utilities to support partial loading of models from CPU to
GPU. The new utilities are not yet being used by the ModelCache, so
there should be no functional behavior changes in this PR.
Detailed changes:
- Add autocast modules that are designed to wrap common
`torch.nn.Module`s and enable them to run with automatic device casting.
E.g. a linear layer on the CPU can be executed with an input tensor on
the GPU by streaming the weights to the GPU at runtime.
- Add unit tests for the aforementioned autocast modules to verify that
they work for all supported quantization formats (GGUF, BnB NF4, BnB
LLM.int8()).
- Add `CachedModelWithPartialLoad` and `CachedModelOnlyFullLoad` classes
to manage partial loading at the model level.
## Alternative Implementations
Several options were explored for supporting inference on
partially-loaded models. The pros/cons of the explored options are
summarized here for reference. In the end, wrapper modules were selected
as the best overall solution for our use case.
Option 1: Re-implement the .forward() methods of modules to add support
for device conversions
- This is the option implemented in this PR.
- This approach is the most manual of the three, but as a result offers
the broadest compatibility with unusual model types. It is manual in
that we have to explicitly add support for all module types that we wish
to support. Fortunately, the list of foundational module types is
relatively small (e.g. the current set of implemented layers covers all
but 0.04 MB of the full FLUX model.).
Option 2: Implement a custom Tensor type that casts tensors to a
`target_device` each time the tensor is used
- This approach has the nice property that it is injected at the tensor
level, and the model does not need to be modified in any way.
- One challenge with this approach is handling interactions with other
custom tensor types (e.g. GGMLTensor). This problem is solvable, but
definitely introduces a layer of complexity. (There are likely to also
be some similar issues with interactions with the BnB quantization, but
I didn't get as far as testing BnB.)
Option 3: Override the `__torch_function__` dispatch calls globally and
cast all params to the execution device.
- This approach is nice and simple: just apply a global context manager
and all operations will happen on the compute device regardless of the
device of the participating tensors.
- Challenges:
- Overriding the `__torch_function__` dispatch calls introduces some
overhead even if the tensors are already on the correct device.
- It is difficult to manage the autocasting context manager. E.g. it is
tempting to apply it to the model's `.forward(...)` method, but we use
some models with non-standard entrypoints. And we don't want to end up
with nested autocasting context managers.
- BnB applies quantization side effects when a param is moved to the GPU
- this interacts in unexpected ways with a global context manager.
## QA Instructions
Most of the changes in this PR should not impact active code, and thus
should not cause any changes to behavior. The main risks come from
bumping the bitsandbytes dependency and some minor modifications to the
bitsandbytes quantization code.
- [x] Regression test bitsandbytes NF4 quantization
- [x] Regression test bitsandbytes LLM.int8() quantization
- [x] Regression test on MacOS (to ensure that there are no lingering
bitsandbytes import errors)
I also tested the new utilities for inference on full models in another
branch to validate that there were not major issues. This functionality
will be tested more thoroughly in a future PR.
## Merge Plan
- [x] #7492 should be merged first so that the target branch can be
updated to main.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
## Summary
This PR tidies up the model cache code in preparation for further
refactoring to support partial loading of models onto the GPU. **These
code changes should not change the functional behavior in any way.**
Changes:
- Remove the `ModelCacheBase` class. `ModelCache` is the only
implementation, so there is no benefit to the separate abstract class.
- Split `CacheRecord` and `CacheStats` out into their own files.
- Remove the `ModelLocker` class. This extra layer of indirection was
not providing any benefit. Locking is now done directly with the
`ModelCache`.
- Tidy up relative imports that were contributing to circular import
issues.
- Pull the 'submodel' concern out of the `ModelCache`. The `ModelCache`
should not need to be aware of the model manager submodel system.
- Delete unused properties from the `ModelCache` (e.g.
`.lazy_offloading`, `.storage_device`, etc.)
## QA Instructions
I ran smoke tests with a variety of SD1, SDXL and FLUX models. No change
to behavior is expected.
## Merge Plan
<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
Uvicorn's logging is rather verbose. This change adds a `log_level_network` config setting to independently control uvicorn's log outputs. The setting defaults to warning.
The change hides the helpful startup message that says the host and port we are running on.
For example: `Uvicorn running on http://0.0.0.0:9090 (Press CTRL+C to quit`
The ASGI lifespan handler is updated to log an equivalent message on startup, regardless of log level settings.
Besides being helpful, the launcher relies on a message like this to launch the app. So, previously, if the user set their log level to anything above info (e.g. warning or error), the launcher would fail to open the app. This change prevents that edge case.
Currently translated at 100.0% (1644 of 1644 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 100.0% (1643 of 1643 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 100.0% (1643 of 1643 strings)
Co-authored-by: Linos <linos.coding@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/vi/
Translation: InvokeAI/Web UI
## Summary
This PR refactors the LoRA handling code to enable the use of FLUX
control LoRAs on top of quantized transformers.
Changes:
- Renamed a bunch of the model patching utilities to reflect that they
are not LoRA-specific
- Improved the unit test coverage.
- Refactored the handling of 'sidecar' patch layers to make them work
with more layer patch types. (This was necessary to get FLUX control
LoRAs working on top of quantized models.)
- Removed `ONNXModelPatcher`. It is out-of-date and hasn't been used in
a while.
## QA Instructions
I completed the following tests.
**These should be repeated after changing the target branch to main.**
**Due to the large surface area of this PR, reviewers should do
regression tests on a range of LoRA formats. There is a risk of
regression on a specific format that was missed during the
refactoring.**
- [x] FLUX Control LoRA + full FLUX transformer
- [x] FLUX Control LoRA + BnB NF4 quantized transformer
- [x] FLUX Control LoRA + GGUF quantized transformer
- [x] FLUX Control LoRA + non-control LoRA + full FLUX transformer
- [x] FLUX Contro LoRA + non-control LoRA + BnB quantized transformer
- [x] FLUX Control LoRA + non-control LoRA + GGUF quantized transformer
- Test the following cases for regression:
- [x] Misc SD1/SDXL LoRA variants (LoRA, LoKr, IA3)
- [x] FLUX, non-quantized, variety of LoRA formats
- [x] FLUX, quantized, variety of LoRA formats
## Merge Plan
**_Don't merge this PR yet._**
Merge plan:
1. First merge brandon/flux-tools-loras into main
2. Change the target branch of this PR to main
3. Review / test / merge this PR
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
- Ensure the currently-rasterizing adapter is reset to `null` on success or failure of a rasterization operation. In case of failure, this prevents the UI from getting stuck with a disabled Invoke button and tooltip message "Canvas is busy (rasterizing)".
- Log the error if there is one.
## Summary
https://github.com/invoke-ai/InvokeAI/issues/7422
As reported in the above ticket, a recent FLUX performance improvement
caused a regression on MacOS. This PR reverts the offending part of the
change.
## Related Issues / Discussions
- Closes#7422
- Original perf improvement:
https://github.com/invoke-ai/InvokeAI/pull/7399
## QA Instructions
I don't have a Mac capable of running this test, so trusting the report
in #7422 that this fixes the problem.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
The `is` operator compares references, not values. Thanks to a wonderfully unintuitive quirk of python, `is` works on integers from `-5` to `256`, inclusive.
Whenever integers in this range are used for a value, internally python returns a reference to a stable object in memory. When integers outside this range are used as a value, python creates a new object in memory for that integer.
See `PyLong_FromLong` documentation here: https://docs.python.org/3/c-api/long.html
Tying this back to our session processor, we were using `is` to compare the queue item ids for equality. Our queue item ids start at 0, and each queue item created increments this by one. So this comparison works only for the first 256 queue items on the machine.
Starting with the 257th queue item, the comparison starts returning `False`, and cancelation gets weird.
Easy fix - use `!=` instead of `is not`.
The "adding to" text indicates if images are going to the gallery or staging area. This info is relevant only to the canvas tab, but was displayed on Upscaling and Workflows tabs. Removed it from those tabs.
A redux selector is used to get the "default" IP Adapter. The selector uses the model list query result to select an IP Adapter model to be preset by default.
The selector is memoized, so if we mutate the returned default IP Adapter state, it mutates the result of the selector for all consumers.
For example, the `image` property of the default IP Adapter selector result is `null`. When we set the `image` property of the selector result while creating an IP Adapter, this does not trigger the selector to recompute its result. We end up setting the image for the selector result directly, and all other consumers now have that same image set.
Solution - we need to clone the selector result everywhere it is used. This was missed in a few spots, causing the issue.
It was easy to misunderstand the empty state for a regional guidance reference image. There was no label, so it seemed like it was the whole region that was empty.
This small change adds the "Reference Image" heading to the empty state, so it's clear that the empty state messaging refers to this reference image, not the whole regional guidance layer.
## Summary
This PR adds support for regional prompting with FLUX.
### Example 1
Global prompt: `An architecture rendering of the reception area of a
corporate office with modern decor.`
<img width="1386" alt="image"
src="https://github.com/user-attachments/assets/c8169bdb-49a9-44bc-bd9e-58d98e09094b">

## QA Instructions
- [x] Test that there is no slowdown in the base case with a single
global prompt.
- [x] Test image fully covered by regional masks.
- [x] Test image covered by region masks with small gaps.
- [x] Test region masks with large unmasked ‘background’ regions
- [x] Test region masks with significant overlap
- [x] Test multiple global prompts.
- [x] Test no global prompt.
- [x] Test regional negative prompts (It runs... but results are not
great. Needs more tuning to be useful.)
- Test compatibility with:
- [x] ControlNet
- [x] LoRA
- [x] IP-Adapter
## Remaining TODO
- [x] Disable the following UI features for FLUX prompt regions:
negative prompts, reference images, auto-negative.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
These helpers consolidate layer validation checks. For example, checking that the layer has content drawn, is compatible with the selected main model, has valid reference images, etc.
There's a technical challenge with outputting these values directly. `ImageField` does not store them, so the batch's `ImageField` collection does not have width and height for each image.
In order to set up the batch and pass along width and height for each image, we'd need to make a network request for each image when the user clicks Invoke. It would often be cached, but this will eventually create a scaling issue and poor user experience.
As a very simple workaround, users can output the batch image output into an `Image Primitive` node to access the width and height.
This change is implemented by adding some simple special handling when parsing the output fields for the `image_batch` node.
I'll keep this situation in mind when extending the batching system to other field types.
- Split up logic to determine reason why the user cannot invoke for each tab.
- Fix issue where the workflows tab would show reasons related to canvas/upscale tab. The tooltip now only shows information relevant to the current tab.
- Add calculation for batch size to the queue count prediction.
- Use a constant for the enqueue mutation's fixed cache key, instead of a string. Just some typo protection.
Currently translated at 42.3% (672 of 1588 strings)
translationBot(ui): update translation (Spanish)
Currently translated at 28.0% (445 of 1588 strings)
Co-authored-by: gallegonovato <fran-carro@hotmail.es>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/es/
Translation: InvokeAI/Web UI
- Add special handling for `ImageBatchInvocation`
- Add input component for image collections, supporting multi-image upload and dnd
- Minor rework of some hooks for accessing node data
The canvas react components pass canvas entity identifiers around, then redux selectors are used to access that entity. This is good for perf - entity states may rapidly change. Passing only the identifiers allows components and other logic to have more granular state updates.
Unfortunately, this design opens the possibility for for an entity identifier to point to an entity that does not exist.
To get around this, I had created a redux selector `selectEntityOrThrow` for canvas entities. As the name implies, it throws if the entity is not found.
While it prevents components/hooks from needing to deal with missing entities, it results in mysterious errors if an entity is missing. Without sourcemaps, it's very difficult to determine what component or hook couldn't find the entity.
Refactoring the app to not depend on this behaviour is tricky. We could pass the entity state around directly as a prop or via context, but as mentioned, this could cause performance issues with rapidly changing entities.
As a workaround, I've made two changes:
- `<CanvasEntityStateGate/>` is a component that takes an entity identifier, returning its children if the entity state exists, or null if not. This component is wraps every usage of `selectEntityOrThrow`. Theoretically, this should prevent the entity not found errors.
- Add a `caller: string` arg to `selectEntityOrThrow`. This string is now added to the error message when the assertion fails, so we can more easily track the source of the errors.
In the future we can work out a way to not use this throwing selector and retain perf. The app has changed quite a bit since that selector was created - so we may not have to worry about perf at all.
When we added more progress events during generation, we indirectly broke the logic that controls when the progress bar throbs.
Co-authored-by: Mary Hipp Rogers <maryhipp@gmail.com>
Currently translated at 33.6% (533 of 1583 strings)
translationBot(ui): update translation (Japanese)
Currently translated at 30.3% (481 of 1583 strings)
Co-authored-by: Gohsuke Shimada <ghoskay@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/ja/
Translation: InvokeAI/Web UI
Currently translated at 17.6% (278 of 1575 strings)
translationBot(ui): update translation (Spanish)
Currently translated at 17.3% (274 of 1575 strings)
Co-authored-by: gallegonovato <fran-carro@hotmail.es>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/es/
Translation: InvokeAI/Web UI
Currently translated at 100.0% (1581 of 1581 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 100.0% (1576 of 1576 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 100.0% (1575 of 1575 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 85.0% (1340 of 1575 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 78.7% (1240 of 1575 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 73.1% (1152 of 1575 strings)
translationBot(ui): update translation (English)
Currently translated at 99.9% (1574 of 1575 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 57.9% (913 of 1575 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 37.0% (584 of 1575 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 3.2% (51 of 1575 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 3.2% (51 of 1575 strings)
Co-authored-by: Linos <tt250208@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/en/
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/vi/
Translation: InvokeAI/Web UI
Currently translated at 79.9% (1266 of 1583 strings)
translationBot(ui): update translation (Chinese (Simplified Han script))
Currently translated at 74.4% (1171 of 1573 strings)
Co-authored-by: aidawanglion <youjayjeel@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/zh_Hans/
Translation: InvokeAI/Web UI
Currently translated at 99.6% (1569 of 1575 strings)
translationBot(ui): update translation (Italian)
Currently translated at 99.4% (1567 of 1575 strings)
translationBot(ui): update translation (Italian)
Currently translated at 99.4% (1565 of 1573 strings)
Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
Turns out a gallery image's `imageDTO` object can actually be a different object by reference. I thought this was not possible thanks to how we have a quasi-normalized cache.
Need to check against image name instead of reference equality when deciding whether or not to use the single image or the gallery selection for the dnd payload.
Rework uploadImage and uploadImages helpers and the RTK listener, ensuring gallery view isn't changed unexpectedly and preventing extraneous toasts.
Fix staging area save to gallery button to essentially make a copy of the image, instead of changing its intermediate status.
- New name: "Output only Generated Regions"
- New default: true (this was the intention, but at some point the behaviour of the setting was inverted without the default being changed)
The styling in gallery for selected vs hovered was very similar, leading users to think that the hovered image was also selected.
Reducing the borders for hovered images to a single pixel makes it easier to distinguish between selected and hovered.
- Tweak layout/styling of alerts for consistent spacing
- Add percentage to message if it has percentage
- Only show events if the destination is canvas (so workflows events are hidden for example)
- Pass in the `UtilInterface` to the `ModelsInterface` so we can call the simple `signal_progress` method instead of the complicated `emit_invocation_progress` method.
- Only emit load events when starting to load - not after.
- Add more detail to the messages, like submodel type
## Summary
Add support for SD3 image-to-image and inpainting. Similar to FLUX, the
implementation supports fractional denoise_start/denoise_end for more
fine-grained denoise strength control, and a gradient mask adjustment
schedule for smoother inpainting seams.
## Example
Workflow
<img width="1016" alt="image"
src="https://github.com/user-attachments/assets/ee598d77-be80-4ca7-9355-c3cbefa2ef43">
Result

## QA Instructions
- [x] Regression test of text-to-image
- [x] Test image-to-image without mask
- [x] Test that adjusting denoising_start allows fine-grained control of
amount of change in image-to-image
- [x] Test inpainting with mask
- [x] Smoke test SD1, SDXL, FLUX image-to-image to make sure there was
no regression with the frontend changes.
## Merge Plan
<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
## Summary
The Flux VAE, like many VAEs, is broken if run using float16 inputs
returning black images due to NaNs
This will fix the issue by forcing the VAE to run in bfloat16 or float32
were compatible
## Related Issues / Discussions
Fix for issue https://github.com/invoke-ai/InvokeAI/issues/7208
## QA Instructions
Tested on MacOS, VAE works with float16 in the invoke.yaml and left to
default.
I also briefly forced it down the float32 route to check that to.
Needs testing on CUDA / ROCm
## Merge Plan
It should be a straight forward merge,
When an unsupported model architecture is selected, show that warning only, without the extra warnings (i.e. no "missing tile controlnet" warning)
Update Invoke tooltip warnings accordingly
Closes#7239Closes#7177
- Add `withToast` flag to `uploadImage` util
- Skip the toast if this is not set
- Use the flag to disable toasts when canvas does internal image-uploading stuff that should be invisible to user
We don't need a "dnd" image system. We need a "image action" system. We need to execute specific flows with images from various "origins":
- internal dnd e.g. from gallery
- external dnd e.g. user drags an image file into the browser
- direct file upload e.g. user clicks an upload button
- some other internal app button e.g. a context menu
The actions are now generalized to better support these various use-cases.
## Summary
Nodes to support SD3.5 txt2img generations
* adds SD3.5 to starter models
* adds default workflow for SD3.5 txt2img
## Related Issues / Discussions
<!--WHEN APPLICABLE: List any related issues or discussions on github or
discord. If this PR closes an issue, please use the "Closes #1234"
format, so that the issue will be automatically closed when the PR
merges.-->
## QA Instructions
<!--WHEN APPLICABLE: Describe how you have tested the changes in this
PR. Provide enough detail that a reviewer can reproduce your tests.-->
## Merge Plan
<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->
## Checklist
- [ ] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
In a8de6406c5 a change was made to many menus in an effort to improve performance. The menus were made to be lazy, so that they are mounted only while open.
This causes unexpected behaviour when there is some logic in the menu that may need to execute after the user selects a menu item.
In this case, when you click to load a workflow from file, the file picker opens but then the menuitem unmounts, taking the input element and all uploading logic with it. When you select a file, nothing happens because we've nuked the handlers by unmounting everything.
Easy fix - un-lazy-fy the menu.
Closes#7240
The validation on this node causes graph validation to valid. It must be validated _after_ instantiation.
Also, it was a bit too strict. The only case we explicitly do not handle is when both bboxes and points are provided. It's acceptable if neither are provided.
Closes#7248
When filtering, we use a listener to trigger processing the image whenever a filter setting changes. For example, if the user changes from canny to depth, and auto-process is enabled, we re-process the layer with new filter settings.
The filterer has a method to reset its ephemeral state. This includes the filter settings, so resetting the ephemeral state is expected to trigger processing of the filter.
When we exit filtering, we reset the ephemeral state before resetting everything else, like the listeners.
This can cause problem when we exit filtering. The sequence:
- Start filtering a layer.
- Auto-process the filter in response to starting the filter process.
- Change the filter settings.
- Auto-process the filter in response to the changed settings.
- Apply the filter.
- Exit filtering, first by resetting the ephemeral state.
- Auto-process the filter in response to the reset settings.*
- Finish exiting, including unsubscribing from listeners.
*Whoops! That last auto-process has now borked the layer's rendering by processing a filter when we shouldn't be processing a filter.
We need to first unsubscribe from listeners, so we don't react to that change to the filter settings and erroneously process the layer.
Also, add a check to the `processImmediate` method to prevent processing if that method is accidentally called without first starting the filterer.
The same issue could affect the segmenyanything module - same fixes are implemented there.
The root issue is the compositing cache. When we save the canvas to gallery, we need to first composite raster layers together and then upload the image.
The compositor makes extensive use of caching to reduce the number of images created and improve performance. There are two "layers" of caching:
1. Caching the composite canvas element, which is used both for uploading the canvas and for generation mode analysis.
2. Caching the uploaded composite canvas element as an image.
The combination of these caches allows for the various processes that require composite canvases to do minimal work.
But this causes a problem in this situation, because the user expects a new image to be uploaded when they click save to gallery.
For example, suppose we have already composited and uploaded the raster layer state for use in a generation. Then, we ask the compositor to save the canvas to gallery.
The compositor sees that we are requesting an image for the current canvas state, and instead of recompositing and uploading the image again, it just returns the cached image.
In this case, no image is uploaded and it the button does nothing.
We need to be able to opt out of the caching at some level, for certain actions. A `forceUpload` arg is added to the compositor's high-level `getCompositeImageDTO` method to do this.
When true, we ignore the uppermost caching layer (the uploaded image layer), but still use the lower caching layer (the canvas element layer). So we don't recompute the canvas element, but we do upload it as a new image to the server.
Previously, we cleared the canvas progress image when the canvas had no active generations. This allowed for a brief flash of canvas state between the last progress image for a given generation, and when the output image for that generation rendered. Here's the sequence:
- Progress images are received and rendered
- Generation completes - no active canvas generations
- Clear the progress image -> canvas layers visible unexpectedly, creating an awkward jarring change
- Generation output image is rendered -> output image overlaid on canvas layers
In 83538c4b2b I attempted to fix this by only clearing the progress image while we were not staging.
This isn't quite right, though. We are often staging with no active generations - for example, you have a few images completed and are waiting to choose one.
In this situation, if you cancel a pending generation, the logic to clear the progress image doesn't fire because it sees staging is in progress.
What we really need is:
- Staging area module clears the progress image once it has rendered an output image.
- Progress image module clears the progress image when a generation is canceled or failed, in which case there will be no output image.
To do this, we can add an event listener to the progress image module to listen for queue item status changes, and when we get a cancelation or failure, clear the progress image.
pip's dependency resolution doesn't take into account transitive
dependencies when choosing package versions for download.
Even though `torch=~2.4.1` is required by `diffusers`, pip will
download 2.5.0 and higher, but only install 2.4.1.
Pinning torch to <2.5.0 prevents this behaviour.
## Summary
This change mimics the unet padding strategy to align T2I featuremaps
with the latents during denoising. It also slightly adjusts the crop and
scale logic so that the control will match the input image without
shifting when it needs to pad.
## Related Issues / Discussions
<!--WHEN APPLICABLE: List any related issues or discussions on github or
discord. If this PR closes an issue, please use the "Closes #1234"
format, so that the issue will be automatically closed when the PR
merges.-->
## QA Instructions
Image generated at 1032x1024

Image generated at 1080x1040 to prove feature alignment.

Edge artifacts on the bottom and right are a result of SDXL's unet
padding, and t2i influence will be cut off in those regions.
## Merge Plan
Contingent on #7205
Currently the Canvas UI prevents users from generating non-64
resolutions while t2i adapter layers are active. Will leave this as a
draft until fixing that.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
Previously we maintained an `isInteractable` flag, which was derived from these layer flags:
- Locked/unlocked
- Enabled/disabled
- Layer's type visible/hidden
When a layer was not interactable, we blocked all layer actions.
After comparing to the behaviour in Affinity and considering user feedback, I've loosened these restrictions while maintaining safety. First, some definitions.
There two kinds of layer actions - mutating actions and non-mutating actions.
- Mutating actions are drawing on the layer, cropping it, filtering it, converting it, etc. Anything that changes the layer.
- Non-mutating actions are copying the layer, saving the layer to gallery, etc. Anything that _uses_ the layer.
Then, there are two broad canvas states - busy and not busy. "Busy" means the canvas is actively filtering, staging, compositing layers together, etc - something that is "single-threaded" by nature.
And here are the revised restrictions:
- When canvas is busy, you cannot initiate any layer actions.
- When the canvas is not busy, and the layer is locked, you initiate any mutating actions.
- When the canvas is not busy and the layer is not locked, you can initiate any layer action.
Besides safely giving users more freedom, it also fixes an issue where the context menu for a layer was disabled if it was not the selected layer.
- Add method to force a rebuild of the pydantic type adapter for the union of invocations, which is used to validate graphs.
- Update the xfail'd test.
Had missed several of these, which means we were invalidating caches far too often. For example, when you changed a RG prompt, we were invalidating the cached canvas for that entity, even though changing the prompt doesn't affect the canvas at all.
Previously, merge visible deleted all other visible layers. This is not how affinity works, I should have confirmed before making it work like this in the first place.Ï
`CanvasCompositorModule` had a fairly inflexible API, only supporting compositing all raster layers or inpaint masks.
The API has been generalized work with a list of canvas entities. This enables `Merge Down` and `Merge Selected` functionality (though `Merge Selected` is not part of this set of changes).
Let the parent module adopt the filtered/segemented image instead of destroying it and making the parent re-create it, which results in a brief flash of the parent layer's original objects before the new image is rendered.
We were scaling the unscaled image and mask down before doing the paste-back, but this adds an extraneous step & image output.
We can do the paste-back first, then scale to output size after. So instead of 2 resizes before the paste-back, we have 1 resize after.
The end result is the same.
- Restore dedicated `Apply` buttons
- Remove icons from the buttons, too much noise when the words are short and clear
- Update loading state to show a spinner next to the `Process` button instead of on _every_ button
A blue button is begging to be clicked, but clicking it will do nothing. Instead, we should communicate that no action is needed by disabling the button when the default settings are already in use.
Using `&&` will result in false negatives for settings where a falsy value might be valid. For example, any setting for which 0 is a valid number. To be on the safe side, just use an explicit null check on all values.
We use an in-memory cache for PIL images to reduce I/O. If a node mutates the image in any way, the cached image object is also updated (but the on-disk image file is not).
We've lucked out that this hasn't caused major issues in the past (well, maybe it has but we didn't understand them?) mainly because of a happy accident. When you call `context.images.get_pil` in a node, if you provide an image mode (e.g. `mode="RGB"`), we call `convert` on the image. This returns a copy. The node can do whatever it wants to that copy and nothing breaks.
However, when mode is not specified, we return the image directly. This is where we get in trouble - nodes that load the image like this, and then mutate the image, update the cache. Other nodes that reference that same image will now get the mutated version of it.
The fix is super simple - we make sure to return only copies from `get_pil`.
- Use a hash of the last processed points instead of a `hasProcessed` flag to determine whether or not we should re-process a given set of points.
- Store point coords in state instead of pulling them out of the konva node positions. This makes moving a point a more explicit action in code.
- Add a `roundCoord` util to round the x and y values of a coordinate.
- Ensure we always re-process when $points changes.
Realized we are doing a lot of event listening even when segmenting is not occuring. I don't think this will have a meaningful performance impact, but it makes sense to remove these listeners when not in use.
Fix an issue where if the input image is transparent in a region to be masked, that transparent region ends up opaque black. Need to respect the input image transparency by applying the mask to the alpha channel only.
Each version of torch is only available for specific versions of CUDA and ROCm.
The Invoke installer and dockerfile try to install torch 2.4.1 with ROCm 5.6
support, which does not exist. As a result, the installation falls back to the
default CUDA version so AMD GPUs aren't detected. This commits fixes that by
bumping the ROCm version to 6.1, as suggested by the PyTorch documentation. [1]
The specified CUDA version of 12.4 is still correct according to [1] so it does
need to be changed.
Closes#7006Closes#7146
[1]: https://pytorch.org/get-started/previous-versions/#v241
## Summary
This PR adds support for the XLabs IP-Adapter
(https://huggingface.co/XLabs-AI/flux-ip-adapter) in workflows. Linear
UI integration is coming in a follow-up PR. The XLabs IP-Adapter can be
installed in the Starter Models tab.
Usage tips:
- Use a `cfg_scale` value of 2.0 to 4.0
- Start with an IP-Adatper weight of ~0.6 and adjust from there.
- Set `cfg_scale_start_step = 1`
- Set `cfg_scale_end_step` to roughly the halfway point (it's
unnecessary to apply CFG to all steps, and this will improve processing
time).
Sample workflow:
<img width="976" alt="image"
src="https://github.com/user-attachments/assets/4627b459-7e5a-4703-80e7-f7575c5fce19">
Result:

## Related Issues / Discussions
Prerequisite: https://github.com/invoke-ai/InvokeAI/pull/7152
## Remaining TODO:
- [ ] Update default workflows.
## QA Instructions
- [x] Test basic happy path
- [x] Test with multiple IP-Adapters (it runs, but results aren't great)
- [ ] ~Test with multiple images to a single IP-Adapter~ (this is not
supported for now)
- [ ] Test automatic runtime installation of CLIP-L, CLIP-H, and CLIP-G
image encoder models if they are not already installed.
- [ ] Test starter model installation of the XLabs FLUX IP-Adapter
- [ ] Test SD and SDXL IP-Adapters for regression.
- [ ] Check peak memory utilization.
## Merge Plan
- [ ] Merge #7152
- [ ] Change target branch to main
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
## Summary
Add support for Classifier-Free Guidance with FLUX.
- Using CFG doubles the time for the denoising process. Running both the
positive and negative conditioning in a single batch is left for future
work, because most users are already VRAM-constrained (this would
probably be faster at the cost of higher peak VRAM).
- Negative text conditioning is optional and only required if `cfg_scale
!= 1.0`
- CFG is skipped if `cfg_scale == 1.0` (i.e. no compute overhead in this
case)
- `cfg_scale_start_step` and `cfg_scale_end_step` can be used to easily
control the range of steps that CFG is applied for.
- CFG is a prerequisite for IP-Adapter support.
## Example
Positive Caption: `Professional photography of a luxury hotel in the
Nevada desert`
CFG: 1.0

Positive Caption: `Professional photography of a luxury hotel in the
Nevada desert`
Negative Caption: `Swimming pool`
CFG: 2.0
Same seed

## QA Instructions
- [ ] Test interactions with ControlNet
- [ ] Verify that peak RAM/VRAM utilization has not increased
significantly
- [ ] Test that CFG is skipped when cfg_scale == 1.0
- [ ] Test that negative text conditioning can be omitted when cfg_scale
== 1.0
- [ ] Test that a clear error message is returned when negative text
conditioning is omitted when cfg_scale != 1.0
- [ ] Test that the negative text prompt gets applied when cfg_scale
>1.0
- [ ] Test that a collection of cfg_scale values can be provided for
per-step control.
- [ ] Test that `cfg_scale_start_step` and `cfg_scale_end_step` control
the range of steps that CFG is applied
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
Introduce two-stage logging configuration and overrides for enabled status, log level and log namespaces.
The first stage in `<InvokeAIUI />`, before we set up redux (and therefore before we have access to the user's configured logging setup). In this stage, we use the overrides or default values.
The second stage is in `<App />`, after we set up redux, via `useSyncLoggingConfig`. In this stage, we use the overrides or the user's configured logging setup. This hook also handles pushing changes made by the user into localstorage.
Other changes:
- Extract logging config to util function
- Remove the `useEffect` from `SettingsModal` that was changing the logging settings
- Remove extraneous log effects from `useLogger`
- Export new `LoggingOverrides` type
While troubleshooting an issue with this middleware, I found the inclusion of the nextState and diff to be very noisy. It's now a function that accepts some options to configure the output, and returns the middleware.
We can use the drop overlay component directly for this, without needing to add it as a `noop` dnd target.
Other changes:
- The `label` prop is now used to conditionally render the label - every drop target provides its own label, so this doesn't break anything.
- Add `withBackdrop` prop to control whether we apply the dimmed drop target effect.
Instead of providing a duration to the upload action, we close the toast imperatively in the `imageUploaded` listener using a timeout. 3s after the last upload toast, we close it.
This handles the case when we are uploading multiple images and don't want the toast to close til it's all finished.
Currently translated at 98.7% (1476 of 1494 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.8% (1476 of 1493 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.8% (1474 of 1491 strings)
Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
To trigger the edge case:
- Have an empty layer and non-empty layer
- Select the non-empty layer
- Refresh the page
- Select to the empty layer without doing any other action
- You may be unable to draw on the layer
- Zoom in/out slightly
- You can now draw on it
The problem was not syncing visibility when a layer is selected, leaving the layer hidden. This indirectly disabled interactions.
The fix is to listen for changes to the layer's selected status and sync visibility when that changes.
We were:
- Incrementing `addedControlNets` or `addedT2IAdapters`
- Attempting to add it, but maybe failing and skipping
Need to swap the order of operations to prevent misreporting of added cnet/t2i.
I don't think this would ever actually cause problems.
## Summary
Add support for FLUX ControlNet models (XLabs and InstantX).
## QA Instructions
- [x] SD1 and SDXL ControlNets, since the ModelLoaderRegistry calls were
changed.
- [x] Single Xlabs controlnet
- [x] Single InstantX union controlnet
- [x] Single InstantX controlnet
- [x] Single Shakker Labs Union controlnet
- [x] Multiple controlnets
- [x] Weight, start, end params all work as expected
- [x] Can be used with image-to-image and inpainting.
- [x] Clear error message if no VAE is passed when using InstantX
controlnet.
- [x] Install InstantX ControlNet in diffusers format from HF repo
(`InstantX/FLUX.1-dev-Controlnet-Union`)
- [x] Test all FLUX ControlNet starter models
## Merge Plan
No special instructions.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
This replicates the img2img flow:
- Reset the canvas
- Resize the bbox to the image's aspect ratio at the optimal size for the selected model
- Add the image as a raster layer
- Resizes the layer to fit the bbox using the 'fill' strategy
After this completes, the user can immediately click Invoke and it will do img2img.
If an entity needs to do something after init, it can use this system. For example, if a layer should be transformed immediately after initializing, it can use an init callback.
This feature involves a certain amount of extra work to ensure stroke and fill with partial opacity render correctly together. However, none of our shapes actually use that combination of attributes, so we can disable this for a minor perf boost.
Instead of pulling the preview canvas from the konva internals, use the canvas created for bbox calculations as the preview canvas.
This doesn't change perf characteristics, because we were already creating this canvas. It just means we don't need to dip into the konva internals.
It fixes an issue where the layer preview didn't update or show when a layer is disabled or otherwise hidden.
- When resetting workflows, retain the current mode state
- Remove the useEffect that reacted to the `isCleanEditor` flag to prevent getting menu getting locked open
This could be triggered by transforming a layer, undoing, then transforming again. The simple fix is to ignore the rasterization cache for all transforms.
There's a Konva bug where `pointerenter` & `pointerleave` events aren't fired correctly on the stage.
In 87fdea4cc6 I made a change that surfaced this bug, breaking touch and Apple Pencil interactions, because the cursor position doesn't get updated.
Simple fix - ensure we update the cursor on `pointerdown` events, even though we shouldn't need to.
Will make a bug report upstream
- Set an empty title to prevent browsers from showing "Please match the requested format." when hovering the number input
- Fix issue w/ `z-index` that prevented the popover button from being clicked while the input was focused
Currently translated at 98.7% (1461 of 1479 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.7% (1460 of 1479 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.5% (1458 of 1479 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.7% (1459 of 1477 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.7% (1453 of 1471 strings)
Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
* add tooltips for images/assets tabs
* add icon by board name that can be used to activate editable
* update getting started text
---------
Co-authored-by: Mary Hipp <maryhipp@Marys-MacBook-Air.local>
Currently translated at 98.7% (1453 of 1471 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.7% (1453 of 1471 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.7% (1452 of 1471 strings)
Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
- Reverts the `onClick -> onPointerUp` changes, which fixed Apple Pencil interactions of buttons with tooltips but broke things in other subtle ways.
- Adds a default `openDelay` on tooltips of 500ms. This is another way to fix Apple Pencil interactions, and according to some searching online, is the best practice for tooltips anyways. The default behaviour should be for there to be a delay, and only in specific circumstances should there be no delay. So we'll see how this is received.
The color picker take some time to sample the color from the canvas state. This could cause a race condition where the cursor position changes between the time sampling starts, resulting in the picker showing the wrong color. Sometimes it picks up the color picker tool preview!
To resolve this, the color picker's color syncing is now throttled to once per animation frame. Besides fixing the incorrect color issue, it improves the perf substantially by reducing number of samples we take.
- Record both absolute and relative positions
- Use simpler method to get relative position
- Generalize getColorUnderCursor to be getColorAtCoordinate
We just changed all buttons to use `onPointerUp` events to fix Apple Pencil behaviour. This, plus the specific DOM layout of boards, resulted in the `onPointerUp` being triggered on a board before the drop triggered.
The app saw this as selecting the board, which then reset the gallery selection to the first image in the board. By the time you drop, the gallery selection had reset.
DOM layout slightly altered to work around this.
Currently translated at 45.4% (668 of 1470 strings)
translationBot(ui): update translation (French)
Currently translated at 33.1% (488 of 1470 strings)
translationBot(ui): update translation (French)
Currently translated at 32.5% (479 of 1470 strings)
translationBot(ui): update translation (French)
Currently translated at 30.7% (449 of 1458 strings)
translationBot(ui): update translation (French)
Currently translated at 30.2% (442 of 1460 strings)
Co-authored-by: Thomas Bolteau <thomas.bolteau50@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/fr/
Translation: InvokeAI/Web UI
## Summary
#6890 bumped torch, which caused an incompatibility with xformers when
installing with `pip install ".[xformers]"`. This PR bumps xformers.
## QA Instructions
I ran some smoke tests to confirm that generating with xformers still
works.
In my tests on an A100, there is a performance regression after bumping
xformers (2.7 it/s vs 3.2 it/s). I think it is ok to ignore this for
A100s, since users should be using torch-sdp, which is much faster (4.3
it/s). But, we should test for regression on older cards where xformers
is still recommended.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
A new "session" just means to reset most settings to default values, excluding model.
There are a few things that need to be reset:
- Parameters state, except for models and things dependent on model selection (like VAE precision)
- Canvas state, except for the `modelBase`, which is dependent on the model selection
- Canvas staging area state
- LoRAs state
- HRF state
- Style presets state
We also select the canvas tab.
For new gallery sessions, we:
- Open the image viewer
- Set the right panel tab to `gallery`
And for new canvas sessions, we:
- Close the image viewer
- Set the right panel tab to `layers`
Currently translated at 24.1% (351 of 1452 strings)
translationBot(ui): update translation (French)
Currently translated at 17.9% (261 of 1452 strings)
translationBot(ui): update translation (French)
Currently translated at 17.8% (259 of 1452 strings)
translationBot(ui): update translation (French)
Currently translated at 17.5% (255 of 1452 strings)
translationBot(ui): update translation (French)
Currently translated at 10.3% (150 of 1452 strings)
Co-authored-by: Thomas Bolteau <thomas.bolteau50@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/fr/
Translation: InvokeAI/Web UI
Currently translated at 62.0% (901 of 1452 strings)
translationBot(ui): update translation (German)
Currently translated at 56.4% (819 of 1452 strings)
translationBot(ui): update translation (German)
Currently translated at 53.8% (782 of 1452 strings)
Co-authored-by: Ettore Atalan <atalanttore@googlemail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/de/
Translation: InvokeAI/Web UI
Currently translated at 56.4% (819 of 1452 strings)
translationBot(ui): update translation (German)
Currently translated at 53.8% (782 of 1452 strings)
translationBot(ui): update translation (German)
Currently translated at 45.3% (658 of 1451 strings)
Co-authored-by: B N <berndnieschalk@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/de/
Translation: InvokeAI/Web UI
## Summary
This PR add support for FLUX LoRA models in kohya format with `lora_te1`
layers (i.e. CLIP LoRA layers). Previously, only transformer LoRA layers
were supported.
Example LoRA model in this format:
https://huggingface.co/cocktailpeanut/optimus
### Example
Prompt: `optimus is playing tennis in a tennis court`
Seed: 0
Without LoRA:

With LoRA:

## QA Instructions
I tested the following:
- [x] The optimus LoRA (with CLIP layers) can be applied.
- [x] FLUX LoRAs without CLIP layers still work
- [x] Loading the optimus LoRA, but applying it to the transformer
_only_ produces a different result. I.e. verified that patching the CLIP
layers is doing _something_. Ironically, the results seem better without
applying the CLIP layers. The CLIP layers seem to pull in more
background concepts. Regardless, it works.
- [x] The optimus LoRA can be applied via the Linear UI, and the output
matches results from manually constructing the workflow graph.
- [x] FLUX LoRAs without CLIP layers still work via the Linear UI.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
This pops up every now and then and I could never figure it out. A user figured it out in #6936. The cause is appending a query string to the app URL.
For example:
```sh
http://127.0.0.1:9090/?__theme=dark
```
The query string breaking the static file serving, which prevents our translations from loading correctly. Instead of the JSON translations, FastAPI sends the index HTML page. The UI then errors when attempting to parse the translation JSON.
The query string ?__theme=dark is used by Gradio to force dark mode. I believe the users with this issue are doing the same thing the user in #6936 did (just change the port number on an existing bookmark) or their browser history/bookmark includes the query string.
Though this is technically a user-caused problem (we cannot prevent the user from using a malformed URL), we can work around it. When query string is used on the root path, we can redirect the browser to the root path without the query string.
This is done via very simple middleware.
Closes#6696Closes#6817Closes#6828Closes#6936Closes#6983
`usePanel` started panels with a `minSize` and `defaultSize` of 0, which means collapsed. This causes panels to load as collapsed on the very first app load. Then, in the layout effect, we see the panel as collapsed and skip setting it to the correct size.
Reviewing the library's API, `minSize` and `defaultSize` should not be lower than 1. Thankfully, setting this to 1 also prevents the issue described above.
- `minSize` and `defaultSize` start at 1
- Return a sentinel value when converting percentages to pixels, if the panel's container has no size. When that happens, we should not update the `minSize` or `defaultSize`.
- Split observer callback into its own function, so that the exact same logic can be used on the first run of hte effect.
- Update prop names and docstrings to accurately reflect that the numerical values are in pixels
* restore send-to functionality
* lint
* feat(ui): add getImageMetadata helper
* feat(ui): updated usePreselectedImage logic
* fix(ui): race condition when creating & initializing canvas entity adapters
There was a race condition when the canvas was reset as it was initializing. This could occur when the "use preselected image" functionality was triggered.
It was possible to get an error (non-app-breaking) when attempting to initialize an entity:
1. Canvas initializes
2. Canvas starts creating and initializing all entities (this happens in `CanvasEntityRendererModule.render`)
3. Canvas is reset before that process finishes, clearing state
4. The method call from 2) attempts to initialize an entity that has been deleted from state and fails
Changes to fix this:
- Split `CanvasEntityRendererModule.render` into individual methods for each entity type, each with their own store subscription
- Do not `await` initialization after creating the entity adapter classes - let them initialize in the background
So the `render` method now completes very fast - quick enough that we don't run into this race condition.
It's possible that something will change in the future, and this race condition will come back. In that case, we could use mutexes in `CanvasEntityRendererModule` to prevent the failure condition. It's a bit more complicated to do that so I'm skipping it for now.
* feat(ui): export workflow library is open atom
* feat(ui): export image viewer atom
* tidy(ui): organise style presets menu state
* feat(ui): consolidate studio init actions
* build(ui): export type StudioInitAction
* feat(ui): add getStylePreset helper
* feat(ui): add toasts to useStudioInitAction
* tidy(ui): comment & minor cleanup for useStudioInitAction
* chore(ui): lint
* only show version when local
---------
Co-authored-by: Mary Hipp <maryhipp@Marys-MacBook-Air.local>
Co-authored-by: psychedelicious <4822129+psychedelicious@users.noreply.github.com>
Simplify the handle component and use the provided data attributes to style the handles correctly.
Fixes a styling issue where you if you hover at the T-junction between two handles, only one brightens up.
This unused logic was unnecessarily complicating the hook. It also inadvertently made the default panel size arg a percentage value even if it was actually a pixel value.
Cleaned up a couple other little bits.
Only change the selection array when its contents have changed. This prevents unnecessary re-renders.
For example, if the selection is currently `[image1]` and we set it again to `[image1]`, while the array contains the same objects, it is a new array. This will trigger unncessary re-renders.
Selecting a board selects the image, and then we were selecting it again afterwards. So we programmatically select the newly generated image twice.
This can cause a race condition if the user changes image selection between when the two programmatic image selection actions. Their selection will be quickly overridden by the second programmatic selection action.
I broke this in dfac0292f4 due to misunderstanding of what the upscale model actually was. I thought it was a main model but actually its a spandrel model.
There's a situation in which the enqueue response comes after the graph actually executes. This was unexpected when I first wrote the logic. I suppose it has to do with the async endpoint handling.
- Update canvas slice's to track the current base model architecture instead of just the optimal dimension. This lets us derive both optimal dimension _and_ grid size for the currently selected model.
- Update all bbox size utilities to use derived grid size instead of hardcoded values of 8 or 64
- Review every damned instance of the number 8 in the whole frontend and update the ones that need to use the grid size
- Update the invoke button blocking logic to check against scaled bbox size, unless scaling is disabled.
- Update the invoke button blocking to say if it's width or height that is invalid and if its bbox or scaled, for both FLUX and the T2I adapter constraints
- Use consistent logic for all model type handlers
- Fix bug where we could select invalid upscaling models (not sure how this hadn't caused problems...)
- Add logging for each action
- Only reset models when there is a change to be made - skip dispatching actions when there would be no change made to state
Previously the setting was `showOnlyRasterLayersWhileStaging`. This has been renamed to `isolatedStagingPreview`. Works the same.
Also added `isolatedFilteringPreview` an `isolatedTransformingPreview`. These work the same way, but they isolate the current selected layer. There are toggles in the canvas settings popover _and_ the filter/transform popups (same setting).
We need to ensure the getQueueCountsByDestination query is sync'd, invalidating its tags as queue items complete. Unfortunately it's 2 extra network requests per queue item.
Also clean up some jank w/ the handling of accepting staging images - there was this no-op action & a listener for it... should just be a simple callback.
Both the vanilla and autoscale invocations report progress while processing each tile.
The autoscale version, which may run the spandrel model multiple times, also includes the current iteration.
Each of these was a bit off:
- The SD callback started at `-1` and ended at `i`. Combined w/ the weird math on the previous `calc_percentage` util, this caused the progress bar to never finish.
- The MultiDiffusion callback had the same problems as SD.
- The FLUX callback didn't emit a pre-denoising step 0 image. It also reported total_steps as 1 higher than the actual step count.
Each of these now emit the expected events to the frontend:
- The initial latents at 0%
- Progress at each step, ending at 100%
- Update the step callback methods in the invocation API to use the new signal_progress API
- Copy and update the `calc_percentage`, reducing special handling for step and total_steps - a followup commit will fix callers of the step callbacks
## Summary
This PR makes some improvements to the FLUX image-to-image and
inpainting behaviours.
Changes:
- Expand inpainting region at a cutoff timestep. This improves seam
coherence around inpainting regions.
- Add Trajectory Guidance to improve the ability to control how much an
image gets modified during image-to-image/inpainting (see the code for a
more technical explanation - it's well-documented).
## `trajectory_guidance_strength` Usage
- The `trajectory_guidance_strength` param has been added to the `FLUX
Denoise` invocation.
- `trajectory_guidance_strength` defaults to `0.0` and should be in the
range [0, 1].
- `trajectory_guidance_strength = 0.0` has no effect on the denoising
process.
- `trajectory_guidance_strength = 1.0` will guide strongly towards the
original image.
## FLUX image-to-image usage tips
- As always, prompt matters a lot.
- If you are trying to making minor perturbations to an image, use
vanilla image-to-image by setting the `denoising_start` param.
- If you are trying to make significant changes to an image, using
trajectory guidance will give more control than using vanilla
image-to-image. Set `denoising_start=0.0` and adjust
`trajectory_guidance_strength` to control the amount of change in the
image.
- The 'transition point' where the image changes the most as you adjust
`trajectory_guidance_strength` or `denoise_start` varies depending on
the noise. So, set a fixed noise seed, then tune those params.
## QA Instructions
- [x] Vanilla image-to-image - No change in output
- [x] Vanilla inpainting - No change in output
- [x] Vanilla outpainting - No change in output
- Trajectory Guidance image-to-image
- [x] TGS = 0.0 is identical to Vanilla case
- [x] TGS = 1.0 guides close to the original image
- Not as close as I'd like, but it's not broken.
- [x] Smooth transition as TGS varies
- [x] Smoke test: TGS with denoise_start > 0.0
- TG inpainting
- [x] TGS = 0.0 is identical to Vanilla case
- [x] TGS = 1.0 guides close to the original image
- Not as close as I'd like, but it's not broken
- [x] Smooth transition as TGS varies
- [x] Smoke test: TGS with denoise_start > 0.0
- TG outpainting
- [x] TGS = 0.0 is identical to Vanilla case
- [x] Smoke test TGS outpainting
- [x] Smoke test FLUX text-to-image
- [x] Preview images look ok for all of above.
## Known issues (will be addressed in follow-up PRs)
- The current TGS scale biases towards creating more change than desired
in the image. More tuning of the TG change schedule is required.
- TGS does not work very well for outpainting right now. This _might_ be
solvable, but more likely we'll just want to discourage it in the Linear
UI.
## Merge Plan
No special instructions.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
## Summary
If a FLUX dev model is selected, show icon and popover telling user
about its license for commercial use
## Related Issues / Discussions
<!--WHEN APPLICABLE: List any related issues or discussions on github or
discord. If this PR closes an issue, please use the "Closes #1234"
format, so that the issue will be automatically closed when the PR
merges.-->
## QA Instructions
<!--WHEN APPLICABLE: Describe how you have tested the changes in this
PR. Provide enough detail that a reviewer can reproduce your tests.-->
## Merge Plan
<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->
## Checklist
- [ ] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
## Summary
This PR attempts to fix a flaky FLUX LoRA unit test.
Example test failure:
https://github.com/invoke-ai/InvokeAI/actions/runs/10958325913/job/30428299328?pr=6898
The failure _seems_ to be caused by a numerical precision error, but I
haven't been able to reproduce it locally. I have reduced the tolerance
of the offending comparison, and am pretty confident that this will
solve the issue.
## QA Instructions
No QA necessary.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
This can be used for nodes that Invoke uses internally. Internal nodes do not have API stability guarantees. For example, they may change if the needs of the linear UI change.
Two main changes:
- Add `runGraphAndReturnImageOutput` to `CanvasStateApiModule`. This method is a safe and convenient abstraction to execute a graph and retrieve the image output of one of its nodes. It supports cancellation (via an AbortSignal) and timeout.
- Update filters to build whole graphs, as opposed to nodes.
These changes allow:
- Filter execution is resilient, with all error cases handled (afaik)
- `CanvasEntityFilterer` class is much simpler
- Stuck or long-running filters may be canceled
- Filters may be arbitrarily complex - so long as there is one node that outputs an image, the filter will just work
- Rename util to `getImageDTOSafe`
- Update API to accept the same options as RTKQ's `initiate`
- Add `getImageDTO`; while `getImageDTOSafe` returns null if the image is not found, the new util throws
- Update usage of `getImageDTOSafe`
* wip
* more updates for new user experience
* pull whats new out
* use loading state
* lint
* fix(ui): translation missing period
* feat(ui): create icon component for invoke logo
* feat(ui): tweaked invoke logo colors
---------
Co-authored-by: Mary Hipp <maryhipp@Marys-MacBook-Air.local>
Co-authored-by: psychedelicious <4822129+psychedelicious@users.noreply.github.com>
## Summary
There was an issue w/ the calculation causing an infinite loop but the
fixed algorithm from #6887 wasn't correct bc it doesn't take into
account the grid gap correctly. This then breaks arrow key navigation.
- Restore the previous calculation
- Bail out if the gallery elements don't have any width, which causes
the infinite loop - this part was missed when copying the logic from
GalleryImageGrid
## Related Issues / Discussions
n/a
## QA Instructions
shouldn't freeze
## Merge Plan
n/a
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
There was an issue w/ the calculation causing an infinite loop but the fixed algorithm wasn't correct bc it doesn't take into account the grid gap correctly. This then breaks arrow key navigation.
- Restore the previous calculation
- Bail out if the gallery elements don't have any width, which causes the infinite loop - this part was missed when copying the logic from GalleryImageGrid
- Allow `uploadImage` util to accept `metadata` to embed in the image
- Update compositor to support `metadata` field when uploading rasterized composite layer
- Add async zod refiner to `zImageWithDims` which fetches the image as part of validation
- Add `zServerValidatedModelIdentifierField`, a zod-refined version of `zModelIdentifierField` which fetches the model as part of validation
- Add `zCanvasMetadata` zod schema, which contains only canvas entities - no bbox, and no `isHidden` flags
- Renamed "Send to Canvas" -> "New Layer from Image"
- Added "New Canvas from Image"
This clarifies the purpose of the menu items and gives tablet users a way to easily add images tot he canvas.
Also update the verbiage for the alerts:
- "Sending to Canvas" -> "Staging Generations on Canvas"
- "Sending to Gallery" -> "Sending Generations to Gallery"
- Add buttons to zoom in/out
- Update hotkeys for fit & 100% to match affinity (e.g. ctrl+0, ctrl+1)
- Add hotkeys for 200%, 400%, 800%
- Update tooltips
This mirrors affinity/photoshop's default `d` hotkey, which sets the fg/bg to white/black. We don't have a concept of "background color", and white is more useful for control images, so it sets to white.
- Rework hotkey data to include the keys for each hotkey action.
- Add wrapper for `useHotkeys` that accepts a hotkey category and id. Automatically selects the key from the hotkey data.
- Add handling for macOS (cmd vs ctrl, option vs alt).
- Redo all hotkey descriptions, deleting nonexistant ones.
- Some `esc` hotkeys that just close whatever you are currently in are omitted due to their relative simplicity and intuitiveness.
This was caused by allowing the stage to be set to fractional coordinates. For example, the stage might be positioned at `x: 142.22255, y: 488.79`.
When positioned like this, the canvas will be slightly misaligned with its native pixel grid. The browser does its best, but this causes tiny scaling artifacts throughout the image. It's most noticeable where there is a sharp contrast.
This behaviour was introduced while troubleshooting an issue with degraded quality when saving canvas to gallery. Turned out the stage position was unrelated to that issue, but I didn't realize that the change would cause this other type of problem.
The fix is super simple - ensure we floor stage coords when setting the manually. Konva never sets the position to fractional coordinates itself. For example, while dragging the stage, Konva sets the stage coordiantes itself, and they are always integers.
## Summary
This PR adds support for FLUX LoRA models on both quantized and
non-quantized base models.
Supported formats:
- diffusers
- kohya
Full changelist:
- Consolidated LoRA handling code in `invokeai/backend/lora`
- Add support for FLUX kohya and FLUX diffusers LoRA model loading
- Add ability to either patch LoRAs or run as a sidecar model (the
latter enables LoRAs to be applied to a wide range of quantized models).
## QA Instructions
Note to reviewers: I tested everything in this checklist. Feel free to
re-verify any of this, but also test any LoRAs that you have. There are
many small LoRA format variations, and there's a risk of breaking one of
them with this change.
FLUX LoRA
- [x] Import / probe of kohya FLUX LoRA
(https://civitai.com/models/159333/pokemon-trainer-sprite-pixelart?modelVersionId=779247)
- [x] Import / probe of Diffusers FLUX LoRA
(https://civitai.com/models/200255/hands-xl-sd-15-flux1-dev?modelVersionId=781855)
- [x] kohya with non-quantized base model
- [x] kohya with quantized base model (should roughly match the
non-quantized case)
- [x] diffusers with non-quantized base model
- [x] diffusers with quantized base model (should roughly match the
non-quantized case)
- [x] Sidecar LoRA patching speed (<0.1secs after model is loaded)
- [x] Stacking multiple fused LoRA models (i.e. on top on non-quantized
model)
- [x] Stacking multiple sidecar LoRA models (i.e. on top of quantized
model)
Regression Tests
- [x] SD1.5 LoRA (check output, speed and memory)
- [x] SDXL LoRA (check output, speed and memory)
- [x] `USE_MODULAR_DENOISE=1` smoke test with LoRA
Test for output regression with the following LoRA formats:
- [x] LoRA
- [x] LoHA
- [x] LoKr
- [x] IA3
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
- Canvas manages its own progress socket event listeners and progress event data.
- Remove cancellations listener jank.
- Dip into low-level redux subscription API to watch for queue status changes, clearing the last "global" progress event when the queue has nothing in progress. Could also do this in a useEffect I guess.
- Had to shuffle some things around to prevent circular imports, so there are a lot of tiny changes here.
- Remove queue front button. Hold shift while clicking `Invoke` button to queue front.
- Restore queue menu actions w/ the reclaimed space.
- Simplify queue interaction hooks.
The lineart model often outputs a lot of almost-black noise. SD1.5 ControlNets seem to be OK with this, but SDXL ControlNets are not - they need a cleaner map. 12 was experimentally determined to be a good threshold, eliminating all the noise while keeping the actual edges. Other approaches to thresholding may be better, for example stretching the contrast or removing noise.
I tried:
- Simple thresholding (as implemented here) - works fine.
- Adaptive thresholding - doesn't work, because the thresholding is done in the context of small blocks, while we want thresholding in the context of the whole image.
- Gamma adjustment - alters the white values too much. Hard to tuen.
- Contrast stretching, with and without pre-simple-thresholding - this allows us to treshold out the noise, then stretch everything above the threshold down to almost-zero. So you have a smoother gradient of lightness near zero. It works but it also stretches contrast near white down a bit, which is probably undesired.
In the end, simple thresholding works fine and is very simple.
The HTML Canvas context has an `imageSmoothingEnabled` property which defaults to `true`. This causes the browser canvas API to, well, apply image smoothing - everything gets antialiased when drawn.
This is, of course, problematic when our goal is to be pixel-perfect. When the same image is drawn multiple times, we get progressive image degradation.
In `CanvasEntityObjectRenderer.cloneObjectGroup()`, where we use Konva's `Node.cache()` method to create a canvas from the entity's objects. Here, we were not setting `imageSmoothingEnabled` to false. This method is used very often by the compositor and we end up feeding back antialiased versions of the image data back into the canvas or generation backend.
Disabling smoothing here appears to fix the issue. I've also disabled image smoothing everywhere else we interact with a canvas rendering context.
The checkerboard background was rendered as a separate DOM element that stretched to fill the canvas container.
While the canvas width and height are always integers, this background element could have non-integer dimensions, depending on panel sizes.As a result, it could be slightly larger than the canvas, introducing a fine border around the canvas.
This is purely a visual issue, but it's very noticeable when you use the bbox overlay. It also can be noticed with masks that extend beyond the edge of the visible canvas.
- Refactor the checkerboard background to be rendered by the canvas instead of as a DOM element, resolving the issue.
- Add a helper method to get the scaled rect of the stage, updating a few places where we need such a rect.
- Rename `CanvasStageModule.getScaledPixels` method to `unscale`, clarifying its purpose.
Track various canvas states:
- Filtering an entity
- Transforming an entity
- Rasterizing an entity
- Compositing
- Busy (derived from all of the above)
Also track individual entity states:
- Locked
- Disabled
- All of type are hidden
- Has objects
- Interactable (derived from all of the above)
These states then gate various actions. For example:
- Cannot invoke while the canvas is busy.
- Cannot transform, filter, duplicate, or delete when the canvas is busy.
Tool interaction restrictions are not yet implemented.
## Summary
This PR splits the lora.py monolith into separate files. The main
motivation for doing this in a standalone PR is to make the diffs more
interpretable in the [upcoming
changes](https://github.com/invoke-ai/InvokeAI/compare/main...ryan/flux-lora-sidecar)
to support LoRAs for FLUX.
This PR does not make any functional changes - it just moves files
around and changes import paths.
## QA Instructions
I smoke tested generation with LoRA, LoHA, LoKr, and IA3.
## Merge Plan
No special instructions. Merge on approval.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
- Add backcompat for cnet model default settings
- Default filter selection based on model type
- Updated UI components to use new filter nodes
- Added handling for failed filter executions, preventing filter from getting stuck in case it failed for some reason
- New translations for all filters & fields
Use a generic to narrow the `type` field from `string` to a literal. Now you can do e.g. `adapter.type === 'control_layer_adapter'` and TS narrows the type.
Similar to the existing node, but without any resizing. The backend logic was consolidated and modified so that it the model loading can be managed by the model manager.
The ONNX Runtime `InferenceSession` class was added to the `AnyModel` union to satisfy the type checker.
Similar to the existing node, but without any resizing and with a revised model loading API that uses the model manager.
All code related to the invocation now lives in the Invoke repo.
Similar to the existing node, but without any resizing and with a revised model loading API that uses the model manager.
All code related to the invocation now lives in the Invoke repo. Unfortunately, this includes a whole git repo for EfficientNet. I believe we could use the package `timm` instead of this, but it's beyond me.
Similar to the existing node, but without any resizing and with a revised model loading API that uses the model manager.
All code related to the invocation now lives in the Invoke repo.
Similar to the existing node, but without any resizing and with a revised model loading API that uses the model manager.
All code related to the invocation now lives in the Invoke repo.
So far, this includes:
- Save Canvas to Gallery
- Save Bbox to Gallery
- Send Bbox to Regional IP Adapter
- Send Bbox to Global IP Adapter
- Send Bbox to Control Layer
- Send Bbox to Raster Layer
To prevent losing all ephemeral canvas stage when switching tabs, we will refrain from destroying the canvas manager instance when its tab unmounts, and use the existing canvas manager instance on mount, if there is one.
One small change required in `CanvasStageModule` - a `setContainer` method to update the konva stage DOM element.
- Add `reset` functionality
- Rename badly named `autoPreviewFilter` to `autoProcessFilter`
- Do not process filter when starting, unless `autoProcessFilter` is enabled
This includes some fixes for the composite number input component's local value handling, resolving an infinite recursion problem when an invalid value is set.
Snap can be any of off, 8px or 64px.
The snap is used when moving and transforming entities.
When transforming and locking aspect ratio, the snap is ignored entirely, because we'd change the aspect ratio if we forced the snap.
Otherwise, if we are not locking aspect ratio (e.g. the user is holding shift), we snap the transform anchors to the grid.
Realized we can use listener middleware to respond to _actions_, as opposed to using the redux store subscription to respond to _state changes_... This might simplify some things.
Using this pattern here.
Only hiccup - there's a TS issue preventing this from being added to the state api module. The `addListener` method has an overloaded type signature and TS cannot extract the overloaded arg type using `Parameters<T>`. As a result, if we try to wrap this, we end up with a broken TS signature for the wrapper method.
There's a race condition where we sometimes get progress events from canceled queue items, depending on the timing of the cancellation request and last event or two from the queue item.
I can't imagine how to resolve this except by tracking all cancellations and ignoring events for cancelled items, which is implemented in this change.
- Add selectors to get the default control adapter and ip adapter with model, preferring controlnet over t2i adapter for model
- Add hooks to add each entity type, using the defaults
- Add hooks to add prompts/ip adapters to a regional guidance layer
- Use the defaults in other places where we add control layers or ip adapters (e.g. dnd-triggered entity creation)
- Each entity gets its own `CanvasEntityFilterer`
- Add auto-preview feature to filter, debounced by 1000ms leading + trailing
- Fix flash when preview updates
When resetting the canvas or staging area, we don't want to cancel generations that are going to the gallery - only those going to the canvas.
Thus the method should not cancel by origin, but instead cancel by destination.
Update the queue method and route.
Use the min of each pixel's alpha value and lightness for the output alpha. This prevents artifacts when using the transparency effect, especially with non-black pixels with low alpha.
- Rely on redux + reselect more
- Remove all nanostores that simply "mirrored" redux state in favor of direct subscriptions to redux store
- Add abstractions for creating redux subs and running selectors
- Add `initialize` method to CanvasModuleBase, for post-instantiation tasks
- Reduce local caching of state in modules to a minimum
Big cleanup. Makes these classes easier to implement, lots of comments and docstrings to clarify how it all works.
- Add default implementations for `destroy`, `repr` and `getLoggingContext`
- Tidy individual module configs
- Update `CanvasManager.buildLogger` to accept a canvas module as the arg
- Add `CanvasManager.buildPath`
TBH not sure exactly why this broke. Fixed by rollback back the use of a render prop in favor of global state. Also revised the API of `useBoolean` and `buildUseBoolean`.
- Canvas generation mode is replace with a boolean `sendToCanvas` flag. When off, images generated on the canvas go to the gallery. When on, they get added to the staging area.
- When an image result is received, if its destination is the canvas, staging is automatically started.
- Updated queue list to show the destination column.
- Added `IconSwitch` component to represent binary choices, used for the new `sendToCanvas` flag and image viewer toggle.
- Remove the queue actions menu in `QueueControls`. Move the queue count badge to the cancel button.
- Redo layout of `QueueControls` to prevent duplicate queue count badges.
- Fix issue where gallery and options panels could show thru transparent regions of queue tab.
- Disable panel hotkeys when on mm/queue tabs.
The frontend needs to know where queue items came from (i.e. which tab), and where results are going to (i.e. send images to gallery or canvas). The `origin` column is not quite enough to represent this cleanly.
A `destination` column provides the frontend what it needs to handle incoming generations.
This hook forcibly updates _all_ portals with `data-hidden=true` when the modal opens - then reverts it when the modal closes. It's intended to help screen readers. Unfortunately, this absolutely tanks performance because we have many portals. React needs to do alot of layout calculations (not re-renders).
IMO this behaviour is a bug in chakra. The modals which generated the portals are hidden by default, so this data attr should really be set by default. Dunno why it isn't.
Previously this badge, floating over the queue menu button next to the invoke button, was rendered within the existing layout. When I initially positioned it, the app layout interfered - it would extend into an area reserved for a flex gap, which cut off the badge.
As a (bad) workaround, I had shifted the whole app down a few pixels to make room for it. What I should have done is what I've done in this commit - render the badge in a portal to take it out of the layout so we don't need that extra vertical padding.
Sleekified some styling a bit too.
The canvas size was dynamic based on the container div's size. When the div was hidden (e.g. when selecting another tab), the container's effective size is 0. This resulted in the preview image canvas being drawn at a scale of 0.
Fixed by using an absolute size for the canvas container.
- Add lock toggle
- Tweak lock and enabled styles
- Update entity list action bar w/ delete & delete all
- Move add layer menu to action bar
- Adjust opacity slider style
- Throttle pushing to history for actions of the same type, starting with 1000ms throttle.
- History has a limit of 64 items, same as workflow editor
- Add clear history button
- Fix an issue where entity transformers would reset the entity state when the entity is fully transparent, resetting the redo stack. This could happen when you undo to the starting state of a layer
I learned that the inline selector syntax recreates the selector function on every render:
```ts
const val = useAppSelector((s) => s.slice.val)
```
Not good! Better is to create a selector outside the function and use it. Doing that for all selectors now, most of the way through now. Feels snappier.
Things like `$lastCursorPos` are now created within the canvas drawing classes. Consumers in react access them via `useCanvasManager`.
For example:
```tsx
const canvasManager = useCanvasManager();
const lastCursorPos = useStore(canvasManager.stateApi.$lastCursorPos);
```
Previously, canvas actions specific to an entity type only needed the id of that entity type. This allowed you to pass in the id of an entity of the wrong type.
All actions for a specific entity now take a full entity identifier, and the entity identifier type can be narrowed.
`selectEntity` and `selectEntityOrThrow` now need a full entity identifier, and narrow their return values to a specific entity type _if_ the entity identifier is narrowed.
The types for canvas entities are updated with optional type parameters for this purpose.
All reducers, actions and components have been updated.
While we lose the benefit of the caches persisting across reloads, this is a much simpler way to handle things. If we need a persistent cache, we can explore it in the future.
- use `stable-hash` to generate stable, non-crypto hashes for cache entries, instead of using deep object comparisons
- use an object to store image name caches
Sequence of events causing the race condition:
- Enqueue batch
- Invalidate `SessionQueueStatus` tag
- Request updated queue status via HTTP - batch still processing at this point
- Batch completes
- Event emitted saying so
- Optimistically update the queue status cache, it is correct
- HTTP request makes it back and overwrites the optimistic update, indicating the batch is still in progress
FIxed by not invalidating the cache.
Download events and invocation status events (including progress images) are very frequent. There's no real need for these to pass through redux. Handling them outside redux is a significant performance win - far fewer store subscription calls, far fewer trips through middleware.
All event handling is moved outside middleware. Cleanup of unused actions and listeners to follow.
- create a context for entity identifiers, massively simplifying UI for each entity int he list
- consolidate common redux actions
- remove now-unused code
The origin is an optional field indicating the queue item's origin. For example, "canvas" when the queue item originated from the canvas or "workflows" when the queue item originated from the workflows tab. If omitted, we assume the queue item originated from the API directly.
- Add migration to add the nullable column to the `session_queue` table.
- Update relevant event payloads with the new field.
- Add `cancel_by_origin` method to `session_queue` service and corresponding route. This is required for the canvas to bail out early when staging images.
- Add `origin` to both `SessionQueueItem` and `Batch` - it needs to be provided initially via the batch and then passed onto the queue item.
-
Instead of chaining konva `find` and `findOne` methods, all konva nodes are added to a mapping object. Finding and manipulating them is much simpler.
Done for regions and layers, wip for control adapters.
Subscribe to redux store directly, skipping all the react overhead.
With react in dev mode, a typical frame while using the brush tool on almost-empty canvas is reduced from ~7.5ms to ~3.5ms. All things considered, this still feels slow, but it's a massive improvement.
- Create separate object types for brush and eraser lines, instead of a single type that has a `tool` field.
- Create new object type for rect shapes.
- Add logic to schemas to migrate old object types to new.
- Update renderers & reducers.
Currently translated at 98.2% (1350 of 1374 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.2% (1350 of 1374 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.2% (1350 of 1374 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.4% (1349 of 1370 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.4% (1348 of 1369 strings)
Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
* [MM] add API routes for getting & setting MM cache sizes, and retrieving MM stats
* Update invokeai/app/api/routers/model_manager.py
Co-authored-by: Ryan Dick <ryanjdick3@gmail.com>
* code cleanup after @ryand review
* Update invokeai/app/api/routers/model_manager.py
Co-authored-by: Ryan Dick <ryanjdick3@gmail.com>
* fix merge conflicts; tested and working
---------
Co-authored-by: Lincoln Stein <lstein@gmail.com>
Co-authored-by: Ryan Dick <ryanjdick3@gmail.com>
## Summary
This PR adds support for Image-to-Image and inpainting workflows with
the FLUX model.
Full changelog:
- Split out `FLUX VAE Encode` and `FLUX VAE Decode` nodes
- Renamed `FLUX Text-to-Image` node to `FLUX Denoise` (since it now
supports image-to-image too). This is a workflow-breaking change.
- Added support for FLUX image-to-image via the `Latents` param on the
FLUX denoising node.
- Added support for FLUX masked inpainting via the `Denoise Mask` param
on the FLUX denoising node.
- Added "Denoise Start" and "Denoise End" params to the "FLUX Denoise"
node.
- Updated the "FLUX Text to Image" default workflow.
- Added a "FLUX Image to Image" default workflow.
### Example
FLUX inpainting workflow
<img width="1282" alt="image"
src="https://github.com/user-attachments/assets/86fc1170-e620-4412-8fd8-e119f875fc2e">
Input image

Mask

Output image

### Callouts for reviewers:
- I renamed FLUXTextToImageInvocation -> FLUXDenoisingInvocation. This
is, of course, a breaking change. It feels like the right move and now
is the right time to do it. Any objection?
- I added new `FLUX VAE Encode` and `FLUX VAE Decode` nodes.
Alternatively, I could have tried to match these names to the
corresponding SD nodes (e.g. `FLUX Image to Latents`, `FLUX Latents to
Image`). Personally, I prefer the current names, but want to hear other
opinions.
### Usage notes:
- With the default dev timestep scheduler, the image structure is
largely determined in the first ~3 steps. A consequence of this is that
the denoise_start parameter provides limited 'granularity' of control.
This will likely be improved in the future as we add more scheduler
options. In the meantime, you will likely want to use small values for
`denoise_start` (e.g. 0.03) to start denoising on step ~1-4 out of ~30.
- Currently, there is no 'noise' parameter on the `FLUX Denoise` node,
so the `denoise_end` parameter has limited utility. This will be added
in the future.
## QA Instructions
Test the following workflows:
- [x] Vanilla FLUX text-to-image behaviour is unchanged
- [x] Image-to-image with FLUX dev, no mask
- [x] Image-to-image with FLUX dev, with mask
- [x] Image-to-image with FLUX schnell, no mask (smoke test, not
expected to work well)
## Merge Plan
No special instructions.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
Allocates the specified amount of VRAM, or allocates enough VRAM such that you have the specified amount of VRAM free.
Useful to simulate an environment with a specific amount of VRAM.
## Summary
This PR contains several improvements to memory management for FLUX
workflows.
It is now possible to achieve better FLUX model caching performance, but
this still requires users to manually configure their `ram`/`vram`
settings. E.g. a `vram` setting of 16.0 should allow for all quantized
FLUX models to be kept in memory on the GPU.
Changes:
- Check the size of a model on disk and free the requisite space in the
model cache before loading it. (This behaviour existed previously, but
was removed in https://github.com/invoke-ai/InvokeAI/pull/6072/files.
The removal did not seem to be intentional).
- Removed the hack to free 24GB of space in the cache before loading the
FLUX model.
- Split the T5 embedding and CLIP embedding steps into separate
functions so that the two models don't both have to be held in RAM at
the same time.
- Fix a bug in `InvokeLinear8bitLt` that was causing some tensors to be
left on the GPU when the model was offloaded to the CPU. (This class is
getting very messy due to the non-standard state_dict handling in
`bnb.nn.Linear8bitLt`. )
- Tidy up some dtype handling in FluxTextToImageInvocation to avoid
situations where we hold references to two copies of the same tensor
unnecessarily.
- (minor) Misc cleanup of ModelCache: improve docs and remove unused
vars.
Future:
We should revisit our default ram/vram configs. The current defaults are
very conservative, and users could see major performance improvements
from tuning these values.
## QA Instructions
I tested the FLUX workflow with the following configurations and
verified that the cache hit rates and memory usage matched the expected
behaviour:
- `ram = 16` and `vram = 16`
- `ram = 16` and `vram = 1`
- `ram = 1` and `vram = 1`
Note that the changes in this PR are not isolated to FLUX. Since we now
check the size of models on disk, we may see slight changes in model
cache offload patterns for other models as well.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
## Summary
- Add selectedStylePreset to app parameters
<!--A description of the changes in this PR. Include the kind of change
(fix, feature, docs, etc), the "why" and the "how". Screenshots or
videos are useful for frontend changes.-->
## Related Issues / Discussions
<!--WHEN APPLICABLE: List any related issues or discussions on github or
discord. If this PR closes an issue, please use the "Closes #1234"
format, so that the issue will be automatically closed when the PR
merges.-->
## QA Instructions
<!--WHEN APPLICABLE: Describe how you have tested the changes in this
PR. Provide enough detail that a reviewer can reproduce your tests.-->
## Merge Plan
<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->
## Checklist
- [ ] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
These two scripts are broken and can cause data loss. Remove them.
They are not in the launcher script, but _are_ available to users in the terminal/file browser.
Hopefully, when we removing them here, `pip` will delete them on next installation of the package...
The root cause was the active style preset not being reset when it was deleted, or no longer present in the list of style presets.
- Add extra reducer to `stylePresetSlice` to reset the active preset if it is deleted or otherwise unavailable
- Update the dynamic prompts listener to trigger on delete/update/list of style presets
When invoke.sh is executed using a symlink with a working directory outside of InvokeAI's root directory, it will fail.
invoke.sh attempts to cd into the correct directory at the start of the script, but will cd into the directory of the symlink instead. This commit fixes that.
## Summary
Adds option to download all prompt templates to a CSV
## Related Issues / Discussions
<!--WHEN APPLICABLE: List any related issues or discussions on github or
discord. If this PR closes an issue, please use the "Closes #1234"
format, so that the issue will be automatically closed when the PR
merges.-->
## QA Instructions
<!--WHEN APPLICABLE: Describe how you have tested the changes in this
PR. Provide enough detail that a reviewer can reproduce your tests.-->
## Merge Plan
<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->
## Checklist
- [ ] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
## Summary
added a base prop for selectedWorkflow to allow loading a workflow on
launch
<!--A description of the changes in this PR. Include the kind of change
(fix, feature, docs, etc), the "why" and the "how". Screenshots or
videos are useful for frontend changes.-->
## Related Issues / Discussions
<!--WHEN APPLICABLE: List any related issues or discussions on github or
discord. If this PR closes an issue, please use the "Closes #1234"
format, so that the issue will be automatically closed when the PR
merges.-->
## QA Instructions
can test by loading InvokeAIUI with a selectedWorkflow prop of the
workflow ID
<!--WHEN APPLICABLE: Describe how you have tested the changes in this
PR. Provide enough detail that a reviewer can reproduce your tests.-->
## Merge Plan
<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->
## Checklist
- [ ] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
- Enforce name is present and not an empty string
- Provide empty string as default for positive and negative prompt
- Add `positive_prompt` as validation alias for `prompt` field
- Strip whitespace automatically
- Create `TypeAdapter` to validate the whole list in one go
Currently translated at 98.5% (1336 of 1355 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.5% (1302 of 1321 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.6% (1302 of 1320 strings)
Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
## Summary
Adds prompt templates to the UI. Demo video is attached.
* added default prompt templates to seed database on startup (these
cannot be edited or deleted by users via the UI)
* can create fresh prompt template, create from an image in gallery that
has prompt metadata, or copy an existing prompt template and modify
* if a template is active, can view what your prompt will be invoked as
by switching to "view mode"
https://github.com/user-attachments/assets/32d84e0c-b04c-48da-bae5-aa6eb685d209
## Related Issues / Discussions
<!--WHEN APPLICABLE: List any related issues or discussions on github or
discord. If this PR closes an issue, please use the "Closes #1234"
format, so that the issue will be automatically closed when the PR
merges.-->
## QA Instructions
<!--WHEN APPLICABLE: Describe how you have tested the changes in this
PR. Provide enough detail that a reviewer can reproduce your tests.-->
## Merge Plan
<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->
## Checklist
- [ ] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
Around the time we (I) implemented pydantic events, I noticed a short pause between progress images every 4 or 5 steps when generating with SDXL. It didn't happen with SD1.5, but I did notice that with SD1.5, we'd get 4 or 5 progress events simultaneously. I'd expect one event every ~25ms, matching my it/s with SD1.5. Mysterious!
Digging in, I found an issue is related to our use of a synchronous queue for events. When the event queue is empty, we must call `asyncio.sleep` before checking again. We were sleeping for 100ms.
Said another way, every time we clear the event queue, we have to wait 100ms before another event can be dispatched, even if it is put on the queue immediately after we start waiting. In practice, this means our events get buffered into batches, dispatched once every 100ms.
This explains why I was getting batches of 4 or 5 SD1.5 progress events at once, but not the intermittent SDXL delay.
But this 100ms wait has another effect when the events are put on the queue in intervals that don't perfectly line up with the 100ms wait. This is most noticeable when the time between events is >100ms, and can add up to 100ms delay before the event is dispatched.
For example, say the queue is empty and we start a 100ms wait. Then, immediately after - like 0.01ms later - we push an event on to the queue. We still need to wait another 99.9ms before that event will be dispatched. That's the SDXL delay.
The easy fix is to reduce the sleep to something like 0.01 seconds, but this feels kinda dirty. Can't we just wait on the queue and dispatch every event immediately? Not with the normal synchronous queue - but we can with `asyncio.Queue`.
I switched the events queue to use `asyncio.Queue` (as seen in this commit), which lets us asynchronous wait on the queue in a loop.
Unfortunately, I ran into another issue - events now felt like their timing was inconsistent, but in a different way than with the 100ms sleep. The time between pushing events on the queue and dispatching them was not consistently ~0ms as I'd expect - it was highly variable from ~0ms up to ~100ms.
This is resolved by passing the asyncio loop directly into the events service and using its methods to create the task and interact with the queue. I don't fully understand why this resolved the issue, because either way we are interacting with the same event loop (as shown by `asyncio.get_running_loop()`). I suppose there's some scheduling magic happening.
There's a FastAPI bug that results in the OpenAPI spec outputting the same operation id for each operation when specifying multiple HTTP methods.
- Discussion: https://github.com/tiangolo/fastapi/discussions/8449
- Pending PR to fix: https://github.com/tiangolo/fastapi/pull/10694
In our case, we have a `get_image_full` endpoint that handles GET and HEAD.
This results in an invalid OpenAPI schema. A workaround is to use two route decorators for the operation handler. This works as expected - HEAD requests get the header, and GET requests get the resource. And the OpenAPI schema is valid.
- Updated the previous DepthAnything manual implementation to use the
`transformers` implementation instead. So we can get upstream features.
- Plugged in the DepthAnything models to be handled by Invoke's Model
Manager.
- `small_v2` model will use DepthAnythingV2. This has been added as a
new model option and is now also the default in the Linear UI.

# Merge
Review and merge.
Currently translated at 98.6% (1303 of 1321 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.6% (1302 of 1320 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.6% (1294 of 1312 strings)
Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
There was a problem w/ this release on windows and the builds were pulled from pypi. When installing invoke on windows, pip attempts to build from source, but most (all?) systems won't have the prerequisites for this and installs fail.
This also affects GH actions.
The simple fix is to exclude version 3.9.1 from our deps.
For more information, see https://github.com/matplotlib/matplotlib/issues/28551
## Summary
This PR enables Grounded SAM workflows
(https://arxiv.org/pdf/2401.14159) via the following:
- `GroundingDinoInvocation` for running a Grounding DINO model.
- `SegmentAnythingModelInvocation` for running a SAM model.
- `MaskTensorToImageInvocation` for convenient visualization.
Other notes:
- Uses the transformers implementation of Grounding DINO and SAM.
- The new models are treated as 'utility models' meaning that they are
not visible in the Models tab, and are downloaded automatically the
first time that they are used.
<img width="874" alt="image"
src="https://github.com/user-attachments/assets/1cbaa97d-0e27-4943-86b1-dc7327ba8675">
## Example
Input image

Prompt: "wheels", all other configs default
Result:

## Related Issues / Discussions
Thanks to @blessedcoolant for the initial draft here:
https://github.com/invoke-ai/InvokeAI/pull/6678
## QA Instructions
Manual tests:
- [ ] Test that default settings work well.
- [ ] Test with / without apply_polygon_refinement
- [ ] Test mask_filter options
- [ ] Test detection_threshold values
- [ ] Test RGB input image
- [ ] Test RGBA input image
- [ ] Test grayscale input image
- [ ] Smoke test that an empty mask is returned when 0 objects are
detected
- [ ] Test on CPU
- [ ] Test on MPS (Works on Mac OS, but had to force both models to run
on CPU instead of MPS)
Performance:
- Peak GPU memory utilization with both Grounding DINO and SAM models
loaded is ~4.5GB. (The models do not need to be loaded at the same time,
so could be offloaded by the MM if needed.)
- On an RTX4090, with the models already cached, node execution takes
~0.6 secs.
- On my CPU, with the models cached, node execution takes ~10secs.
## Merge Plan
No special instructions.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
## Summary
- we want a way to load the studio while being directed to a specific
tab, introduced a destination prop to achieve that
<!--A description of the changes in this PR. Include the kind of change
(fix, feature, docs, etc), the "why" and the "how". Screenshots or
videos are useful for frontend changes.-->
## Related Issues / Discussions
<!--WHEN APPLICABLE: List any related issues or discussions on github or
discord. If this PR closes an issue, please use the "Closes #1234"
format, so that the issue will be automatically closed when the PR
merges.-->
## QA Instructions
<!--WHEN APPLICABLE: Describe how you have tested the changes in this
PR. Provide enough detail that a reviewer can reproduce your tests.-->
## Merge Plan
<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->
## Checklist
- [ ] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
## Summary
Code for lora patching from #6577.
Additionally made it the way, that lora can patch not only `weight`, but
also `bias`, because saw some loras which doing it.
## Related Issues / Discussions
#6606https://invokeai.notion.site/Modular-Stable-Diffusion-Backend-Design-Document-e8952daab5d5472faecdc4a72d377b0d
## QA Instructions
Run with and without set `USE_MODULAR_DENOISE` environment.
## Merge Plan
Replace old lora patcher with new after review done.
If you think that there should be some kind of tests - feel free to add.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
## Summary
Gradient mask node outputs mask tensor with values in range [-1, 1],
which unexpected range for mask.
It handled in denoise node the way it translates to [0, 2] mask, which
looks even more wrongly)
From discussion with @dunkeroni I understand him as he thought that
negative values will be treated same as 0, so clamping values not change
intended node logic.
## Related Issues / Discussions
#6643
## QA Instructions
\-
## Merge Plan
\-
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
## Summary
Add karras variants of `deis`, `unipc`, `kdpm2` and `kdpm_2_a`
schedulers.
Also added `dpmpp_3` schedulers, but `dpmpp_3s` currently bugged, so
added only 3m:
https://github.com/huggingface/diffusers/issues/9007
## Related Issues / Discussions
\-
## QA Instructions
\-
## Merge Plan
~@psychedelicious We need to decide what to do with schedulers order, as
it looks a bit broken:~

## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
## Summary
<!--A description of the changes in this PR. Include the kind of change
(fix, feature, docs, etc), the "why" and the "how". Screenshots or
videos are useful for frontend changes.-->
## Related Issues / Discussions
<!--WHEN APPLICABLE: List any related issues or discussions on github or
discord. If this PR closes an issue, please use the "Closes #1234"
format, so that the issue will be automatically closed when the PR
merges.-->
## QA Instructions
<!--WHEN APPLICABLE: Describe how you have tested the changes in this
PR. Provide enough detail that a reviewer can reproduce your tests.-->
## Merge Plan
<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->
## Checklist
- [ ] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
## Summary
Code for inpainting and inpaint models handling from
https://github.com/invoke-ai/InvokeAI/pull/6577.
Separated in 2 extensions as discussed briefly before, so wait for
discussion about such implementation.
## Related Issues / Discussions
#6606https://invokeai.notion.site/Modular-Stable-Diffusion-Backend-Design-Document-e8952daab5d5472faecdc4a72d377b0d
## QA Instructions
Run with and without set `USE_MODULAR_DENOISE` environment.
Try and compare outputs between backends in cases:
- Normal generation on inpaint model
- Inpainting on inpaint model
- Inpainting on normal model
## Merge Plan
Nope.
If you think that there should be some kind of tests - feel free to add.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
We were checking the selected and auto-add board ids against the query cache to see if they still exist. If not, we reset.
This only works if the query cache is updated by the time we do the check - race condition!
We already have the board id from the query args, so there's no need to check the query cache - just compare the deleted board ID directly.
Previously this file's several listeners were all in a single one and I had adapted/split its logic up a bit wonkily, introducing these problems.
The logic was incorrect in two ways:
1. We only ran the logic if we _enable_ showing archived boards. It should be run we we _disable_ showing archived boards.
2. If we couldn't find the selected board in the query cache, we didn't do the reset. This is wrong - if the board isn't in the query cache, we _should_ do the reset. This inverted logic makes more sense before the fix for issue 1.
## Summary
T2I Adapter code from #6577.
## Related Issues / Discussions
#6606https://invokeai.notion.site/Modular-Stable-Diffusion-Backend-Design-Document-e8952daab5d5472faecdc4a72d377b0d
## QA Instructions
Run with and without set `USE_MODULAR_DENOISE` environment.
## Merge Plan
Nope.
If you think that there should be some kind of tests - feel free to add.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
## Summary
Seamless code from #6577.
## Related Issues / Discussions
#6606https://invokeai.notion.site/Modular-Stable-Diffusion-Backend-Design-Document-e8952daab5d5472faecdc4a72d377b0d
## QA Instructions
Run with and without set `USE_MODULAR_DENOISE` environment.
## Merge Plan
Nope.
If you think that there should be some kind of tests - feel free to add.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
The model edit UI's composition allows for the model edit form to be instantiated before the model's config has been received. This results in the form having no values - all the fields are blank instead of populated by the model config.
Part of the fix is to pass the model config around directly instead of relying on _all_ components to fetch the model directly.
I also fixed a crapload of performance issues related to improper use of redux selectors.
Problems this was causing:
- Deleting an edge was a copy of another edge deletes both edges
- Deleting a node that was a copy-with-edges of another node deletes its edges and it's original edges, leaving what I will call "ghost noodles" behind
Previously you could spam the next/prev buttons and really thrash the server. Throttled to 500ms, which feels like a happy medium between responsive and not-thrash-y.
- Autofocus on popover open
- Autoselect number on popover open
- Enter works to change page when input is focused
- Esc works to close popover when input is focused
It was possible to clear the search term while a debounced setSearchTerm is still pending. This resulted in the gallery getting out of sync w/ the search term.
To fix this, we need to lift the state up a bit and cancel any pending debounced setSearchTerm calls when closing the search or clearing the search term box.
`spandrel_image_to_image` now just runs the model with no changes.
`spandrel_image_to_image_autoscale` runs the model repeatedly until the desired scale is reached. previously, `spandrel_image_to_image` did this.
* [MM2] replace untyped config dict passed to install_model with typed ModelRecordChanges
- adjusted frontend to work with new schema
- used this facility to assign "starter model" names and descriptions to the installed
models.
* documentation fix
* [MM2] replace untyped config dict passed to install_model with typed ModelRecordChanges
- adjusted frontend to work with new schema
- used this facility to assign "starter model" names and descriptions to the installed
models.
* documentation fix
* remove v9 pnpm lockfile
* [MM2] replace untyped config dict passed to install_model with typed ModelRecordChanges
- adjusted frontend to work with new schema
- used this facility to assign "starter model" names and descriptions to the installed
models.
* [MM2] replace untyped config dict passed to install_model with typed ModelRecordChanges
- adjusted frontend to work with new schema
- used this facility to assign "starter model" names and descriptions to the installed
models.
* remove v9 pnpm lockfile
* regenerate schema.ts
* prettified
---------
Co-authored-by: Lincoln Stein <lstein@gmail.com>
## Summary
Update Simple Upscale Button to work with spandrel models, add
UpscaleWarning when models aren't available, clean up ESRGAN logic
<!--A description of the changes in this PR. Include the kind of change
(fix, feature, docs, etc), the "why" and the "how". Screenshots or
videos are useful for frontend changes.-->
## Related Issues / Discussions
<!--WHEN APPLICABLE: List any related issues or discussions on github or
discord. If this PR closes an issue, please use the "Closes #1234"
format, so that the issue will be automatically closed when the PR
merges.-->
## QA Instructions
<!--WHEN APPLICABLE: Describe how you have tested the changes in this
PR. Provide enough detail that a reviewer can reproduce your tests.-->
## Merge Plan
<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->
## Checklist
- [ ] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
## Summary
ControlNet code from #6577.
## Related Issues / Discussions
#6606https://invokeai.notion.site/Modular-Stable-Diffusion-Backend-Design-Document-e8952daab5d5472faecdc4a72d377b0d
## QA Instructions
Run with and without set `USE_MODULAR_DENOISE` environment.
## Merge Plan
Merge #6641 firstly, to be able see output difference properly.
If you think that there should be some kind of tests - feel free to add.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
## Summary
Rescale CFG code from #6577.
## Related Issues / Discussions
#6606https://invokeai.notion.site/Modular-Stable-Diffusion-Backend-Design-Document-e8952daab5d5472faecdc4a72d377b0d
## QA Instructions
Run with and without set `USE_MODULAR_DENOISE` environment.
~~Note: for some reasons there slightly different output from run to
run, but I able sometimes to get same output on main and this branch.~~
Fix presented in #6641.
## Merge Plan
~~Nope.~~ Merge #6641 firstly, to be able see output difference
properly.
If you think that there should be some kind of tests - feel free to add.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
## Summary
- currently the total for uncategorized images is not updating when
moving and deleting images, this will update that count when making
those actions
<!--A description of the changes in this PR. Include the kind of change
(fix, feature, docs, etc), the "why" and the "how". Screenshots or
videos are useful for frontend changes.-->
## Related Issues / Discussions
<!--WHEN APPLICABLE: List any related issues or discussions on github or
discord. If this PR closes an issue, please use the "Closes #1234"
format, so that the issue will be automatically closed when the PR
merges.-->
## QA Instructions
<!--WHEN APPLICABLE: Describe how you have tested the changes in this
PR. Provide enough detail that a reviewer can reproduce your tests.-->
## Merge Plan
<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->
## Checklist
- [ ] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
## Summary
Fix function call that we forgot to update in #6606
## QA Instructions
Run a TiledMultiDiffusionDenoiseLatents invocation and make sure it
doesn't crash.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
## Summary
Base code of new modular backend from #6577.
Contains normal generation and regional prompts support.
Also preview extension included to test if extensions logic works.
## Related Issues / Discussions
https://invokeai.notion.site/Modular-Stable-Diffusion-Backend-Design-Document-e8952daab5d5472faecdc4a72d377b0d
## QA Instructions
Run with and without set `USE_MODULAR_DENOISE` environment.
Currently only normal and regional conditionings supported, so just
generate some images and compare with main output.
## Merge Plan
Discuss a bit more about injection point names?
As if for example in future unet will be overridable, current
`pre_unet`/`post_unet` assumes to name override as `unet` what feels a
bit odd.
Also `apply_cfg` - future implementation could ignore/not use cfg, so in
this case `combine_noise_predictions`/`combine_noise` seems more
suitable.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
## Summary
This PR adds some spandrel upscale models to the starter model list.
In the future we may also want to add:
- Some DAT models
(https://drive.google.com/drive/folders/1iBdf_-LVZuz_PAbFtuxSKd_11RL1YKxM)
## QA Instructions
I installed the starter models via the model manager UI, and tested that
I could use them in a workflow.
## Merge Plan
- [ ] Merge the preceding Spandrel PRs first, then change the target
branch to `main`.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
## Summary
Add tiling to the `SpandrelImageToImageInvocation` node so that it can
process large images.
Tiling enables this node to run on effectively any input image
dimension. Of course, the computation time increases quadratically with
the image dimension.
Some profiling results on an RTX4090:
- Input 1024x1024, 4x upscale, 4x UltraSharp ESRGAN: `13 secs`, `<4 GB
VRAM`
- Input 4096x4096, 4x upscale, 4x UltraSharop ESRGAN: `46 secs`, `<4 GB
VRAM`
- Input 4096x4096, 2x upscale, SwinIR: `165 secs`, `<5 GB VRAM`
A lot of the time is spent PNG encoding the final image:
- PNG encoding of a 16384x16384 image takes `83secs @
pil_compress_level=7`, `24secs @ pil_compress_level=1`
Callout: If we want to start building workflows that pass large images
between nodes, we are going to have to find a way to avoid the PNG
encode/decode roundtrip that we are currently doing. As is, we will be
incurring a huge penalty for every node that receives/produces a large
image.
## QA Instructions
- [x] Tested with tiling up to 4096x4096 -> 16384x16384.
- [x] Test on images with an alpha channel (the alpha channel is
dropped).
- [x] Test on images with odd dimension.
- [x] Test no tiling (`tile_size=0`)
## Merge Plan
- [x] Merge #6556 first, and change the target branch to `main`.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
## Summary
- Add support for all
[spandrel](https://github.com/chaiNNer-org/spandrel) image-to-image
models - this is a collection of many popular super-resolution models
(e.g. ESRGAN, Real-ESRGAN, SwinIR, DAT, etc.)
Examples of supported models:
- DAT:
https://drive.google.com/drive/folders/1iBdf_-LVZuz_PAbFtuxSKd_11RL1YKxM
- SwinIR: https://github.com/JingyunLiang/SwinIR/releases
- Any ESRGAN / Real-ESRGAN model
## Related Issues
Closes#6394
## QA Instructions
- [x] Test that unsupported models still fail the probe (i.e. no false
positive spandrel models)
- [x] Test adding a few non-spandrel model types
- [x] Test adding a handful of spandrel model types: ESRGAN,
Real-ESRGAN, SwinIR, DAT
- [x] Verify model size estimation for the model cache
- [x] Test using the spandrel models in a practical image upscaling
workflow
## Merge Plan
- [x] Get approval from @brandonrising and @maryhipp before merging -
this PR has commercial implications.
- [x] Merge #6571 and change the target branch to `main`
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
In #6490 we enabled non-blocking torch device transfers throughout the model manager's memory management code. When using this torch feature, torch attempts to wait until the tensor transfer has completed before allowing any access to the tensor. Theoretically, that should make this a safe feature to use.
This provides a small performance improvement but causes race conditions in some situations. Specific platforms/systems are affected, and complicated data dependencies can make this unsafe.
- Intermittent black images on MPS devices - reported on discord and #6545, fixed with special handling in #6549.
- Intermittent OOMs and black images on a P4000 GPU on Windows - reported in #6613, fixed in this commit.
On my system, I haven't experience any issues with generation, but targeted testing of non-blocking ops did expose a race condition when moving tensors from CUDA to CPU.
One workaround is to use torch streams with manual sync points. Our application logic is complicated enough that this would be a lot of work and feels ripe for edge cases and missed spots.
Much safer is to fully revert non-locking - which is what this change does.
This issue is caused by a race condition. When a large image is served to the client, it is done using a streaming `FileResponse`. This concurrently serves the image straight from disk. The file is kept open by FastAPI until the image is fully served.
When a user deletes an image before the file is done serving, the delete fails because the file is still held by FastAPI.
To reproduce the issue:
- Create a very large image (8k reliably creates the issue).
- Create a smaller image, so that the first image in the gallery is not the large image.
- Refresh the app. The small image should be selected.
- Select the large image and immediately delete it. You have to be fast, to delete it before it finishes loading.
- In the terminal, we expect to see an error saying `Failed to delete image file`, and the image does not disappear from the UI.
- After a short wait, once the image has fully loaded, try deleting it again. We expect this to work.
The workaround is to instead serve the image from memory.
Loading the image to memory is very fast, so there is only a tiny window in which we could create the race condition, but it technically could still occur, because FastAPI is asynchronous and handles requests concurrently.
Once we load the image into memory, deletions of that image will work. Then we return a normal `Response` object with the image bytes. This is essentially what `FileResponse` does - except it uses `anyio.open_file`, which is async.
The tradeoff is that the server thread is blocked while opening the file. I think this is a fair tradeoff.
A future enhancement could be to implement soft deletion of images (db is already set up for this), and then clean up deleted image files on startup/shutdown. We could move back to using the async `FileResponse` for best responsiveness in the server without any risk of race conditions.
For some reason, I started getting this indefinite hang when the app checks if port 9090 is available. After some fiddling around, I found that adding a timeout resolves the issue.
I confirmed that the util still works by starting the app on 9090, then starting a second instance. The second instance correctly saw 9090 in use and moved to 9091.
## Summary
This PR changes the handling of invalid model configs in the DB to log a
warning rather than crashing the app.
This change is being made in preparation for some upcoming new model
additions. Previously, if a user rolled back from an app version that
added a new model type, the app would not launch until the DB was fixed.
This PR changes this behaviour to allow rollbacks of this type (with
warnings).
**Keep in mind that this change is only helpful to users _rolling back
to a version that has this fix_. I.e. it offers no help in the first
version that includes it.**
## QA Instructions
1. Run the Spandrel model branch, which adds a new model type
https://github.com/invoke-ai/InvokeAI/pull/6556.
2. Add a spandrel model via the model manager.
3. Rollback to main. The app will crash on launch due to the invalid
spandrel model config.
4. Checkout this branch. The app should now run with warnings about the
invalid model config.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
Currently translated at 100.0% (1282 of 1282 strings)
translationBot(ui): update translation (Russian)
Currently translated at 100.0% (1280 of 1280 strings)
translationBot(ui): update translation (Russian)
Currently translated at 100.0% (1275 of 1275 strings)
translationBot(ui): update translation (Russian)
Currently translated at 100.0% (1273 of 1273 strings)
Co-authored-by: Васянатор <ilabulanov339@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/ru/
Translation: InvokeAI/Web UI
Currently translated at 98.2% (1260 of 1282 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.4% (1260 of 1280 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.4% (1255 of 1275 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.4% (1253 of 1273 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.4% (1245 of 1265 strings)
Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
- Refine layout
- Update colors - more minimal, fewer shaded boxes
- Add indicator for search icons showing a search term is entered
- Handle new `projectName` and `projectUrl` ui props
## Summary
Update Boards UI in the gallery and adds support for creating and
displaying private boards
<!--A description of the changes in this PR. Include the kind of change
(fix, feature, docs, etc), the "why" and the "how". Screenshots or
videos are useful for frontend changes.-->
## Related Issues / Discussions
<!--WHEN APPLICABLE: List any related issues or discussions on github or
discord. If this PR closes an issue, please use the "Closes #1234"
format, so that the issue will be automatically closed when the PR
merges.-->
## QA Instructions
Can view private boards by setting config.allowPrivateBoards to true
<!--WHEN APPLICABLE: Describe how you have tested the changes in this
PR. Provide enough detail that a reviewer can reproduce your tests.-->
## Merge Plan
<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->
## Checklist
- [ ] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
## Summary
Demote error log to warning for models treated as having size 0.
## Related Issues / Discussions
Closes#6587
I looked into handling ESRGAN model sizes properly. They load a
state_dict with a bit of an unusual nested-dict structure. Rather than
figure out how to accurately calculate their size, we can just wait for
https://github.com/invoke-ai/InvokeAI/pull/6556. ESRGAN model size
handling should work properly when loaded through that pathway.
## QA Instructions
Loaded an ESRGAN model, and confirmed that the warning log is at the
warning level.
## Merge Plan
No special instructions.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
This commit corrects a broken link on line 16 that was pointing to the latest release but causing a 404 error (page not found) when clicked. The issue was identified as a trailing dot at the end of the URL, which has now been removed. This ensures users can access the intended latest release page.
## Summary
This PR tweaks the wording of the PR template QA instructions with the
goals of:
1. Make it more clear that PR authors are responsible for testing their
PRs.
2. Encouraging sufficient detail in the test descriptions.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
## Summary
Delete an unused duplicate libc_util.py file. The active version is at
`invokeai/backend/model_manager/libc_util.py`
## QA Instructions
I ran a smoke test to confirm that memory snapshotting still works.
## Merge Plan
- [x] Change target branch to `main` before merging.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
## Summary
This PR migrates all relative imports to absolute imports, and adds a
ruff check to enforce this going forward.
The justification for this change is here:
https://github.com/invoke-ai/InvokeAI/issues/6575
## QA Instructions
Smoke test all common workflows. Most of the relative -> absolute
conversions could be completed automatically, so the risk is relatively
low.
## Merge Plan
As with any far-reaching change like this, it is likely to cause some
merge conflicts with some in-flight branches. Unfortunately, there's no
way around this, but let me know if you can think of in-flight work that
will be significantly disrupted by this.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_ N/A
- [x] _Documentation added / updated (if applicable)_ N/A
## Summary
This PR fixes a regression that caused the following models to be
treated as having size 0 in the model cache: `(TextualInversionModelRaw,
IPAdapter, LoRAModelRaw)`.
Changes:
- Call the correct model size calculation for all supported model types.
- Log an error message if an unexpected model type is loaded, to prevent
similar regressions in the future.
## QA Instructions
I tested the following features and verified that no models fell back to
using a size of 0 unexpectedly:
- Test-to-image
- Textual Inversion
- LoRA
- IP-Adapter
- ControlNet
(All tested with both SD1.5 and SDXL.)
I compared the model cache switching behavior before and after this
change with a large number of LoRAs (10). Since LoRAs are small compared
to the main models, the changes in behaviour are minimal. Nonetheless,
it makes sense to get this in for correctness. And it might make a
difference for some usage patterns with limited RAM.
## Merge Plan
No special instructions.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
- For single image deletion, select the image in the same slot as the deleted image
- For multiple image deletion, empty selection
- On list images, if no images are currently selected, select the first image
## Summary
- This PR exposes a `tile_size` field on `ImageToLatentsInvocation` and
`LatentsToImageInvocation`.
- Setting `tile_size = 0` preserves the default behaviour.
- This feature is primarily intended to support upscaling workflows that
require VAE encoding/decoding high resolution images. In the future, we
may want to expose the tile size as a global application config, but
that's a separate conversation.
- As a general rule, larger tile sizes produce better results at the
cost of higher memory usage.
### Example:
Original (5472x5472)

VAE roundtrip with 512x512 tiles (note the discoloration)

VAE roundtrip with 1024x1024 tiles (some discoloration still present,
but less severe than at 512x512)

## Related Issues / Discussions
Related: #6144
## QA Instructions
- [x] Test image generation via the Linear tab
- [x] Test VAE roundtrip with tiling disabled
- [x] Test VAE roundtrip with tiling and tile_size = 0
- [x] Test VAE roundtrip with tiling and tile_size > 0
## Merge Plan
No special instructions.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
The selection logic is a bit complicated. We have image selection and pagination, both of which can be triggered using the mouse or hotkeys. We have viewer image selection and comparison image selection, which is determined by the alt key.
This change ties the room together with these behaviours:
- Changing the page using pagination buttons never changes the selection.
- Changing the selected image using arrows may change the page, if the arrow key pressed would select an image off the current page.
- `right` on the last image of the current page goes to the next page
- `down` on the last row of images goes to the next page
- `left` on the first image of the current page goes to the previous page
- `up` on the first row of images goes to the previous page
- If `alt` is held when using arrow keys, we change the page, but we only change the comparison image selection.
- When using arrow keys, if the page has changed since the last image was selected, the selection is reset to the first image on the page.
- The next/previous buttons on the image viewer do the same thing as `left` and `right` without `alt`.
- When clicking an image in the gallery:
- If no modifier keys are held, the image is exclusively selected.
- If `ctrl` or `meta` are held, the image's selection status is toggled.
- If `shift` is held, all images from the last-selected image to the image are selected. If there are no images on the current page, the selection is unchanged.
- If `alt` is held, the image is set as the compare image.
- `ctrl+a` and `meta+a` add the current page to the selection.
The logic for gallery navigation and selection is now pretty hairy. It's spread across 3 hooks, a listener, redux slice, components.
When we next make changes to this part of the app, we should consider consolidating some of the related logic. Probably most of it can go into a single listener and make it much simpler to grok.
Don't like this UI (even though I suggested it). No need to prevent the user from interacting with the search term field during fetching. Let's figure out a nicer way to present this in a followup.
## Summary
Python 3.11 has a wonderfully devious breaking change where _sometimes_
using enum classes that inherit from `str` or `int` do not work the same
way as they do in 3.10 when used within string formatting/interpolation.
This breaks the new gallery sort queries. The fix is to use
`order_dir.value` instead of `order_dir` in the query.
This was not an issue during development because the feature was
developed w/ python 3.10.
## Related Issues / Discussions
Thanks to @JPPhoto for reporting and troubleshooting:
https://discord.com/channels/1020123559063990373/1149513625321603162/1256211815982039173
## QA Instructions
JP's fancy python 3.11 system should work on this PR.
## Merge Plan
n/a
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
If the currently selected or auto-add board is archived or deleted, we should reset them. There are some edge cases taht weren't handled in the previous implementation.
All handling of this logic is moved to the (renamed) listener.
Before this change, if you attempt to create an image that with a nonexistent board, we'd get an unhandled error when adding the image to a board. The record would be created, but file not, due to the structure of the code.
With this change, we now log a warning if we have a problem adding the image to the board, but the record and file are still created.
A future improvement would be to create a transaction for this part of the code, preventing some other situation that could result in only the record or only the file beings saved.
* use model_class.load_singlefile() instead of converting; works, but performance is poor
* adjust the convert api - not right just yet
* working, needs sql migrator update
* rename migration_11 before conflict merge with main
* Update invokeai/backend/model_manager/load/model_loaders/stable_diffusion.py
Co-authored-by: Ryan Dick <ryanjdick3@gmail.com>
* Update invokeai/backend/model_manager/load/model_loaders/stable_diffusion.py
Co-authored-by: Ryan Dick <ryanjdick3@gmail.com>
* implement lightweight version-by-version config migration
* simplified config schema migration code
* associate sdxl config with sdxl VAEs
* remove use of original_config_file in load_single_file()
---------
Co-authored-by: Lincoln Stein <lstein@gmail.com>
Co-authored-by: Ryan Dick <ryanjdick3@gmail.com>
## Summary
We can get black outputs when moving tensors from CPU to MPS. It appears
MPS to CPU is fine. See:
- https://github.com/pytorch/pytorch/issues/107455
-
https://discuss.pytorch.org/t/should-we-set-non-blocking-to-true/38234/28
Changes:
- Add properties for each device on `TorchDevice` as a convenience.
- Add `get_non_blocking` static method on `TorchDevice`. This utility
takes a torch device and returns the flag to be used for non_blocking
when moving a tensor to the device provided.
- Update model patching and caching APIs to use this new utility.
## Related Issues / Discussions
Fixes: #6545
## QA Instructions
For both MPS and CUDA:
- Generate at least 5 images using LoRAs
- Generate at least 5 images using IP Adapters
## Merge Plan
We have pagination merged into `main` but aren't ready for that to be
released.
Once this fix is tested and merged, we will probably want to create a
`v4.2.5post1` branch off the `v4.2.5` tag, cherry-pick the fix and do a
release from the hotfix branch.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_ @RyanJDick @lstein This
feels testable but I'm not sure how.
- [ ] _Documentation added / updated (if applicable)_
We can get black outputs when moving tensors from CPU to MPS. It appears MPS to CPU is fine. See:
- https://github.com/pytorch/pytorch/issues/107455
- https://discuss.pytorch.org/t/should-we-set-non-blocking-to-true/38234/28
Changes:
- Add properties for each device on `TorchDevice` as a convenience.
- Add `get_non_blocking` static method on `TorchDevice`. This utility takes a torch device and returns the flag to be used for non_blocking when moving a tensor to the device provided.
- Update model patching and caching APIs to use this new utility.
Fixes: #6545
We only need to show the totals in the tooltip. Tooltips accpet a component for the tooltip label. The component isn't rendered until the tooltip is triggered.
Move the board total fetching into a tooltip component for the boards. Now we only fire these requests when the user mouses over the board
- Simplify the gallery layout
- Set an initial gallery limit to load _some_ images immediately.
- Refactor the resize observer to use the actual rendered image component to calculate the number of images per row/col. This prevents inaccuracies caused by image padding that could result in the wrong number of images.
- Debounce the limit update to not thrash teh API
- Use absolute positioning trick to ensure the gallery container is always exactly the right size
- Minimum of `imagesPerRow` images loaded at all times
This is one of those unexpected CSS quirks. Flex containers need min-width or min-height for their children to not overflow. Add `minH={0}` to gallery container.
## Summary
https://github.com/invoke-ai/InvokeAI/pull/6522 introduced a change in
behavior in cases where start/end were set such that there are 0
timesteps. This PR reverts that change.
cc @StAlKeR7779
## QA Instructions
Run with euler, 5 steps, start: 0.0, end: 0.05. I ran this test before
#6522, after #6522, and on this branch. This branch restores the
behavior to pre-#6522 i.e. noise is injected even if no denoising steps
are applied.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
* add support for probing and loading SDXL VAE checkpoint files
* broaden regexp probe for SDXL VAEs
---------
Co-authored-by: Lincoln Stein <lstein@gmail.com>
## Summary
No functional changes, just cleaning some things up as I touch the code.
This PR cleans up the `SilenceWarnings` context manager:
- Fix type errors
- Enable SilenceWarnings to be used as both a context manager and a
decorator
- Remove duplicate implementation
- Check the initial verbosity on `__enter__()` rather than `__init__()`
- Save an indentation level in DenoiseLatents
## QA Instructions
I generated an image to confirm that warnings are still muted.
## Merge Plan
- [x] ⚠️ Merge https://github.com/invoke-ai/InvokeAI/pull/6492 first,
then change the target branch to `main`.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
- Fix type errors
- Enable SilenceWarnings to be used as both a context manager and a decorator
- Remove duplicate implementation
- Check the initial verbosity on __enter__() rather than __init__()
## Summary
added route to install huggingface models from model marketplace
<!--A description of the changes in this PR. Include the kind of change
(fix, feature, docs, etc), the "why" and the "how". Screenshots or
videos are useful for frontend changes.-->
## Related Issues / Discussions
<!--WHEN APPLICABLE: List any related issues or discussions on github or
discord. If this PR closes an issue, please use the "Closes #1234"
format, so that the issue will be automatically closed when the PR
merges.-->
## QA Instructions
test by going to
http://localhost:5173/api/v2/models/install/huggingface?source=${hfRepo}
<!--WHEN APPLICABLE: Describe how we can test the changes in this PR.-->
## Merge Plan
<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->
## Checklist
- [ ] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
When a model install is initiated from outside the client, we now trigger the model manager tab's model install list to update.
- Handle new `model_install_download_started` event
- Handle `model_install_download_complete` event (this event is not new but was never handled)
- Update optimistic updates/cache invalidation logic to efficiently update the model install list
Previously, we used `model_install_download_progress` for both download starting and progressing. When handling this event, we don't know which actual thing it represents.
Add `model_install_download_started` event to explicitly represent a model download started event.
* allow model patcher to optimize away the unpatching step when feasible
* remove lazy_offloading functionality
* allow model patcher to optimize away the unpatching step when feasible
* remove lazy_offloading functionality
* do not save original weights if there is a CPU copy of state dict
* Update invokeai/backend/model_manager/load/load_base.py
Co-authored-by: Ryan Dick <ryanjdick3@gmail.com>
* documentation fixes requested during penultimate review
* add non-blocking=True parameters to several torch.nn.Module.to() calls, for slight performance increases
* fix ruff errors
* prevent crash on non-cuda-enabled systems
---------
Co-authored-by: Lincoln Stein <lstein@gmail.com>
Co-authored-by: Kent Keirsey <31807370+hipsterusername@users.noreply.github.com>
Co-authored-by: Ryan Dick <ryanjdick3@gmail.com>
## Summary
This three two model manager-related methods to the InvocationContext
uniform API. They are accessible via `context.models.*`:
1. **`load_local_model(model_path: Path, loader:
Optional[Callable[[Path], AnyModel]] = None) ->
LoadedModelWithoutConfig`**
*Load the model located at the indicated path.*
This will load a local model (.safetensors, .ckpt or diffusers
directory) into the model manager RAM cache and return its
`LoadedModelWithoutConfig`. If the optional loader argument is provided,
the loader will be invoked to load the model into memory. Otherwise the
method will call `safetensors.torch.load_file()` `torch.load()` (with a
pickle scan), or `from_pretrained()` as appropriate to the path type.
Be aware that the `LoadedModelWithoutConfig` object differs from
`LoadedModel` by having no `config` attribute.
Here is an example of usage:
```
def invoke(self, context: InvocatinContext) -> ImageOutput:
model_path = Path('/opt/models/RealESRGAN_x4plus.pth')
loadnet = context.models.load_local_model(model_path)
with loadnet as loadnet_model:
upscaler = RealESRGAN(loadnet=loadnet_model,...)
```
---
2. **`load_remote_model(source: str | AnyHttpUrl, loader:
Optional[Callable[[Path], AnyModel]] = None) ->
LoadedModelWithoutConfig`**
*Load the model located at the indicated URL or repo_id.*
This is similar to `load_local_model()` but it accepts either a
HugginFace repo_id (as a string), or a URL. The model's file(s) will be
downloaded to `models/.download_cache` and then loaded, returning a
```
def invoke(self, context: InvocatinContext) -> ImageOutput:
model_url = 'https://github.com/xinntao/Real-ESRGAN/releases/download/v0.1.0/RealESRGAN_x4plus.pth'
loadnet = context.models.load_remote_model(model_url)
with loadnet as loadnet_model:
upscaler = RealESRGAN(loadnet=loadnet_model,...)
```
---
3. **`download_and_cache_model( source: str | AnyHttpUrl, access_token:
Optional[str] = None, timeout: Optional[int] = 0) -> Path`**
Download the model file located at source to the models cache and return
its Path. This will check `models/.download_cache` for the desired model
file and download it from the indicated source if not already present.
The local Path to the downloaded file is then returned.
---
## Other Changes
This PR performs a migration, in which it renames `models/.cache` to
`models/.convert_cache`, and migrates previously-downloaded ESRGAN,
openpose, DepthAnything and Lama inpaint models from the `models/core`
directory into `models/.download_cache`.
There are a number of legacy model files in `models/core`, such as
GFPGAN, which are no longer used. This PR deletes them and tidies up the
`models/core` directory.
## Related Issues / Discussions
I have systematically replaced all the calls to
`download_with_progress_bar()`. This function is no longer used
elsewhere and has been removed.
<!--WHEN APPLICABLE: List any related issues or discussions on github or
discord. If this PR closes an issue, please use the "Closes #1234"
format, so that the issue will be automatically closed when the PR
merges.-->
## QA Instructions
I have added unit tests for the three new calls. You may test that the
`load_and_cache_model()` call is working by running the upscaler within
the web app. On first try, you will see the model file being downloaded
into the models `.cache` directory. On subsequent tries, the model will
either load from RAM (if it hasn't been displaced) or will be loaded
from the filesystem.
<!--WHEN APPLICABLE: Describe how we can test the changes in this PR.-->
## Merge Plan
Squash merge when approved.
<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->
## Checklist
- [X] _The PR has a short but descriptive title, suitable for a
changelog_
- [X] _Tests added / updated (if applicable)_
- [X] _Documentation added / updated (if applicable)_
## Summary
I've started working towards a better tiled upscaling implementation. It
is going to require some refactoring of `DenoiseLatentsInvocation`. As a
first step, this PR splits up all of the invocations in latent.py into
their own files. That file had become a bit of a dumping ground - it
should be a bit more manageable to work with now.
This PR just re-organizes the code. There should be no functional
changes.
## QA Instructions
I've done some light smoke testing. I'll do some more before merging.
The main risk is that I missed a broken import, or some other copy-paste
error.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_: N/A
- [x] _Documentation added / updated (if applicable)_: N/A
* allow model patcher to optimize away the unpatching step when feasible
* remove lazy_offloading functionality
* allow model patcher to optimize away the unpatching step when feasible
* remove lazy_offloading functionality
* do not save original weights if there is a CPU copy of state dict
* Update invokeai/backend/model_manager/load/load_base.py
Co-authored-by: Ryan Dick <ryanjdick3@gmail.com>
* documentation fixes added during penultimate review
---------
Co-authored-by: Lincoln Stein <lstein@gmail.com>
Co-authored-by: Kent Keirsey <31807370+hipsterusername@users.noreply.github.com>
Co-authored-by: Ryan Dick <ryanjdick3@gmail.com>
Create intermediary nanostores for values required by the event handlers. This allows the event handlers to be purely imperative, with no reactivity: instead of recreating/setting the handlers when a dependent piece of state changes, we use nanostores' imperative API to access dependent state.
For example, some handlers depend on brush size. If we used the standard declarative `useSelector` API, we'd need to recreate the event handler callback each time the brush size changed. This can be costly.
An intermediate `$brushSize` nanostore is set in a `useLayoutEffect()`, which responds to changes to the redux store. Then, in the event handler, we use the imperative API to access the brush size: `$brushSize.get()`.
This change allows the event handler logic to be shared with the pending canvas v2, and also more easily tested. It's a noticeable perf improvement, too, especially when changing brush size.
Currently translated at 98.5% (1243 of 1261 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.5% (1243 of 1261 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.5% (1225 of 1243 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.5% (1225 of 1243 strings)
Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
- Pass the seed from `latents_a` to the output latents. Fixed an issue where using `BlendLatentsInvocation` could result in different outputs during denoising even when the alpha or slerp weight was 0.
## Explanation
`LatentsField` has an optional `seed` field. During denoising, if this `seed` field is not present, we **fall back to 0 for the seed**. The seed is used during denoising in a few ways:
1. Initializing the scheduler.
The seed is used in two places in `invokeai/app/invocations/latent.py`.
The `get_scheduler()` utility function has special handling for `DPMSolverSDEScheduler`, which appears to need a seed for deterministic outputs.
`DenoiseLatentsInvocation.init_scheduler()` has special handling for schedulers that accept a generator - the generator needs to be seeded in a particular way. At the time of this commit, these are the Invoke-supported schedulers that need this seed:
- DDIMScheduler
- DDPMScheduler
- DPMSolverMultistepScheduler
- EulerAncestralDiscreteScheduler
- EulerDiscreteScheduler
- KDPM2AncestralDiscreteScheduler
- LCMScheduler
- TCDScheduler
2. Adding noise during inpainting.
If a mask is used for denoising, and we are not using an inpainting model, we add noise to the unmasked area. If, for some reason, we have a mask but no noise, the seed is used to add noise.
I wonder if we should instead assert that if a mask is provided, we also have noise.
This is done in `invokeai/backend/stable_diffusion/diffusers_pipeline.py` in `StableDiffusionGeneratorPipeline.latents_from_embeddings()`.
When we create noise to be used in denoising, we are expected to set `LatentsField.seed` to the seed used to create the noise. This introduces some awkwardness when we manipulate any "latents" that will be used for denoising. We have to pass the seed along for every operation.
If the wrong seed or no seed is passed along, we can get unexpected outputs during denoising. One notable case relates to blending latents (slerping tensors).
If we slerp two noise tensors (`LatentsField`s) _without_ passing along the seed from the source latents, when we denoise with a seed-dependent scheduler*, the schedulers use the fallback seed of 0 and we get the wrong output. This is most obvious when slerping with a weight of 0, in which case we expect the exact same output after denoising.
*It looks like only the DPMSolver* schedulers are affected, but I haven't tested all of them.
Passing the seed along in the output fixes this issue.
This required some minor reworking of of the logic to recall multiple items. I split this into a utility function that includes some special handling for concat.
Closes#6478
When the model in metadata's key no longer exists, fall back to fetching by name, base and type. This was the intention all along but the logic was never put in place.
- Any mypy issues are a misconfiguration of mypy
- Use simple conditionals instead of ternaries
- Consistent & standards-compliant docstring formatting
- Use `dict` instead of `typing.Dict`
It doesn't make sense to allow context menu here, because the context menu will technically be on a div and not an image - there won't be any image options there.
## Summary
Fix some issues with openapi schema generation. See commits for details.
## Related Issues / Discussions
https://discord.com/channels/1020123559063990373/1049495067846524939/1245141831394529352
## QA Instructions
App should work, workflows should work.
## Merge Plan
n/a
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
Some tech debt related to dynamic pydantic schemas for invocations became problematic. Including the invocations and results in the event schemas was breaking pydantic's handling of ref schemas. I don't really understand why - I think it's a pydantic bug in a remote edge case that we are hitting.
After many failed attempts I landed on this implementation, which is actually much tidier than what was in there before.
- Create pydantic-enabled types for `AnyInvocation` and `AnyInvocationOutput` and use these in place of the janky dynamic unions. Actually, they are kinda the same, but better encapsulated. Use these in `Graph`, `GraphExecutionState`, `InvocationEventBase` and `InvocationCompleteEvent`.
- Revise the custom openapi function to work with the new models.
- Split out the custom openapi function to a separate file. Add a `post_transform` callback so consumers can customize the output schema.
- Update makefile scripts.
## Summary
- Updated the documentation for `TextualInversionManager`
- Updated the `self.tokenizer.model_max_length` access to work with the
latest transformers version. Thanks to @skunkworxdark for looking into
this here:
https://github.com/invoke-ai/InvokeAI/issues/6445#issuecomment-2133098342
## Related Issues / Discussions
Closes#6445
## QA Instructions
I tested with `transformers==4.41.1`, and compared the results against a
recent InvokeAI version before updating tranformers - no change, as
expected.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
This is required to get these event fields to deserialize correctly. If omitted, pydantic uses `BaseInvocation`/`BaseInvocationOutput`, which is not correct.
This is similar to the workaround in the `Graph` and `GraphExecutionState` classes where we need to fanagle pydantic with manual validation handling.
Note about the huge diff: I had a different version of pydantic installed at some point, which slightly altered a _ton_ of schema components. This typegen was done on the correct version of pydantic and un-does those alterations, in addition to the intentional changes to event models.
There's no longer any need for session-scoped events now that we have the session queue. Session started/completed/canceled map 1-to-1 to queue item status events, but queue item status events also have an event for failed state.
We can simplify queue and processor handling substantially by removing session events and instead using queue item events.
- Remove the session-scoped events entirely.
- Remove all event handling from session queue. The processor still needs to respond to some events from the queue: `QueueClearedEvent`, `BatchEnqueuedEvent` and `QueueItemStatusChangedEvent`.
- Pass an `is_canceled` callback to the invocation context instead of the cancel event
- Update processor logic to ensure the local instance of the current queue item is synced with the instance in the database. This prevents race conditions and ensures lifecycle callback do not get stale callbacks.
- Update docstrings and comments
- Add `complete_queue_item` method to session queue service as an explicit way to mark a queue item as successfully completed. Previously, the queue listened for session complete events to do this.
Closes#6442
- Restore calculation of step percentage but in the backend instead of client
- Simplify signatures for denoise progress event callbacks
- Clean up `step_callback.py` (types, do not recreate constant matrix on every step, formatting)
We don't need to use the payload schema registry. All our events are dispatched as pydantic models, which are already validated on instantiation.
We do want to add all events to the OpenAPI schema, and we referred to the payload schema registry for this. To get all events, add a simple helper to EventBase. This is functionally identical to using the schema registry.
The model loader emits events. During testing, it doesn't have access to a fully-mocked events service, so the test fails when attempting to call a nonexistent method. There was a check for this previously, but I accidentally removed it. Restored.
- Remove ABCs, they do not work well with pydantic
- Remove the event type classvar - unused
- Remove clever logic to require an event name - we already get validation for this during schema registration.
- Rename event bases to all end in "Base"
Our events handling and implementation has a couple pain points:
- Adding or removing data from event payloads requires changes wherever the events are dispatched from.
- We have no type safety for events and need to rely on string matching and dict access when interacting with events.
- Frontend types for socket events must be manually typed. This has caused several bugs.
`fastapi-events` has a neat feature where you can create a pydantic model as an event payload, give it an `__event_name__` attr, and then dispatch the model directly.
This allows us to eliminate a layer of indirection and some unpleasant complexity:
- Event handler callbacks get type hints for their event payloads, and can use `isinstance` on them if needed.
- Event payload construction is now the responsibility of the event itself (a pydantic model), not the service. Every event model has a `build` class method, encapsulating this logic. The build methods are provided as few args as possible. For example, `InvocationStartedEvent.build()` gets the invocation instance and queue item, and can choose the data it wants to include in the event payload.
- Frontend event types may be autogenerated from the OpenAPI schema. We use the payload registry feature of `fastapi-events` to collect all payload models into one place, making it trivial to keep our schema and frontend types in sync.
This commit moves the backend over to this improved event handling setup.
* avoid copying model back from cuda to cpu
* handle models that don't have state dicts
* add assertions that models need a `device()` method
* do not rely on torch.nn.Module having the device() method
* apply all patches after model is on the execution device
* fix model patching in latents too
* log patched tokenizer
* closes#6375
---------
Co-authored-by: Lincoln Stein <lstein@gmail.com>
Show error toasts on queue item error events instead of invocation error events. This allows errors that occurred outside node execution to be surfaced to the user.
The error description component is updated to show the new error message if available. Commercial handling is retained, but local now uses the same component to display the error message itself.
I had set the cancel event at some point during troubleshooting an unrelated issue. It seemed logical that it should be set there, and didn't seem to break anything. However, this is not correct.
The cancel event should not be set in response to a queue status change event. Doing so can cause a race condition when nodes are executed very quickly.
It's possible that a previously-executed session's queue item status change event is handled after the next session starts executing. The cancel event is set and the session runner sees it aborting the session run early.
In hindsight, it doesn't make sense to set the cancel event here either. It should be set in response to user action, e.g. the user cancelled the session or cleared the queue (which implicitly cancels the current session). These events actually trigger the queue item status changed event, so if we set the cancel event here, we'd be setting it twice per cancellation.
There's a race condition where a canceled session may emit a progress event or two after it's been canceled, and the progress image isn't cleared out.
To resolve this, the system slice tracks canceled session ids. When a progress event comes in, we check the cancellations and skip setting the progress if canceled.
- Add handling for new error columns `error_type`, `error_message`, `error_traceback`.
- Update queue item model to include the new data. The `error_traceback` field has an alias of `error` for backwards compatibility.
- Add `fail_queue_item` method. This was previously handled by `cancel_queue_item`. Splitting this functionality makes failing a queue item a bit more explicit. We also don't need to handle multiple optional error args.
-
We were not handling node preparation errors as node errors before. Here's the explanation, copied from a comment that is no longer required:
---
TODO(psyche): Sessions only support errors on nodes, not on the session itself. When an error occurs outside
node execution, it bubbles up to the processor where it is treated as a queue item error.
Nodes are pydantic models. When we prepare a node in `session.next()`, we set its inputs. This can cause a
pydantic validation error. For example, consider a resize image node which has a constraint on its `width`
input field - it must be greater than zero. During preparation, if the width is set to zero, pydantic will
raise a validation error.
When this happens, it breaks the flow before `invocation` is set. We can't set an error on the invocation
because we didn't get far enough to get it - we don't know its id. Hence, we just set it as a queue item error.
---
This change wraps the node preparation step with exception handling. A new `NodeInputError` exception is raised when there is a validation error. This error has the node (in the state it was in just prior to the error) and an identifier of the input that failed.
This allows us to mark the node that failed preparation as errored, correctly making such errors _node_ errors and not _processor_ errors. It's much easier to diagnose these situations. The error messages look like this:
> Node b5ac87c6-0678-4b8c-96b9-d215aee12175 has invalid incoming input for height
Some of the exception handling logic is cleaned up.
- Use protocol to define callbacks, this allows them to have kwargs
- Shuffle the profiler around a bit
- Move `thread_limit` and `polling_interval` to `__init__`; `start` is called programmatically and will never get these args in practice
- Add `OnNodeError` and `OnNonFatalProcessorError` callbacks
- Move all session/node callbacks to `SessionRunner` - this ensures we dump perf stats before resetting them and generally makes sense to me
- Remove `complete` event from `SessionRunner`, it's essentially the same as `OnAfterRunSession`
- Remove extraneous `next_invocation` block, which would treat a processor error as a node error
- Simplify loops
- Add some callbacks for testing, to be removed before merge
This query is only subscribed-to in the `QueueItemDetail` component - when is rendered only when the user clicks on a queue item in the queue. Invalidating this tag instead of optimistically updating it won't cause any meaningful change to network traffic.
The session is never updated in the queue after it is first enqueued. As a result, the queue detail view in the frontend never never updates and the session itself doesn't show outputs, execution graph, etc.
We need a new method on the queue service to update a queue item's session, then call it before updating the queue item's status.
Queue item status may be updated via a session-type event _or_ queue-type event. Adding the updated session to all these events is a hairy - simpler to just update the session before we do anything that could trigger a queue item status change event:
- Before calling `emit_session_complete` in the processor (handles session error, completed and cancel events and the corresponding queue events)
- Before calling `cancel_queue_item` in the processor (handles another way queue items can be canceled, outside the session execution loop)
When serializing the session, both in the new service method and the `get_queue_item` endpoint, we need to use `exclude_none=True` to prevent unexpected validation errors.
## Summary
TIL if you add `undefined` to a form data object, it gets stringified to
`'undefined'`. Whoops!
## Related Issues / Discussions
n/a
## QA Instructions
n/a
## Merge Plan
n/a
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
Currently translated at 98.5% (1210 of 1228 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.5% (1206 of 1224 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.5% (1204 of 1222 strings)
Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
## Summary
Notes nodes used some overly-strict redux selectors. The selectors are
now more chill. Also fixed an issue where you couldn't edit a notes node
title.
Found another class of error related to the overly strict reducers that
caused errors when loading a workflow that had missing templates. Fixed
this with fallback wrapper component, works like an error boundary when
a template isn't found.
## Related Issues / Discussions
https://discord.com/channels/1020123559063990373/1149506274971631688/1242256425527545949
## QA Instructions
- Add a notes node to a workflow. Edit the notes title.
- Load a workflow that has nodes that aren't installed. Should get a
fallback UI for each missing node.
- Load a workflow that references a node with different inputs than are
in the template - like an old version of a node. Should get a fallback
field warning for both missing templates, or missing inputs.
## Merge Plan
n/a
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
Some asserts were bubbling up in places where they shouldn't have, causing errors when a node has a field without a matching template, or vice-versa.
To resolve this without sacrificing the runtime safety provided by asserts, a `InvocationFieldCheck` component now wraps all field components. This component renders a fallback when a field doesn't exist, so the inner components can safely use the asserts.
At some point, I made a mistake and imported the wrong types to some files for the old v1 and v2 workflow schema migration data.
The relevant zod schemas and inferred types have been restored.
This change doesn't alter runtime behaviour. Only type annotations.
Replace the `isCollection` and `isCollectionOrScalar` flags with a single enum value `cardinality`. Valid values are `SINGLE`, `COLLECTION` and `SINGLE_OR_COLLECTION`.
Why:
- The two flags were mutually exclusive, but this wasn't enforce. You could create a field type that had both `isCollection` and `isCollectionOrScalar` set to true, whuch makes no sense.
- There was no explicit declaration for scalar/single types.
- Checking if a type had only a single flag was tedious.
Thanks to a change a couple months back in which the workflows schema was revised, field types are internal implementation details. Changes to them are non-breaking.
Canvas images are saved by uploading a blob generated from the HTML canvas element. This means the existing metadata handling, inside the graph execution engine, is not available.
To save metadata to canvas images, we need to provide it when uploading that blob.
The upload route now has a `metadata` body param. If this is provided, we use it over any metadata embedded in the image.
Depending on the user behaviour and network conditions, it's possible that we could try to load a workflow before the invocation templates are available.
Fix is simple:
- Use the RTKQ query hook for openAPI schema in App.tsx
- Disable the load workflow buttons until w have templates parsed
Remove our DIY'd reducers, consolidating all node and edge mutations to use `edgesChanged` and `nodesChanged`, which are called by reactflow. This makes the API for manipulating nodes and edges less tangly and error-prone.
We now keep track of the original field type, derived from the python type annotation in addition to the override type provided by `ui_type`.
This makes `ui_type` work more like it sound like it should work - change the UI input component only.
Connection validation is extend to also check the original types. If there is any match between two fields' "final" or original types, we consider the connection valid.This change is backwards-compatible; there is no workflow migration needed.
When clearing the processor config, we shouldn't re-process the image. This logic wasn't handled correctly, but coincidentally the bug didn't cause a user-facing issue.
Without a config, we had a runtime error when trying to build the node for the processor graph and the listener failed.
So while we didn't re-process the image, it was because there was an error, not because the logic was correct.
Fix this by bailing if there is no image or config.
If you change the control model and the new model has the same default processor, we would still re-process the image, even if there was no need to do so.
With this change, if the image and processor config are unchanged, we bail out.
Graph, metadata and workflow all take stringified JSON only. This makes the API consistent and means we don't need to do a round-trip of pydantic parsing when handling this data.
It also prevents a failure mode where an uploaded image's metadata, workflow or graph are old and don't match the current schema.
As before, the frontend does strict validation and parsing when loading these values.
The previous super-minimal implementation had a major issue - the saved workflow didn't take into account batched field values. When generating with multiple iterations or dynamic prompts, the same workflow with the first prompt, seed, etc was stored in each image.
As a result, when the batch results in multiple queue items, only one of the images has the correct workflow - the others are mismatched.
To work around this, we can store the _graph_ in the image metadata (alongside the workflow, if generated via workflow editor). When loading a workflow from an image, we can choose to load the workflow or the graph, preferring the workflow.
Internally, we need to update images router image-saving services. The changes are minimal.
To avoid pydantic errors deserializing the graph, when we extract it from the image, we will leave it as stringified JSON and let the frontend's more sophisticated and flexible parsing handle it. The worklow is also changed to just return stringified JSON, so the API is consistent.
Pending:
- Move model install calls into model manager and create passthrus in invocation_context.
- Consider splitting load_model_from_url() into a call to get the path and a call to load the path.
Invoke is a leading creative engine built to empower professionals and enthusiasts alike. Generate and create stunning visual media using the latest AI-driven technologies. Invoke offers an industry leading web-based UI, and serves as the foundation for multiple commercial products.
[Installation and Updates][installation docs] - [Documentation and Tutorials][docs home] - [Bug Reports][github issues] - [Contributing][contributing docs]
| **For users looking for a locally installed, self-hosted and self-managed service** | **For users or teams looking for a cloud-hosted, fully managed service** |
| - Free to use under a commercially-friendly license | - Monthly subscription fee with three different plan levels |
| - Download and install on compatible hardware | - Offers additional benefits, including multi-user support, improved model training, and more |
| - Includes all core studio features: generate, refine, iterate on images, and build workflows | - Hosted in the cloud for easy, secure model access and scalability |
| Quick Start -> [Installation and Updates][installation docs] | More Information -> [www.invoke.com/pricing](https://www.invoke.com/pricing) |
<div align="center">

| [Installation and Updates][installation docs] - [Documentation and Tutorials][docs home] - [Bug Reports][github issues] - [Contributing][contributing docs] |
## Quick Start
# Installation
1. Download and unzip the installer from the bottom of the [latest release][latest release link].
2. Run the installer script.
To get started with Invoke, [Download the Installer](https://www.invoke.com/downloads).
- **Windows**: Double-click on the `install.bat` script.
- **macOS**: Open a Terminal window, drag the file `install.sh` from Finder into the Terminal, and press enter.
- **Linux**: Run `install.sh`.
For detailed step by step instructions, or for instructions on manual/docker installations, visit our documentation on [Installation and Updates][installation docs]
3. When prompted, enter a location for the install and select your GPU type.
4. Once the install finishes, find the directory you selected during install. The default location is `C:\Users\Username\invokeai` for Windows or `~/invokeai` for Linux/macOS.
5. Run the launcher script (`invoke.bat` for Windows, `invoke.sh` for macOS and Linux) the same way you ran the installer script in step 2.
6. Select option 1 to start the application. Once it starts up, open your browser and go to <http://localhost:9090>.
7. Open the model manager tab to install a starter model and then you'll be ready to generate.
More detail, including hardware requirements and manual install instructions, are available in the [installation documentation][installation docs].
## Troubleshooting, FAQ and Support
@@ -66,7 +66,7 @@ Invoke features an organized gallery system for easily storing, accessing, and r
### Other features
- Support for both ckpt and diffusers models
- SD1.5, SD2.0, and SDXL support
- SD1.5, SD2.0, SDXL, and FLUX support
- Upscaling Tools
- Embedding Manager & Support
- Model Manager & Support
@@ -87,15 +87,15 @@ Invoke is a combined effort of [passionate and talented people from across the w
All commands should be run within the `docker` directory: `cd docker`
First things first:
## Quickstart :rocket:
- Ensure that Docker can use your [NVIDIA][nvidia docker docs] or [AMD][amd docker docs] GPU.
- This document assumes a Linux system, but should work similarly under Windows with WSL2.
- We don't recommend running Invoke in Docker on macOS at this time. It works, but very slowly.
On a known working Linux+Docker+CUDA (Nvidia) system, execute `./run.sh` in this directory. It will take a few minutes - depending on your internet speed - to install the core models. Once the application starts up, open `http://localhost:9090` in your browser to Invoke!
## Quickstart
For more configuration options (using an AMD GPU, custom root directory location, etc): read on.
No `docker compose`, no persistence, single command, using the official images:
## Detailed setup
**CUDA (NVIDIA GPU):**
```bash
docker run --runtime=nvidia --gpus=all --publish 9090:9090 ghcr.io/invoke-ai/invokeai
```
**ROCm (AMD GPU):**
```bash
docker run --device /dev/kfd --device /dev/dri --publish 9090:9090 ghcr.io/invoke-ai/invokeai:main-rocm
```
Open `http://localhost:9090` in your browser once the container finishes booting, install some models, and generate away!
### Data persistence
To persist your generated images and downloaded models outside of the container, add a `--volume/-v` flag to the above command, e.g.:
```bash
docker run --volume /some/local/path:/invokeai {...etc...}
```
`/some/local/path/invokeai` will contain all your data.
It can *usually* be reused between different installs of Invoke. Tread with caution and read the release notes!
## Customize the container
The included `run.sh` script is a convenience wrapper around `docker compose`. It can be helpful for passing additional build arguments to `docker compose`. Alternatively, the familiar `docker compose` commands work just as well.
```bash
cd docker
cp .env.sample .env
# edit .env to your liking if you need to; it is well commented.
./run.sh
```
It will take a few minutes to build the image the first time. Once the application starts up, open `http://localhost:9090` in your browser to invoke!
>[!TIP]
>When using the `run.sh` script, the container will continue running after Ctrl+C. To shut it down, use the `docker compose down` command.
## Docker setup in detail
#### Linux
1. Ensure builkit is enabled in the Docker daemon settings (`/etc/docker/daemon.json`)
1. Ensure buildkit is enabled in the Docker daemon settings (`/etc/docker/daemon.json`)
2. Install the `docker compose` plugin using your package manager, or follow a [tutorial](https://docs.docker.com/compose/install/linux/#install-using-the-repository).
- The deprecated `docker-compose` (hyphenated) CLI continues to work for now.
- The deprecated `docker-compose` (hyphenated) CLI probably won't work. Update to a recent version.
3. Ensure docker daemon is able to access the GPU.
-You may need to install [nvidia-container-toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html)
> You'll be better off installing Invoke directly on your system, because Docker can not use the GPU on macOS.
If you are still reading:
1. Ensure Docker has at least 16GB RAM
2. Enable VirtioFS for file sharing
3. Enable `docker compose` V2 support
This is done via Docker Desktop preferences
This is done via Docker Desktop preferences.
### Configure Invoke environment
### Configure the Invoke Environment
1. Make a copy of `.env.sample` and name it `.env` (`cp .env.sample .env` (Mac/Linux) or `copy example.env .env` (Windows)). Make changes as necessary. Set `INVOKEAI_ROOT` to an absolute path to:
a. the desired location of the InvokeAI runtime directory, or
b. an existing, v3.0.0 compatible runtime directory.
1. Make a copy of `.env.sample` and name it `.env` (`cp .env.sample .env` (Mac/Linux) or `copy example.env .env` (Windows)). Make changes as necessary. Set `INVOKEAI_ROOT` to an absolute path to the desired location of the InvokeAI runtime directory. It may be an existing directory from a previous installation (post 4.0.0).
1. Execute `run.sh`
The image will be built automatically if needed.
The runtime directory (holding models and outputs) will be created in the location specified by `INVOKEAI_ROOT`. The default location is `~/invokeai`. The runtime directory will be populated with the base configs and models necessary to start generating.
The runtime directory (holding models and outputs) will be created in the location specified by `INVOKEAI_ROOT`. The default location is `~/invokeai`. Navigate to the Model Manager tab and install some models before generating.
### Use a GPU
@@ -43,9 +90,9 @@ The runtime directory (holding models and outputs) will be created in the locati
- WSL2 is *required* for Windows.
- only `x86_64` architecture is supported.
The Docker daemon on the system must be already set up to use the GPU. In case of Linux, this involves installing `nvidia-docker-runtime` and configuring the `nvidia` runtime as default. Steps will be different for AMD. Please see Docker documentation for the most up-to-date instructions for using your GPU with Docker.
The Docker daemon on the system must be already set up to use the GPU. In case of Linux, this involves installing `nvidia-docker-runtime` and configuring the `nvidia` runtime as default. Steps will be different for AMD. Please see Docker/NVIDIA/AMD documentation for the most up-to-date instructions for using your GPU with Docker.
To use an AMD GPU, set `GPU_DRIVER=rocm` in your `.env` file.
To use an AMD GPU, set `GPU_DRIVER=rocm` in your `.env` file before running `./run.sh`.
## Customize
@@ -59,30 +106,12 @@ Values are optional, but setting `INVOKEAI_ROOT` is highly recommended. The defa
INVOKEAI_ROOT=/Volumes/WorkDrive/invokeai
HUGGINGFACE_TOKEN=the_actual_token
CONTAINER_UID=1000
GPU_DRIVER=nvidia
GPU_DRIVER=cuda
```
Any environment variables supported by InvokeAI can be set here - please see the [Configuration docs](https://invoke-ai.github.io/InvokeAI/features/CONFIGURATION/) for further detail.
Any environment variables supported by InvokeAI can be set here. See the [Configuration docs](https://invoke-ai.github.io/InvokeAI/features/CONFIGURATION/) for further detail.
## Even Moar Customizing!
---
See the `docker-compose.yml` file. The `command` instruction can be uncommented and used to run arbitrary startup commands. Some examples below.
### Reconfigure the runtime directory
Can be used to download additional models from the supported model list
In conjunction with `INVOKEAI_ROOT` can be also used to initialize a runtime directory
This release (along with the post1 and post2 follow-on releases) expands support for additional LoRA and LyCORIS models, upgrades diffusers versions, and fixes a few bugs.
### LoRA and LyCORIS Support Improvement
A number of LoRA/LyCORIS fine-tune files (those which alter the text encoder as well as the unet model) were not having the desired effect in InvokeAI. This bug has now been fixed. Full documentation of LoRA support is available at InvokeAI LoRA Support.
Previously, InvokeAI did not distinguish between LoRA/LyCORIS models based on Stable Diffusion v1.5 vs those based on v2.0 and 2.1, leading to a crash when an incompatible model was loaded. This has now been fixed. In addition, the web pulldown menus for LoRA and Textual Inversion selection have been enhanced to show only those files that are compatible with the currently-selected Stable Diffusion model.
Support for the newer LoKR LyCORIS files has been added.
### Library Updates and Speed/Reproducibility Advancements
The major enhancement in this version is that NVIDIA users no longer need to decide between speed and reproducibility. Previously, if you activated the Xformers library, you would see improvements in speed and memory usage, but multiple images generated with the same seed and other parameters would be slightly different from each other. This is no longer the case. Relative to 2.3.5 you will see improved performance when running without Xformers, and even better performance when Xformers is activated. In both cases, images generated with the same settings will be identical.
Here are the new library versions:
Library Version
Torch 2.0.0
Diffusers 0.16.1
Xformers 0.0.19
Compel 1.1.5
Other Improvements
### Performance Improvements
When a model is loaded for the first time, InvokeAI calculates its checksum for incorporation into the PNG metadata. This process could take up to a minute on network-mounted disks and WSL mounts. This release noticeably speeds up the process.
### Bug Fixes
The "import models from directory" and "import from URL" functionality in the console-based model installer has now been fixed.
When running the WebUI, we have reduced the number of times that InvokeAI reaches out to HuggingFace to fetch the list of embeddable Textual Inversion models. We have also caught and fixed a problem with the updater not correctly detecting when another instance of the updater is running
## v2.3.4 <small>(7 April 2023)</small>
What's New in 2.3.4
This features release adds support for LoRA (Low-Rank Adaptation) and LyCORIS (Lora beYond Conventional) models, as well as some minor bug fixes.
### LoRA and LyCORIS Support
LoRA files contain fine-tuning weights that enable particular styles, subjects or concepts to be applied to generated images. LyCORIS files are an extended variant of LoRA. InvokeAI supports the most common LoRA/LyCORIS format, which ends in the suffix .safetensors. You will find numerous LoRA and LyCORIS models for download at Civitai, and a small but growing number at Hugging Face. Full documentation of LoRA support is available at InvokeAI LoRA Support.( Pre-release note: this page will only be available after release)
To use LoRA/LyCORIS models in InvokeAI:
Download the .safetensors files of your choice and place in /path/to/invokeai/loras. This directory was not present in earlier version of InvokeAI but will be created for you the first time you run the command-line or web client. You can also create the directory manually.
Add withLora(lora-file,weight) to your prompts. The weight is optional and will default to 1.0. A few examples, assuming that a LoRA file named loras/sushi.safetensors is present:
family sitting at dinner table eating sushi withLora(sushi,0.9)
family sitting at dinner table eating sushi withLora(sushi, 0.75)
family sitting at dinner table eating sushi withLora(sushi)
Multiple withLora() prompt fragments are allowed. The weight can be arbitrarily large, but the useful range is roughly 0.5 to 1.0. Higher weights make the LoRA's influence stronger. Negative weights are also allowed, which can lead to some interesting effects.
Generate as you usually would! If you find that the image is too "crisp" try reducing the overall CFG value or reducing individual LoRA weights. As is the case with all fine-tunes, you'll get the best results when running the LoRA on top of the model similar to, or identical with, the one that was used during the LoRA's training. Don't try to load a SD 1.x-trained LoRA into a SD 2.x model, and vice versa. This will trigger a non-fatal error message and generation will not proceed.
You can change the location of the loras directory by passing the --lora_directory option to `invokeai.
### New WebUI LoRA and Textual Inversion Buttons
This version adds two new web interface buttons for inserting LoRA and Textual Inversion triggers into the prompt as shown in the screenshot below.
Clicking on one or the other of the buttons will bring up a menu of available LoRA/LyCORIS or Textual Inversion trigger terms. Select a menu item to insert the properly-formatted withLora() or <textual-inversion> prompt fragment into the positive prompt. The number in parentheses indicates the number of trigger terms currently in the prompt. You may click the button again and deselect the LoRA or trigger to remove it from the prompt, or simply edit the prompt directly.
Currently terms are inserted into the positive prompt textbox only. However, some textual inversion embeddings are designed to be used with negative prompts. To move a textual inversion trigger into the negative prompt, simply cut and paste it.
By default the Textual Inversion menu only shows locally installed models found at startup time in /path/to/invokeai/embeddings. However, InvokeAI has the ability to dynamically download and install additional Textual Inversion embeddings from the HuggingFace Concepts Library. You may choose to display the most popular of these (with five or more likes) in the Textual Inversion menu by going to Settings and turning on "Show Textual Inversions from HF Concepts Library." When this option is activated, the locally-installed TI embeddings will be shown first, followed by uninstalled terms from Hugging Face. See The Hugging Face Concepts Library and Importing Textual Inversion files for more information.
### Minor features and fixes
This release changes model switching behavior so that the command-line and Web UIs save the last model used and restore it the next time they are launched. It also improves the behavior of the installer so that the pip utility is kept up to date.
### Known Bugs in 2.3.4
These are known bugs in the release.
The Ancestral DPMSolverMultistepScheduler (k_dpmpp_2a) sampler is not yet implemented for diffusers models and will disappear from the WebUI Sampler menu when a diffusers model is selected.
Windows Defender will sometimes raise Trojan or backdoor alerts for the codeformer.pth face restoration model, as well as the CIDAS/clipseg and runwayml/stable-diffusion-v1.5 models. These are false positives and can be safely ignored. InvokeAI performs a malware scan on all models as they are loaded. For additional security, you should use safetensors models whenever they are available.
## v2.3.3 <small>(28 March 2023)</small>
This is a bugfix and minor feature release.
### Bugfixes
Since version 2.3.2 the following bugs have been fixed:
Bugs
When using legacy checkpoints with an external VAE, the VAE file is now scanned for malware prior to loading. Previously only the main model weights file was scanned.
Textual inversion will select an appropriate batchsize based on whether xformers is active, and will default to xformers enabled if the library is detected.
The batch script log file names have been fixed to be compatible with Windows.
Occasional corruption of the .next_prefix file (which stores the next output file name in sequence) on Windows systems is now detected and corrected.
Support loading of legacy config files that have no personalization (textual inversion) section.
An infinite loop when opening the developer's console from within the invoke.sh script has been corrected.
Documentation fixes, including a recipe for detecting and fixing problems with the AMD GPU ROCm driver.
Enhancements
It is now possible to load and run several community-contributed SD-2.0 based models, including the often-requested "Illuminati" model.
The "NegativePrompts" embedding file, and others like it, can now be loaded by placing it in the InvokeAI embeddings directory.
If no --model is specified at launch time, InvokeAI will remember the last model used and restore it the next time it is launched.
On Linux systems, the invoke.sh launcher now uses a prettier console-based interface. To take advantage of it, install the dialog package using your package manager (e.g. sudo apt install dialog).
When loading legacy models (safetensors/ckpt) you can specify a custom config file and/or a VAE by placing like-named files in the same directory as the model following this example:
my-favorite-model.ckpt
my-favorite-model.yaml
my-favorite-model.vae.pt # or my-favorite-model.vae.safetensors
### Known Bugs in 2.3.3
These are known bugs in the release.
The Ancestral DPMSolverMultistepScheduler (k_dpmpp_2a) sampler is not yet implemented for diffusers models and will disappear from the WebUI Sampler menu when a diffusers model is selected.
Windows Defender will sometimes raise Trojan or backdoor alerts for the codeformer.pth face restoration model, as well as the CIDAS/clipseg and runwayml/stable-diffusion-v1.5 models. These are false positives and can be safely ignored. InvokeAI performs a malware scan on all models as they are loaded. For additional security, you should use safetensors models whenever they are available.
## v2.3.2 <small>(11 March 2023)</small>
This is a bugfix and minor feature release.
### Bugfixes
Since version 2.3.1 the following bugs have been fixed:
Black images appearing for potential NSFW images when generating with legacy checkpoint models and both --no-nsfw_checker and --ckpt_convert turned on.
Black images appearing when generating from models fine-tuned on Stable-Diffusion-2-1-base. When importing V2-derived models, you may be asked to select whether the model was derived from a "base" model (512 pixels) or the 768-pixel SD-2.1 model.
The "Use All" button was not restoring the Hi-Res Fix setting on the WebUI
When using the model installer console app, models failed to import correctly when importing from directories with spaces in their names. A similar issue with the output directory was also fixed.
Crashes that occurred during model merging.
Restore previous naming of Stable Diffusion base and 768 models.
Upgraded to latest versions of diffusers, transformers, safetensors and accelerate libraries upstream. We hope that this will fix the assertion NDArray > 2**32 issue that MacOS users have had when generating images larger than 768x768 pixels. Please report back.
As part of the upgrade to diffusers, the location of the diffusers-based models has changed from models/diffusers to models/hub. When you launch InvokeAI for the first time, it will prompt you to OK a one-time move. This should be quick and harmless, but if you have modified your models/diffusers directory in some way, for example using symlinks, you may wish to cancel the migration and make appropriate adjustments.
New "Invokeai-batch" script
### Invoke AI Batch
2.3.2 introduces a new command-line only script called invokeai-batch that can be used to generate hundreds of images from prompts and settings that vary systematically. This can be used to try the same prompt across multiple combinations of models, steps, CFG settings and so forth. It also allows you to template prompts and generate a combinatorial list like:
a shack in the mountains, photograph
a shack in the mountains, watercolor
a shack in the mountains, oil painting
a chalet in the mountains, photograph
a chalet in the mountains, watercolor
a chalet in the mountains, oil painting
a shack in the desert, photograph
...
If you have a system with multiple GPUs, or a single GPU with lots of VRAM, you can parallelize generation across the combinatorial set, reducing wait times and using your system's resources efficiently (make sure you have good GPU cooling).
To try invokeai-batch out. Launch the "developer's console" using the invoke launcher script, or activate the invokeai virtual environment manually. From the console, give the command invokeai-batch --help in order to learn how the script works and create your first template file for dynamic prompt generation.
### Known Bugs in 2.3.2
These are known bugs in the release.
The Ancestral DPMSolverMultistepScheduler (k_dpmpp_2a) sampler is not yet implemented for diffusers models and will disappear from the WebUI Sampler menu when a diffusers model is selected.
Windows Defender will sometimes raise a Trojan alert for the codeformer.pth face restoration model. As far as we have been able to determine, this is a false positive and can be safely whitelisted.
## v2.3.1 <small>(22 February 2023)</small>
This is primarily a bugfix release, but it does provide several new features that will improve the user experience.
### Enhanced support for model management
InvokeAI now makes it convenient to add, remove and modify models. You can individually import models that are stored on your local system, scan an entire folder and its subfolders for models and import them automatically, and even directly import models from the internet by providing their download URLs. You also have the option of designating a local folder to scan for new models each time InvokeAI is restarted.
There are three ways of accessing the model management features:
From the WebUI, click on the cube to the right of the model selection menu. This will bring up a form that allows you to import models individually from your local disk or scan a directory for models to import.
Using the Model Installer App
Choose option (5) download and install models from the invoke launcher script to start a new console-based application for model management. You can use this to select from a curated set of starter models, or import checkpoint, safetensors, and diffusers models from a local disk or the internet. The example below shows importing two checkpoint URLs from popular SD sites and a HuggingFace diffusers model using its Repository ID. It also shows how to designate a folder to be scanned at startup time for new models to import.
Command-line users can start this app using the command invokeai-model-install.
Using the Command Line Client (CLI)
The !install_model and !convert_model commands have been enhanced to allow entering of URLs and local directories to scan and import. The first command installs .ckpt and .safetensors files as-is. The second one converts them into the faster diffusers format before installation.
Internally InvokeAI is able to probe the contents of a .ckpt or .safetensors file to distinguish among v1.x, v2.x and inpainting models. This means that you do not need to include "inpaint" in your model names to use an inpainting model. Note that Stable Diffusion v2.x models will be autoconverted into a diffusers model the first time you use it.
Please see INSTALLING MODELS for more information on model management.
### An Improved Installer Experience
The installer now launches a console-based UI for setting and changing commonly-used startup options:
After selecting the desired options, the installer installs several support models needed by InvokeAI's face reconstruction and upscaling features and then launches the interface for selecting and installing models shown earlier. At any time, you can edit the startup options by launching invoke.sh/invoke.bat and entering option (6) change InvokeAI startup options
Command-line users can launch the new configure app using invokeai-configure.
This release also comes with a renewed updater. To do an update without going through a whole reinstallation, launch invoke.sh or invoke.bat and choose option (9) update InvokeAI . This will bring you to a screen that prompts you to update to the latest released version, to the most current development version, or any released or unreleased version you choose by selecting the tag or branch of the desired version.
Command-line users can run this interface by typing invokeai-configure
### Image Symmetry Options
There are now features to generate horizontal and vertical symmetry during generation. The way these work is to wait until a selected step in the generation process and then to turn on a mirror image effect. In addition to generating some cool images, you can also use this to make side-by-side comparisons of how an image will look with more or fewer steps. Access this option from the WebUI by selecting Symmetry from the image generation settings, or within the CLI by using the options --h_symmetry_time_pct and --v_symmetry_time_pct (these can be abbreviated to --h_sym and --v_sym like all other options).
### A New Unified Canvas Look
This release introduces a beta version of the WebUI Unified Canvas. To try it out, open up the settings dialogue in the WebUI (gear icon) and select Use Canvas Beta Layout:
Refresh the screen and go to to Unified Canvas (left side of screen, third icon from the top). The new layout is designed to provide more space to work in and to keep the image controls close to the image itself:
Model conversion and merging within the WebUI
The WebUI now has an intuitive interface for model merging, as well as for permanent conversion of models from legacy .ckpt/.safetensors formats into diffusers format. These options are also available directly from the invoke.sh/invoke.bat scripts.
An easier way to contribute translations to the WebUI
We have migrated our translation efforts to Weblate, a FOSS translation product. Maintaining the growing project's translations is now far simpler for the maintainers and community. Please review our brief translation guide for more information on how to contribute.
Numerous internal bugfixes and performance issues
### Bug Fixes
This releases quashes multiple bugs that were reported in 2.3.0. Major internal changes include upgrading to diffusers 0.13.0, and using the compel library for prompt parsing. See Detailed Change Log for a detailed list of bugs caught and squished.
Summary of InvokeAI command line scripts (all accessible via the launcher menu)
Command Description
invokeai Command line interface
invokeai --web Web interface
invokeai-model-install Model installer with console forms-based front end
invokeai-ti --gui Textual inversion, with a console forms-based front end
invokeai-merge --gui Model merging, with a console forms-based front end
invokeai-configure Startup configuration; can also be used to reinstall support models
invokeai-update InvokeAI software updater
### Known Bugs in 2.3.1
These are known bugs in the release.
MacOS users generating 768x768 pixel images or greater using diffusers models may experience a hard crash with assertion NDArray > 2**32 This appears to be an issu...
## v2.3.0 <small>(15 January 2023)</small>
**Transition to diffusers
Version 2.3 provides support for both the traditional `.ckpt` weight
checkpoint files as well as the HuggingFace `diffusers` format. This
introduces several changes you should know about.
1. The models.yaml format has been updated. There are now two
different type of configuration stanza. The traditional ckpt
one will look like this, with a `format` of `ckpt` and a
`weights` field that points to the absolute or ROOTDIR-relative
location of the ckpt file.
```
inpainting-1.5:
description: RunwayML SD 1.5 model optimized for inpainting (4.27 GB)
1. On CUDA systems, the 768 pixel stable-diffusion-2.0 and
stable-diffusion-2.1 models can only be run as `diffusers` models
when the `xformer` library is installed and configured. Without
`xformers`, InvokeAI returns black images.
2. Inpainting and outpainting have regressed in quality.
Both these issues are being actively worked on.
## v2.2.4 <small>(11 December 2022)</small>
**the `invokeai` directory**
Previously there were two directories to worry about, the directory that
contained the InvokeAI source code and the launcher scripts, and the `invokeai`
directory that contained the models files, embeddings, configuration and
outputs. With the 2.2.4 release, this dual system is done away with, and
everything, including the `invoke.bat` and `invoke.sh` launcher scripts, now
live in a directory named `invokeai`. By default this directory is located in
your home directory (e.g. `\Users\yourname` on Windows), but you can select
where it goes at install time.
After installation, you can delete the install directory (the one that the zip
file creates when it unpacks). Do **not** delete or move the `invokeai`
directory!
**Initialization file `invokeai/invokeai.init`**
You can place frequently-used startup options in this file, such as the default
number of steps or your preferred sampler. To keep everything in one place, this
file has now been moved into the `invokeai` directory and is named
`invokeai.init`.
**To update from Version 2.2.3**
The easiest route is to download and unpack one of the 2.2.4 installer files.
When it asks you for the location of the `invokeai` runtime directory, respond
with the path to the directory that contains your 2.2.3 `invokeai`. That is, if
`invokeai` lives at `C:\Users\fred\invokeai`, then answer with `C:\Users\fred`
and answer "Y" when asked if you want to reuse the directory.
The `update.sh` (`update.bat`) script that came with the 2.2.3 source installer
does not know about the new directory layout and won't be fully functional.
**To update to 2.2.5 (and beyond) there's now an update path**
As they become available, you can update to more recent versions of InvokeAI
using an `update.sh` (`update.bat`) script located in the `invokeai` directory.
Running it without any arguments will install the most recent version of
InvokeAI. Alternatively, you can get set releases by running the `update.sh`
script with an argument in the command shell. This syntax accepts the path to
the desired release's zip file, which you can find by clicking on the green
"Code" button on this repository's home page.
**Other 2.2.4 Improvements**
- Fix InvokeAI GUI initialization by @addianto in #1687
- fix link in documentation by @lstein in #1728
- Fix broken link by @ShawnZhong in #1736
- Remove reference to binary installer by @lstein in #1731
- documentation fixes for 2.2.3 by @lstein in #1740
- Modify installer links to point closer to the source installer by @ebr in
#1745
- add documentation warning about 1650/60 cards by @lstein in #1753
- Fix Linux source URL in installation docs by @andybearman in #1756
- Make install instructions discoverable in readme by @damian0815 in #1752
- typo fix by @ofirkris in #1755
- Non-interactive model download (support HUGGINGFACE_TOKEN) by @ebr in #1578
- fix(srcinstall): shell installer - cp scripts instead of linking by @tildebyte
in #1765
- stability and usage improvements to binary & source installers by @lstein in
#1760
- fix off-by-one bug in cross-attention-control by @damian0815 in #1774
- Eventually update APP_VERSION to 2.2.3 by @spezialspezial in #1768
- invoke script cds to its location before running by @lstein in #1805
- Make PaperCut and VoxelArt models load again by @lstein in #1730
- Fix --embedding_directory / --embedding_path not working by @blessedcoolant in
#1817
- Clean up readme by @hipsterusername in #1820
- Optimized Docker build with support for external working directory by @ebr in
#1544
- disable pushing the cloud container by @mauwii in #1831
- Fix docker push github action and expand with additional metadata by @ebr in
#1837
- Fix Broken Link To Notebook by @VedantMadane in #1821
- Account for flat models by @spezialspezial in #1766
- Update invoke.bat.in isolate environment variables by @lynnewu in #1833
- Arch Linux Specific PatchMatch Instructions & fixing conda install on linux by
@SammCheese in #1848
- Make force free GPU memory work in img2img by @addianto in #1844
- New installer by @lstein
## v2.2.3 <small>(2 December 2022)</small>
!!! Note
This point release removes references to the binary installer from the
installation guide. The binary installer is not stable at the current
time. First time users are encouraged to use the "source" installer as
described in [Installing InvokeAI with the Source Installer](installation/deprecated_documentation/INSTALL_SOURCE.md)
With InvokeAI 2.2, this project now provides enthusiasts and professionals a
robust workflow solution for creating AI-generated and human facilitated
compositions. Additional enhancements have been made as well, improving safety,
ease of use, and installation.
Optimized for efficiency, InvokeAI needs only ~3.5GB of VRAM to generate a
512x768 image (and less for smaller images), and is compatible with
Windows/Linux/Mac (M1 & M2).
You can see the [release video](https://youtu.be/hIYBfDtKaus) here, which
introduces the main WebUI enhancement for version 2.2 -
[The Unified Canvas](features/UNIFIED_CANVAS.md). This new workflow is the
biggest enhancement added to the WebUI to date, and unlocks a stunning amount of
potential for users to create and iterate on their creations. The following
sections describe what's new for InvokeAI.
## v2.2.2 <small>(30 November 2022)</small>
!!! note
The binary installer is not ready for prime time. First time users are recommended to install via the "source" installer accessible through the links at the bottom of this page.****
With InvokeAI 2.2, this project now provides enthusiasts and professionals a
robust workflow solution for creating AI-generated and human facilitated
compositions. Additional enhancements have been made as well, improving safety,
ease of use, and installation.
Optimized for efficiency, InvokeAI needs only ~3.5GB of VRAM to generate a
512x768 image (and less for smaller images), and is compatible with
Windows/Linux/Mac (M1 & M2).
You can see the [release video](https://youtu.be/hIYBfDtKaus) here, which
introduces the main WebUI enhancement for version 2.2 -
The app is published in twice, in different build formats.
The Invoke application is published as a python package on [PyPI]. This includes both a source distribution and built distribution (a wheel).
- A [PyPI] distribution. This includes both a source distribution and built distribution (a wheel). Users install with `pip install invokeai`. The updater uses this build.
- An installer on the [InvokeAI Releases Page]. This is a zip file with install scripts and a wheel. This is only used for new installs.
Most users install it with the [Launcher](https://github.com/invoke-ai/launcher/), others with `pip`.
The launcher uses GitHub as the source of truth for available releases.
## Broad Strokes
- Merge all changes and bump the version in the codebase.
- Tag the release commit.
- Wait for the release workflow to complete.
- Approve the PyPI publish jobs.
- Write GH release notes.
## General Prep
Make a developer call-out for PRs to merge. Merge and test things out.
While the release workflow does not include end-to-end tests, it does pause before publishing so you can download and test the final build.
Make a developer call-out for PRs to merge. Merge and test things out. Bump the version by editing `invokeai/version/invokeai_version.py`.
## Release Workflow
The `release.yml` workflow runs a number of jobs to handle code checks, tests, build and publish on PyPI.
It is triggered on **tag push**, when the tag matches `v*`. It doesn't matter if you've prepped a release branch like `release/v3.5.0` or are releasing from `main` - it works the same.
> Because commits are reference-counted, it is safe to create a release branch, tag it, let the workflow run, then delete the branch. So long as the tag exists, that commit will exist.
It is triggered on **tag push**, when the tag matches `v*`.
### Triggering the Workflow
Run `make tag-release` to tag the current commit and kick off the workflow.
Ensure all commits that should be in the release are merged, and you have pulled them locally.
The release may also be dispatched [manually].
Double-check that you have checked out the commit that will represent the release (typically the latest commit on `main`).
Run `make tag-release` to tag the current commit and kick off the workflow. You will be prompted to provide a message - use the version specifier.
If this version's tag already exists for some reason (maybe you had to make a last minute change), the script will overwrite it.
> In case you cannot use the Make target, the release may also be dispatched [manually] via GH.
### Workflow Jobs and Process
The workflow consists of a number of concurrently-run jobs, and two final publish jobs.
The workflow consists of a number of concurrently-run checks and tests, then two final publish jobs.
The publish jobs require manual approval and are only run if the other jobs succeed.
#### `check-version` Job
This job checks that the git ref matches the app version. It matches the ref against the `__version__` variable in `invokeai/version/invokeai_version.py`.
When the workflow is triggered by tag push, the ref is the tag. If the workflow is run manually, the ref is the target selected from the **Use workflow from** dropdown.
This job ensures that the `invokeai` python package version specifier matches the tag for the release. The version specifier is pulled from the `__version__` variable in `invokeai/version/invokeai_version.py`.
This job uses [samuelcolvin/check-python-version].
@@ -43,62 +52,47 @@ This job uses [samuelcolvin/check-python-version].
#### Check and Test Jobs
Next, these jobs run and must pass. They are the same jobs that are run for every PR.
- **`python-tests`**: runs `pytest` on matrix of platforms
- **`python-checks`**: runs `ruff` (format and lint)
- **`frontend-tests`**: runs `vitest`
- **`frontend-checks`**: runs `prettier` (format), `eslint` (lint), `dpdm` (circular refs), `tsc` (static type check) and `knip` (unused imports)
- **`typegen-checks`**: ensures the frontend and backend types are synced
> **TODO** We should add `mypy` or `pyright` to the **`check-python`** job.
#### `build-wheel` Job
> **TODO** We should add an end-to-end test job that generates an image.
This sets up both python and frontend dependencies and builds the python package. Internally, this runs `./scripts/build_wheel.sh` and uploads `dist.zip`, which contains the wheel and unarchived build.
#### `build-installer` Job
This sets up both python and frontend dependencies and builds the python package. Internally, this runs `installer/create_installer.sh` and uploads two artifacts:
- **`dist`**: the python distribution, to be published on PyPI
- **`InvokeAI-installer-${VERSION}.zip`**: the installer to be included in the GitHub release
You don't need to download or test these artifacts.
#### Sanity Check & Smoke Test
At this point, the release workflow pauses as the remaining publish jobs require approval. Time to test the installer.
At this point, the release workflow pauses as the remaining publish jobs require approval.
Because the installer pulls from PyPI, and we haven't published to PyPI yet, you will need to install from the wheel:
It's possible to test the python package before it gets published to PyPI. We've never had problems with it, so it's not necessary to do this.
- Download and unzip `dist.zip` and the installer from the **Summary** tab of the workflow
- Run the installer script using the `--wheel` CLI arg, pointing at the wheel:
But, if you want to be extra-super careful, here's how to test it:
- Install to a temporary directory so you get the new user experience
- Download a model and generate
> The same wheel file is bundled in the installer and in the `dist` artifact, which is uploaded to PyPI. You should end up with the exactly the same installation as if the installer got the wheel from PyPI.
- Download the `dist.zip` build artifact from the `build-wheel` job
- Unzip it and find the wheel file
- Create a fresh Invoke install by following the [manual install guide](https://invoke-ai.github.io/InvokeAI/installation/manual/) - but instead of installing from PyPI, install from the wheel
- Test the app
##### Something isn't right
If testing reveals any issues, no worries. Cancel the workflow, which will cancel the pending publish jobs (you didn't approve them prematurely, right?).
Now you can start from the top:
- Fix the issues and PR the fixes per usual
- Get the PR approved and merged per usual
- Switch to `main` and pull in the fixes
- Run `make tag-release` to move the tag to `HEAD` (which has the fixes) and kick off the release workflow again
- Re-do the sanity check
If testing reveals any issues, no worries. Cancel the workflow, which will cancel the pending publish jobs (you didn't approve them prematurely, right?) and start over.
#### PyPI Publish Jobs
The publish jobs will run if any of the previous jobs fail.
The publish jobs will not run if any of the previous jobs fail.
They use [GitHub environments], which are configured as [trusted publishers] on PyPI.
Both jobs require a maintainer to approve them from the workflow's **Summary** tab.
Both jobs require a @hipsterusername or @psychedelicious to approve them from the workflow's **Summary** tab.
- Click the **Review deployments** button
- Select the environment (either `testpypi` or `pypi`)
- Select the environment (either `testpypi` or `pypi` - typically you select both)
- Click **Approve and deploy**
> **If the version already exists on PyPI, the publish jobs will fail.** PyPI only allows a given version to be published once - you cannot change it. If version published on PyPI has a problem, you'll need to "fail forward" by bumping the app version and publishing a followup release.
@@ -113,46 +107,33 @@ If there are no incidents, contact @hipsterusername or @lstein, who have owner a
Publishes the distribution on the [Test PyPI] index, using the `testpypi` GitHub environment.
This job is not required for the production PyPI publish, but included just in case you want to test the PyPI release.
This job is not required for the production PyPI publish, but included just in case you want to test the PyPI release for some reason:
If approved and successful, you could try out the test release like this:
- Approve this publish job without approving the prod publish
- Let it finish
- Create a fresh Invoke install by following the [manual install guide](https://invoke-ai.github.io/InvokeAI/installation/manual/), making sure to use the Test PyPI index URL: `https://test.pypi.org/simple/`
- Test the app
#### `publish-pypi` Job
Publishes the distribution on the production PyPI index, using the `pypi` GitHub environment.
## Publish the GitHub Release with installer
It's a good idea to wait to approve and run this job until you have the release notes ready!
Once the release is published to PyPI, it's time to publish the GitHub release.
## Prep and publish the GitHub Release
1. [Draft a new release] on GitHub, choosing the tag that triggered the release.
1. Write the release notes, describing important changes. The **Generate release notes** button automatically inserts the changelog and new contributors, and you can copy/paste the intro from previous releases.
1. Use `scripts/get_external_contributions.py` to get a list of external contributions to shout out in the release notes.
1. Upload the zip file created in **`build`** job into the Assets section of the release notes.
1. Check **Set as a pre-release** if it's a pre-release.
1. Check **Create a discussion for this release**.
1. Publish the release.
1. Announce the release in Discord.
> **TODO** Workflows can create a GitHub release from a template and upload release assets. One popular action to handle this is [ncipollo/release-action]. A future enhancement to the release process could set this up.
## Manual Build
The `build installer` workflow can be dispatched manually. This is useful to test the installer for a given branch or tag.
No checks are run, it just builds.
2. The **Generate release notes** button automatically inserts the changelog and new contributors. Make sure to select the correct tags for this release and the last stable release. GH often selects the wrong tags - do this manually.
3. Write the release notes, describing important changes. Contributions from community members should be shouted out. Use the GH-generated changelog to see all contributors. If there are Weblate translation updates, open that PR and shout out every person who contributed a translation.
4. Check **Set as a pre-release** if it's a pre-release.
5. Approve and wait for the `publish-pypi` job to finish if you haven't already.
6. Publish the GH release.
7. Post the release in Discord in the [releases](https://discord.com/channels/1020123559063990373/1149260708098359327) channel with abbreviated notes. For example:
> It's a pretty big one - Form Builder, Metadata Nodes (thanks @SkunkWorxDark!), and much more.
8. Right click the message in releases and copy the link to it. Then, post that link in the [new-release-discussion](https://discord.com/channels/1020123559063990373/1149506274971631688) channel. For example:
The provided token will be added as a `Bearer` token to the network requests to download the model files. As far as we know, this works for all model marketplaces that require authorization.
!!! tip "HuggingFace Models"
If you get an error when installing a HF model using a URL instead of repo id, you may need to [set up a HF API token](https://huggingface.co/settings/tokens) and add an entry for it under `remote_api_tokens`. Use `huggingface.co` for `url_regex`.
#### Model Hashing
Models are hashed during installation, providing a stable identifier for models across all platforms. Hashing is a one-time operation.
@@ -155,12 +162,12 @@ log_handlers:
locally or to a remote logging machine. `syslog` offers a variety
of configuration options:
```
syslog=/dev/log` - log to the /dev/log device
syslog=localhost` - log to the network logger running on the local machine
syslog=localhost:512` - same as above, but using a non-standard port
@@ -50,7 +50,7 @@ Applications are built on top of the invoke framework. They should construct `in
### Web UI
The Web UI is built on top of an HTTP API built with [FastAPI](https://fastapi.tiangolo.com/) and [Socket.IO](https://socket.io/). The frontend code is found in `/frontend` and the backend code is found in `/ldm/invoke/app/api_app.py` and `/ldm/invoke/app/api/`. The code is further organized as such:
The Web UI is built on top of an HTTP API built with [FastAPI](https://fastapi.tiangolo.com/) and [Socket.IO](https://socket.io/). The frontend code is found in `/invokeai/frontend` and the backend code is found in `/invokeai/app/api_app.py` and `/invokeai/app/api/`. The code is further organized as such:
| Component | Description |
| --- | --- |
@@ -62,7 +62,7 @@ The Web UI is built on top of an HTTP API built with [FastAPI](https://fastapi.t
### CLI
The CLI is built automatically from invocation metadata, and also supports invocation piping and auto-linking. Code is available in `/ldm/invoke/app/cli_app.py`.
The CLI is built automatically from invocation metadata, and also supports invocation piping and auto-linking. Code is available in `/invokeai/frontend/cli`.
## Invoke
@@ -70,7 +70,7 @@ The Invoke framework provides the interface to the underlying AI systems and is
### Invoker
The invoker (`/ldm/invoke/app/services/invoker.py`) is the primary interface through which applications interact with the framework. Its primary purpose is to create, manage, and invoke sessions. It also maintains two sets of services:
The invoker (`/invokeai/app/services/invoker.py`) is the primary interface through which applications interact with the framework. Its primary purpose is to create, manage, and invoke sessions. It also maintains two sets of services:
- **invocation services**, which are used by invocations to interact with core functionality.
- **invoker services**, which are used by the invoker to manage sessions and manage the invocation queue.
@@ -82,12 +82,12 @@ The session graph does not support looping. This is left as an application probl
### Invocations
Invocations represent individual units of execution, with inputs and outputs. All invocations are located in `/ldm/invoke/app/invocations`, and are all automatically discovered and made available in the applications. These are the primary way to expose new functionality in Invoke.AI, and the [implementation guide](INVOCATIONS.md) explains how to add new invocations.
Invocations represent individual units of execution, with inputs and outputs. All invocations are located in `/invokeai/app/invocations`, and are all automatically discovered and made available in the applications. These are the primary way to expose new functionality in Invoke.AI, and the [implementation guide](INVOCATIONS.md) explains how to add new invocations.
### Services
Services provide invocations access AI Core functionality and other necessary functionality (e.g. image storage). These are available in `/ldm/invoke/app/services`. As a general rule, new services should provide an interface as an abstract base class, and may provide a lightweight local implementation by default in their module. The goal for all services should be to enable the usage of different implementations (e.g. using cloud storage for image storage), but should not load any module dependencies unless that implementation has been used (i.e. don't import anything that won't be used, especially if it's expensive to import).
Services provide invocations access AI Core functionality and other necessary functionality (e.g. image storage). These are available in `/invokeai/app/services`. As a general rule, new services should provide an interface as an abstract base class, and may provide a lightweight local implementation by default in their module. The goal for all services should be to enable the usage of different implementations (e.g. using cloud storage for image storage), but should not load any module dependencies unless that implementation has been used (i.e. don't import anything that won't be used, especially if it's expensive to import).
## AI Core
The AI Core is represented by the rest of the code base (i.e. the code outside of `/ldm/invoke/app/`).
The AI Core is represented by the rest of the code base (i.e. the code outside of `/invokeai/app/`).
We use [mkdocs](https://www.mkdocs.org) for our documentation with the [material theme](https://squidfunk.github.io/mkdocs-material/). Documentation is written in markdown files under the `./docs` folder and then built into a static website for hosting with GitHub Pages at [invoke-ai.github.io/InvokeAI](https://invoke-ai.github.io/InvokeAI).
To contribute to the documentation you'll need to install the dependencies. Note
the use of `"`.
@@ -50,6 +39,7 @@ and will be required for testing the changes you make to the code.
### Tests
See the [tests documentation](./TESTS.md) for information about running and writing tests.
### Reloading Changes
Experimenting with changes to the Python source code is a drag if you have to re-start the server —
@@ -63,7 +53,6 @@ running server on the fly.
This will allow you to avoid restarting the server (and reloading models) in most cases, but there are some caveats; see
the [jurigged documentation](https://github.com/breuleux/jurigged#caveats) for details.
## Front End
<!--#TODO: get input from blessedcoolant here, for the moment inserted the frontend README via snippets extension.-->
@@ -250,7 +239,7 @@ Consult the
get it set up.
Suggest using VSCode's included settings sync so that your remote dev host has
all the same app settings and extensions automagically.
all the same app settings and extensions automatically.
We use `pytest` to run the backend python tests. (See [pyproject.toml](/pyproject.toml) for the default `pytest` options.)
We use `pytest` to run the backend python tests. (See [pyproject.toml](https://github.com/invoke-ai/InvokeAI/blob/main/pyproject.toml) for the default `pytest` options.)
## Fast vs. Slow
All tests are categorized as either 'fast' (no test annotation) or 'slow' (annotated with the `@pytest.mark.slow` decorator).
@@ -33,7 +33,7 @@ pytest tests -m ""
## Test Organization
All backend tests are in the [`tests/`](/tests/) directory. This directory mirrors the organization of the `invokeai/` directory. For example, tests for `invokeai/model_management/model_manager.py` would be found in `tests/model_management/test_model_manager.py`.
All backend tests are in the [`tests/`](https://github.com/invoke-ai/InvokeAI/tree/main/tests) directory. This directory mirrors the organization of the `invokeai/` directory. For example, tests for `invokeai/model_management/model_manager.py` would be found in `tests/model_management/test_model_manager.py`.
TODO: The above statement is aspirational. A re-organization of legacy tests is required to make it true.
If you are looking to help to with a code contribution, InvokeAI uses several different technologies under the hood: Python (Pydantic, FastAPI, diffusers) and Typescript (React, Redux Toolkit, ChakraUI, Mantine, Konva). Familiarity with StableDiffusion and image generation concepts is helpful, but not essential.
If you are looking to help with a code contribution, InvokeAI uses several different technologies under the hood: Python (Pydantic, FastAPI, diffusers) and Typescript (React, Redux Toolkit, ChakraUI, Mantine, Konva). Familiarity with StableDiffusion and image generation concepts is helpful, but not essential.
## **Get Started**
@@ -12,7 +12,7 @@ To get started, take a look at our [new contributors checklist](newContributorCh
Once you're setup, for more information, you can review the documentation specific to your area of interest:
@@ -20,15 +20,15 @@ Once you're setup, for more information, you can review the documentation specif
If you don't feel ready to make a code contribution yet, no problem! You can also help out in other ways, such as [documentation](documentation.md), [translation](translation.md) or helping support other users and triage issues as they're reported in GitHub.
There are two paths to making a development contribution:
There are two paths to making a development contribution:
1. Choosing an open issue to address. Open issues can be found in the [Issues](https://github.com/invoke-ai/InvokeAI/issues?q=is%3Aissue+is%3Aopen) section of the InvokeAI repository. These are tagged by the issue type (bug, enhancement, etc.) along with the “good first issues” tag denoting if they are suitable for first time contributors.
1. Additional items can be found on our [roadmap](https://github.com/orgs/invoke-ai/projects/7). The roadmap is organized in terms of priority, and contains features of varying size and complexity. If there is an inflight item you’d like to help with, reach out to the contributor assigned to the item to see how you can help.
1. Additional items can be found on our [roadmap](https://github.com/orgs/invoke-ai/projects/7). The roadmap is organized in terms of priority, and contains features of varying size and complexity. If there is an inflight item you’d like to help with, reach out to the contributor assigned to the item to see how you can help.
2. Opening a new issue or feature to add. **Please make sure you have searched through existing issues before creating new ones.**
*Regardless of what you choose, please post in the [#dev-chat](https://discord.com/channels/1020123559063990373/1049495067846524939) channel of the Discord before you start development in order to confirm that the issue or feature is aligned with the current direction of the project. We value our contributors time and effort and want to ensure that no one’s time is being misspent.*
## Best Practices:
## Best Practices:
* Keep your pull requests small. Smaller pull requests are more likely to be accepted and merged
* Comments! Commenting your code helps reviewers easily understand your contribution
* Use Python and Typescript’s typing systems, and consider using an editor with [LSP](https://microsoft.github.io/language-server-protocol/) support to streamline development
@@ -38,7 +38,7 @@ There are two paths to making a development contribution:
If you need help, you can ask questions in the [#dev-chat](https://discord.com/channels/1020123559063990373/1049495067846524939) channel of the Discord.
For frontend related work, **@psychedelicious** is the best person to reach out to.
For frontend related work, **@psychedelicious** is the best person to reach out to.
For backend related work, please reach out to **@blessedcoolant**, **@lstein**, **@StAlKeR7779** or **@psychedelicious**.
Documentation is an important part of any open source project. It provides a clear and concise way to communicate how the software works, how to use it, and how to troubleshoot issues. Without proper documentation, it can be difficult for users to understand the purpose and functionality of the project.
Documentation is an important part of any open source project. It provides a clear and concise way to communicate how the software works, how to use it, and how to troubleshoot issues. Without proper documentation, it can be difficult for users to understand the purpose and functionality of the project.
## Contributing
All documentation is maintained in the InvokeAI GitHub repository. If you come across documentation that is out of date or incorrect, please submit a pull request with the necessary changes.
All documentation is maintained in our [GitHub repository](https://github.com/invoke-ai/InvokeAI). If you come across documentation that is out of date or incorrect, please submit a pull request with the necessary changes.
When updating or creating documentation, please keep in mind InvokeAI is a tool for everyone, not just those who have familiarity with generative art.
When updating or creating documentation, please keep in mind Invoke is a tool for everyone, not just those who have familiarity with generative art.
## Help & Questions
Please ping @imic or @hipsterusernamein the [Discord](https://discord.com/channels/1020123559063990373/1049495067846524939) if you have any questions.
Please ping @hipsterusernameon [Discord](https://discord.gg/ZmtBAhwWhy) if you have any questions.
If you're a new contributor to InvokeAI or Open Source Projects, this is the guide for you.
If you're a new contributor to InvokeAI or Open Source Projects, this is the guide for you.
## New Contributor Checklist
- [x] Set up your local development environment & fork of InvokAI by following [the steps outlined here](../../installation/020_INSTALL_MANUAL.md#developer-install)
- [x] Set up your local tooling with [this guide](InvokeAI/contributing/LOCAL_DEVELOPMENT/#developing-invokeai-in-vscode). Feel free to skip this step if you already have tooling you're comfortable with.
- [x] Set up your local development environment & fork of InvokAI by following [the steps outlined here](../dev-environment.md)
- [x] Set up your local tooling with [this guide](../LOCAL_DEVELOPMENT.md). Feel free to skip this step if you already have tooling you're comfortable with.
- [x] Familiarize yourself with [Git](https://www.atlassian.com/git) & our project structure by reading through the [development documentation](development.md)
- [x] Join the [#dev-chat](https://discord.com/channels/1020123559063990373/1049495067846524939) channel of the Discord
- [x] Choose an issue to work on! This can be achieved by asking in the #dev-chat channel, tackling a [good first issue](https://github.com/invoke-ai/InvokeAI/contribute) or finding an item on the [roadmap](https://github.com/orgs/invoke-ai/projects/7). If nothing in any of those places catches your eye, feel free to work on something of interest to you!
- [x] Choose an issue to work on! This can be achieved by asking in the #dev-chat channel, tackling a [good first issue](https://github.com/invoke-ai/InvokeAI/contribute) or finding an item on the [roadmap](https://github.com/orgs/invoke-ai/projects/7). If nothing in any of those places catches your eye, feel free to work on something of interest to you!
- [x] Make your first Pull Request with the guide below
- [x] Happy development! Don't be afraid to ask for help - we're happy to help you contribute!
## How do I make a contribution?
Never made an open source contribution before? Wondering how contributions work in our project? Here's a quick rundown!
Before starting these steps, ensure you have your local environment [configured for development](../LOCAL_DEVELOPMENT.md).
1. Find a [good first issue](https://github.com/invoke-ai/InvokeAI/contribute) that you are interested in addressing or a feature that you would like to add. Then, reach out to our team in the [#dev-chat](https://discord.com/channels/1020123559063990373/1049495067846524939) channel of the Discord to ensure you are setup for success.
1. Find a [good first issue](https://github.com/invoke-ai/InvokeAI/contribute) that you are interested in addressing or a feature that you would like to add. Then, reach out to our team in the [#dev-chat](https://discord.com/channels/1020123559063990373/1049495067846524939) channel of the Discord to ensure you are setup for success.
2. Fork the [InvokeAI](https://github.com/invoke-ai/InvokeAI) repository to your GitHub profile. This means that you will have a copy of the repository under**your-GitHub-username/InvokeAI**.
3. Clone the repository to your local machine using:
If you're unfamiliar with using Git through the commandline, [GitHub Desktop](https://desktop.github.com) is a easy-to-use alternative with a UI. You can do all the same steps listed here, but through the interface.
If you're unfamiliar with using Git through the commandline, [GitHub Desktop](https://desktop.github.com) is a easy-to-use alternative with a UI. You can do all the same steps listed here, but through the interface. 4. Create a new branch for your fix using:
```bash
git checkout -b branch-name-here
```
5. Make the appropriate changes for the issue you are trying to address or the feature that you want to add.
6. Add the file contents of the changed files to the "snapshot" git uses to manage the state of the project, also known as the index:
```bash
git add -A
```
```bash
git add -A
```
7. Store the contents of the index with a descriptive message.
```bash
git commit -m "Insert a short message of the changes made here"
```
```bash
git commit -m "Insert a short message of the changes made here"
```
8. Push the changes to the remote repository using
```bash
git push origin branch-name-here
```
```bash
git push origin branch-name-here
```
9. Submit a pull request to the **main** branch of the InvokeAI repository. If you're not sure how to, [follow this guide](https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/creating-a-pull-request)
10. Title the pull request with a short description of the changes made and the issue or bug number associated with your change. For example, you can title an issue like so "Added more log outputting to resolve #1234".
11. In the description of the pull request, explain the changes that you made, any issues you think exist with the pull request you made, and any questions you have for the maintainer. It's OK if your pull request is not perfect (no pull request is), the reviewer will be able to help you fix any problems and improve it!
13. Make changes to the pull request if the reviewer(s) recommend them.
14. Celebrate your success after your pull request is merged!
If you’d like to learn more about contributing to Open Source projects, here is a[Getting Started Guide](https://opensource.com/article/19/7/create-pull-request-github).
If you’d like to learn more about contributing to Open Source projects, here is a[Getting Started Guide](https://opensource.com/article/19/7/create-pull-request-github).
## Best Practices
## Best Practices:
* Keep your pull requests small. Smaller pull requests are more likely to be accepted and merged
* Comments! Commenting your code helps reviewers easily understand your contribution
* Use Python and Typescript’s typing systems, and consider using an editor with [LSP](https://microsoft.github.io/language-server-protocol/) support to streamline development
* Make all communications public. This ensure knowledge is shared with the whole community
- Keep your pull requests small. Smaller pull requests are more likely to be accepted and merged
- Comments! Commenting your code helps reviewers easily understand your contribution
- Use Python and Typescript’s typing systems, and consider using an editor with [LSP](https://microsoft.github.io/language-server-protocol/) support to streamline development
- Make all communications public. This ensure knowledge is shared with the whole community
## **Where can I go for help?**
If you need help, you can ask questions in the [#dev-chat](https://discord.com/channels/1020123559063990373/1049495067846524939) channel of the Discord.
For frontend related work, **@pyschedelicious** is the best person to reach out to.
For frontend related work, **@pyschedelicious** is the best person to reach out to.
For backend related work, please reach out to **@blessedcoolant**, **@lstein**, **@StAlKeR7779** or **@pyschedelicious**.
Tutorials help new & existing users expand their abilty to use InvokeAI to the full extent of our features and services.
Tutorials help new & existing users expand their ability to use InvokeAI to the full extent of our features and services.
Currently, we have a set of tutorials available on our [YouTube channel](https://www.youtube.com/@invokeai), but as InvokeAI continues to evolve with new updates, we want to ensure that we are giving our users the resources they need to succeed.
@@ -8,4 +8,4 @@ Tutorials can be in the form of videos or article walkthroughs on a subject of y
## Contributing
Please reach out to @imic or @hipsterusername on [Discord](https://discord.gg/ZmtBAhwWhy) to help create tutorials for InvokeAI.
Please reach out to @imic or @hipsterusername on [Discord](https://discord.gg/ZmtBAhwWhy) to help create tutorials for InvokeAI.
To make changes to Invoke's backend, frontend or documentation, you'll need to set up a dev environment.
If you only want to make changes to the docs site, you can skip the frontend dev environment setup as described in the below guide.
If you just want to use Invoke, you should use the [launcher][launcher link].
!!! warning
Invoke uses a SQLite database. When you run the application as a dev install, you accept responsibility for your database. This means making regular backups (especially before pulling) and/or fixing it yourself in the event that a PR introduces a schema change.
If you don't need to persist your db, you can use an ephemeral in-memory database by setting `use_memory_db: true` in your `invokeai.yaml` file. You'll also want to set `scan_models_on_startup: true` so that your models are registered on startup.
## Setup
1. Run through the [requirements][requirements link].
2. [Fork and clone][forking link] the [InvokeAI repo][repo link].
3. This repository uses Git LFS to manage large files. To ensure all assets are downloaded:
- Enable automatic LFS fetching for this repository:
```shell
git config lfs.fetchinclude "*"
```
- Fetch files from LFS (only needs to be done once; subsequent `git pull` will fetch changes automatically):
```
git lfs pull
```
4. Create an directory for user data (images, models, db, etc). This is typically at `~/invokeai`, but if you already have a non-dev install, you may want to create a separate directory for the dev install.
5. Follow the [manual install][manual install link] guide, with some modifications to the install command:
- Use `.` instead of `invokeai` to install from the current directory. You don't need to specify the version.
- Add `-e` after the `install` operation to make this an [editable install][editable install link]. That means your changes to the python code will be reflected when you restart the Invoke server.
- When installing the `invokeai` package, add the `dev`, `test` and `docs` package options to the package specifier. You may or may not need the `xformers` option - follow the manual install guide to figure that out. So, your package specifier will be either `".[dev,test,docs]"` or `".[dev,test,docs,xformers]"`. Note the quotes!
With the modifications made, the install command should look something like this:
6. At this point, you should have Invoke installed, a venv set up and activated, and the server running. But you will see a warning in the terminal that no UI was found. If you go to the URL for the server, you won't get a UI.
This is because the UI build is not distributed with the source code. You need to build it manually. End the running server instance.
If you only want to edit the docs, you can stop here and skip to the **Documentation** section below.
7. Install the frontend dev toolchain:
- [`nodejs`](https://nodejs.org/) (v20+)
- [`pnpm`](https://pnpm.io/8.x/installation) (must be v8 - not v9!)
8. Do a production build of the frontend:
```sh
cd <PATH_TO_INVOKEAI_REPO>/invokeai/frontend/web
pnpm i
pnpm build
```
9. Restart the server and navigate to the URL. You should get a UI. After making changes to the python code, restart the server to see those changes.
## Updating the UI
You'll need to run `pnpm build` every time you pull in new changes.
Another option is to skip the build and instead run the UI in dev mode:
```sh
pnpm dev
```
This starts a vite dev server for the UI at `127.0.0.1:5173`, which you will use instead of `127.0.0.1:9090`.
The dev mode is substantially slower than the production build but may be more convenient if you just need to test things out. It will hot-reload the UI as you make changes to the frontend code. Sometimes the hot-reload doesn't work, and you need to manually refresh the browser tab.
## Documentation
The documentation is built with `mkdocs`. It provides a hot-reload dev server for the docs. Start it with `mkdocs serve`.
@@ -4,52 +4,42 @@ Invoke's UI is made possible by many contributors and open-source libraries. Tha
## Dev environment
### Setup
Follow the [dev environment](../dev-environment.md) guide to get set up. Run the UI using `pnpm dev`.
1. Install [node] and [pnpm].
1. Run `pnpm i` to install all packages.
#### Run in dev mode
1. From `invokeai/frontend/web/`, run `pnpm dev`.
1. From repo root, run `python scripts/invokeai-web.py`.
1. Point your browser to the dev server address, e.g. <http://localhost:5173/>
### Package scripts
## Package scripts
- `dev`: run the frontend in dev mode, enabling hot reloading
- `build`: run all checks (madge, eslint, prettier, tsc) and then build the frontend
- `typegen`: generate types from the OpenAPI schema (see [Type generation])
- `build`: run all checks (dpdm, eslint, prettier, tsc, knip) and then build the frontend
- `lint:dpdm`: check circular dependencies
- `lint:eslint`: check code quality
- `lint:prettier`: check code formatting
- `lint:tsc`: check type issues
- `lint:knip`: check for unused exports or objects (failures here are just suggestions, not hard fails)
- `lint:knip`: check for unused exports or objects
- `lint`: run all checks concurrently
- `fix`: run `eslint` and `prettier`, fixing fixable issues
- `test:ui`: run `vitest` with the fancy web UI
### Type generation
## Type generation
We use [openapi-typescript] to generate types from the app's OpenAPI schema.
We use [openapi-typescript] to generate types from the app's OpenAPI schema. The generated types are committed to the repo in [schema.ts].
The generated types are committed to the repo in [schema.ts].
If you make backend changes, it's important to regenerate the frontend types:
```sh
# from the repo root, start the server
python scripts/invokeai-web.py
# from invokeai/frontend/web/, run the script
pnpm typegen
cd invokeai/frontend/web && python ../../../scripts/generate_openapi_schema.py | pnpm typegen
```
### Localization
On macOS and Linux, you can run `make frontend-typegen` as a shortcut for the above snippet.
## Localization
We use [i18next] for localization, but translation to languages other than English happens on our [Weblate] project.
Only the English source strings should be changed on this repo.
Only the English source strings (i.e. `en.json`) should be changed on this repo.
### VSCode
## VSCode
#### Example debugger config
### Example debugger config
```jsonc
{
@@ -66,7 +56,7 @@ Only the English source strings should be changed on this repo.
}
```
#### Remote dev
### Remote dev
We've noticed an intermittent timeout issue with the VSCode remote dev port forwarding.
@@ -82,7 +72,7 @@ Thanks for your interest in contributing to the Invoke Web UI!
Please follow these guidelines when contributing.
### Check in before investing your time
## Check in before investing your time
Please check in before you invest your time on anything besides a trivial fix, in case it conflicts with ongoing work or isn't aligned with the vision for the app.
@@ -90,7 +80,7 @@ If a feature request or issue doesn't already exist for the thing you want to wo
Ping `@psychedelicious` on [discord] in the `#frontend-dev` channel or in the feature request / issue you want to work on - we're happy to chat.
### Code conventions
## Code conventions
- This is a fairly complex app with a deep component tree. Please use memoization (`useCallback`, `useMemo`, `memo`) with enthusiasm.
- If you need to add some global, ephemeral state, please use [nanostores] if possible.
@@ -98,7 +88,7 @@ Ping `@psychedelicious` on [discord] in the `#frontend-dev` channel or in the fe
- Feel free to use `lodash` (via `lodash-es`) to make the intent of your code clear.
- Please add comments describing the "why", not the "how" (unless it is really arcane).
### Commit format
## Commit format
Please use the [conventional commits] spec for the web UI, with a scope of "ui":
@@ -107,27 +97,32 @@ Please use the [conventional commits] spec for the web UI, with a scope of "ui":
- `feat(ui): add some cool new feature`
- `fix(ui): fix some bug`
### Submitting a PR
## Tests
We don't do any UI testing at this time, but consider adding tests for sensitive logic.
We use `vitest`, and tests should be next to the file they are testing. If the logic is in `something.ts`, the tests should be in `something.test.ts`.
In some situations, we may want to test types. For example, if you use `zod` to create a schema that should match a generated type, it's best to add a test to confirm that the types match. Use `tsafe`'s assert for this.
## Submitting a PR
- Ensure your branch is tidy. Use an interactive rebase to clean up the commit history and reword the commit messages if they are not descriptive.
- Run `pnpm lint`. Some issues are auto-fixable with `pnpm fix`.
- Fill out the PR form when creating the PR.
- It doesn't need to be super detailed, but a screenshot or video is nice if you changed something visually.
- If a section isn't relevant, delete it. There are no UI tests at this time.
@@ -117,13 +117,13 @@ Stateless fields do not store their value in the node, so their field instances
"Custom" fields will always be treated as stateless fields.
##### Collection and Scalar Fields
##### Single and Collection Fields
Field types have a name and two flags which may identify it as a **collection** or **collection or scalar** field.
Field types have a name and cardinality property which may identify it as a **SINGLE**, **COLLECTION** or **SINGLE_OR_COLLECTION** field.
If a field is annotated in python as a list, its field type is parsed and flagged as a **collection** type (e.g. `list[int]`).
If it is annotated as a union of a type and list, the type will be flagged as a **collection or scalar** type (e.g. `Union[int, list[int]]`). Fields may not be unions of different types (e.g. `Union[int, list[str]]` and `Union[int, str]` are not allowed).
- If a field is annotated in python as a singular value or class, its field type is parsed as a **SINGLE** type (e.g. `int`, `ImageField`, `str`).
- If a field is annotated in python as a list, its field type is parsed as a **COLLECTION** type (e.g. `list[int]`).
- If it is annotated as a union of a type and list, the type will be parsed as a **SINGLE_OR_COLLECTION** type (e.g. `Union[int, list[int]]`). Fields may not be unions of different types (e.g. `Union[int, list[str]]` and `Union[int, str]` are not allowed).
## Implementation
@@ -173,8 +173,7 @@ Field types are represented as structured objects:
@@ -186,7 +185,7 @@ There are 4 general cases for field type parsing.
When a field is annotated as a primitive values (e.g. `int`, `str`, `float`), the field type parsing is fairly straightforward. The field is represented by a simple OpenAPI **schema object**, which has a `type` property.
We create a field type name from this `type` string (e.g. `string` -> `StringField`).
We create a field type name from this `type` string (e.g. `string` -> `StringField`). The cardinality is `"SINGLE"`.
##### Complex Types
@@ -200,13 +199,13 @@ We need to **dereference** the schema to pull these out. Dereferencing may requi
When a field is annotated as a list of a single type, the schema object has an `items` property. They may be a schema object or reference object and must be parsed to determine the item type.
We use the item type for field type name, adding `isCollection: true` to the field type.
We use the item type for field type name. The cardinality is `"COLLECTION"`.
##### Collection or Scalar Types
##### Single or Collection Types
When a field is annotated as a union of a type and list of that type, the schema object has an `anyOf` property, which holds a list of valid types for the union.
After verifying that the union has two members (a type and list of the same type), we use the type for field type name, adding `isCollectionOrScalar: true` to the field type.
After verifying that the union has two members (a type and list of the same type), we use the type for field type name, with cardinality `"SINGLE_OR_COLLECTION"`.
Invoke AI originated as a project built by the community, and that vision carries forward today as we aim to build the best pro-grade tools available. We work together to incorporate the latest in AI/ML research, making these tools available in over 20 languages to artists and creatives around the world as part of our fully permissive OSS project designed for individual users to self-host and use.
Invoke originated as a project built by the community, and that vision carries forward today as we aim to build the best pro-grade tools available. We work together to incorporate the latest in AI/ML research, making these tools available in over 20 languages to artists and creatives around the world as part of our fully permissive OSS project designed for individual users to self-host and use.
# Methods of Contributing to Invoke AI
Anyone who wishes to contribute to InvokeAI, whether features, bug fixes, code cleanup, testing, code reviews, documentation or translation is very much encouraged to do so.
We welcome contributions, whether features, bug fixes, code cleanup, testing, code reviews, documentation or translation. Please check in with us before diving in to code to ensure your work aligns with our vision.
## Development
If you’d like to help with development, please see our [development guide](contribution_guides/development.md).
If you’d like to help with development, please see our [development guide](contribution_guides/development.md).
**New Contributors:** If you’re unfamiliar with contributing to open source projects, take a look at our [new contributor guide](contribution_guides/newContributorChecklist.md).
## Nodes
If you’d like to add a Node, please see our [nodes contribution guide](../nodes/contributingNodes.md).
## Support and Triaging
Helping support other users in [Discord](https://discord.gg/ZmtBAhwWhy) and on Github are valuable forms of contribution that we greatly appreciate.
We receive many issues and requests for help from users. We're limited in bandwidth relative to our the user base, so providing answers to questions or helping identify causes of issues is very helpful. By doing this, you enable us to spend time on the highest priority work.
Helping support other users in [Discord](https://discord.gg/ZmtBAhwWhy) and on Github are valuable forms of contribution that we greatly appreciate.
We receive many issues and requests for help from users. We're limited in bandwidth relative to our the user base, so providing answers to questions or helping identify causes of issues is very helpful. By doing this, you enable us to spend time on the highest priority work.
## Documentation
If you’d like to help with documentation, please see our [documentation guide](contribution_guides/documentation.md).
## Translation
If you'd like to help with translation, please see our[translation guide](contribution_guides/translation.md).
## Tutorials
Please reach out to @imic or @hipsterusername on [Discord](https://discord.gg/ZmtBAhwWhy) to help create tutorials for InvokeAI.
## Tutorials
We hope you enjoy using our software as much as we enjoy creating it, and we hope that some of those of you who are reading this will elect to become part of our contributor community.
Please reach out to @hipsterusername on [Discord](https://discord.gg/ZmtBAhwWhy) to help create tutorials for InvokeAI.
## Contributors
# Contributors
This project is a combined effort of dedicated people from across the world.[Check out the list of all these amazing people](contributors.md). We thank them for their time, hard work and effort.
This project is a combined effort of dedicated people from across the world.[Check out the list of all these amazing people](https://invoke-ai.github.io/InvokeAI/other/CONTRIBUTORS/). We thank them for their time, hard work and effort.
## Code of Conduct
# Code of Conduct
The InvokeAI community is a welcoming place, and we want your help in maintaining that. Please review our [Code of Conduct](https://github.com/invoke-ai/InvokeAI/blob/main/CODE_OF_CONDUCT.md) to learn more - it's essential to maintaining a respectful and inclusive environment.
The InvokeAI community is a welcoming place, and we want your help in maintaining that. Please review our [Code of Conduct](../CODE_OF_CONDUCT.md) to learn more - it's essential to maintaining a respectful and inclusive environment.
By making a contribution to this project, you certify that:
@@ -49,12 +50,3 @@ By making a contribution to this project, you certify that:
This disclaimer is not a license and does not grant any rights or permissions. You must obtain necessary permissions and licenses, including from third parties, before contributing to this project.
This disclaimer is provided "as is" without warranty of any kind, whether expressed or implied, including but not limited to the warranties of merchantability, fitness for a particular purpose, or non-infringement. In no event shall the authors or copyright holders be liable for any claim, damages, or other liability, whether in an action of contract, tort, or otherwise, arising from, out of, or in connection with the contribution or the use or other dealings in the contribution.
# Support
For support, please use this repository's [GitHub Issues](https://github.com/invoke-ai/InvokeAI/issues), or join the [Discord](https://discord.gg/ZmtBAhwWhy).
Original portions of the software are Copyright (c) 2023 by respective contributors.
---
Remember, your contributions help make this project great. We're excited to see what you'll bring to our community!
If the troubleshooting steps on this page don't get you up and running, please either [create an issue] or hop on [discord] for help.
## How to Install
Follow the [Quick Start guide](./installation/quick_start.md) to install Invoke.
## Downloading models and using existing models
The Model Manager tab in the UI provides a few ways to install models, including using your already-downloaded models. You'll see a popup directing you there on first startup. For more information, see the [model install docs].
## Missing models after updating from v3
If you find some models are missing after updating from v3, it's likely they weren't correctly registered before the update and didn't get picked up in the migration.
You can use the `Scan Folder` tab in the Model Manager UI to fix this. The models will either be in the old, now-unused `autoimport` folder, or your `models` folder.
- Find and copy your install's old `autoimport` folder path, install the main install folder.
- Go to the Model Manager and click `Scan Folder`.
- Paste the path and scan.
- IMPORTANT: Uncheck `Inplace install`.
- Click `Install All` to install all found models, or just install the models you want.
Next, find and copy your install's `models` folder path (this could be your custom models folder path, or the `models` folder inside the main install folder).
Follow the same steps to scan and import the missing models.
## Slow generation
- Check the [system requirements] to ensure that your system is capable of generating images.
- Follow the [Low-VRAM mode guide](./features/low-vram.md) to optimize performance.
- Check that your generations are happening on your GPU (if you have one). Invoke will log what is being used for generation upon startup. If your GPU isn't used, re-install to and ensure you select the appropriate GPU option.
- If you are on Windows with an Nvidia GPU, you may have exceeded your GPU's VRAM capacity and are triggering Nvidia's "sysmem fallback". There's a guide to opt out of this behaviour in the [Low-VRAM mode guide](./features/low-vram.md).
## Triton error on startup
This can be safely ignored. Invoke doesn't use Triton, but if you are on Linux and wish to dismiss the error, you can install Triton.
## Unable to Copy on Firefox
Firefox does not allow Invoke to directly access the clipboard by default. As a result, you may be unable to use certain copy functions. You can fix this by configuring Firefox to allow access to write to the clipboard:
- Go to `about:config` and click the Accept button
- Search for `dom.events.asyncClipboard.clipboardItem`
- Set it to `true` by clicking the toggle button
- Restart Firefox
## Replicate image found online
Most example images with prompts that you'll find on the internet have been generated using different software, so you can't expect to get identical results. In order to reproduce an image, you need to replicate the exact settings and processing steps, including (but not limited to) the model, the positive and negative prompts, the seed, the sampler, the exact image size, any upscaling steps, etc.
## Invalid configuration file
Everything seems to install ok, you get a `ValidationError` when starting up the app.
This is caused by an invalid setting in the `invokeai.yaml` configuration file. The error message should tell you what is wrong.
Check the [configuration docs] for more detail about the settings and how to specify them.
## Out of Memory Errors
The models are large, VRAM is expensive, and you may find yourself faced with Out of Memory errors when generating images. Follow our [Low-VRAM mode guide](./features/low-vram.md) to configure Invoke to prevent these.
## Memory Leak (Linux)
If you notice a memory leak, it could be caused to memory fragmentation as models are loaded and/or moved from CPU to GPU.
A workaround is to tune memory allocation with an environment variable:
```bash
# Force blocks >1MB to be allocated with `mmap` so that they are released to the system immediately when they are freed.
MALLOC_MMAP_THRESHOLD_=1048576
```
!!! warning "Speed vs Memory Tradeoff"
Your generations may be slower overall when setting this environment variable.
!!! info "Possibly dependent on `libc` implementation"
It's not known if this issue occurs with other `libc` implementations such as `musl`.
If you encounter this issue and your system uses a different implementation, please try this environment variable and let us know if it fixes the issue.
<h3>Detailed Discussion</h3>
Python (and PyTorch) relies on the memory allocator from the C Standard Library (`libc`). On linux, with the GNU C Standard Library implementation (`glibc`), our memory access patterns have been observed to cause severe memory fragmentation.
This fragmentation results in large amounts of memory that has been freed but can't be released back to the OS. Loading models from disk and moving them between CPU/CUDA seem to be the operations that contribute most to the fragmentation.
This memory fragmentation issue can result in OOM crashes during frequent model switching, even if `ram` (the max RAM cache size) is set to a reasonable value (e.g. a OOM crash with `ram=16` on a system with 32GB of RAM).
This problem may also exist on other OSes, and other `libc` implementations. But, at the time of writing, it has only been investigated on linux with `glibc`.
To better understand how the `glibc` memory allocator works, see these references:
Note the differences between memory allocated as chunks in an arena vs. memory allocated with `mmap`. Under `glibc`'s default configuration, most model tensors get allocated as chunks in an arena making them vulnerable to the problem of fragmentation.
ControlNet is a powerful set of features developed by the open-source
community (notably, Stanford researcher
[**@ilyasviel**](https://github.com/lllyasviel)) that allows you to
apply a secondary neural network model to your image generation
process in Invoke.
With ControlNet, you can get more control over the output of your
image generation, providing you with a way to direct the network
towards generating images that better fit your desired style or
outcome.
ControlNet works by analyzing an input image, pre-processing that
image to identify relevant information that can be interpreted by each
specific ControlNet model, and then inserting that control information
into the generation process. This can be used to adjust the style,
composition, or other aspects of the image to better achieve a
specific result.
#### Installation
InvokeAI provides access to a series of ControlNet models that provide
different effects or styles in your generated images.
To install ControlNet Models:
1. The easiest way to install them is
to use the InvokeAI model installer application. Use the
`invoke.sh`/`invoke.bat` launcher to select item [4] and then navigate
to the CONTROLNETS section. Select the models you wish to install and
press "APPLY CHANGES". You may also enter additional HuggingFace
repo_ids in the "Additional models" textbox.
2. Using the "Add Model" function of the model manager, enter the HuggingFace Repo ID of the ControlNet. The ID is in the format "author/repoName"
_Be aware that some ControlNet models require additional code
functionality in order to work properly, so just installing a
third-party ControlNet model may not have the desired effect._ Please
read and follow the documentation for installing a third party model
not currently included among InvokeAI's default list.
Currently InvokeAI **only** supports 🤗 Diffusers-format ControlNet models. These are
folders that contain the files `config.json` and/or
`diffusion_pytorch_model.safetensors` and
`diffusion_pytorch_model.fp16.safetensors`. The name of the folder is
the name of the model.
🤗 Diffusers-format ControlNet models are available at HuggingFace
(http://huggingface.co) and accessed via their repo IDs (identifiers
in the format "author/modelname").
#### ControlNet Models
The models currently supported include:
**Canny**:
When the Canny model is used in ControlNet, Invoke will attempt to generate images that match the edges detected.
Canny edge detection works by detecting the edges in an image by looking for abrupt changes in intensity. It is known for its ability to detect edges accurately while reducing noise and false edges, and the preprocessor can identify more information by decreasing the thresholds.
**M-LSD**:
M-LSD is another edge detection algorithm used in ControlNet. It stands for Multi-Scale Line Segment Detector.
It detects straight line segments in an image by analyzing the local structure of the image at multiple scales. It can be useful for architectural imagery, or anything where straight-line structural information is needed for the resulting output.
**Lineart**:
The Lineart model in ControlNet generates line drawings from an input image. The resulting pre-processed image is a simplified version of the original, with only the outlines of objects visible.The Lineart model in ControlNet is known for its ability to accurately capture the contours of the objects in an input sketch.
**Lineart Anime**:
A variant of the Lineart model that generates line drawings with a distinct style inspired by anime and manga art styles.
**Depth**:
A model that generates depth maps of images, allowing you to create more realistic 3D models or to simulate depth effects in post-processing.
**Normal Map (BAE):**
A model that generates normal maps from input images, allowing for more realistic lighting effects in 3D rendering.
**Image Segmentation**:
A model that divides input images into segments or regions, each of which corresponds to a different object or part of the image. (More details coming soon)
**QR Code Monster**:
A model that helps generate creative QR codes that still scan. Can also be used to create images with text, logos or shapes within them.
**Openpose**:
The OpenPose control model allows for the identification of the general pose of a character by pre-processing an existing image with a clear human structure. With advanced options, Openpose can also detect the face or hands in the image.
*Note:* The DWPose Processor has replaced the OpenPose processor in Invoke. Workflows and generations that relied on the OpenPose Processor will need to be updated to use the DWPose Processor instead.
**Mediapipe Face**:
The MediaPipe Face identification processor is able to clearly identify facial features in order to capture vivid expressions of human faces.
**Tile**:
The Tile model fills out details in the image to match the image, rather than the prompt. The Tile Model is a versatile tool that offers a range of functionalities. Its primary capabilities can be boiled down to two main behaviors:
- It can reinterpret specific details within an image and create fresh, new elements.
- It has the ability to disregard global instructions if there's a discrepancy between them and the local context or specific parts of the image. In such cases, it uses the local context to guide the process.
The Tile Model can be a powerful tool in your arsenal for enhancing image quality and details. If there are undesirable elements in your images, such as blurriness caused by resizing, this model can effectively eliminate these issues, resulting in cleaner, crisper images. Moreover, it can generate and add refined details to your images, improving their overall quality and appeal.
**Pix2Pix (experimental)**
With Pix2Pix, you can input an image into the controlnet, and then "instruct" the model to change it using your prompt. For example, you can say "Make it winter" to add more wintry elements to a scene.
Each of these models can be adjusted and combined with other ControlNet models to achieve different results, giving you even more control over your image generation process.
### Using ControlNet
To use ControlNet, you can simply select the desired model and adjust both the ControlNet and Pre-processor settings to achieve the desired result. You can also use multiple ControlNet models at the same time, allowing you to achieve even more complex effects or styles in your generated images.
Each ControlNet has two settings that are applied to the ControlNet.
Weight - Strength of the Controlnet model applied to the generation for the section, defined by start/end.
Start/End - 0 represents the start of the generation, 1 represents the end. The Start/end setting controls what steps during the generation process have the ControlNet applied.
Additionally, each ControlNet section can be expanded in order to manipulate settings for the image pre-processor that adjusts your uploaded image before using it in when you Invoke.
## T2I-Adapter
[T2I-Adapter](https://github.com/TencentARC/T2I-Adapter) is a tool similar to ControlNet that allows for control over the generation process by providing control information during the generation process. T2I-Adapter models tend to be smaller and more efficient than ControlNets.
##### Installation
To install T2I-Adapter Models:
1. The easiest way to install models is
to use the InvokeAI model installer application. Use the
`invoke.sh`/`invoke.bat` launcher to select item [5] and then navigate
to the T2I-Adapters section. Select the models you wish to install and
press "APPLY CHANGES". You may also enter additional HuggingFace
repo_ids in the "Additional models" textbox.
2. Using the "Add Model" function of the model manager, enter the HuggingFace Repo ID of the T2I-Adapter. The ID is in the format "author/repoName"
#### Usage
Each T2I Adapter has two settings that are applied.
Weight - Strength of the model applied to the generation for the section, defined by start/end.
Start/End - 0 represents the start of the generation, 1 represents the end. The Start/end setting controls what steps during the generation process have the ControlNet applied.
Additionally, each section can be expanded with the "Show Advanced" button in order to manipulate settings for the image pre-processor that adjusts your uploaded image before using it in during the generation process.
## IP-Adapter
[IP-Adapter](https://ip-adapter.github.io) is a tooling that allows for image prompt capabilities with text-to-image diffusion models. IP-Adapter works by analyzing the given image prompt to extract features, then passing those features to the UNet along with any other conditioning provided.
There are several ways to install IP-Adapter models with an existing InvokeAI installation:
1. Through the command line interface launched from the invoke.sh / invoke.bat scripts, option [4] to download models.
2. Through the Model Manager UI with models from the *Tools* section of [www.models.invoke.ai](https://www.models.invoke.ai). To do this, copy the repo ID from the desired model page, and paste it in the Add Model field of the model manager. **Note** Both the IP-Adapter and the Image Encoder must be installed for IP-Adapter to work. For example, the [SD 1.5 IP-Adapter](https://models.invoke.ai/InvokeAI/ip_adapter_plus_sd15) and [SD1.5 Image Encoder](https://models.invoke.ai/InvokeAI/ip_adapter_sd_image_encoder) must be installed to use IP-Adapter with SD1.5 based models.
3.**Advanced -- Not recommended ** Manually downloading the IP-Adapter and Image Encoder files - Image Encoder folders shouid be placed in the `models\any\clip_vision` folders. IP Adapter Model folders should be placed in the relevant `ip-adapter` folder of relevant base model folder of Invoke root directory. For example, for the SDXL IP-Adapter, files should be added to the `model/sdxl/ip_adapter/` folder.
#### Using IP-Adapter
IP-Adapter can be used by navigating to the *Control Adapters* options and enabling IP-Adapter.
IP-Adapter requires an image to be used as the Image Prompt. It can also be used in conjunction with text prompts, Image-to-Image, Inpainting, Outpainting, ControlNets and LoRAs.
Each IP-Adapter has two settings that are applied to the IP-Adapter:
* Weight - Strength of the IP-Adapter model applied to the generation for the section, defined by start/end
* Start/End - 0 represents the start of the generation, 1 represents the end. The Start/end setting controls what steps during the generation process have the IP-Adapter applied.
With the advances in research, many new capabilities are available to customize the knowledge and understanding of novel concepts not originally contained in the base model.
## LoRAs
Low-Rank Adaptation (LoRA) files are models that customize the output of Stable Diffusion
image generation. Larger than embeddings, but much smaller than full
models, they augment SD with improved understanding of subjects and
artistic styles.
Unlike TI files, LoRAs do not introduce novel vocabulary into the
model's known tokens. Instead, LoRAs augment the model's weights that
are applied to generate imagery. LoRAs may be supplied with a
"trigger" word that they have been explicitly trained on, or may
simply apply their effect without being triggered.
LoRAs are typically stored in .safetensors files, which are the most
secure way to store and transmit these types of weights.
To use these when generating, open the LoRA menu item in the options
panel, select the LoRAs you want to apply and ensure that they have
the appropriate weight recommended by the model provider. Typically,
most LoRAs perform best at a weight of .75-1.
## LCM-LoRAs
Latent Consistency Models (LCMs) allowed a reduced number of steps to be used to generate images with Stable Diffusion. These are created by distilling base models, creating models that only require a small number of steps to generate images. However, LCMs require that any fine-tune of a base model be distilled to be used as an LCM.
LCM-LoRAs are models that provide the benefit of LCMs but are able to be used as LoRAs and applied to any fine tune of a base model. LCM-LoRAs are created by training a small number of adapters, rather than distilling the entire fine-tuned base model. The resulting LoRA can be used the same way as a standard LoRA, but with a greatly reduced step count. This enables SDXL images to be generated up to 10x faster than without the use of LCM-LoRAs.
**Using LCM-LoRAs**
LCM-LoRAs are natively supported in InvokeAI throughout the application. To get started, install any diffusers format LCM-LoRAs using the model manager and select it in the LoRA field.
There are a number parameter differences when using LCM-LoRAs and standard generation:
- When using LCM-LoRAs, the LoRA strength should be lower than if using a standard LoRA, with 0.35 recommended as a starting point.
- The LCM scheduler should be used for generation
- CFG-Scale should be reduced to ~1
- Steps should be reduced in the range of 4-8
Standard LoRAs can also be used alongside LCM-LoRAs, but will also require a lower strength, with 0.45 being recommended as a starting point.
More information can be found here: https://huggingface.co/blog/lcm_lora#fast-inference-with-sdxl-lcm-loras
InvokeAI provides the ability to merge two or three diffusers-type models into a new merged model. The
resulting model will combine characteristics of the original, and can
be used to teach an old model new tricks.
## How to Merge Models
Model Merging can be be done by navigating to the Model Manager and clicking the "Merge Models" tab. From there, you can select the models and settings you want to use to merge th models.
## Settings
* Model Selection: there are three multiple choice fields that
display all the diffusers-style models that InvokeAI knows about.
If you do not see the model you are looking for, then it is probably
a legacy checkpoint model and needs to be converted using the
"Convert" option in the Web-based Model Manager tab.
You must select at least two models to merge. The third can be left
at "None" if you desire.
* Alpha: This is the ratio to use when combining models. It ranges
from 0 to 1. The higher the value, the more weight is given to the
2d and (optionally) 3d models. So if you have two models named "A"
and "B", an alpha value of 0.25 will give you a merged model that is
25% A and 75% B.
* Interpolation Method: This is the method used to combine
weights. The options are "weighted_sum" (the default), "sigmoid",
"inv_sigmoid" and "add_difference". Each produces slightly different
results. When three models are in use, only "add_difference" is
available.
* Save Location: The location you want the merged model to be saved in. Default is in the InvokeAI root folder
* Name for merged model: This is the name for the new model. Please
use InvokeAI conventions - only alphanumeric letters and the
characters ".+-".
* Ignore Mismatches / Force: Not all models are compatible with each other. The merge
script will check for compatibility and refuse to merge ones that
are incompatible. Set this checkbox to try merging anyway.
You may run the merge script by starting the invoke launcher
(`invoke.sh` or `invoke.bat`) and choosing the option (4) for _merge
models_. This will launch a text-based interactive user interface that
prompts you to select the models to merge, how to merge them, and the
merged model name.
Alternatively you may activate InvokeAI's virtual environment from the
command line, and call the script via `merge_models --gui` to open up
a version that has a nice graphical front end. To get the commandline-
only version, omit `--gui`.
The user interface for the text-based interactive script is
straightforward. It shows you a series of setting fields. Use control-N (^N)
to move to the next field, and control-P (^P) to move to the previous
one. You can also use TAB and shift-TAB to move forward and
backward. Once you are in a multiple choice field, use the up and down
cursor arrows to move to your desired selection, and press <SPACE> or
<ENTER> to select it. Change text fields by typing in them, and adjust
scrollbars using the left and right arrow keys.
Once you are happy with your settings, press the OK button. Note that
there may be two pages of settings, depending on the height of your
screen, and the OK button may be on the second page. Advance past the
last field of the first page to get to the second page, and reverse
this to get back.
If the merge runs successfully, it will create a new diffusers model
under the selected name and register it with InvokeAI.
[{ align="right" }](https://colab.research.google.com/github/lstein/stable-diffusion/blob/main/notebooks/Stable_Diffusion_AI_Notebook.ipynb)
Open and follow instructions to use an isolated environment running Dream.
Output Example:

---
## **Invisible Watermark**
In keeping with the principles for responsible AI generation, and to
help AI researchers avoid synthetic images contaminating their
training sets, InvokeAI adds an invisible watermark to each of the
final images it generates. The watermark consists of the text
This sections details the ability to improve faces and upscale images.
## Face Fixing
As of InvokeAI 3.0, the easiest way to improve faces created during image generation is through the Inpainting functionality of the Unified Canvas. Simply add the image containing the faces that you would like to improve to the canvas, mask the face to be improved and run the invocation. For best results, make sure to use an inpainting specific model; these are usually identified by the "-inpainting" term in the model name.
## Upscaling
Open the upscaling dialog by clicking on the "expand" icon located
The default upscaling option is Real-ESRGAN x2 Plus, which will scale your image by a factor of two. This means upscaling a 512x512 image will result in a new 1024x1024 image.
Other options are the x4 upscalers, which will scale your image by a factor of 4.
!!! note
Real-ESRGAN is memory intensive. In order to avoid crashes and memory overloads
during the Stable Diffusion process, these effects are applied after Stable Diffusion has completed
its work.
In single image generations, you will see the output right away but when you are using multiple
iterations, the images will first be generated and then upscaled after that
process is complete. While the image generation is taking place, you will still be able to preview
the base images.
## How to disable
If, for some reason, you do not wish to load the ESRGAN libraries,
you can disable them on the invoke.py command line with the `--no_esrgan` options.
|  |  |  |
Using `+` to increase apricot-ness:
| `a man picking apricots+ from a tree` | `a man picking apricots++ from a tree` | `a man picking apricots+++ from a tree` | `a man picking apricots++++ from a tree` | `a man picking apricots+++++ from a tree` |
|  |  |  |  |  |
You can also change the balance between different parts of a prompt. For
example, below is a `mountain man`:
<figure markdown>

-`("a tall thin man picking apricots", "a tall thin man picking pears").blend(1,1)`
- The existing prompt blending using `:<weight>` will continue to be supported -
`("a tall thin man picking apricots", "a tall thin man picking pears").blend(1,1)`
is equivalent to
`a tall thin man picking apricots:1 a tall thin man picking pears:1` in the
old syntax.
- Attention weights can be nested inside blends.
- Non-normalized blends are supported by passing `no_normalize` as an additional
argument to the blend weights, eg
`("a tall thin man picking apricots", "a tall thin man picking pears").blend(1,-1,no_normalize)`.
very fun to explore local maxima in the feature space, but also easy to
produce garbage output.
See the section below on "Prompt Blending" for more information about how this
works.
### Prompt Conjunction
Join multiple clauses together to create a conjoined prompt. Each clause will be passed to CLIP separately.
For example, the prompt:
```bash
"A mystical valley surround by towering granite cliffs, watercolor, warm"
```
Can be used with .and():
```bash
("A mystical valley", "surround by towering granite cliffs", "watercolor", "warm").and()
```
Each will give you different results - try them out and see what you prefer!
### Escaping parentheses and speech marks
If the model you are using has parentheses () or speech marks "" as part of its
syntax, you will need to "escape" these using a backslash, so that`(my_keyword)`
becomes `\(my_keyword\)`. Otherwise, the prompt parser will attempt to interpret
the parentheses as part of the prompt syntax and it will get confused.
---
## **Prompt Blending**
You may blend together prompts to explore the AI's
latent semantic space and generate interesting (and often surprising!)
variations. The syntax is:
```bash
("prompt #1", "prompt #2").blend(0.25, 0.75)
```
This will tell the sampler to blend 25% of the concept of prompt #1 with 75%
of the concept of prompt #2. It is recommended to keep the sum of the weights to around 1.0, but interesting things might happen if you go outside of this range.
Because you are exploring the "mind" of the AI, the AI's way of mixing two
concepts may not match yours, leading to surprising effects. To illustrate, here
are three images generated using various combinations of blend weights. As
usual, unless you fix the seed, the prompts will give you different results each
time you run them.
Let's examine how this affects image generation results:
```bash
"blue sphere, red cube, hybrid"
```
This example doesn't use blending at all and represents the default way of mixing
It's interesting to see how the AI expressed the concept of "cube" within the sphere. If you look closely, there is depth there, so the enclosing frame is actually a cube.
Now that's interesting. We get an image with a resemblance of a red cube, with a hint of blue shadows which represents a melding of concepts within the AI's "latent space" of semantic representations.
Whoa...! I see blue and red, and if I squint, spheres and cubes.
## Dynamic Prompts
Dynamic Prompts are a powerful feature designed to produce a variety of prompts based on user-defined options. Using a special syntax, you can construct a prompt with multiple possibilities, and the system will automatically generate a series of permutations based on your settings. This is extremely beneficial for ideation, exploring various scenarios, or testing different concepts swiftly and efficiently.
### Structure of a Dynamic Prompt
A Dynamic Prompt comprises of regular text, supplemented with alternatives enclosed within curly braces {} and separated by a vertical bar |. For example: {option1|option2|option3}. The system will then select one of the options to include in the final prompt. This flexible system allows for options to be placed throughout the text as needed.
Furthermore, Dynamic Prompts can designate multiple selections from a single group of options. This feature is triggered by prefixing the options with a numerical value followed by $$. For example, in {2$$option1|option2|option3}, the system will select two distinct options from the set.
### Creating Dynamic Prompts
To create a Dynamic Prompt, follow these steps:
Draft your sentence or phrase, identifying words or phrases with multiple possible options.
Encapsulate the different options within curly braces {}.
Within the braces, separate each option using a vertical bar |.
If you want to include multiple options from a single group, prefix with the desired number and $$.
For instance: A {house|apartment|lodge|cottage} in {summer|winter|autumn|spring} designed in {style1|style2|style3}.
### How Dynamic Prompts Work
Once a Dynamic Prompt is configured, the system generates an array of combinations using the options provided. Each group of options in curly braces is treated independently, with the system selecting one option from each group. For a prefixed set (e.g., 2$$), the system will select two distinct options.
For example, the following prompts could be generated from the above Dynamic Prompt:
A house in summer designed in style1, style2
A lodge in autumn designed in style3, style1
A cottage in winter designed in style2, style3
And many more!
When the `Combinatorial` setting is on, Invoke will disable the "Images" selection, and generate every combination up until the setting for Max Prompts is reached.
When the `Combinatorial` setting is off, Invoke will randomly generate combinations up until the setting for Images has been reached.
### Tips and Tricks for Using Dynamic Prompts
Below are some useful strategies for creating Dynamic Prompts:
Utilize Dynamic Prompts to generate a wide spectrum of prompts, perfect for brainstorming and exploring diverse ideas.
Ensure that the options within a group are contextually relevant to the part of the sentence where they are used. For instance, group building types together, and seasons together.
Apply the 2$$ prefix when you want to incorporate more than one option from a single group. This becomes quite handy when mixing and matching different elements.
Experiment with different quantities for the prefix. For example, 3$$ will select three distinct options.
Be aware of coherence in your prompts. Although the system can generate all possible combinations, not all may semantically make sense. Therefore, carefully choose the options for each group.
Always review and fine-tune the generated prompts as needed. While Dynamic Prompts can help you generate a multitude of combinations, the final polishing and refining remain in your hands.
## SDXL Prompting
Prompting with SDXL is slightly different than prompting with SD1.5 or SD2.1 models - SDXL expects a prompt _and_ a style.
### Prompting
<figure markdown>

</figure>
In the prompt box, enter a positive or negative prompt as you normally would.
For the style box you can enter a style that you want the image to be generated in. You can use styles from this example list, or any other style you wish: anime, photographic, digital art, comic book, fantasy art, analog film, neon punk, isometric, low poly, origami, line art, cinematic, 3d model, pixel art, etc.
### Concatenated Prompts
InvokeAI also has the option to concatenate the prompt and style inputs, by pressing the "link" button in the Positive Prompt box.
This concatenates the prompt & style inputs, and passes the joined prompt and style to the SDXL model.

|  |  |
"Outpainting" means asking the AI to expand the original image beyond its
original borders, making a bigger image that's still based on the original. For
example, extending the above image of your Grandmother in a biker's jacket to
include her wearing jeans (and while we're at it, a motorcycle!)
<figure markdown>

</figure>
When you are using the Unified Canvas, Invoke decides automatically whether to
do Inpainting, Outpainting, ImageToImage, or TextToImage by looking inside the
area enclosed by the Bounding Box. It chooses the appropriate type of generation
based on whether the Bounding Box contains empty (transparent) areas on the Base
layer, or whether it contains colored areas from previous generations (or from
painted brushstrokes) on the Base layer, and/or whether the Mask layer contains
any brushstrokes. See [Generation Methods](#generation-methods) below for more
information.
## Getting Started
To get started with the Unified Canvas, you will want to generate a new base
layer using Txt2Img or importing an initial image. We'll refer to either of
these methods as the "initial image" in the below guide.
From there, you can consider the following techniques to augment your image:
- **New Images**: Move the bounding box to an empty area of the Canvas, type in
your prompt, and Invoke, to generate a new image using the Text to Image
function.
- **Image Correction**: Use the color picker and brush tool to paint corrections
on the image, switch to the Mask layer, and brush a mask over your painted
area to use **Inpainting**. You can also use the **ImageToImage** generation
method to invoke new interpretations of the image.
- **Image Expansion**: Move the bounding box to include a portion of your
initial image, and a portion of transparent/empty pixels, then Invoke using a
prompt that describes what you'd like to see in that area. This will Outpaint
the image. You'll typically find more coherent results if you keep about
50-60% of the original image in the bounding box. Make sure that the Image To
Image Strength slider is set to a high value - you may need to set it higher
than you are used to.
- **New Content on Existing Images**: If you want to add new details or objects
into your image, use the brush tool to paint a sketch of what you'd like to
see on the image, switch to the Mask layer, and brush a mask over your painted
area to use **Inpainting**. If the masked area is small, consider using a
smaller bounding box to take advantage of Invoke's automatic Scaling features,
which can help to produce better details.
- **And more**: There are a number of creative ways to use the Canvas, and the
above are just starting points. We're excited to see what you come up with!
The Canvas can use all generation methods available (Txt2Img, Img2Img,
Inpainting, and Outpainting), and these will be automatically selected and used
based on the current selection area within the Bounding Box.
### Text to Image
If the Bounding Box is placed over an area of Canvas with an **empty Base
Layer**, invoking a new image will use **TextToImage**. This generates an
entirely new image based on your prompt.
### Image to Image
If the Bounding Box is placed over an area of Canvas with an **existing Base
Layer area with no transparent pixels or masks**, invoking a new image will use
**ImageToImage**. This uses the image within the bounding box and your prompt to
interpret a new image. The image will be closer to your original image at lower
Image to Image strengths.
### Inpainting
If the Bounding Box is placed over an area of Canvas with an **existing Base
Layer and any pixels selected using the Mask layer**, invoking a new image will
use **Inpainting**. Inpainting uses the existing colors/forms in the masked area
in order to generate a new image for the masked area only. The unmasked portion
of the image will remain the same. Image to Image strength applies to the
inpainted area.
If you desire something completely different from the original image in your new
generation (i.e., if you want Invoke to ignore existing colors/forms), consider
toggling the Inpaint Replace setting on, and use high values for both Inpaint
Replace and Image To Image Strength.
!!! note
By default, the **Scale Before Processing** option — which
inpaints more coherent details by generating at a larger resolution and then
scaling — is only activated when the Bounding Box is relatively small.
To get the best inpainting results you should therefore resize your Bounding
Box to the smallest area that contains your mask and enough surrounding detail
to help Stable Diffusion understand the context of what you want it to draw.
You should also update your prompt so that it describes _just_ the area within
the Bounding Box.
### Outpainting
If the Bounding Box is placed over an area of Canvas partially filled by an
existing Base Layer area and partially by transparent pixels or masks, invoking
a new image will use **Outpainting**, as well as **Inpainting** any masked
areas.
---
## Advanced Features
Features with non-obvious behavior are detailed below, in order to provide
clarity on the intent and common use cases we expect for utilizing them.
### Toolbar
#### Mask Options
- **Enable Mask** - This flag can be used to Enable or Disable the currently
painted mask. If you have painted a mask, but you don't want it affect the
next invocation, but you _also_ don't want to delete it, then you can set this
option to Disable. When you want the mask back, set this back to Enable.
- **Preserve Masked Area** - When enabled, Preserve Masked Area inverts the
effect of the Mask on the Inpainting process. Pixels in masked areas will be
kept unchanged, and unmasked areas will be regenerated.
#### Creative Tools
- **Brush - Base/Mask Modes** - The Brush tool switches automatically between
different modes of operation for the Base and Mask layers respectively.
- On the Base layer, the brush will directly paint on the Canvas using the
color selected on the Brush Options menu.
- On the Mask layer, the brush will create a new mask. If you're finding the
mask difficult to see over the existing content of the Unified Canvas, you
can change the color it is drawn with using the color selector on the Mask
Options dropdown.
- **Erase Bounding Box** - On the Base layer, erases all pixels within the
Bounding Box.
- **Fill Bounding Box** - On the Base layer, fills all pixels within the
Bounding Box with the currently selected color.
#### Canvas Tools
- **Move Tool** - Allows for manipulation of the Canvas view (by dragging on the
Canvas, outside the bounding box), the Bounding Box (by dragging the edges of
the box), or the Width/Height of the Bounding Box (by dragging one of the 9
directional handles).
- **Reset View** - Click to re-orients the view to the center of the Bounding
Box.
- **Merge Visible** - If your browser is having performance problems drawing the
image in the Unified Canvas, click this to consolidate all of the information
currently being rendered by your browser into a merged copy of the image. This
lowers the resource requirements and should improve performance.
### Compositing / Seam Correction
When doing Inpainting or Outpainting, Invoke needs to merge the pixels generated
by Stable Diffusion into your existing image. This is achieved through compositing - the area around the the boundary between your image and the new generation is
automatically blended to produce a seamless output. In a fully automatic
process, a mask is generated to cover the boundary, and then the area of the boundary is
Inpainted.
Although the default options should work well most of the time, sometimes it can
help to alter the parameters that control the Compositing. A larger blur and
a blur setting have been noted as producing
consistently strong results . Strength of 0.7 is best for reducing hard seams.
- **Mode** - What part of the image will have the the Compositing applied to it.
- **Mask edge** will apply Compositing to the edge of the masked area
- **Mask** will apply Compositing to the entire masked area
- **Unmasked** will apply Compositing to the entire image
- **Steps** - Number of generation steps that will occur during the Coherence Pass, similar to Denoising Steps. Higher step counts will generally have better results.
- **Strength** - How much noise is added for the Coherence Pass, similar to Denoising Strength. A strength of 0 will result in an unchanged image, while a strength of 1 will result in an image with a completely new area as defined by the Mode setting.
- **Blur** - Adjusts the pixel radius of the the mask. A larger blur radius will cause the mask to extend past the visibly masked area, while too small of a blur radius will result in a mask that is smaller than the visibly masked area.
- **Blur Method** - The method of blur applied to the masked area.
### Infill & Scaling
- **Scale Before Processing & W/H**: When generating images with a bounding box
smaller than the optimized W/H of the model (e.g., 512x512 for SD1.5), this
feature first generates at a larger size with the same aspect ratio, and then
scales that image down to fill the selected area. This is particularly useful
when inpainting very small details. Scaling is optional but is enabled by
default.
- **Inpaint Replace**: When Inpainting, the default method is to utilize the
existing RGB values of the Base layer to inform the generation process. If
Inpaint Replace is enabled, noise is generated and blended with the existing
pixels (completely replacing the original RGB values at an Inpaint Replace
value of 1). This can help generate more variation from the pixels on the Base
layers.
- When using Inpaint Replace you should use a higher Image To Image Strength
value, especially at higher Inpaint Replace values
- **Infill Method**: Invoke currently supports two methods for producing RGB
values for use in the Outpainting process: Patchmatch and Tile. We believe
that Patchmatch is the superior method, however we provide support for Tile in
case Patchmatch cannot be installed or is unavailable on your computer.
- **Tile Size**: The Tile method for Outpainting sources small portions of the
original image and randomly place these into the areas being Outpainted. This
value sets the size of those tiles.
## Hot Keys
The Unified Canvas is a tool that excels when you use hotkeys. You can view the
full list of keyboard shortcuts, updated with all new features, by clicking the
Keyboard Shortcuts icon at the top right of the InvokeAI WebUI.
Invoke uses a SQLite database to store image, workflow, model, and execution data.
We take great care to ensure your data is safe, by utilizing transactions and a database migration system.
Even so, when testing an prerelease version of the app, we strongly suggest either backing up your database or using an in-memory database. This ensures any prelease hiccups or databases schema changes will not cause problems for your data.
Even so, when testing a prerelease version of the app, we strongly suggest either backing up your database or using an in-memory database. This ensures any prelease hiccups or databases schema changes will not cause problems for your data.
## Database Backup
@@ -24,12 +22,10 @@ SQLite can run on an in-memory database. Your existing database is untouched whe
This is very useful for testing, as there is no chance of a database change modifying your "physical" database.
To run Invoke with a memory database, edit your `invokeai.yaml` file, and add `use_memory_db: true` to the `Paths:` stanza:
To run Invoke with a memory database, edit your `invokeai.yaml` file and add `use_memory_db: true`:
```yaml
InvokeAI:
Development:
use_memory_db: true
use_memory_db: true
```
Delete this line (or set it to `false`) to use your main database.
@@ -70,7 +70,7 @@ Each image also has a context menu (ctrl+click / right-click).
- ***Use Prompt **** this will load only the image's text prompts into the left-hand control panel
- ***Use Seed **** this will load only the image's Seed into the left-hand control panel
- ***Use All **** this will load all of the image's generation information into the left-hand control panel
- ***Send to Image to Image*** this will put the image into the left-hand panel in the Image to Image tab ana automatically open it
- ***Send to Image to Image*** this will put the image into the left-hand panel in the Image to Image tab and automatically open it
- ***Send to Unified Canvas*** This will (bold)replace whatever is already present(bold) in the Unified Canvas tab with the image and automatically open the tab
- ***Change Board*** this will oipen a small window that will let you move the image to a different board. This is the same as dragging the image to that board's thumbnail.
- ***Star Image*** this will add the image to the board's list of starred images that are always kept at the top of the gallery. This is the same as clicking on the star on the top right-hand side of the image that appears when you hover over the image with the mouse
As of v5.6.0, Invoke has a low-VRAM mode. It works on systems with dedicated GPUs (Nvidia GPUs on Windows/Linux and AMD GPUs on Linux).
This allows you to generate even if your GPU doesn't have enough VRAM to hold full models. Most users should be able to run even the beefiest models - like the ~24GB unquantised FLUX dev model.
## Enabling Low-VRAM mode
To enable Low-VRAM mode, add this line to your `invokeai.yaml` configuration file, then restart Invoke:
```yaml
enable_partial_loading:true
```
**Windows users should also [disable the Nvidia sysmem fallback](#disabling-nvidia-sysmem-fallback-windows-only)**.
It is possible to fine-tune the settings for best performance or if you still get out-of-memory errors (OOMs).
!!! tip "How to find `invokeai.yaml`"
The `invokeai.yaml` configuration file lives in your install directory. To access it, run the **Invoke Community Edition** launcher and click the install location. This will open your install directory in a file explorer window.
You'll see `invokeai.yaml` there and can edit it with any text editor. After making changes, restart Invoke.
If you don't see `invokeai.yaml`, launch Invoke once. It will create the file on its first startup.
## Details and fine-tuning
Low-VRAM mode involves 4 features, each of which can be configured or fine-tuned:
- Partial model loading (`enable_partial_loading`)
- PyTorch CUDA allocator config (`pytorch_cuda_alloc_conf`)
- Dynamic RAM and VRAM cache sizes (`max_cache_ram_gb`, `max_cache_vram_gb`)
- Working memory (`device_working_mem_gb`)
- Keeping a RAM weight copy (`keep_ram_copy_of_weights`)
Read on to learn about these features and understand how to fine-tune them for your system and use-cases.
### Partial model loading
Invoke's partial model loading works by streaming model "layers" between RAM and VRAM as they are needed.
When an operation needs layers that are not in VRAM, but there isn't enough room to load them, inactive layers are offloaded to RAM to make room.
#### Enabling partial model loading
As described above, you can enable partial model loading by adding this line to `invokeai.yaml`:
```yaml
enable_partial_loading:true
```
### PyTorch CUDA allocator config
The PyTorch CUDA allocator's behavior can be configured using the `pytorch_cuda_alloc_conf` config. Tuning the allocator configuration can help to reduce the peak reserved VRAM. The optimal configuration is dependent on many factors (e.g. device type, VRAM, CUDA driver version, etc.), but switching from PyTorch's native allocator to using CUDA's built-in allocator works well on many systems. To try this, add the following line to your `invokeai.yaml` file:
```yaml
pytorch_cuda_alloc_conf:"backend:cudaMallocAsync"
```
A more complete explanation of the available configuration options is [here](https://pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf).
### Dynamic RAM and VRAM cache sizes
Loading models from disk is slow and can be a major bottleneck for performance. Invoke uses two model caches - RAM and VRAM - to reduce loading from disk to a minimum.
By default, Invoke manages these caches' sizes dynamically for best performance.
#### Fine-tuning cache sizes
Prior to v5.6.0, the cache sizes were static, and for best performance, many users needed to manually fine-tune the `ram` and `vram` settings in `invokeai.yaml`.
As of v5.6.0, the caches are dynamically sized. The `ram` and `vram` settings are no longer used, and new settings are added to configure the cache.
**Most users will not need to fine-tune the cache sizes.**
But, if your GPU has enough VRAM to hold models fully, you might get a perf boost by manually setting the cache sizes in `invokeai.yaml`:
```yaml
# The default max cache RAM size is logged on InvokeAI startup. It is determined based on your system RAM / VRAM.
# You can override the default value by setting `max_cache_ram_gb`.
# Increasing `max_cache_ram_gb` will increase the amount of RAM used to cache inactive models, resulting in faster model
# reloads for the cached models.
# As an example, if your system has 32GB of RAM and no other heavy processes, setting the `max_cache_ram_gb` to 28GB
# might be a good value to achieve aggressive model caching.
max_cache_ram_gb:28
# The default max cache VRAM size is adjusted dynamically based on the amount of available VRAM (taking into
# consideration the VRAM used by other processes).
# You can override the default value by setting `max_cache_vram_gb`.
# CAUTION: Most users should not manually set this value. See warning below.
max_cache_vram_gb:16
```
!!! warning "Max safe value for `max_cache_vram_gb`"
Most users should not manually configure the `max_cache_vram_gb`. This configuration value takes precedence over the `device_working_mem_gb` and any operations that explicitly reserve additional working memory (e.g. VAE decode). As such, manually configuring it increases the likelihood of encountering out-of-memory errors.
For users who wish to configure `max_cache_vram_gb`, the max safe value can be determined by subtracting `device_working_mem_gb` from your GPU's VRAM. As described below, the default for `device_working_mem_gb` is 3GB.
For example, if you have a 12GB GPU, the max safe value for `max_cache_vram_gb` is `12GB - 3GB = 9GB`.
If you had increased `device_working_mem_gb` to 4GB, then the max safe value for `max_cache_vram_gb` is `12GB - 4GB = 8GB`.
Most users who override `max_cache_vram_gb` are doing so because they wish to use significantly less VRAM, and should be setting `max_cache_vram_gb` to a value significantly less than the 'max safe value'.
### Working memory
Invoke cannot use _all_ of your VRAM for model caching and loading. It requires some VRAM to use as working memory for various operations.
Invoke reserves 3GB VRAM as working memory by default, which is enough for most use-cases. However, it is possible to fine-tune this setting if you still get OOMs.
#### Fine-tuning working memory
You can increase the working memory size in `invokeai.yaml` to prevent OOMs:
```yaml
# The default is 3GB - bump it up to 4GB to prevent OOMs.
device_working_mem_gb:4
```
!!! tip "Operations may request more working memory"
For some operations, we can determine VRAM requirements in advance and allocate additional working memory to prevent OOMs.
VAE decoding is one such operation. This operation converts the generation process's output into an image. For large image outputs, this might use more than the default working memory size of 3GB.
During this decoding step, Invoke calculates how much VRAM will be required to decode and requests that much VRAM from the model manager. If the amount exceeds the working memory size, the model manager will offload cached model layers from VRAM until there's enough VRAM to decode.
Once decoding completes, the model manager "reclaims" the extra VRAM allocated as working memory for future model loading operations.
### Keeping a RAM weight copy
Invoke has the option of keeping a RAM copy of all model weights, even when they are loaded onto the GPU. This optimization is _on_ by default, and enables faster model switching and LoRA patching. Disabling this feature will reduce the average RAM load while running Invoke (peak RAM likely won't change), at the cost of slower model switching and LoRA patching. If you have limited RAM, you can disable this optimization:
```yaml
# Set to false to reduce the average RAM usage at the cost of slower model switching and LoRA patching.
On Windows, Nvidia GPUs are able to use system RAM when their VRAM fills up via **sysmem fallback**. While it sounds like a good idea on the surface, in practice it causes massive slowdowns during generation.
It is strongly suggested to disable this feature:
- Open the **NVIDIA Control Panel** app.
- Expand **3D Settings** on the left panel.
- Click **Manage 3D Settings** in the left panel.
- Find **CUDA - Sysmem Fallback Policy** in the right panel and set it to **Prefer No Sysmem Fallback**.
If the sysmem fallback feature sounds familiar, that's because Invoke's partial model loading strategy is conceptually very similar - use VRAM when there's room, else fall back to RAM.
Unfortunately, the Nvidia implementation is not optimized for applications like Invoke and does more harm than good.
## Troubleshooting
### Windows page file
Invoke has high virtual memory (a.k.a. 'committed memory') requirements. This can cause issues on Windows if the page file size limits are hit. (See this issue for the technical details on why this happens: https://github.com/invoke-ai/InvokeAI/issues/7563).
If you run out of page file space, InvokeAI may crash. Often, these crashes will happen with one of the following errors:
- InvokeAI exits with Windows error code `3221225477`
- InvokeAI crashes without an error, but `eventvwr.msc` reveals an error with code `0xc0000005` (the hex equivalent of `3221225477`)
If you are running out of page file space, try the following solutions:
- Make sure that you have sufficient disk space for the page file to grow. Watch your disk usage as Invoke runs. If it climbs near 100% leading up to the crash, then this is very likely the source of the issue. Clear out some disk space to resolve the issue.
- Make sure that your page file is set to "System managed size" (this is the default) rather than a custom size. Under the "System managed size" policy, the page file will grow dynamically as needed.
Many issues can be resolved by re-installing the application. You won't lose any data by re-installing. We suggest downloading the [latest release](https://github.com/invoke-ai/InvokeAI/releases/latest) and using it to re-install the application. Consult the [installer guide](../installation/010_INSTALL_AUTOMATED.md) for more information.
When you run the installer, you'll have an option to select the version to install. If you aren't ready to upgrade, you choose the current version to fix a broken install.
If the troubleshooting steps on this page don't get you up and running, please either [create an issue] or hop on [discord] for help.
## How to Install
You can download the latest installers [here](https://github.com/invoke-ai/InvokeAI/releases).
Note that any releases marked as _pre-release_ are in a beta state. You may experience some issues, but we appreciate your help testing those! For stable/reliable installations, please install the [latest release].
## Downloading models and using existing models
The Model Manager tab in the UI provides a few ways to install models, including using your already-downloaded models. You'll see a popup directing you there on first startup. For more information, see the [model install docs].
## Missing models after updating to v4
If you find some models are missing after updating to v4, it's likely they weren't correctly registered before the update and didn't get picked up in the migration.
You can use the `Scan Folder` tab in the Model Manager UI to fix this. The models will either be in the old, now-unused `autoimport` folder, or your `models` folder.
- Find and copy your install's old `autoimport` folder path, install the main install folder.
- Go to the Model Manager and click `Scan Folder`.
- Paste the path and scan.
- IMPORTANT: Uncheck `Inplace install`.
- Click `Install All` to install all found models, or just install the models you want.
Next, find and copy your install's `models` folder path (this could be your custom models folder path, or the `models` folder inside the main install folder).
Follow the same steps to scan and import the missing models.
## Slow generation
- Check the [system requirements] to ensure that your system is capable of generating images.
- Check the `ram` setting in `invokeai.yaml`. This setting tells Invoke how much of your system RAM can be used to cache models. Having this too high or too low can slow things down. That said, it's generally safest to not set this at all and instead let Invoke manage it.
- Check the `vram` setting in `invokeai.yaml`. This setting tells Invoke how much of your GPU VRAM can be used to cache models. Counter-intuitively, if this setting is too high, Invoke will need to do a lot of shuffling of models as it juggles the VRAM cache and the currently-loaded model. The default value of 0.25 is generally works well for GPUs without 16GB or more VRAM. Even on a 24GB card, the default works well.
- Check that your generations are happening on your GPU (if you have one). InvokeAI will log what is being used for generation upon startup. If your GPU isn't used, re-install to ensure the correct versions of torch get installed.
- If you are on Windows, you may have exceeded your GPU's VRAM capacity and are using slower [shared GPU memory](#shared-gpu-memory-windows). There's a guide to opt out of this behaviour in the linked FAQ entry.
## Shared GPU Memory (Windows)
!!! tip "Nvidia GPUs with driver 536.40"
This only applies to current Nvidia cards with driver 536.40 or later, released in June 2023.
When the GPU doesn't have enough VRAM for a task, Windows is able to allocate some of its CPU RAM to the GPU. This is much slower than VRAM, but it does allow the system to generate when it otherwise might no have enough VRAM.
When shared GPU memory is used, generation slows down dramatically - but at least it doesn't crash.
If you'd like to opt out of this behavior and instead get an error when you exceed your GPU's VRAM, follow [this guide from Nvidia](https://nvidia.custhelp.com/app/answers/detail/a_id/5490).
Here's how to get the python path required in the linked guide:
- Run `invoke.bat`.
- Select option 2 for developer console.
- At least one python path will be printed. Copy the path that includes your invoke installation directory (typically the first).
## Installer cannot find python (Windows)
Ensure that you checked **Add python.exe to PATH** when installing Python. This can be found at the bottom of the Python Installer window. If you already have Python installed, you can re-run the python installer, choose the Modify option and check the box.
## Triton error on startup
This can be safely ignored. InvokeAI doesn't use Triton, but if you are on Linux and wish to dismiss the error, you can install Triton.
## Updated to 3.4.0 and xformers can’t load C++/CUDA
An issue occurred with your PyTorch update. Follow these steps to fix :
1. Launch your invoke.bat / invoke.sh and select the option to open the developer console
- If you run into an error with `typing_extensions`, re-open the developer console and run: `pip install -U typing-extensions`
Note that v3.4.0 is an old, unsupported version. Please upgrade to the [latest release].
## Install failed and says `pip` is out of date
An out of date `pip` typically won't cause an installation to fail. The cause of the error can likely be found above the message that says `pip` is out of date.
If you saw that warning but the install went well, don't worry about it (but you can update `pip` afterwards if you'd like).
## Replicate image found online
Most example images with prompts that you'll find on the internet have been generated using different software, so you can't expect to get identical results. In order to reproduce an image, you need to replicate the exact settings and processing steps, including (but not limited to) the model, the positive and negative prompts, the seed, the sampler, the exact image size, any upscaling steps, etc.
## OSErrors on Windows while installing dependencies
During a zip file installation or an update, installation stops with an error like this:
- Paste your access token when prompted and press Enter. You won't see anything when you paste it.
- Type `n` if prompted about git credentials.
If you get an error, try the command again - maybe the token didn't paste correctly.
Once your token is set, start Invoke and try downloading the model again. The installer will automatically use the access token.
If the install still fails, you may not have access to the model.
## Stable Diffusion XL generation fails after trying to load UNet
InvokeAI is working in other respects, but when trying to generate
images with Stable Diffusion XL you get a "Server Error". The text log
in the launch window contains this log line above several more lines of
error messages:
`INFO --> Loading model:D:\LONG\PATH\TO\MODEL, type sdxl:main:unet`
This failure mode occurs when there is a network glitch during
downloading the very large SDXL model.
To address this, first go to the Model Manager and delete the
Stable-Diffusion-XL-base-1.X model. Then, click the HuggingFace tab,
paste the Repo ID stabilityai/stable-diffusion-xl-base-1.0 and install
the model.
## Package dependency conflicts during installation or update
If you have previously installed InvokeAI or another Stable Diffusion
package, the installer may occasionally pick up outdated libraries and
either the installer or `invoke` will fail with complaints about
library conflicts.
To resolve this, re-install the application as described above.
## Invalid configuration file
Everything seems to install ok, you get a `ValidationError` when starting up the app.
This is caused by an invalid setting in the `invokeai.yaml` configuration file. The error message should tell you what is wrong.
Check the [configuration docs] for more detail about the settings and how to specify them.
## Out of Memory Issues
The models are large, VRAM is expensive, and you may find yourself
faced with Out of Memory errors when generating images. Here are some
tips to reduce the problem:
!!! info "Optimizing for GPU VRAM"
=== "4GB VRAM GPU"
This should be adequate for 512x512 pixel images using Stable Diffusion 1.5
and derived models, provided that you do not use the NSFW checker. It won't be loaded unless you go into the UI settings and turn it on.
If you are on a CUDA-enabled GPU, we will automatically use xformers or torch-sdp to reduce VRAM requirements, though you can explicitly configure this. See the [configuration docs].
=== "6GB VRAM GPU"
This is a border case. Using the SD 1.5 series you should be able to
generate images up to 640x640 with the NSFW checker enabled, and up to
1024x1024 with it disabled.
If you run into persistent memory issues there are a series of
environment variables that you can set before launching InvokeAI that
alter how the PyTorch machine learning library manages memory. See
<https://pytorch.org/docs/stable/notes/cuda.html#memory-management> for
a list of these tweaks.
=== "12GB VRAM GPU"
This should be sufficient to generate larger images up to about 1280x1280.
## Memory Leak (Linux)
If you notice a memory leak, it could be caused to memory fragmentation as models are loaded and/or moved from CPU to GPU.
A workaround is to tune memory allocation with an environment variable:
```bash
# Force blocks >1MB to be allocated with `mmap` so that they are released to the system immediately when they are freed.
MALLOC_MMAP_THRESHOLD_=1048576
```
!!! warning "Speed vs Memory Tradeoff"
Your generations may be slower overall when setting this environment variable.
!!! info "Possibly dependent on `libc` implementation"
It's not known if this issue occurs with other `libc` implementations such as `musl`.
If you encounter this issue and your system uses a different implementation, please try this environment variable and let us know if it fixes the issue.
<h3>Detailed Discussion</h3>
Python (and PyTorch) relies on the memory allocator from the C Standard Library (`libc`). On linux, with the GNU C Standard Library implementation (`glibc`), our memory access patterns have been observed to cause severe memory fragmentation.
This fragmentation results in large amounts of memory that has been freed but can't be released back to the OS. Loading models from disk and moving them between CPU/CUDA seem to be the operations that contribute most to the fragmentation.
This memory fragmentation issue can result in OOM crashes during frequent model switching, even if `ram` (the max RAM cache size) is set to a reasonable value (e.g. a OOM crash with `ram=16` on a system with 32GB of RAM).
This problem may also exist on other OSes, and other `libc` implementations. But, at the time of writing, it has only been investigated on linux with `glibc`.
To better understand how the `glibc` memory allocator works, see these references:
Note the differences between memory allocated as chunks in an arena vs. memory allocated with `mmap`. Under `glibc`'s default configuration, most model tensors get allocated as chunks in an arena making them vulnerable to the problem of fragmentation.
@@ -20,7 +20,7 @@ When you generate an image using text-to-image, multiple steps occur in latent s
4. The VAE decodes the final latent image from latent space into image space.
Image-to-image is a similar process, with only step 1 being different:
1. The input image is encoded from image space into latent space by the VAE. Noise is then added to the input latent image. Denoising Strength dictates how may noise steps are added, and the amount of noise added at each step. A Denoising Strength of 0 means there are 0 steps and no noise added, resulting in an unchanged image, while a Denoising Strength of 1 results in the image being completely replaced with noise and a full set of denoising steps are performance. The process is then the same as steps 2-4 in the text-to-image process.
1. The input image is encoded from image space into latent space by the VAE. Noise is then added to the input latent image. Denoising Strength dictates how many noise steps are added, and the amount of noise added at each step. A Denoising Strength of 0 means there are 0 steps and no noise added, resulting in an unchanged image, while a Denoising Strength of 1 results in the image being completely replaced with noise and a full set of denoising steps are performance. The process is then the same as steps 2-4 in the text-to-image process.
Furthermore, a model provides the CLIP prompt tokenizer, the VAE, and a U-Net (where noise prediction occurs given a prompt and initial noise tensor).
[latest commit to main badge]: https://flat.badgen.net/github/last-commit/invoke-ai/InvokeAI/main?icon=github&color=yellow&label=last%20commit&cache=900
[latest commit to main link]: https://github.com/invoke-ai/InvokeAI/commits/main
<a href="https://github.com/invoke-ai/InvokeAI">InvokeAI</a> is an
implementation of Stable Diffusion, the open source text-to-image and
image-to-image generator. It provides a streamlined process with various new
features and options to aid the image generation process. It runs on Windows,
Mac and Linux machines, and runs on GPU cards with as little as 4 GB of RAM.
<a href="https://github.com/invoke-ai/InvokeAI">Invoke</a> is a leading creative engine built to empower professionals and enthusiasts alike. Generate and create stunning visual media using the latest AI-driven technologies. Invoke offers an industry leading web-based UI, and serves as the foundation for multiple commercial products.
- [WebUI Unified Canvas for Img2Img, inpainting and outpainting](features/UNIFIED_CANVAS.md)
If you still have a problem, [create an issue](https://github.com/invoke-ai/InvokeAI/issues) or ask for help on [Discord](https://discord.gg/ZmtBAhwWhy).
<!-- separator -->
## Training
### Image Management
- [Image2Image](features/IMG2IMG.md)
- [Adding custom styles and subjects](features/CONCEPTS.md)
- [Upscaling and Face Reconstruction](features/POSTPROCESS.md)
- [Other Features](features/OTHER.md)
Invoke Training has moved to its own repository, with a dedicated UI for accessing common scripts like Textual Inversion and LoRA training.
**The same packaged installer file can be used for both new installs and updates.**
Using the installer for updates will leave everything you've added since installation, and just update the core libraries used to run Invoke.
Simply use the same path you installed to originally.
Both release and pre-release versions can be installed using the installer. It also supports install through a wheel if needed.
Be sure to review the [installation requirements] and ensure your system has everything it needs to install Invoke.
## Getting the Latest Installer
Download the `InvokeAI-installer-vX.Y.Z.zip` file from the [latest release] page. It is at the bottom of the page, under **Assets**.
After unzipping the installer, you should have a `InvokeAI-Installer` folder with some files inside, including `install.bat` and `install.sh`.
## Running the Installer
!!! tip
Windows users should first double-click the `WinLongPathsEnabled.reg` file to prevent a failed installation due to long file paths.
Double-click the install script:
=== "Windows"
```sh
install.bat
```
=== "Linux/macOS"
```sh
install.sh
```
!!! info "Running the Installer from the commandline"
You can also run the install script from cmd/powershell (Windows) or terminal (Linux/macOS).
!!! warning "Untrusted Publisher (Windows)"
You may get a popup saying the file comes from an `Untrusted Publisher`. Click `More Info` and `Run Anyway` to get past this.
The installation process is simple, with a few prompts:
- Select the version to install. Unless you have a specific reason to install a specific version, select the default (the latest version).
- Select location for the install. Be sure you have enough space in this folder for the base application, as described in the [installation requirements].
- Select a GPU device.
!!! info "Slow Installation"
The installer needs to download several GB of data and install it all. It may appear to get stuck at 99.9% when installing `pytorch` or during a step labeled "Installing collected packages".
If it is stuck for over 10 minutes, something has probably gone wrong and you should close the window and restart.
## Running the Application
Find the install location you selected earlier. Double-click the launcher script to run the app:
=== "Windows"
```sh
invoke.bat
```
=== "Linux/macOS"
```sh
invoke.sh
```
Choose the first option to run the UI. After a series of startup messages, you'll see something like this:
```
Uvicorn running on http://127.0.0.1:9090 (Press CTRL+C to quit)
```
Copy the URL into your browser and you should see the UI.
## First-time Setup
You will need to [install some models] before you can generate.
Check the [configuration docs] for details on configuring the application.
## Updating
Updating is exactly the same as installing - download the latest installer, choose the latest version and off you go.
!!! info "Dependency Resolution Issues"
We've found that pip's dependency resolution can cause issues when upgrading packages. One very common problem was pip "downgrading" torch from CUDA to CPU, but things broke in other novel ways.
The installer doesn't have this kind of problem, so we use it for updating as well.
## Installation Issues
If you have installation issues, please review the [FAQ]. You can also [create an issue] or ask for help on [discord].
InvokeAI is distributed as a python package on PyPI, installable with `pip`. There are a few things that are handled by the installer and launcher that you'll need to manage manually, described in this guide.
### Requirements
Before you start, go through the [installation requirements].
### Installation Walkthrough
1. Create a directory to contain your InvokeAI library, configuration
files, and models. This is known as the "runtime" or "root"
directory, and often lives in your home directory under the name `invokeai`.
We will refer to this directory as `INVOKEAI_ROOT`. For convenience, create an environment variable pointing to the directory.
1. Enter the root (invokeai) directory and create a virtual Python environment within it named `.venv`.
!!! warning "Virtual Environment Location"
While you may create the virtual environment anywhere in the file system, we recommend that you create it within the root directory as shown here. This allows the application to automatically detect its data directories.
If you choose a different location for the venv, then you _must_ set the `INVOKEAI_ROOT` environment variable or specify the root directory using the `--root` CLI arg.
```terminal
cd $INVOKEAI_ROOT
python3 -m venv .venv --prompt InvokeAI
```
1. Activate the new environment:
=== "Linux/macOS"
```bash
source .venv/bin/activate
```
=== "Windows"
```ps
.venv\Scripts\activate
```
!!! info "Permissions Error (Windows)"
If you get a permissions error at this point, run this command and try again
The command-line prompt should change to to show `(InvokeAI)` at the beginning of the prompt.
The following steps should be run while inside the `INVOKEAI_ROOT` directory.
1. Make sure that pip is installed in your virtual environment and up to date:
```bash
python3 -m pip install --upgrade pip
```
1. Install the InvokeAI Package. The base command is `pip install InvokeAI --use-pep517`, but you may need to change this depending on your system and the desired features.
- You may need to provide an [extra index URL]. Select your platform configuration using [this tool on the PyTorch website]. Copy the `--extra-index-url` string from this and append it to your install command.
- If you have a CUDA GPU and want to install with `xformers`, you need to add an option to the package name. Note that `xformers` is not necessary. PyTorch includes an implementation of the SDP attention algorithm with the same performance.
!!! example "Install with `xformers`"
```bash
pip install "InvokeAI[xformers]" --use-pep517
```
1. Deactivate and reactivate your runtime directory so that the invokeai-specific commands become available in the environment:
=== "Linux/macOS"
```bash
deactivate && source .venv/bin/activate
```
=== "Windows"
```ps
deactivate
.venv\Scripts\activate
```
1. Run the application:
Run `invokeai-web` to start the UI. You must activate the virtual environment before running the app.
!!! warning
If the virtual environment is _not_ inside the root directory, then you _must_ specify the path to the root directory with `--root_dir \path\to\invokeai` or the `INVOKEAI_ROOT` environment variable.
We highly recommend to Install InvokeAI locally using [these instructions](INSTALLATION.md),
because Docker containers can not access the GPU on macOS.
!!! warning "AMD GPU Users"
Container support for AMD GPUs has been reported to work by the community, but has not received
extensive testing. Please make sure to set the `GPU_DRIVER=rocm` environment variable (see below), and
use the `build.sh` script to build the image for this to take effect at build time.
!!! tip "Linux and Windows Users"
For optimal performance, configure your Docker daemon to access your machine's GPU.
Docker Desktop on Windows [includes GPU support](https://www.docker.com/blog/wsl-2-gpu-support-for-docker-desktop-on-nvidia-gpus/).
Linux users should install and configure the [NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html)
## Why containers?
They provide a flexible, reliable way to build and deploy InvokeAI.
See [Processes](https://12factor.net/processes) under the Twelve-Factor App
methodology for details on why running applications in such a stateless fashion is important.
The container is configured for CUDA by default, but can be built to support AMD GPUs
by setting the `GPU_DRIVER=rocm` environment variable at Docker image build time.
Developers on Apple silicon (M1/M2/M3): You
[can't access your GPU cores from Docker containers](https://github.com/pytorch/pytorch/issues/81224)
and performance is reduced compared with running it directly on macOS but for
development purposes it's fine. Once you're done with development tasks on your
laptop you can build for the target platform and architecture and deploy to
another environment with NVIDIA GPUs on-premises or in the cloud.
## TL;DR
This assumes properly configured Docker on Linux or Windows/WSL2. Read on for detailed customization options.
```bash
# docker compose commands should be run from the `docker` directory
| `INVOKEAI_ROOT` | `~/invokeai` | **Required** - the location of your InvokeAI root directory. It will be created if it does not exist.
| `HUGGING_FACE_HUB_TOKEN` | | InvokeAI will work without it, but some of the integrations with HuggingFace (like downloading from models from private repositories) may not work|
| `GPU_DRIVER` | `cuda` | Optionally change this to `rocm` to build the image for AMD GPUs. NOTE: Use the `build.sh` script to build the image for this to take effect.
</figure>
#### Build the Image
Use the standard `docker compose build` command from within the `docker` directory.
If using an AMD GPU:
a: set the `GPU_DRIVER=rocm` environment variable in `docker-compose.yml` and continue using `docker compose build` as usual, or
b: set `GPU_DRIVER=rocm` in the `.env` file and use the `build.sh` script, provided for convenience
#### Run the Container
Use the standard `docker compose up` command, and generally the `docker compose` [CLI](https://docs.docker.com/compose/reference/) as usual.
Once the container starts up (and configures the InvokeAI root directory if this is a new installation), you can access InvokeAI at [http://localhost:9090](http://localhost:9090)
## Troubleshooting / FAQ
- Q: I am running on Windows under WSL2, and am seeing a "no such file or directory" error.
- A: Your `docker-entrypoint.sh` file likely has Windows (CRLF) as opposed to Unix (LF) line endings,
and you may have cloned this repository before the issue was fixed. To solve this, please change
the line endings in the `docker-entrypoint.sh` file to `LF`. You can do this in VSCode
(`Ctrl+P` and search for "line endings"), or by using the `dos2unix` utility in WSL.
Finally, you may delete `docker-entrypoint.sh` followed by `git pull; git checkout docker/docker-entrypoint.sh`
to reset the file to its most recent version.
For more information on this issue, please see the [Docker Desktop documentation](https://docs.docker.com/desktop/troubleshoot/topics/#avoid-unexpected-syntax-errors-use-unix-style-line-endings-for-files-in-containers)
Before installing, review the [installation requirements] to ensure your system is set up properly.
See the [FAQ] for frequently-encountered installation issues.
If you need more help, join our [discord] or [create an issue].
<h2>Automatic Install & Updates </h2>
✅ The automatic install is the best way to run InvokeAI. Check out the [installation guide] to get started.
⬆️ The same installer is also the best way to update InvokeAI - Simply rerun it for the same folder you installed to.
The installation process simply manages installation for the core libraries & application dependencies that run Invoke.
Any models, images, or other assets in the Invoke root folder won't be affected by the installation process.
<h2>Manual Install</h2>
If you are familiar with python and want more control over the packages that are installed, you can [install InvokeAI manually via PyPI].
Updates are managed by reinstalling the latest version through PyPi.
<h2>Developer Install</h2>
If you want to contribute to InvokeAI, consult the [developer install guide].
<h2>Docker Install</h2>
This method is recommended for those familiar with running Docker containers.
We offer a method for creating Docker containers containing InvokeAI and its dependencies. This method is recommended for individuals with experience with Docker containers and understand the pluses and minuses of a container-based install.
See the [docker installation guide].
<h2>Other Installation Guides</h2>
- [PyPatchMatch](060_INSTALL_PATCHMATCH.md)
- [Installing Models](050_INSTALLING_MODELS.md)
[install InvokeAI manually via PyPI]: 020_INSTALL_MANUAL.md
InvokeAI uses a SQLite database. By running on `main`, you accept responsibility for your database. This
means making regular backups (especially before pulling) and/or fixing it yourself in the event that a
PR introduces a schema change.
If you don't need persistent backend storage, you can use an ephemeral in-memory database by setting
`use_memory_db: true` in your `invokeai.yaml` file. You'll also want to set `scan_models_on_startup: true`
so that your models are registered on startup.
If this is untenable, you should run the application via the official installer or a manual install of the
python package from PyPI. These releases will not break your database.
If you have an interest in how InvokeAI works, or you would like to add features or bugfixes, you are encouraged to install the source code for InvokeAI.
!!! info "Why do I need the frontend toolchain?"
The repo doesn't contain a build of the frontend. You'll be responsible for rebuilding it (or running it in dev mode) to use the app, as described in the [frontend dev toolchain] docs.
<h2> Installation </h2>
1. [Fork and clone] the [InvokeAI repo].
1. Follow the [manual installation] docs to create a new virtual environment for the development install.
- Create a new folder outside the repo root for the installation and create the venv inside that folder.
- When installing the InvokeAI package, add `-e` to the command so you get an [editable install].
1. Install the [frontend dev toolchain] and do a production build of the UI as described.
1. You can now run the app as described in the [manual installation] docs.
As described in the [frontend dev toolchain] docs, you can run the UI using a dev server. If you do this, you won't need to continually rebuild the frontend. Instead, you run the dev server and use the app with the server URL it provides.
[Fork and clone]: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/fork-a-repo
Docker can not access the GPU on macOS, so your generation speeds will be slow. Use the [launcher](./quick_start.md) instead.
!!! tip "Linux and Windows Users"
Configure Docker to access your machine's GPU.
Docker Desktop on Windows [includes GPU support](https://www.docker.com/blog/wsl-2-gpu-support-for-docker-desktop-on-nvidia-gpus/).
Linux users should follow the [NVIDIA](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html) or [AMD](https://rocm.docs.amd.com/projects/install-on-linux/en/latest/how-to/docker.html) documentation.
## TL;DR
Ensure your Docker setup is able to use your GPU. Then:
```bash
docker run --runtime=nvidia --gpus=all --publish 9090:9090 ghcr.io/invoke-ai/invokeai
```
Once the container starts up, open <http://localhost:9090> in your browser, install some models, and start generating.
## Build-It-Yourself
All the docker materials are located inside the [docker](https://github.com/invoke-ai/InvokeAI/tree/main/docker) directory in the Git repo.
```bash
cd docker
cp .env.sample .env
docker compose up
```
We also ship the `run.sh` convenience script. See the `docker/README.md` file for detailed instructions on how to customize the docker setup to your needs.
| `INVOKEAI_ROOT` | `~/invokeai` | **Required** - the location of your InvokeAI root directory. It will be created if it does not exist. |
| `HUGGING_FACE_HUB_TOKEN` | | InvokeAI will work without it, but some of the integrations with HuggingFace (like downloading from models from private repositories) may not work |
| `GPU_DRIVER` | `cuda` | Optionally change this to `rocm` to build the image for AMD GPUs. NOTE: Use the `build.sh` script to build the image for this to take effect. |
</figure>
#### Build the Image
Use the standard `docker compose build` command from within the `docker` directory.
If using an AMD GPU:
a: set the `GPU_DRIVER=rocm` environment variable in `docker-compose.yml` and continue using `docker compose build` as usual, or
b: set `GPU_DRIVER=rocm` in the `.env` file and use the `build.sh` script, provided for convenience
#### Run the Container
Use the standard `docker compose up` command, and generally the `docker compose` [CLI](https://docs.docker.com/compose/reference/) as usual.
Once the container starts up (and configures the InvokeAI root directory if this is a new installation), you can access InvokeAI at [http://localhost:9090](http://localhost:9090)
## Troubleshooting / FAQ
- Q: I am running on Windows under WSL2, and am seeing a "no such file or directory" error.
- A: Your `docker-entrypoint.sh` might have has Windows (CRLF) line endings, depending how you cloned the repository.
To solve this, change the line endings in the `docker-entrypoint.sh` file to `LF`. You can do this in VSCode
(`Ctrl+P` and search for "line endings"), or by using the `dos2unix` utility in WSL.
Finally, you may delete `docker-entrypoint.sh` followed by `git pull; git checkout docker/docker-entrypoint.sh`
to reset the file to its most recent version.
For more information on this issue, see [Docker Desktop documentation](https://docs.docker.com/desktop/troubleshoot/topics/#avoid-unexpected-syntax-errors-use-unix-style-line-endings-for-files-in-containers)
If you want to use Invoke locally, you should probably use the [launcher](./quick_start.md).
If you want to contribute to Invoke or run the app on the latest dev branch, instead follow the [dev environment](../contributing/dev-environment.md) guide.
InvokeAI is distributed as a python package on PyPI, installable with `pip`. There are a few things that are handled by the launcher that you'll need to manage manually, described in this guide.
## Requirements
Before you start, go through the [installation requirements](./requirements.md).
## Walkthrough
We'll use [`uv`](https://github.com/astral-sh/uv) to install python and create a virtual environment, then install the `invokeai` package. `uv` is a modern, very fast alternative to `pip`.
The following commands vary depending on the version of Invoke being installed and the system onto which it is being installed.
1. Install `uv` as described in its [docs](https://docs.astral.sh/uv/getting-started/installation/#standalone-installer). We suggest using the standalone installer method.
Run `uv --version` to confirm that `uv` is installed and working. After installation, you may need to restart your terminal to get access to `uv`.
2. Create a directory for your installation, typically in your home directory (e.g. `~/invokeai` or `$Home/invokeai`):
=== "Linux/macOS"
```bash
mkdir ~/invokeai
cd ~/invokeai
```
=== "Windows (PowerShell)"
```bash
mkdir $Home/invokeai
cd $Home/invokeai
```
3. Create a virtual environment in that directory:
This command creates a portable virtual environment at `.venv` complete with a portable python 3.12. It doesn't matter if your system has no python installed, or has a different version - `uv` will handle everything.
4. Activate the virtual environment:
=== "Linux/macOS"
```bash
source .venv/bin/activate
```
=== "Windows (PowerShell)"
```ps
.venv\Scripts\activate
```
5. Choose a version to install. Review the [GitHub releases page](https://github.com/invoke-ai/InvokeAI/releases).
6. Determine the package specifier to use when installing. This is a performance optimization.
- If you have an Nvidia 20xx series GPU or older, use `invokeai[xformers]`.
- If you have an Nvidia 30xx series GPU or newer, or do not have an Nvidia GPU, use `invokeai`.
7. Determine the `PyPI` index URL to use for installation, if any. This is necessary to get the right version of torch installed.
=== "Invoke v5.12 and later"
- If you are on Windows or Linux with an Nvidia GPU, use `https://download.pytorch.org/whl/cu128`.
- If you are on Linux with no GPU, use `https://download.pytorch.org/whl/cpu`.
- If you are on Linux with an AMD GPU, use `https://download.pytorch.org/whl/rocm6.2.4`.
- **In all other cases, do not use an index.**
=== "Invoke v5.10.0 to v5.11.0"
- If you are on Windows or Linux with an Nvidia GPU, use `https://download.pytorch.org/whl/cu126`.
- If you are on Linux with no GPU, use `https://download.pytorch.org/whl/cpu`.
- If you are on Linux with an AMD GPU, use `https://download.pytorch.org/whl/rocm6.2.4`.
- **In all other cases, do not use an index.**
=== "Invoke v5.0.0 to v5.9.1"
- If you are on Windows with an Nvidia GPU, use `https://download.pytorch.org/whl/cu124`.
- If you are on Linux with no GPU, use `https://download.pytorch.org/whl/cpu`.
- If you are on Linux with an AMD GPU, use `https://download.pytorch.org/whl/rocm6.1`.
- **In all other cases, do not use an index.**
=== "Invoke v4"
- If you are on Windows with an Nvidia GPU, use `https://download.pytorch.org/whl/cu124`.
- If you are on Linux with no GPU, use `https://download.pytorch.org/whl/cpu`.
- If you are on Linux with an AMD GPU, use `https://download.pytorch.org/whl/rocm5.2`.
- **In all other cases, do not use an index.**
8. Install the `invokeai` package. Substitute the package specifier and version.
9. Deactivate and reactivate your venv so that the invokeai-specific commands become available in the environment:
=== "Linux/macOS"
```bash
deactivate && source .venv/bin/activate
```
=== "Windows (PowerShell)"
```ps
deactivate
.venv\Scripts\activate
```
10. Run the application, specifying the directory you created earlier as the root directory:
=== "Linux/macOS"
```bash
invokeai-web --root ~/invokeai
```
=== "Windows (PowerShell)"
```bash
invokeai-web --root $Home/invokeai
```
## Headless Install and Launch Scripts
If you run Invoke on a headless server, you might want to install and run Invoke on the command line.
We do not plan to maintain scripts to do this moving forward, instead focusing our dev resources on the GUI [launcher](../installation/quick_start.md).
You can create your own scripts for this by copying the handful of commands in this guide. `uv`'s [`pip` interface docs](https://docs.astral.sh/uv/reference/cli/#uv-pip-install) may be useful.
@@ -15,7 +15,7 @@ Today, there are thousands of models, fine tuned to excel at specific styles, ge
- `safetensors`: Single file, like `.ckpt` files. Prevents malware from lurking in a model.
- `diffusers`: Splits the model components into separate files, allowing very fast loading.
InvokeAI supports all three formats. Our backend will convert models to `diffusers` format before running them. This is a transparent process.
InvokeAI supports all three formats.
## Starter Models
@@ -27,7 +27,7 @@ Some models carry license terms that limit their use in commercial applications
## Other Models
You can install other models using the Model Manager. You'll find tabs for the following install methods:
There are a few ways to install other models:
- **URL or Local Path**: Provide the path to a model on your computer, or a direct link to the model. Some sites require you to use an API token to download models, which you can [set up in the config file].
- **HuggingFace**: Paste a HF Repo ID to install it. If there are multiple models in the repo, you'll get a list to choose from. Repo IDs look like this: `XpucT/Deliberate`. There is a copy button on each repo to copy the ID.
@@ -49,4 +49,4 @@ In this situation, you may need to provide some additional information to identi
Add `:v2` to the repo ID and use that when installing the model: `monster-labs/control_v1p_sd15_qrcode_monster:v2`
[set up in the config file]: ../../features/CONFIGURATION#model-marketplace-api-keys
[set up in the config file]: ../configuration.md#model-marketplace-api-keys
PatchMatch is an algorithm used to infill images. It can greatly improve outpainting results. PyPatchMatch is a python wrapper around a C++ implementation of the algorithm.
It uses the image data around the target area as a reference to generate new image data of a similar character and quality.
## Why Use PatchMatch
In the context of image generation, "outpainting" refers to filling in a transparent area using AI-generated image data. But the AI can't generate without some initial data. We need to first fill in the transparent area with _something_.
The first step in "outpainting" then, is to fill in the transparent area with something. Generally, you get better results when that initial infill resembles the rest of the image.
Because PatchMatch generates image data so similar to the rest of the image, it works very well as the first step in outpainting, typically producing better results than other infill methods supported by Invoke (e.g. LaMA, cv2 infill, random tiles).
### Performance Caveat
PatchMatch is CPU-bound, and the amount of time it takes increases proportionally as the infill area increases. While the numbers certainly vary depending on system specs, you can expect a noticeable slowdown once you start infilling areas around 512x512 pixels. 1024x1024 pixels can take several seconds to infill.
## Installation
Unfortunately, installation can be somewhat challenging, as it requires some things that `pip` cannot install for you.
## Windows
You're in luck! On Windows platforms PyPatchMatch will install automatically on
Windows systems with no extra intervention.
## Macintosh
You need to have opencv installed so that pypatchmatch can be built:
```bash
brew install opencv
```
The next time you start `invoke`, after successfully installing opencv, pypatchmatch will be built.
## Linux
Prior to installing PyPatchMatch, you need to take the following steps:
### Debian Based Distros
1. Install the `build-essential` tools:
```sh
sudo apt update
sudo apt install build-essential
```
2. Install `opencv`:
```sh
sudo apt install python3-opencv libopencv-dev
```
3. Activate the environment you use for invokeai, either with `conda` or with a
virtual environment.
4. Install pypatchmatch:
```sh
pip install pypatchmatch
```
5. Confirm that pypatchmatch is installed. At the command-line prompt enter
`python`, and then at the `>>>` line type
`from patchmatch import patch_match`: It should look like the following:
```py
Python 3.10.12 (main, Jun 11 2023, 05:26:28) [GCC 11.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from patchmatch import patch_match
Compiling and loading c extensions from "/home/lstein/Projects/InvokeAI/.invokeai-env/src/pypatchmatch/patchmatch".
rm -rf build/obj libpatchmatch.so
mkdir: created directory 'build/obj'
mkdir: created directory 'build/obj/csrc/'
[dep] csrc/masked_image.cpp ...
[dep] csrc/nnf.cpp ...
[dep] csrc/inpaint.cpp ...
[dep] csrc/pyinterface.cpp ...
[CC] csrc/pyinterface.cpp ...
[CC] csrc/inpaint.cpp ...
[CC] csrc/nnf.cpp ...
[CC] csrc/masked_image.cpp ...
[link] libpatchmatch.so ...
```
### Arch Based Distros
1. Install the `base-devel` package:
```sh
sudo pacman -Syu
sudo pacman -S --needed base-devel
```
2. Install `opencv`, `blas`, and required dependencies:
```sh
sudo pacman -S opencv blas fmt glew vtk hdf5
```
or for CUDA support
```sh
sudo pacman -S opencv-cuda blas fmt glew vtk hdf5
```
3. Fix the naming of the `opencv` package configuration file:
```sh
cd /usr/lib/pkgconfig/
ln -sf opencv4.pc opencv.pc
```
[**Next, Follow Steps 4-6 from the Debian Section above**](#linux)
Welcome to Invoke! Follow these steps to install, update, and get started creating.
## Step 1: System Requirements
Invoke runs on Windows 10+, macOS 14+ and Linux (Ubuntu 20.04+ is well-tested).
Hardware requirements vary significantly depending on model and image output size. The requirements below are rough guidelines.
- All Apple Silicon (M1, M2, etc) Macs work, but 16GB+ memory is recommended.
- AMD GPUs are supported on Linux only. The VRAM requirements are the same as Nvidia GPUs.
!!! info "Hardware Requirements (Windows/Linux)"
=== "SD1.5 - 512×512"
- GPU: Nvidia 10xx series or later, 4GB+ VRAM.
- Memory: At least 8GB RAM.
- Disk: 10GB for base installation plus 30GB for models.
=== "SDXL - 1024×1024"
- GPU: Nvidia 20xx series or later, 8GB+ VRAM.
- Memory: At least 16GB RAM.
- Disk: 10GB for base installation plus 100GB for models.
=== "FLUX - 1024×1024"
- GPU: Nvidia 20xx series or later, 10GB+ VRAM.
- Memory: At least 32GB RAM.
- Disk: 10GB for base installation plus 200GB for models.
More detail on system requirements can be found [here](./requirements.md).
## Step 2: Download
Download the most launcher for your operating system:
- [Download for Windows](https://download.invoke.ai/Invoke%20Community%20Edition.exe)
- [Download for macOS](https://download.invoke.ai/Invoke%20Community%20Edition.dmg)
- [Download for Linux](https://download.invoke.ai/Invoke%20Community%20Edition.AppImage)
## Step 3: Install or Update
Run the launcher you just downloaded, click **Install** and follow the instructions to get set up.
If you have an existing Invoke installation, you can select it and let the launcher manage the install. You'll be able to update or launch the installation.
!!! warning "Problem running the launcher on macOS"
macOS may not allow you to run the launcher. We are working to resolve this by signing the launcher executable. Until that is done, you can manually flag the launcher as safe:
- Open the **Invoke Community Edition.dmg** file.
- Drag the launcher to **Applications**.
- Open a terminal.
- Run `xattr -d 'com.apple.quarantine' /Applications/Invoke\ Community\ Edition.app`.
You should now be able to run the launcher.
## Step 4: Launch
Once installed, click **Finish**, then **Launch** to start Invoke.
The very first run after an installation or update will take a few extra moments to get ready.
!!! tip "Server Mode"
The launcher runs Invoke as a desktop application. You can enable **Server Mode** in the launcher's settings to disable this and instead access the UI through your web browser.
## Step 5: Install Models
With Invoke started up, you'll need to install some models.
The quickest way to get started is to install a **Starter Model** bundle. If you already have a model collection, Invoke can use it.
!!! info "Install Models"
=== "Install a Starter Model bundle"
1. Go to the **Models** tab.
2. Click **Starter Models** on the right.
3. Click one of the bundles to install its models. Refer to the [system requirements](#step-1-confirm-system-requirements) if you're unsure which model architecture will work for your system.
=== "Use my model collection"
4. Go to the **Models** tab.
5. Click **Scan Folder** on the right.
6. Paste the path to your models collection and click **Scan Folder**.
7. With **In-place install** enabled, Invoke will leave the model files where they are. If you disable this, **Invoke will move the models into its own folders**.
You’re now ready to start creating!
## Step 6: Learn the Basics
We recommend watching our [Getting Started Playlist](https://www.youtube.com/playlist?list=PLvWK1Kc8iXGrQy8r9TYg6QdUuJ5MMx-ZO). It covers essential features and workflows, including:
- Generating your first image.
- Using control layers and reference guides.
- Refining images with advanced workflows.
## Troubleshooting
If installation fails, retrying the install in Repair Mode may fix it. There's a checkbox to enable this on the Review step of the install flow.
If that doesn't fix it, [clearing the `uv` cache](https://docs.astral.sh/uv/reference/cli/#uv-cache-clean) might do the trick:
- Open and start the dev console (button at the bottom-left of the launcher).
- Run `uv cache clean`.
- Retry the installation. Enable Repair Mode for good measure.
If you are still unable to install, try installing to a different location and see if that works.
If you still have problems, ask for help on the Invoke [discord](https://discord.gg/ZmtBAhwWhy).
## Other Installation Methods
- You can install the Invoke application as a python package. See our [manual install](./manual.md) docs.
- You can run Invoke with docker. See our [docker install](./docker.md) docs.
Invoke runs on Windows 10+, macOS 14+ and Linux (Ubuntu 20.04+ is well-tested).
!!! warning "Problematic Nvidia GPUs"
## Hardware
We do not recommend these GPUs. They cannot operate with half precision, but have insufficient VRAM to generate 512x512 images at full precision.
Hardware requirements vary significantly depending on model and image output size.
- NVIDIA 10xx series cards such as the 1080 TI
- GTX 1650 series cards
- GTX 1660 series cards
The requirements below are rough guidelines for best performance. GPUs with less VRAM typically still work, if a bit slower. Follow the [Low-VRAM mode guide](./features/low-vram.md) to optimize performance.
Invoke runs best with a dedicated GPU, but will fall back to running on CPU, albeit much slower. You'll need a beefier GPU for SDXL.
- All Apple Silicon (M1, M2, etc) Macs work, but 16GB+ memory is recommended.
- AMD GPUs are supported on Linux only. The VRAM requirements are the same as Nvidia GPUs.
!!! example "Stable Diffusion 1.5"
!!! info "Hardware Requirements (Windows/Linux)"
=== "Nvidia"
=== "SD1.5 - 512×512"
```
Any GPU with at least 4GB VRAM.
```
- GPU: Nvidia 10xx series or later, 4GB+ VRAM.
- Memory: At least 8GB RAM.
- Disk: 10GB for base installation plus 30GB for models.
=== "AMD"
=== "SDXL - 1024×1024"
```
Any GPU with at least 4GB VRAM. Linux only.
```
- GPU: Nvidia 20xx series or later, 8GB+ VRAM.
- Memory: At least 16GB RAM.
- Disk: 10GB for base installation plus 100GB for models.
=== "Mac"
=== "FLUX - 1024×1024"
```
Any Apple Silicon Mac with at least 8GB memory.
```
!!! example "Stable Diffusion XL"
=== "Nvidia"
```
Any GPU with at least 8GB VRAM.
```
=== "AMD"
```
Any GPU with at least 16GB VRAM. Linux only.
```
=== "Mac"
```
Any Apple Silicon Mac with at least 16GB memory.
```
## RAM
At least 12GB of RAM.
## Disk
SSDs will, of course, offer the best performance.
The base application disk usage depends on the torch backend.
!!! example "Disk"
=== "Nvidia (CUDA)"
```
~6.5GB
```
=== "AMD (ROCm)"
```
~12GB
```
=== "Mac (MPS)"
```
~3.5GB
```
You'll need to set aside some space for images, depending on how much you generate. A couple GB is enough to get started.
You'll need a good chunk of space for models. Even if you only install the most popular models and the usual support models (ControlNet, IP Adapter ,etc), you will quickly hit 50GB of models.
- GPU: Nvidia 20xx series or later, 10GB+ VRAM.
- Memory: At least 32GB RAM.
- Disk: 10GB for base installation plus 200GB for models.
!!! info "`tmpfs` on Linux"
@@ -92,26 +37,32 @@ You'll need a good chunk of space for models. Even if you only install the most
## Python
Invoke requires python 3.10 or 3.11. If you don't already have one of these versions installed, we suggest installing 3.11, as it will be supported for longer.
!!! tip "The launcher installs python for you"
Check that your system has an up-to-date Python installed by running `python --version` in the terminal (Linux, macOS) or cmd/powershell (Windows).
You don't need to do this if you are installing with the [Invoke Launcher](./quick_start.md).
<h3>Installing Python (Windows)</h3>
Invoke requires python 3.10 through 3.12. If you don't already have one of these versions installed, we suggest installing 3.12, as it will be supported for longer.
- Install python 3.11 with [an official installer].
- The installer includes an option to add python to your PATH. Be sure to enable this. If you missed it, re-run the installer, choose to modify an existing installation, and tick that checkbox.
- You may need to install [Microsoft Visual C++ Redistributable].
Check that your system has an up-to-date Python installed by running `python3 --version` in the terminal (Linux, macOS) or cmd/powershell (Windows).
<h3>Installing Python (macOS)</h3>
!!! info "Installing Python"
- Install python 3.11 with [an official installer].
- If model installs fail with a certificate error, you may need to run this command (changing the python version to match what you have installed): `/Applications/Python\ 3.10/Install\ Certificates.command`
- If you haven't already, you will need to install the XCode CLI Tools by running `xcode-select --install` in a terminal.
=== "Windows"
<h3>Installing Python (Linux)</h3>
- Install python with [an official installer].
- The installer includes an option to add python to your PATH. Be sure to enable this. If you missed it, re-run the installer, choose to modify an existing installation, and tick that checkbox.
- You may need to install [Microsoft Visual C++ Redistributable].
- Follow the [linux install instructions], being sure to install python 3.11.
- You'll need to install `libglib2.0-0` and `libgl1-mesa-glx` for OpenCV to work. For example, on a Debian system: `sudo apt update && sudo apt install -y libglib2.0-0 libgl1-mesa-glx`
=== "macOS"
- Install python with [an official installer].
- If model installs fail with a certificate error, you may need to run this command (changing the python version to match what you have installed): `/Applications/Python\ 3.10/Install\ Certificates.command`
- If you haven't already, you will need to install the XCode CLI Tools by running `xcode-select --install` in a terminal.
=== "Linux"
- Installing python varies depending on your system. We recommend [using `uv` to manage your python installation](https://docs.astral.sh/uv/concepts/python-versions/#installing-a-python-version).
- You'll need to install `libglib2.0-0` and `libgl1-mesa-glx` for OpenCV to work. For example, on a Debian system: `sudo apt update && sudo apt install -y libglib2.0-0 libgl1-mesa-glx`
## Drivers
@@ -175,7 +126,4 @@ An alternative to installing ROCm locally is to use a [ROCm docker container] to
@@ -10,9 +10,10 @@ The suggested method is to use `git clone` to clone the repository the node is f
If you'd prefer, you can also just download the whole node folder from the linked repository and add it to the `nodes` folder.
To use a community workflow, download the the `.json` node graph file and load it into Invoke AI via the **Load Workflow** button in the Workflow Editor.
To use a community workflow, download the `.json` node graph file and load it into Invoke AI via the **Load Workflow** button in the Workflow Editor.
- Community Nodes
+ [Anamorphic Tools](#anamorphic-tools)
+ [Adapters-Linked](#adapters-linked-nodes)
+ [Autostereogram](#autostereogram-nodes)
+ [Average Images](#average-images)
@@ -20,8 +21,12 @@ To use a community workflow, download the the `.json` node graph file and load i
**Description:** A single node that can enhance the detail in an image. Increase or decrease details in an image using a guided filter (as opposed to the typical Gaussian blur used by most sharpening filters.) Based on the `Enhance Detail` ComfyUI node from https://github.com/spacepxl/ComfyUI-Image-Filters
**Description:** This node will flip an openpose image horizontally, recoloring it to make sure that it isn't facing the wrong direction. Note that it does not work with openpose hands.
**Description:** This node returns an ideal size to use for the first stage of a Flux image generation pipeline. Generating at the right size helps limit duplication and odd subject placement.
**Description:** Uses Ollama API to expand text prompts for text-to-image generation using local LLMs. Works great for expanding basic prompts into detailed natural language prompts for Flux. Also provides a toggle to unload the LLM model immediately after expanding, to free up VRAM for Invoke to continue the image generation workflow.
**Description:** an extensive suite of auto prompt generation and prompt helper nodes based on extensive logic. Get creative with the best prompt generator in the world.
The main node generates interesting prompts based on a set of parameters. There are also some additional nodes such as Auto Negative Prompt, One Button Artify, Create Prompt Variant and other cool prompt toys to play around with.
@@ -456,7 +531,7 @@ See full docs here: https://github.com/skunkworxdark/Prompt-tools-nodes/edit/mai
### BriaAI Remove Background
**Description**: Implements one click background removal with BriaAI's new version 1.4 model which seems to be be producing better results than any other previous background removal tool.
**Description**: Implements one click background removal with BriaAI's new version 1.4 model which seems to be producing better results than any other previous background removal tool.
**Description:** A set of custom nodes for InvokeAI to create cross-view or parallel-view stereograms. Stereograms are 2D images that, when viewed properly, reveal a 3D scene. Check out [r/crossview](https://www.reddit.com/r/CrossView/) for tutorials.
We've curated some example workflows for you to get started with Workflows in InvokeAI! These can also be found in the Workflow Library, located in the Workflow Editor of Invoke.
To use them, right click on your desired workflow, follow the link to GitHub and click the "⬇" button to download the raw file. You can then use the "Load Workflow" functionality in InvokeAI to load the workflow and start generating images!
If you're interested in finding more workflows, checkout the [#share-your-workflows](https://discord.com/channels/1020123559063990373/1130291608097661000) channel in the InvokeAI Discord.
* [SD1.5 / SD2 Text to Image](https://github.com/invoke-ai/InvokeAI/blob/main/docs/workflows/Text_to_Image.json)
* [SDXL Text to Image](https://github.com/invoke-ai/InvokeAI/blob/main/docs/workflows/SDXL_Text_to_Image.json)
* [SDXL Text to Image with Refiner](https://github.com/invoke-ai/InvokeAI/blob/main/docs/workflows/SDXL_w_Refiner_Text_to_Image.json)
An Node is simply a single operation that takes in inputs and returns
out outputs. Multiple nodes can be linked together to create more
complex functionality. All InvokeAI features are added through nodes.
@@ -13,14 +14,10 @@ Individual nodes are made up of the following:
- Outputs: Edge points on the right side of the node window where you connect to inputs on other nodes.
- Options: Various options which are either manually configured, or overridden by connecting an output from another node to the input.
With nodes, you can can easily extend the image generation capabilities of InvokeAI, and allow you build workflows that suit your needs.
With nodes, you can can easily extend the image generation capabilities of InvokeAI, and allow you build workflows that suit your needs.
You can read more about nodes and the node editor [here](../nodes/NODES.md).
To get started with nodes, take a look at some of our examples for [common workflows](../nodes/exampleWorkflows.md)
You can read more about nodes and the node editor [here](../nodes/NODES.md).
## Downloading New Nodes
To download a new node, visit our list of [Community Nodes](../nodes/communityNodes.md). These are nodes that have been created by the community, for the community.
To download a new node, visit our list of [Community Nodes](../nodes/communityNodes.md). These are nodes that have been created by the community, for the community.
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.