Do not use whole layer as trigger for histo recalc; use the canvas cache
of the layer - it more reliably indicates when the layer pixel data has
changed, and fixes an issue where we can miss the first histo calc due
to race conditiong with async layer bbox calculation.
Added button checks to bbox rect and transformer mousedown/touchstart handlers to only process left clicks. Also added stage dragging check in onBboxDragMove to clear bbox drag state when middle mouse panning is active.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
When middle mouse button is used for canvas panning, the pointerup event was still creating points in the segmentation module. Added button check to onBboxDragEnd handler to only process left clicks.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
Fixed an issue where bounding boxes could grow exponentially when created at small sizes. The problem occurred because Konva Transformer modifies scaleX/scaleY rather than width/height directly, and the scale values weren't consistently reset after being applied to dimensions.
Changes:
- Ensure scale values are always reset to 1 after applying to dimensions
- Add minimum size constraints to prevent zero/negative dimensions
- Fix scale handling in transformend, dragend, and initial bbox creation
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
Revised the Select Object feature to support two input modes:
- Visual mode: Combined points and bounding box input for paired SAM inputs
- Prompt mode: Text-based object selection (unchanged)
Key changes:
- Replaced three input types (points, prompt, bbox) with two (visual, prompt)
- Visual mode supports both point and bbox inputs simultaneously
- Click to add include points, Shift+click for exclude points
- Click and drag to draw bounding box
- Fixed bbox visibility issues when adding points
- Fixed coordinate system issues for proper bbox positioning
- Added proper event handling and interaction controls
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
There was a really confusing aspect of the SAM pipeline classes where
they accepted deeply nested lists of different dimensions (bbox, points,
and labels).
The lengths of the lists are related; each point must have a
corresponding label, and if bboxes are provided with points, they must
be same length.
I've refactored the backend API to take a single list of SAMInput
objects. This class has a bbox and/or a list of points, making it much
simpler to provide the right shape of inputs.
Internally, the pipeline classes take rejigger these input classes to
have the correct nesting.
The Nodes still have an awkward API where you can provide both bboxes
and points of different lengths, so I added a pydantic validator that
enforces correct lenghts.
Certain items in redux are ephemeral and omitted from persisted slices.
On rehydration, we need to inject these values back into the slice.
But there was an issue taht could prevent slice migrations from running
during rehydration.
The migrations look for the `_version` key in state and migrate the
slice accordingly.
The logic that merged in the ephemeral values accidentally _also_ merged
in the `_version` key if it didn't already exist. This happened _before_
migrations are run.
This causes problems for slices that didn't have a `_version` key and
then have one added via migration.
For example, the params slice didn't have a `_version` key until the
previous commit, which added `_version` and changed some other parts of
state in a migration.
On first load of the updated code, we have a catch-22 kinda situation:
- The persisted params slice is the old version. It needs to have both
`_version` and some other data added to it.
- We deserialize the state and then merge in ephemeral values. This
inadvertnetly also merged in the `_version` key.
- We run the slice migration. It sees there is a `_version` key and
thinks it doesn't need to run. The extra data isn't added to the slice.
The slice is parsed against its zod schema and fails because the new
data is missing.
- Because the parse failed, we treat the user's persisted data as
invalid and overwrite it with initial state, potentially causing data
loss.
The fix is to be more selective when merging in the ephemeral state
before migration - this is now done by checking which keys are on the
persist denylist and only adding those key.
This tells react that the component is a new instance each time we
change the image. Which, in turn, prevents a flash of the
previously-selected image during image switching and
progress-image-to-output-image-ing.
This has been an issue for a long time. I suspect it wasn't noticed
until now because it's finicky to trigger - you have to click and
release very quickly, without moving the mouse at all.
Must set cross origin whenever we load an image from a URL to prevent
race conditions where browser caches an image with no CORS, then canvas
attempts to load it with CORS, resulting in browser rejecting the
request before it is made
If incompatible LoRAs are added, prevent Invoking.
The logic to prevent adding incompatible LoRAs to graphs already
existed. This does not fix any generation bugs; just a visual
inconsistency where it looks like Invoke would use an incompatible LoRA.
Gemini 2.5 Flash makes no guarantees about output image sizes. Our
existing logic always rendered staged images on Canvas at the bbox dims
- not the image's physical dimensions. When Gemini returns an image that
doesn't match the bbox, it would get squished.
To rectify this, the canvas staging area renderer is updated to render
its images using their physical dimensions, as opposed to their
configured dimensions (i.e. bbox).
A flag on CanvasObjectImage enables this rendering behaviour.
Then, when saving the image as a layer from staging area, we use the
physical dimensions.
When the bbox and physical dimensions do not match, the bbox is not
touched, so it won't exactly encompass the staged image. No point in
resizing the bbox if the dimensions don't match - the next image could
be a different size, and the sizes might not be valid (it's an external
resource, after all).
- Disable LoRAs instead of deleting them when base model changes
- Update toast message to indicate that we may have _updated_ a model
(prev just sayed cleared or disabled)
- Do not change ref image models if the new base model doesn't support
them. For example, changing from SDXL to Imagen does not update the ref
image model or alert the user, because Imagen does not support ref
images. Switching from Imagen to FLUX does update the ref image model
and alert the user. Just a bit less noisy.
## Summary
Bump version
## Related Issues / Discussions
n/a
## QA Instructions
n/a
## Merge Plan
This is already released.
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
Fixes errors like `AttributeError: module 'cv2.ximgproc' has no
attribute 'thinning'` which occur because there is a conflict between
our own `opencv-contrib-python` dependency and the `invisible-watermark`
library's `opencv-python`.
Determine the "base" step for floats. If no `multipleOf` is provided,
the "base" step is `undefined`, meaning the float can have any number of
decimal places.
The UI library does its own step constrains though and is rounding to 3
decimal places. Probably need to update the logic in the UI library to
have truly arbitrary precision for float fields.
I ran into a race condition where I set a HF token and it was valid, but
somehow this error toast still appeared. The conditional feel through to
an assertion that we never expected to get to, which crashed the UI.
Handled the unexpected case gracefully now.
- Move the estimation logic to utility functions
- Estimate memory _within_ the encode and decode methods, ensuring we
_always_ estimate working memory when running a VAE
Three changes needed to make scrollIntoView and "Locate in Gallery" work
reliably.
1. Use setTimeout to work around race condition with scrollIntoView in
gallery.
It was possible to call scrollIntoView before react-virtuoso was ready.
I think react-virtuoso was initialized but hadn't rendered/measured its
items yet, so when we scroll to e.g. index 742, the items have a zero
height, so it doesn't actually scroll down. Then the items render.
Setting a timeout here defers the scroll until after the next event loop
cycle, by which time we expect react-virutoso to be ready.
2. Ensure the scollIntoView effect in gallery triggers any time the
selection is touched by making its dependency the array of selected
images, not just the last selected image name.
The "locate in gallery" functionality works by selecting an image.
There's a reactive effect in the gallery that runs when the last
selected image changes and scrolls it into view.
But if you already have an image selected, selecting it again will not
change the image name bc it is a string primitive. The useEffect ignores
the selection.
So, if you clicked "locate in gallery" on an image that was already
selected, it wouldn't be scrolled into view - even if you had already
scrolled away from it.
To work around this, the effect now uses the whole selection array as
its dependency. Whenever the selection changes, we get a new array,
which triggers the effect.
3. Gallery slice had some checks to avoid creating a new array of
selected image names in state when the selected images didn't change.
For example, if image "abc" was selected, and we selected "abc" again,
instead of creating a new array with the same "abc" image, we bailed
early. IIRC this optimization addressed a rerender issue long ago.
This optimization needs to be removed in order for fix#2 above to work.
We now _want_ a new array whenever selection is set - even if it didn't
actually change.
This feature added a lot of unexpected complexity in graph building /
metadata recall and is unintuitive user experience. 99% of the time, the
style prompt should be exactly the main prompt.
You can still use style prompts in workflows, but in an effort to reduce
complexity in the linear UI, we are removing this rarely-used feature.
When installing a model, the previous, graceful logic would increment a
suffix on the destination path until found a free path for the model.
But because model file installation and record creation are not in a
transaction, we could end up moving the file successfully and fail to
create the record:
- User attempts to install an already-installed model
- Attempt to move the downloaded model from download tempdir to
destination path
- The path already exists
- Add `_1` or similar to the path until we find a path that is free
- Move the model
- Create the model record
- FK constraint violation bc we already have a model w/ that name, but
the model file has already been moved into the invokeai dir.
Closes#8416
Prevents a large spike in VRAM when preparing to denoise w/ multiple ref
images.
There doesn't appear to be any different in image quality / ref
adherence when concatenating in latent space vs image space, though
images _are_ different.
If the transformer fills up VRAM, then when we VAE encode kontext
latents, we'll need to first offload the transformer (partially, if
partial loading is enabled).
No need to do this - we can encode kontext latents before loading the
transformer to reduce model thrashing.
Tell the model manager that we need some extra working memory for VAE
encoding operations to prevent OOMs.
See previous commit for investigation and determination of the magic
numbers used.
This safety measure is especially relevant now that we have FLUX Kontext
and may be encoding rather large ref images. Without the working memory
estimation we can OOM as we prepare for denoising.
See #8405 for an example of this issue on a very low VRAM system. It's
possible we can have the same issue on any GPU, though - just a matter
of hitting the right combination of models loaded.
This commit includes a task delegated to Claude to investigate our VAE
working memory calculations and investigation results.
See VAE_INVESTIGATION.md for motivation and detail. Everything else is
its output.
Result data includes empirical measurements for all supported model
architectures at a variety of resolutions and fp16/fp32 precision.
Testing conducted on a 4090.
The summarized conclusion is that our working memory estimations for
decoding are spot-on, but decoding also needs some extra working memory.
Empirical measurements suggest ~45% the amount needed for encoding.
A followup commit will implement working memory estimations for VAE
encoding with the goal of preventing unexpected OOMs during encode.
Currently translated at 98.6% (2037 of 2065 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.6% (2037 of 2065 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.5% (2036 of 2065 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.6% (2014 of 2042 strings)
Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
- Do not reset dimensions when resetting generation settings (they are
model-dependent, and we don't change model-dependent settings w/ that
butotn)
- Do not reset bbox when resetting canvas layers
- Show reset canvas layers button only on canvas tab
- Show reset generation settings button only on canvas or generate tab
Disable these items while staging:
- New Canvas From Image context menu
- Edit image hook & launchpad button
- Generate from Text launchpad button (only while on canvas tab)
- Use a Layout Image launchpad button
When unsafe_disable_picklescan is enabled, instead of erroring on
detections or scan failures, a warning is logged.
A warning is also logged on app startup when this setting is enabled.
The setting is disabled by default and there is no change in behaviour
when disabled.
Implements intelligent spatial tiling that arranges multiple reference
images in a virtual canvas, choosing between horizontal and vertical
placement to maintain a square-like aspect ratio
This fixes an issue where gallery's auto-scroll-into-view for selected
images didn't work, and users instead saw a "Unable to find image..."
debug log message in JS console.
1. Fix the run script to properly read the GPU_DRIVER
2. Cloned and adjusted the ROCM dockerbuild for docker
3. Adjust the docker-compose.yml to use the cloned dockerbuild
It's not clear why we were copying downloaded models to the destination
dir instead of moving them. I cannot find a reason for it, and I am able
to install single-file and diffusers models just fine with the change.
This fixes an issue where model installation requires 2x the model's
size (bc we were copying the model over).
Previously, we used pathlib's `with_suffix()` method to change add a
suffix (e.g. ".safetensors") to a model when installing it.
The intention is to add a suffix to the model's name - but that method
actually replaces everything after the first period.
This can cause different models to be installed under the same name!
For example, the FLUX models all end up with the same name:
- "FLUX.1 schnell.safetensors" -> "FLUX.safetensors"
- "FLUX.1 dev.safetensors" -> "FLUX.safetensors"
The fix is easy - append the suffix using string formatting instead of
using pathlib.
This issue has existed for a long time, but was exacerbated in
075345bffd in which I updated the names of
our starter models, adding ".1" to the FLUX model names. Whoops!
## Summary
Move client state persistence from browser to server.
- Add new client state persistence service to handle reading and writing
client state to db & associated router. The API mirrors that of
LocalStorage/IndexedDB where the set/get methods both operate on _keys_.
For example, when we persist the canvas state, we send only the new
canvas state to the backend - not the whole app state.
- The data is very flexibly-typed as a pydantic `JsonValue`. The client
is expected to handle all data parsing/validation (it must do this
anyways, and does this today).
- Change persistence from debounced to throttled at 2 seconds. Maybe
less is OK? Trying to not hammer the server.
- Add new persistence storage driver in client and use it in
redux-remember. It does its best to avoid extraneous persist requests,
caching the last data it persisted and noop-ing if there are no changes.
- Storage driver tracks pending persist actions using ref counts (bc
each slice is persisted independently). If there user navigates away
from the page during a persist request, it will give them the "you may
lose something if you navigate away" alert.
- This "lose something" alert message is not customizable (browser
security reasons).
- The alert is triggered only when the user closes the tape while a
persist network request is mid-flight. It's possible that the user makes
a change and closes the page before we start persisting. In this case,
they will lose the last 2 seconds of data.
- I tried making triggering the alert when a persist was waiting to
start, and it felt off.
- Maybe the alert isn't even necessary. Again you'd lose 2s of data at
most, probably a non issue. IMO after trying it, a subtle indicator
somewhere on the page is probably less confusing/intrusive.
- Fix an issue where the `redux-remember` enhancer was added _last_ in
the enhancer chain, which prevented us detecting when a persist has
succeeded. This required a small change to the `unserialze` utility
(used during rehydration) to ensure slices enhanced with `redux-undo`
are set up correctly as they are rehydrated.
- Restructure the redux store code to avoid circular dependencies. I
couldn't figure out how to do this without just smooshing it all into
the main `store.ts` file. Oh well.
Implications:
- Because client state is now on the server, different browsers will
have the same studio state. For example, if I start working on something
in Firefox, if I switch to Chrome, I have the same client state.
- Incognito windows won't do anything bc client state is server-side.
- It takes a bit longer for persistence to happen thanks to the
debounce, but there's now an indicator that tells you your stuff isn't
saved yet.
- Resetting the browser won't fix an issue with your studio state. You
must use `Reset Web UI` to fix it (or otherwise hit the appropriate
endpoint). It may be possible to end up in a Catch-22 where you can't
click the button and get stuck w/ a borked studio - I think to think
through this a bit more, might not be an issue.
- It probably takes a bit longer to start up, since we need to retrieve
client state over network instead of directly with browser APIs.
Other notes:
- We could explore adding an "incognito" mode, enabled via
`invokeai.yaml` setting or maybe in the UI. This would temporarily
disable persistence. Actually, I don't think this really makes sense, bc
all the images would be saved to disk.
- The studio state is stored in a single row in the DB. Currently, a
static row ID is used to force the studio state to be a singleton. It is
_possible_ to support multiple saved states. Might be a solve for app
workspaces.
## Related Issues / Discussions
n/a
## QA Instructions
Try it out. It's pretty straightforward. Error states are the main
things to test - for example, network blips. The new server-side
persistence driver is the only real functional change - everything else
is just kinda shuffling things around to support it.
## Merge Plan
n/a
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
It is accessible in two places:
- The queue actions hamburger menu.
- On the queue tab.
If the clear queue app feature is disabled, it is not shown in either of
those places.
Currently translated at 98.7% (1978 of 2003 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.7% (1978 of 2003 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.6% (1968 of 1994 strings)
Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
Currently translated at 99.8% (2007 of 2011 strings)
translationBot(ui): update translation (Japanese)
Currently translated at 99.8% (2007 of 2011 strings)
translationBot(ui): update translation (Japanese)
Currently translated at 99.8% (2007 of 2011 strings)
translationBot(ui): update translation (Japanese)
Currently translated at 99.8% (2007 of 2011 strings)
translationBot(ui): update translation (Japanese)
Currently translated at 99.8% (2007 of 2011 strings)
translationBot(ui): update translation (Japanese)
Currently translated at 92.0% (1851 of 2011 strings)
translationBot(ui): update translation (Japanese)
Currently translated at 92.0% (1851 of 2011 strings)
translationBot(ui): update translation (Japanese)
Currently translated at 92.0% (1851 of 2011 strings)
translationBot(ui): update translation (Japanese)
Currently translated at 87.4% (1744 of 1995 strings)
translationBot(ui): update translation (Japanese)
Currently translated at 87.4% (1744 of 1995 strings)
translationBot(ui): update translation (Japanese)
Currently translated at 81.0% (1616 of 1995 strings)
translationBot(ui): update translation (Japanese)
Currently translated at 81.0% (1616 of 1995 strings)
translationBot(ui): update translation (Japanese)
Currently translated at 81.0% (1616 of 1995 strings)
translationBot(ui): update translation (Japanese)
Currently translated at 81.0% (1616 of 1995 strings)
translationBot(ui): update translation (Japanese)
Currently translated at 81.0% (1616 of 1995 strings)
translationBot(ui): update translation (Japanese)
Currently translated at 81.0% (1616 of 1995 strings)
translationBot(ui): update translation (Japanese)
Currently translated at 81.0% (1616 of 1995 strings)
translationBot(ui): update translation (Japanese)
Currently translated at 81.0% (1616 of 1995 strings)
translationBot(ui): update translation (Japanese)
Currently translated at 81.0% (1616 of 1995 strings)
translationBot(ui): update translation (Japanese)
Currently translated at 75.6% (1510 of 1995 strings)
translationBot(ui): update translation (Japanese)
Currently translated at 75.6% (1510 of 1995 strings)
translationBot(ui): update translation (Japanese)
Currently translated at 75.6% (1510 of 1995 strings)
translationBot(ui): update translation (Japanese)
Currently translated at 75.6% (1510 of 1995 strings)
translationBot(ui): update translation (Japanese)
Currently translated at 75.6% (1510 of 1995 strings)
translationBot(ui): update translation (Japanese)
Currently translated at 75.6% (1510 of 1995 strings)
translationBot(ui): update translation (Japanese)
Currently translated at 75.6% (1510 of 1995 strings)
translationBot(ui): update translation (Japanese)
Currently translated at 75.6% (1510 of 1995 strings)
Co-authored-by: RyoKoba <kobayashi_ryo@cyberagent.co.jp>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/ja/
Translation: InvokeAI/Web UI
Currently translated at 97.9% (1953 of 1994 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.7% (1986 of 2011 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.7% (1970 of 1995 strings)
translationBot(ui): update translation (Italian)
Currently translated at 97.8% (1910 of 1952 strings)
Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
Currently translated at 100.0% (2012 of 2012 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 100.0% (2012 of 2012 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 99.7% (2006 of 2012 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 99.7% (2006 of 2012 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 99.5% (2002 of 2012 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 99.5% (2002 of 2012 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 97.8% (1968 of 2012 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 97.8% (1968 of 2012 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 97.8% (1968 of 2012 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 97.8% (1968 of 2012 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 96.4% (1940 of 2012 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 96.4% (1940 of 2012 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 100.0% (1921 of 1921 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 100.0% (1917 of 1917 strings)
Co-authored-by: Linos <linos.coding@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/vi/
Translation: InvokeAI/Web UI
Fix nodes ui: Make nodes dot background to be the same as the snap to grid size and position
Update to Flow.tsx
Changes the size and offset of the dots background to be the same size as the snap to grid, and also fix the background dot pattern alignment.
Currently, the snapGrid is 25x25, and the default background dot gap is 20x20, these do not align. This is fixed by making the gap property of the background the same as the snapGrid.
Additionally, there is a bug in the rectFlow background code that incorrectly sets the offset to be the centre of the dot pattern with the default offset of 0. To work around this issue, setting the background offset property to the snapGrid size will realign the dot pattern correctly.
I have logged a bug for the rectFlow background issue in its repo.
https://github.com/xyflow/xyflow/issues/5405
Update workflowSettingsSlice.ts
Change the default settings for auto layout nodeSpacing and layerSpacing to 30 instead of 32. This will make the x position of auto layed nodes land on the snap to grid positions.
Because the node width (320) + 30 = 350 which is divisible by the snap to grid size of 25.
We intermittently get an error like this:
```
TypeError: Cannot read properties of undefined (reading 'length')
```
This error is caused by a `redux-undo`-enhanced slice being rehydrated
without the extra stuff it adds to the slice to make it undoable (e.g.
an array of `past` states, the `present` state, array of `future`
states, and some other metadata).
`redux-undo` may need to check the length of the past/future arrays as
part of its internal functionality. These keys don't exist so we get the
error. I'm not sure _why_ they don't exist - my understanding of
`redux-undo` is that it should be checking and wrapping the state w/ the
history stuff automatically. Seems to be related to `redux-remember` -
may be a race condition.
The solution is to ensure we wrap rehydrated state for undoable slices
as we rehydrate them. I discovered the solution while troubleshooting
#8314 when the changes therein somehow triggered the issue to start
occuring every time instead of rarely.
* Add auto layout controls using elkjs to node editor
Introduces auto layout functionality for the node editor using elkjs, including a new UI popover for layout options (placement strategy, layering, spacing, direction). Adds related state and actions to workflowSettingsSlice, updates translations, and ensures elkjs is included in optimized dependencies.
* feat(nodes): Improve workflow auto-layout controls and accuracy
- The auto-layout settings panel is updated to use `Select` dropdowns and `NumberInput`
- The layout algorithm now uses the actual rendered dimensions of nodes from the DOM, falling back to estimates only when necessary. This results in a much more accurate and predictable layout.
- The ELKjs library integration is refactored to fix some warnings
* Update useAutoLayout.ts
prettier
* feat(nodes): Improve workflow auto-layout controls and accuracy
- The auto-layout settings panel is updated to use `Select` dropdowns and `NumberInput`
- The layout algorithm now uses the actual rendered dimensions of nodes from the DOM, falling back to estimates only when necessary. This results in a much more accurate and predictable layout.
- The ELKjs library integration is refactored to fix some warnings
* Update useAutoLayout.ts
prettier
* build(ui): import elkjs directly
* updated to use dagrejs for autolayout
updated to use dagrejs - it has less layout options but is already included
but this is still WIP as some nodes don't report the height correctly. I am still investigating this...
* Update useAutoLayout.ts
update to fix layout issues
* minor updates
- pretty useAutoLayout.ts
- add missing type import in ViewportControls.tsx
- update pnpm-lock.yaml with elkjs removed
* Update ViewportControls.tsx
pnpm fix
* Fix Frontend check + single node selection fix
Fix Frontend check - remove unused export from workflowSettingsSlice.ts
Update so that if you have a single node selected, it will auto layout all nodes, as this is a common thing to have a single node selected and means that you don't have to unselect it.
* feat(ui): misc improvements for autolayout
- Split popover into own component
- Add util functions to get node w/h
- Use magic wand icon for button
- Fix sizing of input components
- Use CompositeNumberInput instead of base chakra number input
- Add zod schemas for string values and use them in the component to
ensure state integrity
* chore(ui): lint
---------
Co-authored-by: psychedelicious <4822129+psychedelicious@users.noreply.github.com>
- Name it `pickerCompactViewStates` bc its not exclusive to model
picker, it is used for all pickers
- Rename redux action to model an event
- Move selector to right file
- Use selector to derive state for individual picker
There was a subtle issue where the progress image wasn't ever cleared,
preventing the context menu from working on staging area preview images.
The staging area preview images were displaying the last progress image
_on top of_ the result image. Because the image elements were so small,
you wouldn't notice that you were looking at a low-res progress image.
Right clicking a progress image gets you no menu.
If you refresh the page or switch tabs, this would fix itself, because
those actions clear out the progress images. The result image would then
be the topmost element, and the context menu works.
Fixing this without introducing a flash of empty space as the progress
image was hidden required a bit of refactoring. We have to wait for the
result image element to load before clearing out the progress.
Result - progress images appear to "resolve" to result images in the
staging area without any blips or jank, and the context menu works after
that happens.
Was running into difficultlies reasoning about the logic and couldn't
write tests because it was all in react.
Moved logic outside react, updated context, make it testable.
Simplify the canvas auto-switch logic to not rely on the preview images
loading. This fixes an issue where offscreen preview images didn't get
auto-switched to. Images are now loaded directly.
Fix an issue in certain browsers/builds causing a runtime error.
A zod enum has a .options property, which is an array of all the options
for the enum. This is handy for when you need to derive something from a
zod schema.
In this case, we represented the possible focus regions in the zod enum,
then derived a mapping of region names to set of target HTML elements.
Why isn't important, but suffice to say, we were using the .options
property for this.
But actually, we were using .options.values(), then calling .reduce() on
that. An array's .values() method returns an _array iterator_. Array
iterators do not have .reduce() methods!
Except, apparently in some environments they do - it depends on the JS
engine and whether or not polyfills for iterator helpers were included
in the build.
Turns out my dev environment - and most user browsers - do provide
.reduce(), so we didn't catch this error. It took a large deployment and
error monitoring to catch it.
I've refactored the code to totally avoid deriving data from zod in this
way.
- Add a context manager to the SqliteDatabase class which abstracts away
creating a transaction, committing it on success and rolling back on
error.
- Use it everywhere. The context manager should be exited before
returning results. No business logic changes should be present.
- Apparently locales must use hyphens instead of underscores. This must
have been a fairly recent change that we didn't catch. It caused i18n to
throw for Brasilian Portuguese and both Simplified and Traditional
Mandarin. Change the locales to use the right strings.
- Move the theme + locale provider inside of the error boundary. This
allows errors with locals to be caught by the error boundary instead of
hard-crashing the app. The error screen is unstyled if this happens but
at least it has the reset button.
- Add a migration for the system slice to fix existing users' language
selections. For example, if the user had an incorrect language setting
of `zh_CN`, it will be changed to the correct `zh-CN`.
The range-based fetching logic had a subtle bug - it didn't keep track
of what the _current_ visible range is - only the ranges that the user
last scrolled to.
When an image was added to the gallery, the logic saw that the images
had changed, but thought it had already loaded everything it needed to,
so it didn't load the new image.
The updated logic tracks the current visible range separately from the
accumulated scroll ranges to address this issue.
When the user scrolls in the gallery, we are alerted of the new range of
visible images. Then we fetch those specific images.
Previously, each change of range triggered a throttled function to fetch
that range. The throttle timeout was 100ms.
Now, each change of range appends that range to a list of ranges and
triggers the throttled fetch. The timeout is increased to 500ms, but to
compensate, each fetch handles all ranges that had been accumulated
since the last fetch.
The result is far fewer network requests, but each of them gets more
images.
- Smaller staged image previews.
- Move autoswitch buttons to staging area toolbar, remove from settings
popover and the little three-dots menu. Use persisted autoswitch
setting, which is renamed from `defaultAutoSwitch` to
`stagingAreaAutoSwitch`.
- Fix issue with misaligned border radii in staging area preview images.
Required small changes to DndImage and its usage elsewhere.
- Fix issue where staging area toolbar could show up without any
previews in the list.
- Migrate canvas settings slice to use zod schema and inferred types for
its state.
* dont show option to add new layer from if on generate tab
* only disable width/height recall is staging AND canvas tab
---------
Co-authored-by: Mary Hipp <maryhipp@Marys-Air.lan>
Reverted incomplete change to how queue items are listed. In the future
I think we should redo it to work like the gallery. For now, it is back
the way it was in v5.
When percentage is zero, the progress bar looks the same as it does when
no generation is in progress. Render it as indeterminate (pulsing) when
percentage is zero to indicate that somethign is happenign.
* initializing prompt expansion and putting response in prompt box working for all methods
* properly disable UI and show loading state on prompt box when there is a pending prompt expansion item
* misc wrapup: disable apploying prompt templates, dont block textarea resize handle
* update progress to differentiate between prompt expansion and non
* cleanup
* lint
* more cleanup
* add image to background of loading state
* add allowPromptExpansion for front-end gating
* updated readiness text for needing to accept or discard
* fix tsc
* lint
* lint
* refactor(ui): prompt expansion logic
* tidy(ui): remove unnecessary changes
* revert(ui): unused arg on useImageUploadButton
* feat(ui): simplify prompt expansion state
* set pending for dragndrop and context menu
* add readiness logic for generate tab
* missing translation
* update error handling for prompt expansion
---------
Co-authored-by: Mary Hipp <maryhipp@Marys-Air.lan>
Co-authored-by: Mary Hipp <maryhipp@Marys-MacBook-Air.local>
Co-authored-by: psychedelicious <4822129+psychedelicious@users.noreply.github.com>
Ensure disabled tabs are never mounted:
- Add didLoad flag to configSlice, default false
- Always merge in config - even it is is empty
- On first merge, set didLoad to true
- Until didLoad is true, mark _all_ tabs as disabled
This gets around an issue where tabs are all enabled for a brief moment
before the config is loaded.
A bit hacky but it works.
Co-authored-by: kent <kent@invoke.ai>
Revert unnecessary validation changes in multi-diffusion
Fix in python instead of graphbuilder
tidy(ui): remove extraneous comment
The previous logic had a subtle python bug related the scope and nested
generators.
Python generators are lazily evaluated - the expressions are stored and
only evaluated when needed (e.g. calling next() or list() on them)
The old logic used a variable `s`, which was continually overwritten as
the generator expressions were created. As a result, the final mappings
all use the _final_ value for `s`.
Following the consequences of this down the line, we find that collect
nodes can end up with multiple edges from exactly one of their ancestor
nodes, instead of one edge from each ancestor. Notably, it's only the
source _node_id_ that is affected - the source _fields_ have the correct
values.
So the invalid edges will point to a real node and a real field, but the
field exists on a different node.
---
This can result in a number of cryptic problems - include an error about
incompatible field types:
```
InvalidEdgeError: Field types are incompatible
(31758fd5-14a8-4de7-a840-b73ec1a1b94f.value ->
3459c793-41a2-4d82-9204-7df2d6d099ba.item)
```
Here are the conditions that lead to this error:
- The collect node has at least two incoming connections.
- The two incoming connections come from nodes of different types.
- The nodes both output a value of the same type, but the name of the
output field differs between them.
---
This commit uses non-generator logic to build up the mappings, avoiding
the issue entirely. As a bonus, it is much easier to read.
Previously we used python's own type introspection utilties to determine
input and output field types. We can use pydantic to get the field types
in a clearer, more direct way.
This improvement also exposed an awkward behaviour in this utility,
where it would return None when a field doesn't exist. I've added a
comment in the code describing the issue, but changing it would require
some significant changes and I don't want to risk breaking anything.
* Add Rule of 4 composition guide to canvas settings and rendering
Co-authored-by: kent <kent@invoke.ai>
* Rename Rule of 4 Guide to Rule of Thirds in canvas composition guide
Co-authored-by: kent <kent@invoke.ai>
* Updates to comp guide and naming
* Fix reference
* Update translation keys and organize settings.
* revert to previous canvas manager for conflict
* Re-add composition guide.
* Fix lint
* prettier
* feat(ui): improve markup in canvas settings popover
* feat(ui): use brand colors for canvas rule of thirds guide
---------
Co-authored-by: Cursor Agent <cursoragent@cursor.com>
Co-authored-by: psychedelicious <4822129+psychedelicious@users.noreply.github.com>
Enhance LoRA picker to default filter by current base model architecture
## Summary
Fixes new LoRA picker to auto select the architecture filter for the
current model group
## Related Issues / Discussions
N/A
## QA Instructions
Open LoRA menu with any model group selected. The right models should be
filtered.
## Merge Plan
Merge when ready.
## Checklist
- [X] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
When we delete images, boards, or do any other board mutation, we need
to invalidate numerous query caches and related internal frontend state.
This gets complicated very quickly.
We can drastically reduce the complexity by having the backend return
some more information when we make these mutations.
For example, when deleting a list of images by name, we can return a
list of deleted image name and affected boards. The frontend can use
this information to determine which queries to invalidate with far less
tedium.
This will also enable the more efficient storage of images (e.g. in the
gallery selection). Previously, we had to store the entire image DTO
object, else we wouldn't be able to figure out which queries to
invalidate. But now that the backend tells us exactly what images/boards
have changed, we can just store image names in frontend state. This
amounts to a substantial improvement in DX and reduction in frontend
complexity.
When the invocation cache is used, we might skip all progress images. This can prevent auto-switch-on-first-progress from working, as we don't get any of those events.
It's much easier to only support auto-switch on complete.
This appears to be a bug in Chakra UI v2 - use of a fallback component makes the ref passed to an image end up undefined. Had to remove the skeleton loader fallback component.
* add support for flux-kontext models in nodes
* flux kontext in canvas
* add aspect ratio support
* lint
* restore aspect ratio logic
* more linting
* typegen
* fix typegen
---------
Co-authored-by: Mary Hipp <maryhipp@Marys-Air.lan>
## Summary
Support for
[OMI](https://github.com/Open-Model-Initiative/OMI-Model-Standards/tree/main)
LoRAs that use Flux and SDXL as the base model. Automated tests for
config classification. Manually tested (visual inspection) for LoRA
loading and execution.
## Related Issues / Discussions
<!--WHEN APPLICABLE: List any related issues or discussions on github or
discord. If this PR closes an issue, please use the "Closes #1234"
format, so that the issue will be automatically closed when the PR
merges.-->
## QA Instructions
<!--WHEN APPLICABLE: Describe how you have tested the changes in this
PR. Provide enough detail that a reviewer can reproduce your tests.-->
## Merge Plan
<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->
## Checklist
- [ ] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
In #7724 we made a number of perf optimisations related to enqueuing. One of these optimisations included moving the enqueue logic - including expensive prep work and db writes - to a separate thread.
At the same time manual DB locking was abandoned in favor of WAL mode.
Finally, we set `check_same_thread=False` to allow multiple threads to access the connection at a given time.
I think this may be the cause of #7950:
- We start an enqueue in a thread (running in bg)
- We dequeue
- Dequeue pulls a partially-written queue item from DB and we get the errors in the linked issue
To be honest, I don't understand enough about SQLite to confidently say that this kind of race condition is actually possible. But:
- The error started popping up around the time we made this change.
- I have reviewed the logic from enqueue to dequeue very carefully _many_ times over the past month or so, and I am confident that the error is only possible if we are getting unexpectedly `NULL` values from the DB.
- The DB schema includes `NOT NULL` constraints for the column that is apparently returning `NULL`.
- Therefore, without some kind of race condition or schema issue, the error should not be possible.
- The `enqueue_batch` call is the only place I can find where we have the possibility of a race condition due to async logic. Everywhere else, all DB interaction for the queue is synchronous, as far as I can tell.
This change retains the perf benefits by running the heavy enqueue prep logic in a separate thread, but moves back to the main thread for the DB write. It also uses an explicit transaction for the write.
Will just have to wait and see if this fixes the issue.
This reduces peak memory usage at a negligible cost. Queue items typically take on the order of seconds, making the time cost of a GC essentially free.
Not a great idea on a hotter code path though.
We've long suspected there is a memory leak in Invoke, but that may not be true. What looks like a memory leak may in fact be the expected behaviour for our allocation patterns.
We observe ~20 to ~30 MB increase in memory usage per session executed. I did some prolonged tests, where I measured the process's RSS in bytes while doing 200 SDXL generations. I found that it eventually leveled off at around 100 generations, at which point memory usage had climbed by ~900MB from its starting point.
I used tracemalloc to diff the allocations of single session executions and found that we are allocating ~20MB or so per session in `ModelPatcher.apply_ti()`.
In `ModelPatcher.apply_ti()` we add tokens to the tokenizer when handling TIs. The added tokens should be scoped to only the current invocation, but there is no simple way to remove the tokens afterwards.
As a workaround for this, we clone the tokenizer, add the TI tokens to the clone, and use the clone to when running compel. Afterwards, this cloned tokenizer is discarded.
The tokenizer uses ~20MB of memory, and it has referrers/referents to other compel stuff. This is what is causing the observed increases in memory per session!
We'd expect these objects to be GC'd but python doesn't do it immediately. After creating the cond tensors, we quickly move on to denoising. So there isn't any time for the GC to happen to free up its existing memory arenas/blocks to reuse them. Instead, python needs to request more memory from the OS.
We can improve the situation by immediately calling `del` on the tokenizer clone and related objects. In fact, we already had some code in the compel nodes to `del` some of these objects, but not all.
Adding the `del`s vastly improves things. We hit peak RSS in half the sessions (~50 or less) and it's now ~100MB more than starting value. There is still a gradual increase in memory usage until we level off.
* build: prevent `opencv-python` from being installed
Fixes this error: `AttributeError: module 'cv2.ximgproc' has no attribute 'thinning'`
`opencv-contrib-python` supersedes `opencv-python`, providing the same API + additional features. The two packages should not be installed at the same time to avoid conflicts and/or errors.
The `invisible-watermark` package requires `opencv-python`, but we require the contrib variant.
This change updates `pyproject.toml` to prevent `opencv-python` from ever being installed using a `uv` features called dependency overrides.
* feat(ui): data viewer supports disabling wrap
* feat(api): list _all_ pkgs in app deps endpoint
* chore(ui): typegen
* feat(ui): update about modal to display new full deps list
* chore: uv lock
When a layer is initialized, we do not yet know its bbox, so we cannot fit the stage view to the layer. We have to wait for the bbox calculation to finish. Previously, we had no way to wait unti lthat bbox calculation was complete to take an action.
For example, this means we could not fit the layers to the stage immediately after creating a new layer, bc we don't know the dimensions of the layer yet.
This callback lets us do that. When creating a new canvas from an image, we now...
- Register a bbox update callback to fit the layers to stage
- Layer is created
- Canvas initializes the layer's entity adapter module (layer's width and height are set to zero at this point)
- Canvas calculates the bbox
- Bbox is updated (width and height are now correct)
- Callback is ran, fitting layer to stage
Also change import order to ensure CLI args are handled correctly. Had to do this bc importing `InvocationRegistry` before parsing args resulted in the `--root` CLI arg being ignored.
Add `heuristic_resize_fast`, which does the same thing as `heuristic_resize`, except it's about 20x faster.
This is achieved by using opencv for the binary edge handling isntead of python, and checking only 100k pixels to determine what kind of image we are working with.
Besides being much faster, it results in cleaner lines for resized binary canny edge maps, and has results in fewer misidentified segmentation maps.
Tested against normal images, binary canny edge maps, grayscale HED edge maps, segmentation maps, and normal images.
Tested resizing up and down for each.
Besides the new utility function, I needed to swap the `opencv-python` dep for `opencv-contrib-python`, which includes `cv2.ximgproc.thinning`. This function accounts for a good chunk of the perf improvement.
Upstream bug in `transformers` breaks use of `AutoModelForMaskGeneration` class to load SAM models
Simple fix - directly load the model with `SamModel` class instead.
See upstream issue https://github.com/huggingface/transformers/issues/38228
## Summary
- Fallback to new classification API if legacy probe fails
- Method to read model metadata
- Created `StrippedModelOnDisk` class for testing
- Test to verify only a single config `matches` with a model
## Related Issues / Discussions
<!--WHEN APPLICABLE: List any related issues or discussions on github or
discord. If this PR closes an issue, please use the "Closes #1234"
format, so that the issue will be automatically closed when the PR
merges.-->
## QA Instructions
<!--WHEN APPLICABLE: Describe how you have tested the changes in this
PR. Provide enough detail that a reviewer can reproduce your tests.-->
## Merge Plan
<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->
## Checklist
- [ ] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
For example:
```py
my_field: Literal["foo", "bar"] | None = InputField(default=None)
```
Previously, this would cause a field parsing error and prevent the app from loading.
Two fixes:
- This type annotation and resultant schema are now parsed correctly
- Error handling added to template building logic to prevent the hang at startup when an error does occur
Major cleanup of RelatedModels.tsx for improved readability, structure, and maintainability.
Dried out repetitive logic
Consolidated model type sorting into reusable helpers
Added disallowed model type relationships to prevent broken connections (e.g. VAE ↔ LoRA)
- Aware this introduces a new constraint—open to feedback (see PR comment)
Some naming and types may still need refinement; happy to revisit
Adds full support for managing model-to-model relationships in the UI and backend.
Introduces RelatedModels subpanel for linking and unlinking models in model management.
- Adds REST API routes for adding, removing, and retrieving model relationships.
- New database migration: creates model_relationships table for bidirectional links.
- New service layer (model_relationships) for relationship management.
- Updated frontend: Related models float to top of LoRA/Main grouped model comboboxes for quick access.
- Added 'Show Only Related' toggle badge to MainModelPicker filter bar
**Amended commit to remove changes to ParamMainModelSelect.tsx and MainModelPicker.tsx to avoid conflict with upstream deletion/ rewrite**
## Summary
- Modify stats reset to be on a per session basis, rather than a "full
reset", to allow for parallel session execution
- Add "aider" to gitignore
## Related Issues / Discussions
<!--WHEN APPLICABLE: List any related issues or discussions on github or
discord. If this PR closes an issue, please use the "Closes #1234"
format, so that the issue will be automatically closed when the PR
merges.-->
## QA Instructions
<!--WHEN APPLICABLE: Describe how you have tested the changes in this
PR. Provide enough detail that a reviewer can reproduce your tests.-->
## Merge Plan
<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->
## Checklist
- [ ] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
Currently translated at 67.1% (1279 of 1904 strings)
translationBot(ui): update translation (Japanese)
Currently translated at 64.9% (1231 of 1895 strings)
translationBot(ui): update translation (Japanese)
Currently translated at 60.2% (1141 of 1895 strings)
translationBot(ui): update translation (Japanese)
Currently translated at 56.7% (1075 of 1895 strings)
Co-authored-by: RyoKoba <kobayashi_ryo@cyberagent.co.jp>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/ja/
Translation: InvokeAI/Web UI
Currently translated at 100.0% (1896 of 1896 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 100.0% (1895 of 1895 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 100.0% (1886 of 1886 strings)
Co-authored-by: Linos <linos.coding@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/vi/
Translation: InvokeAI/Web UI
Currently translated at 98.8% (1883 of 1904 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.8% (1882 of 1903 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.8% (1881 of 1902 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.8% (1878 of 1899 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.8% (1874 of 1895 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.8% (1873 of 1895 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.8% (1864 of 1886 strings)
Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
When we do our field type overrides to allow invocations to be instantiated without all required fields, we were not modifying the annotation of the field but did set the default value of the field to `None`.
This results in an error when doing a ser/de round trip. Here's what we end up doing:
```py
from pydantic import BaseModel, Field
class MyModel(BaseModel):
foo: str = Field(default=None)
```
And here is a simple round-trip, which should not error but which does:
```py
MyModel(**MyModel().model_dump())
# ValidationError: 1 validation error for MyModel
# foo
# Input should be a valid string [type=string_type, input_value=None, input_type=NoneType]
# For further information visit https://errors.pydantic.dev/2.11/v/string_type
```
To fix this, we now check every incoming field and update its annotation to match its default value. In other words, when we override the default field value to `None`, we make its type annotation `<original type> | None`.
This prevents the error during deserialization.
This slightly alters the schema for all invocations and outputs - the values of all fields without default values are now typed as `<original type> | None`, reflecting the overrides.
This means the autogenerated types for fields have also changed for fields without defaults:
```ts
// Old
image?: components["schemas"]["ImageField"];
// New
image?: components["schemas"]["ImageField"] | null;
```
This does not break anything on the frontend.
* support for custom error toast components, starting with usage limit
* add support for all usage limits
---------
Co-authored-by: Mary Hipp <maryhipp@Marys-MacBook-Air.local>
* display credit column in queue list if shouldShowCredits is true
* change apiModels feature to chatGPT4oModels feature
* empty
---------
Co-authored-by: Mary Hipp <maryhipp@Marys-MacBook-Air.local>
When I followed the Contribute Node documentation, I encountered an import error.
This commit fixes the error, which will help reduce debugging time for all future contributors.
* add GPTimage1 as allowed base model
* fix for non-disabled inpaint layers
* lots of boilerplate for adding gpt-image base model and disabling things along with imagen
* handle gpt-image dimensions
* build graph for gpt-image
* lint
* feat(ui): make chatgpt model naming consistent
* feat(ui): graph builder naming
* feat(ui): disable img2img for imagen3
* feat(ui): more naming
* feat(ui): support presigned url prefetch
* feat(ui): disable neg prompt for chatgpt
* docs(ui): update docstring
* feat(ui): fix graph building issues for chatgpt
* fix(ui): node ids for chatgpt/imagen
* chore(ui): typegen
---------
Co-authored-by: Mary Hipp <maryhipp@Marys-MacBook-Air.local>
Co-authored-by: psychedelicious <4822129+psychedelicious@users.noreply.github.com>
If provided, `<NavigateToModelManagerButton />` will render, even if `disabledTabs` includes "models". If provided, `<NavigateToModelManagerButton />` will run the callback instead of switching tabs within the studio.
The button's tooltip is now just "Manage Models" and its icon is the same as the model manager tab's icon ([CUBE!](https://www.youtube.com/watch?v=4aGDCE6Nrz0)).
There is a subtle change in behaviour with the new model probe API.
Previously, checks for model types was done in a specific order. For example, we did all main model checks before LoRA checks.
With the new API, the order of checks has changed. Check ordering is as follows:
- New API checks are run first, then legacy API checks.
- New API checks categorized by their speed. When we run new API checks, we sort them from fastest to slowest, and run them in that order. This is a performance optimization.
Currently, LoRA and LLaVA models are the only model types with the new API. Checks for them are thus run first.
LoRA checks involve checking the state dict for presence of keys with specific prefixes. We expect these keys to only exist in LoRAs.
It turns out that main models may have some of these keys.
For example, this model has keys that match the LoRA prefix `lora_te_`: https://civitai.com/models/134442/helloyoung25d
Under the old probe, we'd do the main model checks first and correctly identify this as a main model. But with the new setup, we do the LoRA check first, and those pass. So we import this model as a LoRA.
Thankfully, the old probe still exists. For now, the new probe is fully disabled. It was only called in one spot.
I've also added the example affected model as a test case for the model probe. Right now, this causes the test to fail, and I've marked the test as xfail. CI will pass.
Once we enable the new API again, the xfail will pass, and CI will fail, and we'll be reminded to update the test.
In the previous commit, the LLaVA model was updated to support partial loading.
In this commit, the SigLIP model is updated in the same way.
This model is used for FLUX Redux. It's <4GB and only ever run in isolation, so it won't benefit from partial loading for the vast majority of users. Regardless, I think it is best if we make _all_ models work with partial loading.
PS: I also fixed the initial load dtype issue, described in the prev commit. It's probably a non-issue for this model, but we may as well fix it.
The model manager has two types of model cache entries:
- `CachedModelOnlyFullLoad`: The model may only ever be loaded and unloaded as a single object.
- `CachedModelWithPartialLoad`: The model may be partially loaded and unloaded.
Partial loaded is enabled by overwriting certain torch layer classes, adding the ability to autocast the layer to a device on-the-fly. See `CustomLinear` for an example.
So, to take advantage of partial loading and be cached as a `CachedModelWithPartialLoad`, the model must inherit from `torch.nn.Module`.
The LLaVA classes provided by `transformers` do inherit from `torch.nn.Module`, but we wrap those classes in a separate class called `LlavaOnevisionModel`. The wrapper encapsulate both the LLaVA model and its "processor" - a lightweight class that prepares model inputs like text and images.
While it is more elegant to encapsulate both model and processor classes in a single entity, this prevents the model cache from enabling partial loading for the chunky vLLM model.
Fixing this involved a few changes.
- Update the `LlavaOnevisionModelLoader` class to operate on the vLLM model directly, instead the `LlavaOnevisionModel` wrapper class.
- Instantiate the processor directly in the node. The processor is lightweight and does its business on the CPU. We don't need to worry about caching in the model manager.
- Remove caching support code from the `LlavaOnevisionModel` wrapper class. It's not needed, because we do not cache this class. The class now only handles running the models provided to it.
- Rename `LlavaOnevisionModel` to `LlavaOnevisionPipeline` to better represent its purpose.
These changes have a bonus effect of fixing an OOM crash when initially loading the models. This was most apparent when loading LLaVA 7B, which is pretty chunky.
The initial load is onto CPU RAM. In the old version of the loaders, we ignored the loader's target dtype for the initial load. Instead, we loaded the model at `transformers`'s "default" dtype of fp32.
LLaVA 7B is fp16 and weighs ~17GB. Loading as fp32 means we need double that amount (~34GB) of CPU RAM. Many users only have 32GB RAM, so this causes a _CPU_ OOM - which is a hard crash of the whole process.
With the updated loaders, the initial load logic now uses the target dtype for the initial load. LLaVA now needs the expected ~17GB RAM for its initial load.
PS: If we didn't make the accompanying partial loading changes, we still could have solved this OOM. We'd just need to pass the initial load dtype to the wrapper class and have it load on that dtype. But we may as well fix both issues.
PPS: There are other models whose model classes are wrappers around a torch module class, and thus cannot be partially loaded. However, these models are typically fairly small and/or are run only on their own, so they don't benefit as much from partial loading. It's the really big models (like LLaVA 7B) that benefit most from the partial loading.
Currently translated at 56.6% (1069 of 1887 strings)
translationBot(ui): update translation (Japanese)
Currently translated at 50.8% (960 of 1887 strings)
translationBot(ui): update translation (Japanese)
Currently translated at 48.4% (912 of 1882 strings)
Co-authored-by: RyoKoba <kobayashi_ryo@cyberagent.co.jp>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/ja/
Translation: InvokeAI/Web UI
I am at loss as the to cause of this bug. The styles that I needed to change to fix it haven't been changed in a couple months. But these do seem to fix it.
Closes#7910
This query can have potentially large responses. Keeping them around for 24 hours essentially a hardcoded memory leak. Use the default for RTKQ of 60 seconds.
When users generate on the canvas or upscaling tabs, we parse prompts through dynamic prompts before invoking. Whenever the prompt or other settings change, we run dynamic prompts.
Previously, we used a redux listener to react to changes to dynamic prompts' dependent state, keeping the processed dynamic prompts synced. For example, when the user changed the prompt field, we re-processed the dynamic prompts.
This requires that all redux actions that change the dependent state be added to the listener matcher. It's easy to forget actions, though, which can result in the dynamic prompts state being stale.
For example, when resetting canvas state, we dispatch an action that resets the whole params slice, but this wasn't in the matcher. As a result, when resetting canvas, the dynamic prompts aren't updated. If the user then clicks Invoke (with an empty prompt), the last dynamic prompts state will be used.
For example:
- Generate w/ prompt "frog", get frog
- Click new canvas session
- Generate without any prompt, still get frog
To resolve this, the logic that keeps the dynamic prompts synced is moved from the listener to a hook. The way the logic is triggered is improved - it's now triggered in a useEffect, which is run when the dependent state changes. This way, it doesn't matter _how_ the dependent state changes - the changes will always be "seen", and the dynamic prompts will update.
Add `useCanvasIsBusySafe()` hook. This is like `useCanvasIsBusy()`, but when the canvas is not initialized, it gracefully falls back to false instead of raising.
Because app tabs are lazy-loaded, the canvas is not initialized until the user visits that tab. If the page loads up on the workflows tab, the canvas will be uninitialized until the user clicks on it.
This graceful fallback behaviour allows actions like sending an image to canvas to work even when the canvas is not yet initialized. These actions are exposed in the image context menu, and previously were hidden when the canvas was not initialized. We can now show these actions and use them even when the canvas is uninitialized.
- Add `useCanvasIsBusySafe()` hook
- Use the new hook in the image context menu for send to canvas actions
- Do not use `<CanvasManagerProviderGate />` in the image context menu (this was hiding the actions when canvas was uninitialized)
When calling `ctx.drawImage()`, if the image to be drawn has a width of height of 0, the call will raise.
In this change, I have carefully reviewed the call hierarchy for all of our own code that calls this method and ensured that each call has error handling.
Well, with one exception - I'm not sure how to handle errors in `invokeai/frontend/web/src/common/hooks/useClientSideUpload.ts`. But this should never be an issue in that hook - it's a Canvas problem.
Currently translated at 100.0% (1873 of 1873 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 100.0% (1871 of 1871 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 99.2% (1857 of 1871 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 100.0% (1840 of 1840 strings)
Co-authored-by: Linos <linos.coding@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/vi/
Translation: InvokeAI/Web UI
Whether a workflow is published or not shouldn't be something stored on the client. It's properly server-side state.
This change removes the `is_published` flag from redux and updates all references to the flag to use the getWorkflow query.
It also updates the socket event listener that handles session complete events. When a validation run completes, we invalidate the tags for the getWorkflow query. We need to do a bit of juggling to avoid a race condition (documented in the code). Works well though.
Previously, we maintained an `isTouched` flag in redux state to indicate if a workflow had unsaved changes. We manually updated this whenever we changed something on the workflow.
This was tedious and error-prone. It also didn't handle undo/redo, so if you made a change to a node and undid it, we'd still think the workflow had unsaved changes.
Moving forward, we use a simpler and more robust strategy by hashing the server's version of the workflow and comparing it to the client's version of the workflow.
The hashing uses `stable-hash`, which is both fast and, well, stable. Most importantly, the ordering of keys in hashed objects does not change the resultant hash.
- Remove `isTouched` state entirely.
- Extract the logic that builds the "preview" workflow object from redux state into its own hook. This "preview" workflow is what we send to the server when saving a workflow. This "preview" workflow is effectively the client version of the workflow.
- Add `useDoesWorkflowHaveUnsavedChanges()` hook, which compares the hash of the client workflow and server workflow (if it exists).
- Add `useIsWorkflowUntouched()` hook, which compares the hash of the client workflow and the initial workflow that you get when you click new workflow.
- Remove `reactflow` workaround in the nodes slice undo/redo filter. When we set the nodes state while loading a workflow, `reactflow` emits a nodes size/placement change event. This triggered up our `isTouched` flag logic and marked the workflow as unsaved right from the get-go. With the new strategy to track touched status, this workaround can be removed.
- Update all logic that tracked the old `isTouched` flag to use the new hooks.
Previously, the workflow form's root element id was random. Every time we reset the workflow editor, the root id changed. This makes it difficult to check if the workflow editor is untouched (in its default state).
Now that root element's id is simply "root". I can't imagine any way that this would break anything.
This allows it to pull in sentencepiece on its own. In 0.10.0, it didn't have this package listed as a dependency, but in recent releases it does. So we are able to remove sentencepiece as an explicit dep.
The fixes in this module monkeypatched `torch` to resolve some issues with FP16 on macOS. These issues have long since been resolved.
Included in the now-removed fixes is `CustomSlicedAttentionProcessor`, which is intended to reduce memory requirements for MPS. This overrides `diffusers`' own `SlicedAttentionProcessor`.
Unfortunately, `attention_type: sliced` produces hot garbage with the fixes and black images without the fixes. So this class appears to now be a moot point.
Regardless, SDPA is supported on MPS and very efficient, so sliced attention is largely obsolete.
In https://github.com/pydantic/pydantic/pull/10029, pydantic made an improvement to its generated JSON schemas (OpenAPI schemas). The previous and new generated schemas both meet the schema spec.
When we parse the OpenAPI schema to generate node templates, we use some typeguard to narrow schema components from generic OpenAPI schema objects to a node field schema objects. The narrower node field schema objects contain extra data.
For example, they contain a `field_kind` attribute that indicates it the field is an input field or output field. These extra attributes are not part of the OpenAPI spec (but the spec allows does allow for this extra data).
This typeguard relied on a pydantic implementation detail. This was changed in the linked pydantic PR, which released with v2.9.0. With the change, our typeguard rejects input field schema objects, causing parsing to fail with errors/warnings like `Unhandled input property` in the JS console.
In the UI, this causes many fields - mostly model fields - to not show up in the workflow editor.
The fix for this is very simple - instead of relying on an implementation detail for the typeguard, we can check if the incoming schema object has any of our invoke-specific extra attributes. Specifically, we now look for the presence of the `field_kind` attribute on the incoming schema object. If it is present, we know we are dealing with an invocation input field and can parse it appropriately.
In `ObjectSerializerDisk`, we use `torch.load` to load serialized objects from disk. With torch 2.6.0, torch defaults to `weights_only=True`. As a result, torch will raise when attempting to deserialize anything with an unrecognized class.
For example, our `ConditioningFieldData` class is untrusted. When we load conditioning from disk, we will get a runtime error.
Torch provides a method to add trusted classes to an allowlist. This change adds an arg to `ObjectSerializerDisk` to add a list of safe globals to the allowlist and uses it for both `ObjectSerializerDisk` instances.
Note: My first attempt inferred the class from the generic type arg that `ObjectSerializerDisk` accepts, and added that to the allowlist. Unfortunately, this doesn't work.
For example, `ConditioningFieldData` has a `conditionings` attribute that may be one some other untrusted classes representing model-specific conditioning data. So, even if we allowlist `ConditioningFieldData`, loading will fail when torch deserializes the `conditionings` attribute.
This is a squash of a lot of scattered commits that became very difficult to clean up and make individually. Sorry.
Besides the new UI, there are a number of notable changes:
- Publishing logic is disabled in OSS by default. To enable it, provided a `disabledFeatures` prop _without_ "publishWorkflow".
- Enqueuing a workflow is no longer handled in a redux listener. It was hard to track the state of the enqueue logic in the listener. It is now in a hook. I did not migrate the canvas and upscaling tabs - their enqueue logic is still in the listener.
- When queueing a validation run, the new `useEnqueueWorkflows()` hook will update the payload with the required data for the run.
- Some logic is added to the socket event listeners to handle workflow publish runs completing.
- The workflow library side nav has a new "published" view. It is hidden when the "publishWorkflow" feature is disabled.
- I've added `Safe` and `OrThrow` versions of some workflows hooks. These hooks typically retrieve some data from redux. For example, a node. The `Safe` hooks return the node or null if it cannot be found, while the `OrThrow` hooks return the node or raise if it cannot be found. The `OrThrow` hooks should be used within one of the gate components. These components use the `Safe` hooks and render a fallback if e.g. the node isn't found. This change is required for some of the publish flow UI.
- Add support for locking the workflow editor. When locked, you can pan and zoom but that's it. Currently, it is only locked during publish flow and if a published workflow is opened.
This message is logged _every_ time we retrieve a list of models if there is an invalid model. Previously it logged the _whole_ row which can be a lot of data. Truncate the row to 64 characters to reduce log pollution.
Currently translated at 98.8% (1818 of 1840 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.6% (1816 of 1840 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.7% (1816 of 1839 strings)
Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
Previously, reactflow appears to have handled an edge case when using its `applyChanges` utility. If a change was provided without an item, it would skip that change. For example, an "add edge" change that somehow passed `null` as the edge, instead of a valid edge.
In our workflow loading and validation logic, invalid edges were removed from the array using `delete edges[i]`. This left "holes" in the array of edges. We then asked `reactflow` to add these edges to state. When it encountered one of the "holes", it skipped over it.
In a recent release (unsure which, somewhere between the latest v11 and ~v12.4) this seems to have changed. It no longer skips over the "holes" and instead trusts the data. This can cause a couple issues:
- Error when loading the workflow if `reactflow` attempt to do anything with the nonexistent edge.
- If somehow the workflow makes it into state with "holes" in the array of edges, all sorts of other stuff breaks when our code does anything with the nonexistent edge.
Two-part fix:
- Update the invalid edge handling to not use `delete edges[i]`. Instead, as we check each edge, we add invalid ones to a set. Then, after all the checks are finished, filter out the invalid edges. The resultant edges array has no holes.
- Simplify the logic around setting nodes and edges in redux. Previously we were using `reactflow`'s `applyChanges` utils, but this does literally nothing except take extra CPU cycles. We can simply set the loaded nodes and edges directly in redux. Perhaps we were using `applyChanges` because it addressed the "holes" issue? Not sure. But we don't need it now.
Closes#7868
## Summary
`timm` below 1.0.0 prevents llava models from working (broken in
transformers). but `controlnet-aux` pins `timm` to an earlier version
because otherwise it was breaking the ZoeDepth controlnet.
we don't use ZoeDepth (replaced by depthAnything), and downgrading
controlnet-aux seems to be acceptable.
more context here:
https://github.com/huggingface/controlnet_aux/issues/106https://github.com/huggingface/controlnet_aux/pull/101
Note that this results in some warnings on startup, stemming from
controlnet-aux:

we can probably silence the warnings as a separate enhancement
## Related Issues / Discussions
<!--WHEN APPLICABLE: List any related issues or discussions on github or
discord. If this PR closes an issue, please use the "Closes #1234"
format, so that the issue will be automatically closed when the PR
merges.-->
## QA Instructions
<!--WHEN APPLICABLE: Describe how you have tested the changes in this
PR. Provide enough detail that a reviewer can reproduce your tests.-->
## Merge Plan
<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
## Summary
- Port LoRA to new classification API
- Add 2 additional tests cases (ControlLora and Flux Diffusers LoRA)
- Moved `ModelOnDisk` to its own module
## Related Issues / Discussions
<!--WHEN APPLICABLE: List any related issues or discussions on github or
discord. If this PR closes an issue, please use the "Closes #1234"
format, so that the issue will be automatically closed when the PR
merges.-->
## QA Instructions
<!--WHEN APPLICABLE: Describe how you have tested the changes in this
PR. Provide enough detail that a reviewer can reproduce your tests.-->
## Merge Plan
<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->
## Checklist
- [ ] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
Before FLUX Fill was merged, we didn't do any checks for the model variant. We always returned "normal".
To determine if a model is a FLUX Fill model, we need to check the state dict for a specific key. Initially, this logic was too strict and rejected quantized FLUX models. This issue was resolved, but it turns out there is another failure mode - some fine-tunes use a different key.
This change further reduces the strictness, handling the alternate key and also falling back to "normal" if we don't see either key. This effectively restores the previous probing behaviour for all FLUX models.
Closes#7856Closes#7859
The polynomial fit isn't perfect and we end up with alpha values of 1 instead of 0 when applying the mask. This in turn causes issues on canvas where outputs aren't 100% transparent and individual layer bbox calculations are incorrect.
Lots of squashed experimentation heh:
ci: manually specify python version in tests
ci: whoops typo in ruff cmds
ci: specify python versions for uv python install
ci: install python verbosely
ci: try forcing python preference?
ci: try forcing python preference a different way?
ci: try in a venv?
ci: it works, but try without venv
ci: oh maybe we need --preview?
ci: poking it with a stick
ci: it works, add summary to pytest output
ci: fix pytest output
experiment: simulate test failure
Revert "experiment: simulate test failure"
This reverts commit b99ca512f6e61a2a04a1c0636d44018c11019954.
ci: just use default pytest output
cI: attempt again to use uv to install python
cI: attempt again again to use uv to install python
Revert "cI: attempt again again to use uv to install python"
This reverts commit 3cba861c90738081caeeb3eca97b60656ab63929.
Revert "cI: attempt again to use uv to install python"
This reverts commit b30f2277041dc999ed514f6c594c6d6a78f5c810.
## Summary
- Extend `ModelOnDisk` with caching, type hints, default args
- Fail early if there is an error classifying a config
## Related Issues / Discussions
<!--WHEN APPLICABLE: List any related issues or discussions on github or
discord. If this PR closes an issue, please use the "Closes #1234"
format, so that the issue will be automatically closed when the PR
merges.-->
## QA Instructions
<!--WHEN APPLICABLE: Describe how you have tested the changes in this
PR. Provide enough detail that a reviewer can reproduce your tests.-->
## Merge Plan
<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->
## Checklist
- [ ] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
## Summary
This PR moves type definitions out of `config.py` into a new
`taxonomy.py` module.
The goal is to reduce clutter in `config.py`, and to resolve circular
import issues by isolating these types in a dedicated module with
(almost) no internal dependencies.
Because so many places import these definitions, these changes touch 73
files.
Additional changes:
- Removed star imports using "removestar" tool
- Added the commit to `.git-blame-ignore-revs` to avoid noise in git
blame history
## Related Issues / Discussions
<!--WHEN APPLICABLE: List any related issues or discussions on github or
discord. If this PR closes an issue, please use the "Closes #1234"
format, so that the issue will be automatically closed when the PR
merges.-->
## QA Instructions
<!--WHEN APPLICABLE: Describe how you have tested the changes in this
PR. Provide enough detail that a reviewer can reproduce your tests.-->
## Merge Plan
<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->
## Checklist
- [ ] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
The top-level `invokeai` package may have an obscured origin due to the way editible installs work, but it's much more likely that this module is from a specific file.
## Summary
This test imports all modules in the invokeai package and fails if there
are any exceptions.
Existing issues are excluded to avoid blocking main.
## Related Issues / Discussions
<!--WHEN APPLICABLE: List any related issues or discussions on github or
discord. If this PR closes an issue, please use the "Closes #1234"
format, so that the issue will be automatically closed when the PR
merges.-->
## QA Instructions
<!--WHEN APPLICABLE: Describe how you have tested the changes in this
PR. Provide enough detail that a reviewer can reproduce your tests.-->
## Merge Plan
<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->
## Checklist
- [ ] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
## Summary
- Port LLaVA model config to new classification API
- Add 2 test cases (stripped LLaVA models variants to git-lfs)
## Related Issues / Discussions
<!--WHEN APPLICABLE: List any related issues or discussions on github or
discord. If this PR closes an issue, please use the "Closes #1234"
format, so that the issue will be automatically closed when the PR
merges.-->
## QA Instructions
<!--WHEN APPLICABLE: Describe how you have tested the changes in this
PR. Provide enough detail that a reviewer can reproduce your tests.-->
## Merge Plan
<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->
## Checklist
- [ ] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
In #7780 we added FLUX Fill support, and needed the probe to be able to distinguish between "normal" FLUX models and FLUX Fill models.
Logic was added to the probe to check a particular state dict key (input channels), which should be 384 for FLUX Fill and 64 for other FLUX models.
The new logic was stricter and instead of falling back on the "normal" variant, it raised when an unexpected value for input channels was detected.
This caused failures to probe for BNB-NF4 quantized FLUX Dev/Schnell, which apparently only have 1 input channel.
After checking a variety of FLUX models, I loosened the strictness of the variant probing logic to only special-case the new FLUX Fill model, and otherwise fall back to returning the "normal" variant. This better matches the old behaviour and fixes the import errors.
Closes#7822
Currently translated at 100.0% (1827 of 1827 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 100.0% (1826 of 1826 strings)
translationBot(ui): update translation (Vietnamese)
Currently translated at 100.0% (1825 of 1825 strings)
Co-authored-by: Linos <linos.coding@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/vi/
Translation: InvokeAI/Web UI
Previously we used erode/dilate and a Gaussian blur to expand and fade the edges of Canvas masks. The implementation a number of problems:
- Erode/dilate kernel sizes were not calculated correctly, and extra iterations were run to compensate. The result is the blur size, which should have been pixels, was very inaccurate and unreliable.
- What we want is to add a "soft bleed" - like a drop shadow with no offset - starting from the edge of the mask, extending out by however many pixels. But Gaussian blur does not do this. The blurred area starts _inside_ the mask and extends outside it. So it kinda blurs inwards and outwards. We compensated for this by expanding the mask.
- Using a Gaussian blur can cause banding artifacts. Gaussian blur doesn't have a "size" or "radius" parameter in the sense that you think it should. It's a convolution matrix and there are _no non-zero values in the result_. This means that, far away from the mask, once compositing completes, we have some values that are very close to zero but not quite zero. These values are quantized by HTML Canvas, resulting in banding artifacts where you'd expect the blur to have faded to 0% alpha. At least, that is my understanding of why the banding artifacts occur.
The new node uses a better strategy to expand the mask and add the fade out effect:
- Calculate the distance from each white pixel to the nearest black pixel.
- Normalize this distance by dividing by the fade size in px, then clip the values to 0 - 1. The result represents the distance of each white pixel to its nearest black pixel as a percentage of the fade size. At this point, it is a linear distribution.
- Create a polynomial to describe the fade's intensity so that we can have a smooth transition from the masked region (black) to unmasked (white). There are some magic numbers here, deterined experimentally.
- Evaluate the polynomial over the normalized distances, so we now have a matrix representing the fade intensity for every pixel
- Convert this matrix back to uint8 and apply it to the mask
This works soooo much better than the previous method. Not only does it fix the banding issues, but when we enable "output only generated regions", we get a much smaller image. Will add images to the PR to clarify.
## Summary
- Integrate Git LFS to our automated Python tests in CI
- Add stripped model files with git-lfs
- `README.md` instructions to install and configure git-lfs
- Unrelated change (skip hashing to make unit test run faster)
## Related Issues / Discussions
<!--WHEN APPLICABLE: List any related issues or discussions on github or
discord. If this PR closes an issue, please use the "Closes #1234"
format, so that the issue will be automatically closed when the PR
merges.-->
## QA Instructions
<!--WHEN APPLICABLE: Describe how you have tested the changes in this
PR. Provide enough detail that a reviewer can reproduce your tests.-->
## Merge Plan
<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->
## Checklist
- [ ] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
## Summary
**Problem**
We want to have automated tests for model classification/probing, but
model files are too large to include in the source.
**Proposed Solution**
Classification/probing only requires metadata (key names, tensor
shapes), not weights.
This PR introduces "stripped" models - lightweight versions that retains
only essential metadata.
- Added script to strip models
- Added stripped models to automated tests
**Model size before and after "stripping":**
```
LLaVA Onevision Qwen2 0.5b-ov-hf before: 1.8 GB, after: 11.6 MB
text_encoder before: 246.1 MB, after: 35.6 kB
llava-onevision-qwen2-7b-si-hf before: 16.1 GB, after: 11.7 MB
RealESRGAN_x2plus.pth before: 67.1 MB, after: 143.0 kB
IP Adapter SD1 before: 2.5 GB, after: 94.9 kB
Hard Edge Detection (canny) before: 722.6 MB, after: 63.6 kB
Lineart before: 722.6 MB, after: 63.6 kB
Segmentation Map before: 722.6 MB, after: 63.6 kB
EasyNegative before: 24.7 kB, after: 151 Bytes
Face Reference (IP Adapter Plus Face) before: 98.2 MB, after: 13.7 kB
Standard Reference (IP Adapter) before: 44.6 MB, after: 6.0 kB
shinkai_makoto_offset before: 151.1 MB, after: 160.0 kB
thickline_fp16 before: 151.1 MB, after: 160.0 kB
Alien Style before: 228.5 MB, after: 582.6 kB
Noodles Style before: 228.5 MB, after: 582.6 kB
Juggernaut XL v9 before: 6.9 GB, after: 3.7 MB
dreamshaper-8 before: 168.9 MB, after: 1.6 MB
```
## Related Issues / Discussions
<!--WHEN APPLICABLE: List any related issues or discussions on github or
discord. If this PR closes an issue, please use the "Closes #1234"
format, so that the issue will be automatically closed when the PR
merges.-->
## QA Instructions
<!--WHEN APPLICABLE: Describe how you have tested the changes in this
PR. Provide enough detail that a reviewer can reproduce your tests.-->
## Merge Plan
<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->
## Checklist
- [ ] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
## Summary
The _goal_ of this PR is to make it easier to add an new config type.
This _scope_ of this PR is to integrate the API and does not include
adding new configs (outside tests) or porting existing ones.
One of the glaring issues of the existing *legacy probe* is that the
logic for each type is spread across multiple classes and intertwined
with the other configs. This means that adding a new config type (or
modifying an existing one) is complex and error prone.
This PR attempts to remedy this by providing a new API for adding
configs that:
- Is backwards compatible with the existing probe.
- Encapsulates fields and logic in a single class, keeping things
self-contained and easy to modify safely.
Below is a minimal toy example illustrating the proposed new structure:
```python
class MinimalConfigExample(ModelConfigBase):
type: ModelType = ModelType.Main
format: ModelFormat = ModelFormat.Checkpoint
fun_quote: str
@classmethod
def matches(cls, mod: ModelOnDisk) -> bool:
return mod.path.suffix == ".json"
@classmethod
def parse(cls, mod: ModelOnDisk) -> dict[str, Any]:
with open(mod.path, "r") as f:
contents = json.load(f)
return {
"fun_quote": contents["quote"],
"base": BaseModelType.Any,
}
```
To create a new config type, one needs to inherit from `ModelConfigBase`
and implement its interface.
The code falls back to the legacy model probe for existing models using
the old API.
This allows us to incrementally port the configs one by one.
## Related Issues / Discussions
<!--WHEN APPLICABLE: List any related issues or discussions on github or
discord. If this PR closes an issue, please use the "Closes #1234"
format, so that the issue will be automatically closed when the PR
merges.-->
## QA Instructions
<!--WHEN APPLICABLE: Describe how you have tested the changes in this
PR. Provide enough detail that a reviewer can reproduce your tests.-->
## Merge Plan
<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
In #7688 we optimized queuing preparation logic. This inadvertently broke retrying queue items.
Previously, a `NamedTuple` was used to store the values to insert in the DB when enqueuing. This handy class provides an API similar to a dataclass, where you can instantiate it with kwargs in any order. The resultant tuple re-orders the kwargs to match the order in the class definition.
For example, consider this `NamedTuple`:
```py
class SessionQueueValueToInsert(NamedTuple):
foo: str
bar: str
```
When instantiating it, no matter the order of the kwargs, if you make a normal tuple out of it, the tuple values are in the same order as in the class definition:
```
t1 = SessionQueueValueToInsert(foo="foo", bar="bar")
print(tuple(t1)) # -> ('foo', 'bar')
t2 = SessionQueueValueToInsert(bar="bar", foo="foo")
print(tuple(t2)) # -> ('foo', 'bar')
```
So, in the old code, when we used the `NamedTuple`, it implicitly normalized the order of the values we insert into the DB.
In the retry logic, the values of the tuple were not ordered correctly, but the use of `NamedTuple` had secretly fixed the order for us.
In the linked PR, `NamedTuple` was dropped for a normal tuple, after profiling showed `NamedTuple` to be meaningfully slower than a normal tuple.
The implicit order normalization behaviour wasn't understood, and the order wasn't fixed when changin the retry logic to use a normal tuple instead of `NamedTuple`. This results in a bug where we incorrectly create queue items in the DB. For example, we stored the `destination` in the `field_values` column.
When such an incorrectly-created queue item is dequeued, it fails pydantic validation and causes what appears to be an endless loop of errors.
The only user-facing solution is to add this line to `invokeai.yaml` and restart the app:
```yaml
clear_queue_on_startup: true
```
On next startup, the queue is forcibly cleared before the error loop is triggered. Then the user should remove this line so their queue is persisted across app launches per usual.
The solution is simple - fix the ordering of the tuple. I also added a type annotation and comment to the tuple type alias definition.
Note: The endless error loop, as a general problem, will take some thinking to fix. The queue service methods to cancel and fail a queue item still retrieve it and parse it. And the list queue items methods parse the queue items. Bit of a catch 22, maybe the solution is to simply delete totally borked queue items and log an error.
Currently translated at 98.7% (1800 of 1822 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.7% (1798 of 1820 strings)
translationBot(ui): update translation (Italian)
Currently translated at 98.7% (1796 of 1818 strings)
Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com>
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
There is now a single entrypoint for loading a workflow - `useLoadWorkflowWithDialog`.
The hook:
Handles loading workflows from various sources. If there are unsaved changes, the user will be prompted to confirm before loading the workflow.
It returns a function that:
Loads a workflow from various sources. If there are unsaved changes, the user will be prompted to confirm before loading the workflow. The workflow will be loaded immediately if there are no unsaved changes. On success, error or completion, the corresponding callback will be called.
WHEW
- Replace `get_counts` method with `get_tag_counts_with_filter` which gets the counts for a list of tags, filtering by a list of selected tags
- Update `get_many` logic to apply tag filtering with AND logic, to match the new `get_tag_counts_with_filter` method
- Update workflow library router
User facing:
When a FLUX main model is selected, users may now add Regional Reference Image layers.
When switching between FLUX Redux and FLUX IP Adapter, the settings will change to match the model type. (IP Adapter has weight, begin/end step, but Redux does not.) The image will be retained when switching between the two.
Otherwise it works the same way as IP Adapter - both in Global and Regional Reference Image layers.
---
Internal state handling:
Slightly awkward, but it was easiest to make FLUX Redux a second type of IP Adapter in redux state.
Global and regional reference images still have a single `ipAdapter` field, but it can have a type of `ip_adapter` or `flux_redux`.
Ideally, this field is called `config` or `settings` or something, but we are past that point. We _could_ do a migration to rename it, but I don't think it's worth the effort.
---
Other changes:
- Updated canvas layer validators to handle FLUX Redux.
- Updated model list loading logic to un-set FLUX Redux models in Canvas if they are not in the list (e.g. if the user deletes the model in the main app).
- Updated graph builders - new `addFLUXRedux` util & updated `addRegions` util.
- Updated the `buildModelsHook` util to return a hook that accepts a filter callback. This handles a discrepancy: FLUX IP Adapter does not support regional guidance, but FLUX Redux does. The Regional Guidance settings provide the filter to filter out FLUX IP Adapter models from the combined list of IP Adapter ahd Redux models.
This follows the same pattern for IP Adapter w/ its CLIP Vision model. The SigLIP model is unlikely to ever change and we don't want to force the user to select it anywhere. Hardcoding it is safe and makes the UX much nicer.
The alternative is a model dropdown that will likely only ever have one valid choice in it.
- We don't need to copy the init file. Just crawl the custom nodes dir for modules and import them all. Dunno why I didn't do this initially.
- Pass the logger in as an arg. There was a race condition where if we got the logger directly in the load_custom_nodes function, the config would not have been loaded fully yet and we'd end up with the wrong custom nodes path!
- Remove permissions-setting logic, I do not believe it is relevant for custom nodes
- Minor cleanup of the utility
The version of Invoke you have installed. If it is not the latest version, please update and try again to confirm the issue still exists. If you are testing main, please include the commit hash instead.
placeholder:ex. 3.6.1
The version of Invoke you have installed. If it is not the [latest version](https://github.com/invoke-ai/InvokeAI/releases/latest), please update and try again to confirm the issue still exists. If you are testing main, please include the commit hash instead.
placeholder:ex. v6.0.2
validations:
required:true
@@ -85,17 +99,17 @@ body:
id:browser-version
attributes:
label:Browser
description:Your web browser and version.
description:Your web browser and version, if you do not use the Launcher's provided GUI.
placeholder:ex. Firefox 123.0b3
validations:
required:true
required:false
- type:textarea
id:python-deps
attributes:
label:Python dependencies
label:System Information
description:|
If the problem occurred during image generation, click the gear icon at the bottom left corner, click "About", click the copy button and then paste here.
Click the gear icon at the bottom left corner, then click "About". Click the copy button and then paste here.
@@ -60,16 +60,11 @@ Next, these jobs run and must pass. They are the same jobs that are run for ever
- **`frontend-checks`**: runs `prettier` (format), `eslint` (lint), `dpdm` (circular refs), `tsc` (static type check) and `knip` (unused imports)
- **`typegen-checks`**: ensures the frontend and backend types are synced
#### `build-installer` Job
#### `build-wheel` Job
This sets up both python and frontend dependencies and builds the python package. Internally, this runs `installer/create_installer.sh` and uploads two artifacts:
This sets up both python and frontend dependencies and builds the python package. Internally, this runs `./scripts/build_wheel.sh` and uploads `dist.zip`, which contains the wheel and unarchived build.
- **`dist`**: the python distribution, to be published on PyPI
- **`InvokeAI-installer-${VERSION}.zip`**: the legacy install scripts
You don't need to download either of these files.
> The legacy install scripts are no longer used, but we haven't updated the workflow to skip building them.
You don't need to download or test these artifacts.
#### Sanity Check & Smoke Test
@@ -79,7 +74,7 @@ It's possible to test the python package before it gets published to PyPI. We've
But, if you want to be extra-super careful, here's how to test it:
- Download the `dist.zip` build artifact from the `build-installer` job
- Download the `dist.zip` build artifact from the `build-wheel` job
- Unzip it and find the wheel file
- Create a fresh Invoke install by following the [manual install guide](https://invoke-ai.github.io/InvokeAI/installation/manual/) - but instead of installing from PyPI, install from the wheel
@@ -18,9 +18,19 @@ If you just want to use Invoke, you should use the [launcher][launcher link].
2. [Fork and clone][forking link] the [InvokeAI repo][repo link].
3.Create an directory for user data (images, models, db, etc). This is typically at `~/invokeai`, but if you already have a non-dev install, you may want to create a separate directory for the dev install.
3.This repository uses Git LFS to manage large files. To ensure all assets are downloaded:
- Enable automatic LFS fetching for this repository:
```shell
git config lfs.fetchinclude "*"
```
- Fetch files from LFS (only needs to be done once; subsequent `git pull` will fetch changes automatically):
```
git lfs pull
```
4. Create an directory for user data (images, models, db, etc). This is typically at `~/invokeai`, but if you already have a non-dev install, you may want to create a separate directory for the dev install.
4. Follow the [manual install][manual install link] guide, with some modifications to the install command:
5. Follow the [manual install][manual install link] guide, with some modifications to the install command:
- Use `.` instead of `invokeai` to install from the current directory. You don't need to specify the version.
@@ -31,22 +41,22 @@ If you just want to use Invoke, you should use the [launcher][launcher link].
With the modifications made, the install command should look something like this:
5. At this point, you should have Invoke installed, a venv set up and activated, and the server running. But you will see a warning in the terminal that no UI was found. If you go to the URL for the server, you won't get a UI.
6. At this point, you should have Invoke installed, a venv set up and activated, and the server running. But you will see a warning in the terminal that no UI was found. If you go to the URL for the server, you won't get a UI.
This is because the UI build is not distributed with the source code. You need to build it manually. End the running server instance.
If you only want to edit the docs, you can stop here and skip to the **Documentation** section below.
6. Install the frontend dev toolchain:
7. Install the frontend dev toolchain, paying attention to versions:
- [`nodejs`](https://nodejs.org/) (v20+)
- [`nodejs`](https://nodejs.org/) (tested on LTS, v22)
- [`pnpm`](https://pnpm.io/8.x/installation) (must be v8 - not v9!)
- [`pnpm`](https://pnpm.io/installation) (tested on v10)
7. Do a production build of the frontend:
8. Do a production build of the frontend:
```sh
cd <PATH_TO_INVOKEAI_REPO>/invokeai/frontend/web
@@ -54,7 +64,7 @@ If you just want to use Invoke, you should use the [launcher][launcher link].
pnpm build
```
8. Restart the server and navigate to the URL. You should get a UI. After making changes to the python code, restart the server to see those changes.
9. Restart the server and navigate to the URL. You should get a UI. After making changes to the python code, restart the server to see those changes.
We recommend using the Invoke Launcher to install and update Invoke. It's a desktop application for Windows, macOS and Linux. It takes care of a lot of nitty gritty details for you.
Follow the [quick start guide](./quick_start.md) to get started.
!!! tip "Use the installer to update"
Using the installer for updates will not erase any of your data (images, models, boards, etc). It only updates the core libraries used to run Invoke.
Simply use the same path you installed to originally to update your existing installation.
Both release and pre-release versions can be installed using the installer. It also supports install through a wheel if needed.
Be sure to review the [installation requirements] and ensure your system has everything it needs to install Invoke.
## Getting the Latest Installer
Download the `InvokeAI-installer-vX.Y.Z.zip` file from the [latest release] page. It is at the bottom of the page, under **Assets**.
After unzipping the installer, you should have a `InvokeAI-Installer` folder with some files inside, including `install.bat` and `install.sh`.
## Running the Installer
!!! tip
Windows users should first double-click the `WinLongPathsEnabled.reg` file to prevent a failed installation due to long file paths.
Double-click the install script:
=== "Windows"
```sh
install.bat
```
=== "Linux/macOS"
```sh
install.sh
```
!!! info "Running the Installer from the commandline"
You can also run the install script from cmd/powershell (Windows) or terminal (Linux/macOS).
!!! warning "Untrusted Publisher (Windows)"
You may get a popup saying the file comes from an `Untrusted Publisher`. Click `More Info` and `Run Anyway` to get past this.
The installation process is simple, with a few prompts:
- Select the version to install. Unless you have a specific reason to install a specific version, select the default (the latest version).
- Select location for the install. Be sure you have enough space in this folder for the base application, as described in the [installation requirements].
- Select a GPU device.
!!! info "Slow Installation"
The installer needs to download several GB of data and install it all. It may appear to get stuck at 99.9% when installing `pytorch` or during a step labeled "Installing collected packages".
If it is stuck for over 10 minutes, something has probably gone wrong and you should close the window and restart.
## Running the Application
Find the install location you selected earlier. Double-click the launcher script to run the app:
=== "Windows"
```sh
invoke.bat
```
=== "Linux/macOS"
```sh
invoke.sh
```
Choose the first option to run the UI. After a series of startup messages, you'll see something like this:
```sh
Uvicorn running on http://127.0.0.1:9090 (Press CTRL+C to quit)
```
Copy the URL into your browser and you should see the UI.
## Improved Outpainting with PatchMatch
PatchMatch is an extra add-on that can improve outpainting. Windows users are in luck - it works out of the box.
On macOS and Linux, a few extra steps are needed to set it up. See the [PatchMatch installation guide](./patchmatch.md).
## First-time Setup
You will need to [install some models] before you can generate.
Check the [configuration docs] for details on configuring the application.
## Updating
Updating is exactly the same as installing - download the latest installer, choose the latest version, enter your existing installation path, and the app will update. None of your data (images, models, boards, etc) will be erased.
!!! info "Dependency Resolution Issues"
We've found that pip's dependency resolution can cause issues when upgrading packages. One very common problem was pip "downgrading" torch from CUDA to CPU, but things broke in other novel ways.
The installer doesn't have this kind of problem, so we use it for updating as well.
## Installation Issues
If you have installation issues, please review the [FAQ]. You can also [create an issue] or ask for help on [discord].
This command creates a portable virtual environment at `.venv` complete with a portable python 3.11. It doesn't matter if your system has no python installed, or has a different version - `uv` will handle everything.
This command creates a portable virtual environment at `.venv` complete with a portable python 3.12. It doesn't matter if your system has no python installed, or has a different version - `uv` will handle everything.
4. Activate the virtual environment:
@@ -64,37 +64,51 @@ The following commands vary depending on the version of Invoke being installed a
5. Choose a version to install. Review the [GitHub releases page](https://github.com/invoke-ai/InvokeAI/releases).
6. Determine the package package specifier to use when installing. This is a performance optimization.
6. Determine the package specifier to use when installing. This is a performance optimization.
- If you have an Nvidia 20xx series GPU or older, use `invokeai[xformers]`.
- If you have an Nvidia 30xx series GPU or newer, or do not have an Nvidia GPU, use `invokeai`.
7. Determine the `PyPI` index URL to use for installation, if any. This is necessary to get the right version of torch installed.
7. Determine the torch backend to use for installation, if any. This is necessary to get the right version of torch installed. This is acheived by using [UV's built in torch support.](https://docs.astral.sh/uv/guides/integration/pytorch/#automatic-backend-selection)
=== "Invoke v5 or later"
=== "Invoke v5.12 and later"
- If you are on Windows with an Nvidia GPU, use `https://download.pytorch.org/whl/cu124`.
- If you are on Linux with no GPU, use `https://download.pytorch.org/whl/cpu`.
- If you are on Linux with an AMD GPU, use `https://download.pytorch.org/whl/rocm6.1`.
- If you are on Windows or Linux with an Nvidia GPU, use `--torch-backend=cu128`.
- If you are on Linux with no GPU, use `--torch-backend=cpu`.
- If you are on Linux with an AMD GPU, use `--torch-backend=rocm6.3`.
- **In all other cases, do not use a torch backend.**
=== "Invoke v5.10.0 to v5.11.0"
- If you are on Windows or Linux with an Nvidia GPU, use `--torch-backend=cu126`.
- If you are on Linux with no GPU, use `--torch-backend=cpu`.
- If you are on Linux with an AMD GPU, use `--torch-backend=rocm6.2.4`.
- **In all other cases, do not use an index.**
=== "Invoke v5.0.0 to v5.9.1"
- If you are on Windows with an Nvidia GPU, use `--torch-backend=cu124`.
- If you are on Linux with no GPU, use `--torch-backend=cpu`.
- If you are on Linux with an AMD GPU, use `--torch-backend=rocm6.1`.
- **In all other cases, do not use an index.**
=== "Invoke v4"
- If you are on Windows with an Nvidia GPU, use `https://download.pytorch.org/whl/cu124`.
- If you are on Linux with no GPU, use `https://download.pytorch.org/whl/cpu`.
- If you are on Linux with an AMD GPU, use `https://download.pytorch.org/whl/rocm5.2`.
- If you are on Windows with an Nvidia GPU, use `--torch-backend=cu124`.
- If you are on Linux with no GPU, use `--torch-backend=cpu`.
- If you are on Linux with an AMD GPU, use `--torch-backend=rocm5.2`.
- **In all other cases, do not use an index.**
8. Install the `invokeai` package. Substitute the package specifier and version.
@@ -33,30 +33,45 @@ Hardware requirements vary significantly depending on model and image output siz
More detail on system requirements can be found [here](./requirements.md).
## Step 2: Download
## Step 2: Download and Set Up the Launcher
Download the most launcher for your operating system:
The Launcher manages your Invoke install. Follow these instructions to download and set up the Launcher.
- [Download for Windows](https://download.invoke.ai/Invoke%20Community%20Edition.exe)
- [Download for macOS](https://download.invoke.ai/Invoke%20Community%20Edition.dmg)
- [Download for Linux](https://download.invoke.ai/Invoke%20Community%20Edition.AppImage)
!!! info "Instructions for each OS"
## Step 3: Install or Update
=== "Windows"
Run the launcher you just downloaded, click **Install** and follow the instructions to get set up.
- [Download for Windows](https://github.com/invoke-ai/launcher/releases/latest/download/Invoke.Community.Edition.Setup.latest.exe)
- Run the `EXE` to install the Launcher and start it.
- A desktop shortcut will be created; use this to run the Launcher in the future.
- You can delete the `EXE` file you downloaded.
=== "macOS"
- [Download for macOS](https://github.com/invoke-ai/launcher/releases/latest/download/Invoke.Community.Edition-latest-arm64.dmg)
- Open the `DMG` and drag the app into `Applications`.
- Run the Launcher using its entry in `Applications`.
- You can delete the `DMG` file you downloaded.
=== "Linux"
- [Download for Linux](https://github.com/invoke-ai/launcher/releases/latest/download/Invoke.Community.Edition-latest.AppImage)
- You may need to edit the `AppImage` file properties and make it executable.
- Optionally move the file to a location that does not require admin privileges and add a desktop shortcut for it.
- Run the Launcher by double-clicking the `AppImage` or the shortcut you made.
## Step 3: Install Invoke
Run the Launcher you just set up if you haven't already. Click **Install** and follow the instructions to install (or update) Invoke.
If you have an existing Invoke installation, you can select it and let the launcher manage the install. You'll be able to update or launch the installation.
!!! warning "Problem running the launcher on macOS"
!!! tip "Updating"
macOS may not allow you to run the launcher. We are working to resolve this by signing the launcher executable. Until that is done, you can either use the [legacy scripts](./legacy_scripts.md) to install, or manually flag the launcher as safe:
The Launcher will check for updates for itself _and_ Invoke.
-Open the **Invoke-Installer-mac-arm64.dmg** file.
-Drag the launcher to **Applications**.
- Open a terminal.
- Run `xattr -d 'com.apple.quarantine' /Applications/Invoke\ Community\ Edition.app`.
You should now be able to run the launcher.
-When the Launcher detects an update is available for itself, you'll get a small popup window. Click through this and the Launcher will update itself.
-When the Launcher detects an update for Invoke, you'll see a small green alert in the Launcher. Click that and follow the instructions to update Invoke.
## Step 4: Launch
@@ -117,7 +132,6 @@ If you still have problems, ask for help on the Invoke [discord](https://discord
- You can install the Invoke application as a python package. See our [manual install](./manual.md) docs.
- You can run Invoke with docker. See our [docker install](./docker.md) docs.
- You can still use our legacy scripts to install and run Invoke. See the [legacy scripts](./legacy_scripts.md) docs.
@@ -41,7 +41,7 @@ The requirements below are rough guidelines for best performance. GPUs with less
You don't need to do this if you are installing with the [Invoke Launcher](./quick_start.md).
Invoke requires python 3.10 or 3.11. If you don't already have one of these versions installed, we suggest installing 3.11, as it will be supported for longer.
Invoke requires python 3.10 through 3.12. If you don't already have one of these versions installed, we suggest installing 3.12, as it will be supported for longer.
Check that your system has an up-to-date Python installed by running `python3 --version` in the terminal (Linux, macOS) or cmd/powershell (Windows).
@@ -49,19 +49,19 @@ Check that your system has an up-to-date Python installed by running `python3 --
=== "Windows"
- Install python 3.11 with [an official installer].
- Install python with [an official installer].
- The installer includes an option to add python to your PATH. Be sure to enable this. If you missed it, re-run the installer, choose to modify an existing installation, and tick that checkbox.
- You may need to install [Microsoft Visual C++ Redistributable].
=== "macOS"
- Install python 3.11 with [an official installer].
- Install python with [an official installer].
- If model installs fail with a certificate error, you may need to run this command (changing the python version to match what you have installed): `/Applications/Python\ 3.10/Install\ Certificates.command`
- If you haven't already, you will need to install the XCode CLI Tools by running `xcode-select --install` in a terminal.
=== "Linux"
- Installing python varies depending on your system. On Ubuntu, you can use the [deadsnakes PPA](https://launchpad.net/~deadsnakes/+archive/ubuntu/ppa).
- Installing python varies depending on your system. We recommend [using `uv` to manage your python installation](https://docs.astral.sh/uv/concepts/python-versions/#installing-a-python-version).
- You'll need to install `libglib2.0-0` and `libgl1-mesa-glx` for OpenCV to work. For example, on a Debian system: `sudo apt update && sudo apt install -y libglib2.0-0 libgl1-mesa-glx`
@@ -41,7 +41,7 @@ Nodes have a "Use Cache" option in their footer. This allows for performance imp
There are several node grouping concepts that can be examined with a narrow focus. These (and other) groupings can be pieced together to make up functional graph setups, and are important to understanding how groups of nodes work together as part of a whole. Note that the screenshots below aren't examples of complete functioning node graphs (see Examples).
### Noise
### Create Latent Noise
An initial noise tensor is necessary for the latent diffusion process. As a result, the Denoising node requires a noise node input.
**Description:** This node will flip an openpose image horizontally, recoloring it to make sure that it isn't facing the wrong direction. Note that it does not work with openpose hands.
**Description:** This node returns an ideal size to use for the first stage of a Flux image generation pipeline. Generating at the right size helps limit duplication and odd subject placement.
seterr_msg=No python was detected on your system. Please install Python version %MINIMUM_PYTHON_VERSION% or higher. We recommend Python 3.10.12 from %PYTHON_URL%
seterr_msg=Your version of Python is too low. You need at least %MINIMUM_PYTHON_VERSION% but you have %python_version%. We recommend Python 3.10.12 from %PYTHON_URL%
gotoerr_exit
)
@rem Cleanup
del /q .tmp1 .tmp2
@rem -------------- Install and Configure ---------------
echo"A suitable Python interpreter could not be found"
echo"Please install Python $MINIMUM_PYTHON_VERSION or higher (maximum $MAXIMUM_PYTHON_VERSION) before running this script. See instructions at $INSTRUCTIONS for help."
read -p "Press any key to exit"
exit -1
fi
echo"For the best user experience we suggest enlarging or maximizing this window now."
"Some of the installation steps take a long time to run. Please be patient. If the script appears to hang for more than 10 minutes, please interrupt with [i]Control-C[/] and retry.",
"We will now apply a registry fix to enable long paths on Windows. InvokeAI needs this to function correctly. We are asking your permission to modify the Windows Registry on your behalf.",
"",
"This is the change that will be applied:",
str(syntax),
]
)
),
title="Windows Long Paths registry fix",
box=box.HORIZONTALS,
padding=(1,1),
)
)
def_platform_specific_help()->Text|None:
ifOS=="Darwin":
text=Text.from_markup(
"""[b wheat1]macOS Users![/]\n\nPlease be sure you have the [b wheat1]Xcode command-line tools[/] installed before continuing.\nIf not, cancel with [i]Control-C[/] and follow the Xcode install instructions at [deep_sky_blue1]https://www.freecodecamp.org/news/install-xcode-command-line-tools/[/]."""
)
elifOS=="Windows":
text=Text.from_markup(
"""[b wheat1]Windows Users![/]\n\nBefore you start, please do the following:
1. Double-click on the file [b wheat1]WinLongPathsEnabled.reg[/] in order to
enable long path support on your system.
2. Make sure you have the [b wheat1]Visual C++ core libraries[/] installed. If not, install from
description="Optional dictionary of metadata for the invocation output, unrelated to the invocation's actual output value. This is not exposed as an output field.",
:param Optional[str] version: Adds a version to the invocation. Must be a valid semver string. Defaults to None.
:param Optional[bool] use_cache: Whether or not to use the invocation cache. Defaults to True. The user may override this in the workflow editor.
:param Classification classification: The classification of the invocation. Defaults to FeatureClassification.Stable. Use Beta or Prototype if the invocation is unstable.
:param Bottleneck bottleneck: The bottleneck of the invocation. Defaults to Bottleneck.GPU. Use Network if the invocation is network-bound.
@@ -40,6 +41,7 @@ class UIType(str, Enum, metaclass=MetaEnum):
# region Model Field Types
MainModel="MainModelField"
CogView4MainModel="CogView4MainModelField"
FluxMainModel="FluxMainModelField"
SD3MainModel="SD3MainModelField"
SDXLMainModel="SDXLMainModelField"
@@ -59,11 +61,20 @@ class UIType(str, Enum, metaclass=MetaEnum):
ControlLoRAModel="ControlLoRAModelField"
SigLipModel="SigLipModelField"
FluxReduxModel="FluxReduxModelField"
LlavaOnevisionModel="LLaVAModelField"
Imagen3Model="Imagen3ModelField"
Imagen4Model="Imagen4ModelField"
ChatGPT4oModel="ChatGPT4oModelField"
Gemini2_5Model="Gemini2_5ModelField"
FluxKontextModel="FluxKontextModelField"
Veo3Model="Veo3ModelField"
RunwayModel="RunwayModelField"
# endregion
# region Misc Field Types
Scheduler="SchedulerField"
Any="AnyField"
Video="VideoField"
# endregion
# region Internal Field Types
@@ -136,6 +147,7 @@ class FieldDescriptions:
noise="Noise tensor"
clip="CLIP (tokenizer, text encoder, LoRAs) and skipped layer count"
t5_encoder="T5 tokenizer and text encoder"
glm_encoder="GLM (THUDM) tokenizer and text encoder"
clip_embed_model="CLIP Embed loader"
clip_g_model="CLIP-G Embed loader"
unet="UNet (scheduler, LoRAs)"
@@ -150,6 +162,7 @@ class FieldDescriptions:
main_model="Main model (UNet, VAE, CLIP) to load"
flux_model="Flux model (Transformer) to load"
sd3_model="SD3 model (MMDiTX) to load"
cogview4_model="CogView4 model (Transformer) to load"
sdxl_main_model="SDXL Main model (UNet, VAE, CLIP1, CLIP2) to load"
sdxl_refiner_model="SDXL Refiner Main Modde (UNet, VAE, CLIP2) to load"
onnx_main_model="ONNX Main model (UNet, VAE, CLIP) to load"
@@ -205,6 +218,9 @@ class FieldDescriptions:
freeu_b2="Scaling factor for stage 2 to amplify the contributions of backbone features."
instantx_control_mode="The control mode for InstantX ControlNet union models. Ignored for other ControlNet models. The standard mapping is: canny (0), tile (1), depth (2), blur (3), pose (4), gray (5), low quality (6). Negative values will be treated as 'None'."
Applies FreeU to the UNet. Suggested values (b1/b2/s1/s2):
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.