InvokeAI

mirror of https://github.com/invoke-ai/InvokeAI.git synced 2026-02-12 12:35:01 -05:00

Author	SHA1	Message	Date
skunkworxdark	bbb5d68146	Update flux_text_encoder.py Added tokenizer logging to flux	2025-06-30 10:40:31 +10:00
psychedelicious	9066dc1839	tidy(nodes): remove extraneous comments & add useful ones	2025-06-27 18:27:46 +10:00
Kent Keirsey	ca1df60e54	Explain the Magic	2025-06-27 18:27:46 +10:00
Cursor Agent	7549c1250d	Add FLUX Kontext conditioning support for reference images Co-authored-by: kent <kent@invoke.ai> Fix Kontext sequence length handling in Flux denoise invocation Co-authored-by: kent <kent@invoke.ai> Fix Kontext step callback to handle combined token sequences Co-authored-by: kent <kent@invoke.ai> fix ruff Fix Flux Kontext	2025-06-27 18:27:46 +10:00
Mary Hipp Rogers	2ad5b5cc2e	Flux Kontext UI support (#8111 ) * add support for flux-kontext models in nodes * flux kontext in canvas * add aspect ratio support * lint * restore aspect ratio logic * more linting * typegen * fix typegen --------- Co-authored-by: Mary Hipp <maryhipp@Marys-Air.lan>	2025-06-25 09:39:57 -04:00
psychedelicious	0794eb43e7	fix(nodes): ensure each invocation overrides _original_model_fields with own field data	2025-06-20 15:03:55 +10:00
Emmanuel Ferdman	c80ad90f72	Migrate to modern logger interface Signed-off-by: Emmanuel Ferdman <emmanuelferdman@gmail.com>	2025-06-13 13:07:09 +10:00
psychedelicious	35c7c59455	fix(app): reduce peak memory usage We've long suspected there is a memory leak in Invoke, but that may not be true. What looks like a memory leak may in fact be the expected behaviour for our allocation patterns. We observe ~20 to ~30 MB increase in memory usage per session executed. I did some prolonged tests, where I measured the process's RSS in bytes while doing 200 SDXL generations. I found that it eventually leveled off at around 100 generations, at which point memory usage had climbed by ~900MB from its starting point. I used tracemalloc to diff the allocations of single session executions and found that we are allocating ~20MB or so per session in `ModelPatcher.apply_ti()`. In `ModelPatcher.apply_ti()` we add tokens to the tokenizer when handling TIs. The added tokens should be scoped to only the current invocation, but there is no simple way to remove the tokens afterwards. As a workaround for this, we clone the tokenizer, add the TI tokens to the clone, and use the clone to when running compel. Afterwards, this cloned tokenizer is discarded. The tokenizer uses ~20MB of memory, and it has referrers/referents to other compel stuff. This is what is causing the observed increases in memory per session! We'd expect these objects to be GC'd but python doesn't do it immediately. After creating the cond tensors, we quickly move on to denoising. So there isn't any time for the GC to happen to free up its existing memory arenas/blocks to reuse them. Instead, python needs to request more memory from the OS. We can improve the situation by immediately calling `del` on the tokenizer clone and related objects. In fact, we already had some code in the compel nodes to `del` some of these objects, but not all. Adding the `del`s vastly improves things. We hit peak RSS in half the sessions (~50 or less) and it's now ~100MB more than starting value. There is still a gradual increase in memory usage until we level off.	2025-06-11 12:56:16 +10:00
dunkeroni	bf0fdbd10e	Fix: inpaint model mask using wrong tensor name	2025-06-05 11:31:35 -04:00
psychedelicious	c848cbc2e3	feat(app): move output annotation checking to run_app Also change import order to ensure CLI args are handled correctly. Had to do this bc importing `InvocationRegistry` before parsing args resulted in the `--root` CLI arg being ignored.	2025-05-30 14:10:13 +10:00
psychedelicious	91db136cd1	feat(nodes): much faster heuristic resize utility Add `heuristic_resize_fast`, which does the same thing as `heuristic_resize`, except it's about 20x faster. This is achieved by using opencv for the binary edge handling isntead of python, and checking only 100k pixels to determine what kind of image we are working with. Besides being much faster, it results in cleaner lines for resized binary canny edge maps, and has results in fewer misidentified segmentation maps. Tested against normal images, binary canny edge maps, grayscale HED edge maps, segmentation maps, and normal images. Tested resizing up and down for each. Besides the new utility function, I needed to swap the `opencv-python` dep for `opencv-contrib-python`, which includes `cv2.ximgproc.thinning`. This function accounts for a good chunk of the perf improvement.	2025-05-29 13:49:07 +10:00
Kent Keirsey	d4c4926caa	Update Compel to 2.1.1 and apply Sentences Split logic	2025-05-26 22:54:15 -04:00
dunkeroni	9aa26f883e	chore: ruff	2025-05-27 07:28:47 +10:00
dunkeroni	9e90bf1b20	fix gradient mask broken with flux gen	2025-05-27 07:28:47 +10:00
dunkeroni	53ac9eafbf	reuse inpaint image noise seed for caching	2025-05-27 07:28:47 +10:00
dunkeroni	139ecc10ce	ruff	2025-05-27 07:28:47 +10:00
dunkeroni	174249ec15	grtadient mask node works on greyscale now	2025-05-27 07:28:47 +10:00
dunkeroni	23627cf18d	compositing in frontend	2025-05-27 07:28:47 +10:00
Mary Hipp	a8e0c48ddc	add new method types to metadata	2025-05-22 14:09:10 -04:00
Mary Hipp	2f35d74902	backend updates	2025-05-22 13:50:15 +10:00
psychedelicious	ecc6e8a532	fix(nodes): transformers bug with SAM Upstream bug in `transformers` breaks use of `AutoModelForMaskGeneration` class to load SAM models Simple fix - directly load the model with `SamModel` class instead. See upstream issue https://github.com/huggingface/transformers/issues/38228	2025-05-22 11:32:37 +10:00
psychedelicious	19ecdb196e	chore: ruff	2025-05-20 10:47:02 +10:00
psychedelicious	021a334240	fix(nodes): fix spots where default of None was provided for non-optional fields	2025-05-20 10:47:02 +10:00
psychedelicious	cfed293d48	fix(nodes): do not make invocation field defaults None when they are not provided	2025-05-20 10:47:02 +10:00
Kent Keirsey	3bfb497764	ruff fixes	2025-05-19 13:50:04 +10:00
Kent Keirsey	b849c7d382	ruff fix	2025-05-19 13:50:04 +10:00
Kent Keirsey	b02ea1a898	Expanded styles & updated UI	2025-05-19 13:50:04 +10:00
Kent Keirsey	d709040f4b	Matt3o base changes	2025-05-19 13:50:04 +10:00
psychedelicious	37e790ae19	fix(app): address pydantic deprecation warning for accessing `BaseModel.model_fields`	2025-05-19 12:22:59 +10:00
psychedelicious	1566e29c19	feat(nodes): tidy some type annotations in baseinvocation	2025-05-14 06:55:15 +10:00
psychedelicious	6a2e35f2c4	feat(nodes): store original field annotation & FieldInfo in invocations	2025-05-14 06:55:15 +10:00
psychedelicious	b6d58774f4	feat(nodes): improved error messages for invalid defaults	2025-05-14 06:55:15 +10:00
psychedelicious	9df0871754	fix(nodes): do not provide invalid defaults for batch nodes	2025-05-14 06:55:15 +10:00
psychedelicious	3011150a3a	feat(nodes): validate default values for all fields This prevents issues where the node is defined with an invalid default value, which would guarantee an error during a ser/de roundtrip. - Upstream issue requesting this functionality be built-in to pydantic: https://github.com/pydantic/pydantic/issues/8722 - Upstream PR that implements the functionality: https://github.com/pydantic/pydantic-core/pull/1593	2025-05-14 06:55:15 +10:00
psychedelicious	df81f3274a	feat(nodes): improved pydantic type annotation massaging When we do our field type overrides to allow invocations to be instantiated without all required fields, we were not modifying the annotation of the field but did set the default value of the field to `None`. This results in an error when doing a ser/de round trip. Here's what we end up doing: ```py from pydantic import BaseModel, Field class MyModel(BaseModel): foo: str = Field(default=None) ``` And here is a simple round-trip, which should not error but which does: ```py MyModel(**MyModel().model_dump()) # ValidationError: 1 validation error for MyModel # foo # Input should be a valid string [type=string_type, input_value=None, input_type=NoneType] # For further information visit https://errors.pydantic.dev/2.11/v/string_type ``` To fix this, we now check every incoming field and update its annotation to match its default value. In other words, when we override the default field value to `None`, we make its type annotation `<original type> \| None`. This prevents the error during deserialization. This slightly alters the schema for all invocations and outputs - the values of all fields without default values are now typed as `<original type> \| None`, reflecting the overrides. This means the autogenerated types for fields have also changed for fields without defaults: ```ts // Old image?: components["schemas"]["ImageField"]; // New image?: components["schemas"]["ImageField"] \| null; ``` This does not break anything on the frontend.	2025-05-14 06:55:15 +10:00
psychedelicious	203fa04295	feat(nodes): support bottleneck flag for nodes	2025-05-13 11:56:40 +10:00
psychedelicious	1e85184c62	feat(nodes): add imagen3/chatgpt-4o field types	2025-05-06 09:07:52 -04:00
psychedelicious	cc54466db9	fix(nodes): default value for UIConfigBase.tags	2025-04-28 13:31:26 -04:00
psychedelicious	cbdafe7e38	feat(nodes): allow node clobbering	2025-04-28 13:31:26 -04:00
psychedelicious	8ed5585285	feat(nodes): move output metadata to BaseInvocationOutput	2025-04-28 09:19:43 -04:00
Mary Hipp	4a0df6b865	add optional output_metadata to baseinvocation	2025-04-28 09:19:43 -04:00
psychedelicious	814406d98a	feat(mm): siglip model loading supports partial loading In the previous commit, the LLaVA model was updated to support partial loading. In this commit, the SigLIP model is updated in the same way. This model is used for FLUX Redux. It's <4GB and only ever run in isolation, so it won't benefit from partial loading for the vast majority of users. Regardless, I think it is best if we make _all_ models work with partial loading. PS: I also fixed the initial load dtype issue, described in the prev commit. It's probably a non-issue for this model, but we may as well fix it.	2025-04-18 10:12:03 +10:00
psychedelicious	c054501103	feat(mm): llava model loading supports partial loading; fix OOM crash on initial load The model manager has two types of model cache entries: - `CachedModelOnlyFullLoad`: The model may only ever be loaded and unloaded as a single object. - `CachedModelWithPartialLoad`: The model may be partially loaded and unloaded. Partial loaded is enabled by overwriting certain torch layer classes, adding the ability to autocast the layer to a device on-the-fly. See `CustomLinear` for an example. So, to take advantage of partial loading and be cached as a `CachedModelWithPartialLoad`, the model must inherit from `torch.nn.Module`. The LLaVA classes provided by `transformers` do inherit from `torch.nn.Module`, but we wrap those classes in a separate class called `LlavaOnevisionModel`. The wrapper encapsulate both the LLaVA model and its "processor" - a lightweight class that prepares model inputs like text and images. While it is more elegant to encapsulate both model and processor classes in a single entity, this prevents the model cache from enabling partial loading for the chunky vLLM model. Fixing this involved a few changes. - Update the `LlavaOnevisionModelLoader` class to operate on the vLLM model directly, instead the `LlavaOnevisionModel` wrapper class. - Instantiate the processor directly in the node. The processor is lightweight and does its business on the CPU. We don't need to worry about caching in the model manager. - Remove caching support code from the `LlavaOnevisionModel` wrapper class. It's not needed, because we do not cache this class. The class now only handles running the models provided to it. - Rename `LlavaOnevisionModel` to `LlavaOnevisionPipeline` to better represent its purpose. These changes have a bonus effect of fixing an OOM crash when initially loading the models. This was most apparent when loading LLaVA 7B, which is pretty chunky. The initial load is onto CPU RAM. In the old version of the loaders, we ignored the loader's target dtype for the initial load. Instead, we loaded the model at `transformers`'s "default" dtype of fp32. LLaVA 7B is fp16 and weighs ~17GB. Loading as fp32 means we need double that amount (~34GB) of CPU RAM. Many users only have 32GB RAM, so this causes a _CPU_ OOM - which is a hard crash of the whole process. With the updated loaders, the initial load logic now uses the target dtype for the initial load. LLaVA now needs the expected ~17GB RAM for its initial load. PS: If we didn't make the accompanying partial loading changes, we still could have solved this OOM. We'd just need to pass the initial load dtype to the wrapper class and have it load on that dtype. But we may as well fix both issues. PPS: There are other models whose model classes are wrappers around a torch module class, and thus cannot be partially loaded. However, these models are typically fairly small and/or are run only on their own, so they don't benefit as much from partial loading. It's the really big models (like LLaVA 7B) that benefit most from the partial loading.	2025-04-18 10:12:03 +10:00
skunkworxdark	566282bff0	Update metadata_linked.py added metadata_to_string_collection, metadata_to_integer_collection, metadata_to_float_collection, metadata_to_bool_collection	2025-04-16 06:28:22 +10:00
psychedelicious	a5bc21cf50	feat(nodes): extract LaMa model url to constant	2025-04-15 07:13:25 +10:00
psychedelicious	ae8d1f26d6	fix(app): import CogView4Transformer2DModel from the module that exports it	2025-04-10 10:50:13 +10:00
psychedelicious	ad582c8cc5	feat(nodes): rename CogView4 nodes to match naming format	2025-04-10 10:50:13 +10:00
maryhipp	305c5761d0	add generation modes for cogview linear	2025-04-10 10:50:13 +10:00
Ryan Dick	d86cd66994	Add CogView4 VAE approximation for progress images.	2025-04-10 10:50:13 +10:00
Ryan Dick	13850271ab	Add inpainting to CogView4DenoiseInvocation.	2025-04-10 10:50:13 +10:00

1 2 3 4 5 ...

1556 Commits