Commit Graph

10180 Commits

Author SHA1 Message Date
Ryan Dick
5b42b7bd45 Add a utility to help with determining the working memory required for expensive operations. 2025-01-07 01:20:15 +00:00
Ryan Dick
71b97ce7be Reduce the likelihood of encountering https://github.com/invoke-ai/InvokeAI/issues/7513 by elminating places where the door was left open for this to happen. 2025-01-07 01:20:15 +00:00
Ryan Dick
b343f81644 Use torch.cuda.memory_allocated() rather than torch.cuda.memory_reserved() to be more conservative in setting dynamic VRAM cache limits. 2025-01-07 01:20:15 +00:00
Ryan Dick
4abfb35321 Tune SD3 VAE decode working memory estimate. 2025-01-07 01:20:15 +00:00
Ryan Dick
cba6528ea7 Add a 20% buffer to all VAE decode working memory estimates. 2025-01-07 01:20:15 +00:00
Ryan Dick
6a5cee61be Tune the working memory estimate for FLUX VAE decoding. 2025-01-07 01:20:15 +00:00
Ryan Dick
bd8017ecd5 Update working memory estimate for VAE decoding when tiling is being applied. 2025-01-07 01:20:15 +00:00
Ryan Dick
299eb94a05 Estimate the working memory required for VAE decoding, since this operations tends to be memory intensive. 2025-01-07 01:20:15 +00:00
Ryan Dick
fc4a22fe78 Allow expensive operations to request more working memory. 2025-01-07 01:20:13 +00:00
Ryan Dick
a167632f09 Calculate model cache size limits dynamically based on the available RAM / VRAM. 2025-01-07 01:14:20 +00:00
Ryan Dick
1321fac8f2 Remove get_cache_size() and set_cache_size() endpoints. These were unused by the frontend and refer to cache fields that are no longer accessible. 2025-01-07 01:06:20 +00:00
Ryan Dick
6a9de1fcf3 Change definition of VRAM in use for the ModelCache from sum of model weights to the total torch.cuda.memory_allocated(). 2025-01-07 00:31:53 +00:00
Ryan Dick
e5180c4e6b Add get_effective_device(...) utility to aid in determining the effective device of models that are partially loaded. 2025-01-07 00:31:00 +00:00
Ryan Dick
2619ef53ca Handle device casting in ia2_layer.py. 2025-01-07 00:31:00 +00:00
Ryan Dick
bcd29c5d74 Remove all cases where we check the 'model.device'. This is no longer trustworthy now that partial loading is permitted. 2025-01-07 00:31:00 +00:00
Ryan Dick
1b7bb70bde Improve handling of cases when application code modifies the size of a model after registering it with the model cache. 2025-01-07 00:31:00 +00:00
Ryan Dick
7127040c3a Remove unused function set_nested_attr(...). 2025-01-07 00:31:00 +00:00
Ryan Dick
ceb2498a67 Add log prefix to model cache logs. 2025-01-07 00:31:00 +00:00
Ryan Dick
d0bfa019be Add 'enable_partial_loading' config flag. 2025-01-07 00:31:00 +00:00
Ryan Dick
535e45cedf First pass at adding partial loading support to the ModelCache. 2025-01-07 00:30:58 +00:00
Ryan Dick
c579a218ef Allow models to be locked in VRAM, even if they have been dropped from the RAM cache (related: https://github.com/invoke-ai/InvokeAI/issues/7513). 2025-01-06 23:02:52 +00:00
Riku
f4f7415a3b fix(app): remove obsolete DEFAULT_PRECISION variable 2025-01-06 11:14:58 +11:00
Mary Hipp
7d6c443d6f fix(api): limit board_name length to 300 characters 2025-01-06 10:49:49 +11:00
psychedelicious
4815b4ea80 feat(ui): tweak verbiage for model install errors 2025-01-03 11:21:23 -05:00
psychedelicious
d77a6ccd76 fix(ui): model install error toasts not updating correctly 2025-01-03 11:21:23 -05:00
psychedelicious
3e860c8338 feat(ui): starter models filter works with model base
For example, "flux" now matches any starter model with a model base of "FLUX".
2025-01-03 11:21:23 -05:00
psychedelicious
4f2ef7ce76 refactor(ui): handle hf vs civitai/other url model install errors separately
Previously, we didn't differentiate between model install errors for different types of model install sources, resulting in a buggy UX:
- If a HF model install failed, but it was a HF URL install and not a repo id install, the link to the HF model page was incorrect.
- If a non-HF URL install (e.g. civitai) failed, we treated it as a HF URL install. In this case, if the user's HF token was invalid or unset, we directed the user to set it. If the HF token was valid, we displayed an empty red toast. If it's not a HF URL install, then of course neither of these are correct.

Also, the logic for handling the toasts was a bit complicated.

This change does a few things:
- Consolidate the model install error toasts into one place - the socket.io event handler for the model install error event. There is no more global state for the toasts and there are no hooks managing them.
- Handling the different cases for errors, including all combinations of HF/non-HF and unauthorized/forbidden/unknown.
2025-01-03 11:21:23 -05:00
psychedelicious
d7e9ad52f9 chore(ui): typegen 2025-01-03 11:21:23 -05:00
psychedelicious
b6d7a44004 refactor(events): include full model source in model install events
This is required to fix an issue with the MM UI's error handling.

Previously, we only included the model source as a string. That could be an arbitrary URL, file path or HF repo id, but the frontend has no parsing logic to differentiate between these different model sources.

Without access to the type of model source, it is difficult to determine how the user should proceed. For example, if it's HF URL with an HTTP unauthorized error, we should direct the user to log in to HF. But if it's a civitai URL with the same error, we should not direct the user to HF.

There are a variety of related edge cases.

With this change, the full `ModelSource` object is included in each model install event, including error events.

I had to fix some circular import issues, hence the import changes to files other than `events_common.py`.
2025-01-03 11:21:23 -05:00
psychedelicious
e18100ae7e refactor(ui): move model install error event handling to own file
No logic change.
2025-01-03 11:21:23 -05:00
psychedelicious
ad0aa0e6b2 feat(ui): reset canvas layers only resets the layers 2025-01-03 11:02:04 -05:00
Kent Keirsey
94785231ce Update href to correct link 2025-01-02 09:39:41 +11:00
Ryan Dick
477d87ec31 Fix layer patch dtype selection for CLIP text encoder models. 2024-12-29 21:48:51 +00:00
Ryan Dick
8b4b0ff0cf Fix bug in CustomConv1d and CustomConv2d patch calculations. 2024-12-29 19:10:19 +00:00
Ryan Dick
6fd9b0a274 Delete old sidecar wrapper implementation. This functionality has moved into the custom layers. 2024-12-29 17:33:08 +00:00
Ryan Dick
a8bef59699 First pass at making custom layer patches work with weights streamed from the CPU to the GPU. 2024-12-29 17:01:37 +00:00
Ryan Dick
6d49ee839c Switch the LayerPatcher to use 'custom modules' to manage layer patching. 2024-12-29 01:18:30 +00:00
Ryan Dick
0525f967c2 Fix the _autocast_forward_with_patches() function for CustomConv1d and CustomConv2d. 2024-12-29 00:22:37 +00:00
Ryan Dick
2855bb6b41 Update BaseLayerPatch.get_parameters(...) to accept a dict of orig_parameters rather than orig_module. This will enable compatibility between patching and cpu->gpu streaming. 2024-12-28 21:12:53 +00:00
Ryan Dick
20acfc9a00 Raise in CustomEmbedding and CustomGroupNorm if a patch is applied. 2024-12-28 20:49:17 +00:00
Ryan Dick
918f541af8 Add unit test for a SetParameterLayer patch applied to a CustomFluxRMSNorm layer. 2024-12-28 20:44:48 +00:00
Ryan Dick
93e76b61d6 Add CustomFluxRMSNorm layer. 2024-12-28 20:33:38 +00:00
Ryan Dick
f692e217ea Add patch support to CustomConv1d and CustomConv2d (no unit tests yet). 2024-12-27 22:23:17 +00:00
Ryan Dick
f2981979f9 Get custom layer patches working with all quantized linear layer types. 2024-12-27 22:00:22 +00:00
Ryan Dick
ef970a1cdc Add support for FluxControlLoRALayer in CustomLinear layers and add a unit test for it. 2024-12-27 21:00:47 +00:00
Ryan Dick
e24e386a27 Add support for patches to CustomModuleMixin and add a single unit test (more to come). 2024-12-27 18:57:13 +00:00
Ryan Dick
b06d61e3c0 Improve custom layer wrap/unwrap logic. 2024-12-27 16:29:48 +00:00
Ryan Dick
7d6ab0ceb2 Add a CustomModuleMixin class with a flag for enabling/disabling autocasting (since it incurs some runtime speed overhead.) 2024-12-26 20:08:30 +00:00
Ryan Dick
a8b2c4c3d2 Add inference tests for all custom module types (i.e. to test autocasting from cpu to device). 2024-12-26 18:33:46 +00:00
Ryan Dick
987c9ae076 Move custom autocast modules to separate files in a custom_modules/ directory. 2024-12-24 22:21:31 +00:00