Commit Graph

69 Commits

Author SHA1 Message Date
Ryan Dick
d7ab464176 Offload the current model when locking if it is already partially loaded and we have insufficient VRAM. 2025-01-07 02:53:44 +00:00
Ryan Dick
5eafe1ec7a Fix ModelCache execution device selection in unit tests. 2025-01-07 01:20:15 +00:00
Ryan Dick
a167632f09 Calculate model cache size limits dynamically based on the available RAM / VRAM. 2025-01-07 01:14:20 +00:00
Ryan Dick
402dd840a1 Add seed to flaky unit test. 2025-01-07 00:31:00 +00:00
Ryan Dick
d0bfa019be Add 'enable_partial_loading' config flag. 2025-01-07 00:31:00 +00:00
Ryan Dick
535e45cedf First pass at adding partial loading support to the ModelCache. 2025-01-07 00:30:58 +00:00
Ryan Dick
9a0a226ce1 Fix bitsandbytes imports in unit tests on MacOS. 2024-12-30 10:41:48 -05:00
Ryan Dick
52fc5a64d4 Add a unit test for a LoRA patch applied to a quantized linear layer with weights streamed from CPU to GPU. 2024-12-29 17:14:55 +00:00
Ryan Dick
a8bef59699 First pass at making custom layer patches work with weights streamed from the CPU to the GPU. 2024-12-29 17:01:37 +00:00
Ryan Dick
6d49ee839c Switch the LayerPatcher to use 'custom modules' to manage layer patching. 2024-12-29 01:18:30 +00:00
Ryan Dick
918f541af8 Add unit test for a SetParameterLayer patch applied to a CustomFluxRMSNorm layer. 2024-12-28 20:44:48 +00:00
Ryan Dick
93e76b61d6 Add CustomFluxRMSNorm layer. 2024-12-28 20:33:38 +00:00
Ryan Dick
f2981979f9 Get custom layer patches working with all quantized linear layer types. 2024-12-27 22:00:22 +00:00
Ryan Dick
ef970a1cdc Add support for FluxControlLoRALayer in CustomLinear layers and add a unit test for it. 2024-12-27 21:00:47 +00:00
Ryan Dick
5ee7405f97 Add more unit tests for custom module LoRA patching: multiple LoRAs and ConcatenatedLoRALayers. 2024-12-27 19:47:21 +00:00
Ryan Dick
e24e386a27 Add support for patches to CustomModuleMixin and add a single unit test (more to come). 2024-12-27 18:57:13 +00:00
Ryan Dick
b06d61e3c0 Improve custom layer wrap/unwrap logic. 2024-12-27 16:29:48 +00:00
Ryan Dick
7d6ab0ceb2 Add a CustomModuleMixin class with a flag for enabling/disabling autocasting (since it incurs some runtime speed overhead.) 2024-12-26 20:08:30 +00:00
Ryan Dick
9692a36dd6 Use a fixture to parameterize tests in test_all_custom_modules.py so that a fresh instance of the layer under test is initialized for each test. 2024-12-26 19:41:25 +00:00
Ryan Dick
b0b699a01f Add unit test to test that isinstance(...) behaves as expected with custom module types. 2024-12-26 18:45:56 +00:00
Ryan Dick
a8b2c4c3d2 Add inference tests for all custom module types (i.e. to test autocasting from cpu to device). 2024-12-26 18:33:46 +00:00
Ryan Dick
03944191db Split test_autocast_modules.py into separate test files to mirror the source file structure. 2024-12-24 22:29:11 +00:00
Ryan Dick
987c9ae076 Move custom autocast modules to separate files in a custom_modules/ directory. 2024-12-24 22:21:31 +00:00
Ryan Dick
0fc538734b Skip flaky test when running on Github Actions, and further reduce peak unit test memory. 2024-12-24 14:32:11 +00:00
Ryan Dick
7214d4969b Workaround a weird quirk of QuantState.to() and add a unit test to exercise it. 2024-12-24 14:32:11 +00:00
Ryan Dick
a83a999b79 Reduce peak memory used for unit tests. 2024-12-24 14:32:11 +00:00
Ryan Dick
f8a6accf8a Fix bitsandbytes imports to avoid ImportErrors on MacOS. 2024-12-24 14:32:11 +00:00
Ryan Dick
f8ab414f99 Add CachedModelOnlyFullLoad to mirror the CachedModelWithPartialLoad for models that cannot or should not be partially loaded. 2024-12-24 14:32:11 +00:00
Ryan Dick
c6795a1b47 Make CachedModelWithPartialLoad work with models that have non-persistent buffers. 2024-12-24 14:32:11 +00:00
Ryan Dick
0a8fc74ae9 Add CachedModelWithPartialLoad to manage partially-loaded models using the new autocast modules. 2024-12-24 14:32:11 +00:00
Ryan Dick
dc54e8763b Add CustomInvokeLinearNF4 to enable CPU -> GPU streaming for InvokeLinearNF4 layers. 2024-12-24 14:32:11 +00:00
Ryan Dick
1b56020876 Add CustomInvokeLinear8bitLt layer for device streaming with InvokeLinear8bitLt layers. 2024-12-24 14:32:11 +00:00
Ryan Dick
97d56f7dc9 Add torch module autocast unit test for GGUF-quantized models. 2024-12-24 14:32:11 +00:00
Ryan Dick
fe0ef2c27c Add torch module autocast utilities. 2024-12-24 14:32:11 +00:00
Ryan Dick
d30a9ced38 Rename model_cache_default.py -> model_cache.py. 2024-12-24 14:23:18 +00:00
Ryan Dick
e0bfa6157b Remove ModelCacheBase. 2024-12-24 14:23:18 +00:00
Ryan Dick
fef26a5f2f Consolidate all LoRA patching logic in the LoRAPatcher. 2024-09-15 04:39:56 +03:00
Ryan Dick
92b8477299 Fixup FLUX LoRA unit tests. 2024-09-15 04:39:56 +03:00
Ryan Dick
8518ae9ccb Remove unused LoRAModelRaw.name attribute. 2024-09-15 04:39:56 +03:00
Ryan Dick
ade75b4748 Get convert_flux_kohya_state_dict_to_invoke_format(...) working, with unit tests. 2024-09-15 04:39:56 +03:00
Ryan Dick
7a80d9ebe2 Add state_dict keys for two FLUX LoRA formats to be used in unit tests. 2024-09-15 04:39:56 +03:00
Ryan Dick
2b3e4e123d Split LoRA layer implementations into separate files. 2024-09-12 15:53:30 +00:00
Ryan Dick
0b77511271 Update HF download logic to work for black-forest-labs/FLUX.1-schnell. 2024-08-26 20:17:50 -04:00
Ryan Dick
69af099532 Warn on invalid model configs in the DB rather than crashing. 2024-07-11 21:05:55 -04:00
Lincoln Stein
3e0fb45dd7 Load single-file checkpoints directly without conversion (#6510)
* use model_class.load_singlefile() instead of converting; works, but performance is poor

* adjust the convert api - not right just yet

* working, needs sql migrator update

* rename migration_11 before conflict merge with main

* Update invokeai/backend/model_manager/load/model_loaders/stable_diffusion.py

Co-authored-by: Ryan Dick <ryanjdick3@gmail.com>

* Update invokeai/backend/model_manager/load/model_loaders/stable_diffusion.py

Co-authored-by: Ryan Dick <ryanjdick3@gmail.com>

* implement lightweight version-by-version config migration

* simplified config schema migration code

* associate sdxl config with sdxl VAEs

* remove use of original_config_file in load_single_file()

---------

Co-authored-by: Lincoln Stein <lstein@gmail.com>
Co-authored-by: Ryan Dick <ryanjdick3@gmail.com>
2024-06-27 17:31:28 -04:00
Lincoln Stein
dc134935c8 replace load_and_cache_model() with load_remote_model() and load_local_odel() 2024-06-07 14:12:16 +10:00
Lincoln Stein
9f9379682e ruff fixes 2024-06-07 13:54:41 +10:00
Lincoln Stein
f81b8bc9f6 add support for generic loading of diffusers directories 2024-06-07 13:54:30 +10:00
Lincoln Stein
34e1eb19f9 merge with main and resolve conflicts 2024-05-27 22:20:34 -04:00
psychedelicious
a876675448 tests: update tests to use new events 2024-05-27 09:06:02 +10:00