Commit Graph

227 Commits

Author SHA1 Message Date
Kent Keirsey
af58a75e97 Support PEFT Loras with Base_Model.model prefix (#8433)
* Support PEFT Loras with Base_Model.model prefix

* update tests

* ruff

* fix python complaints

* update kes

* format keys

* remove unneeded test
2025-08-18 09:14:46 -04:00
psychedelicious
a8a07598c8 chore: ruff 2025-08-18 21:14:00 +10:00
psychedelicious
23206e22e8 tests: skip excessively flaky MPS-specific tests in CI 2025-08-18 21:14:00 +10:00
Heathen711
8cef0f5bf5 Update supported cuda slot input. 2025-06-16 19:33:19 +10:00
Kevin Turner
50cf285efb fix: group aitoolkit lora layers 2025-06-16 19:08:11 +10:00
Kevin Turner
a214f4fff5 fix: group aitoolkit lora layers 2025-06-16 19:08:11 +10:00
Kevin Turner
2981591c36 test: add some aitoolkit lora tests 2025-06-16 19:08:11 +10:00
Kevin Turner
52a8ad1c18 chore: rename model.size to model.file_size
to disambiguate from RAM size or pixel size
2025-04-10 09:53:03 +10:00
Kevin Turner
98260a8efc test: add size field to test model configs 2025-04-10 09:53:03 +10:00
psychedelicious
aaa6211625 chore(backend): ruff C420 2025-03-28 18:28:32 -04:00
Billy
182580ff69 Imports 2025-03-26 12:55:10 +11:00
Billy
8e9d5c1187 Ruff formatting 2025-03-26 12:30:31 +11:00
Billy
99aac5870e Remove star imports 2025-03-26 12:27:00 +11:00
Ryan Dick
f1fde792ee Get FLUX Redux working: model loading and inference. 2025-03-06 10:31:17 +11:00
Ryan Dick
5357d6e08e Rename ConcatenatedLoRALayer to MergedLayerPatch. And other minor cleanup. 2025-01-28 14:51:35 +00:00
Ryan Dick
28514ba59a Update ConcatenatedLoRALayer to work with all sub-layer types. 2025-01-28 14:51:35 +00:00
Ryan Dick
206f261e45 Add utils for loading FLUX OneTrainer DoRA models. 2025-01-28 14:51:35 +00:00
Ryan Dick
dfa253e75b Add utils for working with Kohya LoRA keys. 2025-01-28 14:51:35 +00:00
Ryan Dick
faa4fa02c0 Expand unit tests to test for confusion between FLUX LoRA formats. 2025-01-28 14:51:35 +00:00
Ryan Dick
5bd6428fdd Add is_state_dict_likely_in_flux_onetrainer_format() util function. 2025-01-28 14:51:35 +00:00
Ryan Dick
8b4f411f7b Add a test state dict for the OneTrainer DoRA format. 2025-01-28 14:51:35 +00:00
Ryan Dick
e2f05d0800 Add unit tests for LoKR patch layers. The new tests trigger a bug when LoKR layers are applied to BnB-quantized layers (also impacts several other LoRA variant types). 2025-01-22 09:20:40 +11:00
Ryan Dick
36a3869af0 Add keep_ram_copy_of_weights config option. 2025-01-16 15:35:25 +00:00
Ryan Dick
c76d08d1fd Add keep_ram_copy option to CachedModelOnlyFullLoad. 2025-01-16 15:08:23 +00:00
Ryan Dick
04087c38ce Add keep_ram_copy option to CachedModelWithPartialLoad. 2025-01-16 14:51:44 +00:00
Ryan Dick
974b4671b1 Deprecate the ram and vram configs to make the migration to dynamic
memory limits smoother for users who had previously overriden these
values.
2025-01-07 16:45:29 +00:00
Ryan Dick
d7ab464176 Offload the current model when locking if it is already partially loaded and we have insufficient VRAM. 2025-01-07 02:53:44 +00:00
Ryan Dick
5eafe1ec7a Fix ModelCache execution device selection in unit tests. 2025-01-07 01:20:15 +00:00
Ryan Dick
a167632f09 Calculate model cache size limits dynamically based on the available RAM / VRAM. 2025-01-07 01:14:20 +00:00
Ryan Dick
402dd840a1 Add seed to flaky unit test. 2025-01-07 00:31:00 +00:00
Ryan Dick
d0bfa019be Add 'enable_partial_loading' config flag. 2025-01-07 00:31:00 +00:00
Ryan Dick
535e45cedf First pass at adding partial loading support to the ModelCache. 2025-01-07 00:30:58 +00:00
Ryan Dick
9a0a226ce1 Fix bitsandbytes imports in unit tests on MacOS. 2024-12-30 10:41:48 -05:00
Ryan Dick
6fd9b0a274 Delete old sidecar wrapper implementation. This functionality has moved into the custom layers. 2024-12-29 17:33:08 +00:00
Ryan Dick
52fc5a64d4 Add a unit test for a LoRA patch applied to a quantized linear layer with weights streamed from CPU to GPU. 2024-12-29 17:14:55 +00:00
Ryan Dick
a8bef59699 First pass at making custom layer patches work with weights streamed from the CPU to the GPU. 2024-12-29 17:01:37 +00:00
Ryan Dick
6d49ee839c Switch the LayerPatcher to use 'custom modules' to manage layer patching. 2024-12-29 01:18:30 +00:00
Ryan Dick
2855bb6b41 Update BaseLayerPatch.get_parameters(...) to accept a dict of orig_parameters rather than orig_module. This will enable compatibility between patching and cpu->gpu streaming. 2024-12-28 21:12:53 +00:00
Ryan Dick
918f541af8 Add unit test for a SetParameterLayer patch applied to a CustomFluxRMSNorm layer. 2024-12-28 20:44:48 +00:00
Ryan Dick
93e76b61d6 Add CustomFluxRMSNorm layer. 2024-12-28 20:33:38 +00:00
Ryan Dick
f2981979f9 Get custom layer patches working with all quantized linear layer types. 2024-12-27 22:00:22 +00:00
Ryan Dick
ef970a1cdc Add support for FluxControlLoRALayer in CustomLinear layers and add a unit test for it. 2024-12-27 21:00:47 +00:00
Ryan Dick
5ee7405f97 Add more unit tests for custom module LoRA patching: multiple LoRAs and ConcatenatedLoRALayers. 2024-12-27 19:47:21 +00:00
Ryan Dick
e24e386a27 Add support for patches to CustomModuleMixin and add a single unit test (more to come). 2024-12-27 18:57:13 +00:00
Ryan Dick
b06d61e3c0 Improve custom layer wrap/unwrap logic. 2024-12-27 16:29:48 +00:00
Ryan Dick
7d6ab0ceb2 Add a CustomModuleMixin class with a flag for enabling/disabling autocasting (since it incurs some runtime speed overhead.) 2024-12-26 20:08:30 +00:00
Ryan Dick
9692a36dd6 Use a fixture to parameterize tests in test_all_custom_modules.py so that a fresh instance of the layer under test is initialized for each test. 2024-12-26 19:41:25 +00:00
Ryan Dick
b0b699a01f Add unit test to test that isinstance(...) behaves as expected with custom module types. 2024-12-26 18:45:56 +00:00
Ryan Dick
a8b2c4c3d2 Add inference tests for all custom module types (i.e. to test autocasting from cpu to device). 2024-12-26 18:33:46 +00:00
Ryan Dick
03944191db Split test_autocast_modules.py into separate test files to mirror the source file structure. 2024-12-24 22:29:11 +00:00