David Burnett
|
6c0bd7d150
|
fix import ordering, remove code I reverted that the resync added back
|
2025-05-19 11:16:23 +10:00 |
|
David Burnett
|
8abcc99ced
|
add check for state_dict, required to load TI's
|
2025-05-19 11:16:23 +10:00 |
|
David Burnett
|
73ab4b8895
|
fix offload device
|
2025-05-19 11:16:23 +10:00 |
|
David Burnett
|
86719f2065
|
revert to overload due to failing tests, use Torch futures instead
|
2025-05-19 11:16:23 +10:00 |
|
Ryan Dick
|
da589b3f1f
|
Memory optimization to load state dicts one module at a time in CachedModelWithPartialLoad when we are not storing a CPU copy of the state dict (i.e. when keep_ram_copy_of_weights=False).
|
2025-01-16 17:00:33 +00:00 |
|
Ryan Dick
|
c76d08d1fd
|
Add keep_ram_copy option to CachedModelOnlyFullLoad.
|
2025-01-16 15:08:23 +00:00 |
|
Ryan Dick
|
04087c38ce
|
Add keep_ram_copy option to CachedModelWithPartialLoad.
|
2025-01-16 14:51:44 +00:00 |
|
Ryan Dick
|
d7ab464176
|
Offload the current model when locking if it is already partially loaded and we have insufficient VRAM.
|
2025-01-07 02:53:44 +00:00 |
|
Ryan Dick
|
1b7bb70bde
|
Improve handling of cases when application code modifies the size of a model after registering it with the model cache.
|
2025-01-07 00:31:00 +00:00 |
|
Ryan Dick
|
7127040c3a
|
Remove unused function set_nested_attr(...).
|
2025-01-07 00:31:00 +00:00 |
|
Ryan Dick
|
6d49ee839c
|
Switch the LayerPatcher to use 'custom modules' to manage layer patching.
|
2024-12-29 01:18:30 +00:00 |
|
Ryan Dick
|
f8ab414f99
|
Add CachedModelOnlyFullLoad to mirror the CachedModelWithPartialLoad for models that cannot or should not be partially loaded.
|
2024-12-24 14:32:11 +00:00 |
|
Ryan Dick
|
c6795a1b47
|
Make CachedModelWithPartialLoad work with models that have non-persistent buffers.
|
2024-12-24 14:32:11 +00:00 |
|
Ryan Dick
|
0a8fc74ae9
|
Add CachedModelWithPartialLoad to manage partially-loaded models using the new autocast modules.
|
2024-12-24 14:32:11 +00:00 |
|