Ryan Dick
a8b2c4c3d2
Add inference tests for all custom module types (i.e. to test autocasting from cpu to device).
2024-12-26 18:33:46 +00:00
Ryan Dick
03944191db
Split test_autocast_modules.py into separate test files to mirror the source file structure.
2024-12-24 22:29:11 +00:00
Ryan Dick
987c9ae076
Move custom autocast modules to separate files in a custom_modules/ directory.
2024-12-24 22:21:31 +00:00
Ryan Dick
6d7314ac0a
Consolidate the LayerPatching patching modes into a single implementation.
2024-12-24 15:57:54 +00:00
Ryan Dick
80db9537ff
Rename model_patcher.py -> layer_patcher.py.
2024-12-24 15:57:54 +00:00
Ryan Dick
6f926f05b0
Update apply_smart_model_patches() so that layer restore matches the behavior of non-smart mode.
2024-12-24 15:57:54 +00:00
Ryan Dick
61253b91f1
Enable LoRAPatcher.apply_smart_lora_patches(...) throughout the stack.
2024-12-24 15:57:54 +00:00
Ryan Dick
0148512038
(minor) Rename num_layers -> num_loras in unit tests.
2024-12-24 15:57:54 +00:00
Ryan Dick
d0f35fceed
Add test_apply_smart_lora_patches_to_partially_loaded_model(...).
2024-12-24 15:57:54 +00:00
Ryan Dick
cefcb340d9
Add LoRAPatcher.smart_apply_lora_patches()
2024-12-24 15:57:54 +00:00
Ryan Dick
0fc538734b
Skip flaky test when running on Github Actions, and further reduce peak unit test memory.
2024-12-24 14:32:11 +00:00
Ryan Dick
7214d4969b
Workaround a weird quirk of QuantState.to() and add a unit test to exercise it.
2024-12-24 14:32:11 +00:00
Ryan Dick
a83a999b79
Reduce peak memory used for unit tests.
2024-12-24 14:32:11 +00:00
Ryan Dick
f8a6accf8a
Fix bitsandbytes imports to avoid ImportErrors on MacOS.
2024-12-24 14:32:11 +00:00
Ryan Dick
f8ab414f99
Add CachedModelOnlyFullLoad to mirror the CachedModelWithPartialLoad for models that cannot or should not be partially loaded.
2024-12-24 14:32:11 +00:00
Ryan Dick
c6795a1b47
Make CachedModelWithPartialLoad work with models that have non-persistent buffers.
2024-12-24 14:32:11 +00:00
Ryan Dick
0a8fc74ae9
Add CachedModelWithPartialLoad to manage partially-loaded models using the new autocast modules.
2024-12-24 14:32:11 +00:00
Ryan Dick
dc54e8763b
Add CustomInvokeLinearNF4 to enable CPU -> GPU streaming for InvokeLinearNF4 layers.
2024-12-24 14:32:11 +00:00
Ryan Dick
1b56020876
Add CustomInvokeLinear8bitLt layer for device streaming with InvokeLinear8bitLt layers.
2024-12-24 14:32:11 +00:00
Ryan Dick
3f990393a1
Simplify the state management in InvokeLinear8bitLt and add unit tests. This is in preparation for wrapping it to support streaming of weights from cpu to gpu.
2024-12-24 14:32:11 +00:00
Ryan Dick
97d56f7dc9
Add torch module autocast unit test for GGUF-quantized models.
2024-12-24 14:32:11 +00:00
Ryan Dick
fe0ef2c27c
Add torch module autocast utilities.
2024-12-24 14:32:11 +00:00
Ryan Dick
65fcbf5f60
Bump bitsandbytes. The new verson contains improvements to state_dict loading/saving for LLM.int8 and promises improved speed on some HW.
2024-12-24 14:32:11 +00:00
Ryan Dick
d3916dbdb6
Partial Loading PR1: Tidy ModelCache ( #7492 )
...
## Summary
This PR tidies up the model cache code in preparation for further
refactoring to support partial loading of models onto the GPU. **These
code changes should not change the functional behavior in any way.**
Changes:
- Remove the `ModelCacheBase` class. `ModelCache` is the only
implementation, so there is no benefit to the separate abstract class.
- Split `CacheRecord` and `CacheStats` out into their own files.
- Remove the `ModelLocker` class. This extra layer of indirection was
not providing any benefit. Locking is now done directly with the
`ModelCache`.
- Tidy up relative imports that were contributing to circular import
issues.
- Pull the 'submodel' concern out of the `ModelCache`. The `ModelCache`
should not need to be aware of the model manager submodel system.
- Delete unused properties from the `ModelCache` (e.g.
`.lazy_offloading`, `.storage_device`, etc.)
## QA Instructions
I ran smoke tests with a variety of SD1, SDXL and FLUX models. No change
to behavior is expected.
## Merge Plan
<!--WHEN APPLICABLE: Large PRs, or PRs that touch sensitive things like
DB schemas, may need some care when merging. For example, a careful
rebase by the change author, timing to not interfere with a pending
release, or a message to contributors on discord after merging.-->
## Checklist
- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
2024-12-24 09:30:44 -05:00
Ryan Dick
55b13c1da3
(minor) Add TODO comment regarding the location of get_model_cache_key().
2024-12-24 14:23:19 +00:00
Ryan Dick
7dc3e0fdbe
Get rid of ModelLocker. It was an unnecessary layer of indirection.
2024-12-24 14:23:18 +00:00
Ryan Dick
a39bcf7e85
Move lock(...) and unlock(...) logic from ModelLocker to the ModelCache and make a bunch of ModelCache properties/methods private.
2024-12-24 14:23:18 +00:00
Ryan Dick
a7c72992a6
Pull get_model_cache_key(...) out of ModelCache. The ModelCache should not be concerned with implementation details like the submodel_type.
2024-12-24 14:23:18 +00:00
Ryan Dick
d30a9ced38
Rename model_cache_default.py -> model_cache.py.
2024-12-24 14:23:18 +00:00
Ryan Dick
e0bfa6157b
Remove ModelCacheBase.
2024-12-24 14:23:18 +00:00
Ryan Dick
83ea6420e2
Move CacheStats to its own file.
2024-12-24 14:23:18 +00:00
Ryan Dick
ce11a1952e
Move CacheRecord out to its own file.
2024-12-24 14:23:18 +00:00
Ryan Dick
e48dee4c4a
Rip out ModelLockerBase.
2024-12-24 14:23:18 +00:00
Simon Fuhrmann
712674b6dd
Add Stereogram Nodes to communityNodes.md
2024-12-23 13:51:53 -05:00
psychedelicious
de0043f443
docs: update download links for launcher
2024-12-23 13:23:14 +11:00
Riku
d21506da6f
feat(ci): add typegen check workflow
2024-12-22 06:05:17 +11:00
psychedelicious
a49894901a
docs: fix installation docs home again
2024-12-20 17:35:50 +11:00
psychedelicious
e7e26c8a93
docs: fix installation docs home
2024-12-20 17:12:44 +11:00
psychedelicious
9adcd2cc31
docs: update install-related docs
2024-12-20 17:01:34 +11:00
Kent Keirsey
f9edd009f5
Update README.md
2024-12-20 17:01:34 +11:00
Kent Keirsey
91a4160e36
Update Installation Docs
2024-12-20 17:01:34 +11:00
Kent Keirsey
9c9cec1b43
Update README.md
2024-12-20 17:01:34 +11:00
psychedelicious
948ecf9333
chore: bump version to v5.5.0
v5.5.0
2024-12-20 16:17:23 +11:00
psychedelicious
1038f7bcab
Update invokeai_version.py
v5.5.0rc1
2024-12-20 10:17:09 +11:00
Riccardo Giovanetti
c7d9e2d62a
translationBot(ui): update translation (Italian)
...
Currently translated at 99.3% (1635 of 1645 strings)
translationBot(ui): update translation (Italian)
Currently translated at 99.3% (1634 of 1645 strings)
Co-authored-by: Riccardo Giovanetti <riccardo.giovanetti@gmail.com >
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/it/
Translation: InvokeAI/Web UI
2024-12-20 10:07:15 +11:00
Riku
11c3a2e15d
translationBot(ui): update translation (German)
...
Currently translated at 70.8% (1165 of 1645 strings)
Co-authored-by: Riku <riku.block@gmail.com >
Translate-URL: https://hosted.weblate.org/projects/invokeai/web-ui/de/
Translation: InvokeAI/Web UI
2024-12-20 10:07:15 +11:00
psychedelicious
9e3ca383ec
fix(ui): add missing model config to AnyModelConfig union type
2024-12-20 09:45:04 +11:00
Riku
bda83c2634
chore(ui): update typegen schema
2024-12-20 09:45:04 +11:00
Riku
525cb38c71
fix(app): fixed InputField default values
2024-12-20 09:30:56 +11:00
psychedelicious
a9a6720bad
feat(app): change queue item execution log from debug to info
...
This provides useful context for subsequent logs during queue item execution.
2024-12-20 09:19:04 +11:00