InvokeAI

mirror of https://github.com/invoke-ai/InvokeAI.git synced 2026-04-23 03:00:31 -04:00

Author	SHA1	Message	Date
Lincoln Stein	8cf4c6944a	(style) ruff fix	2026-01-03 14:54:15 -05:00
Lincoln Stein	db228ddc4f	(style) add @record_activity and @synchronized to locked methods	2026-01-03 14:52:31 -05:00
copilot-swe-agent[bot]	4987b4da1c	Fix timeout message appearing during active generation Only log "Clearing model cache" message when there are actually unlocked models to clear. This prevents the misleading message from appearing during active generation when all models are locked. Changes: - Check for unlocked models before logging clear message - Add count of unlocked models in log message - Add debug log when all models are locked - Improves user experience by avoiding confusing messages Co-authored-by: lstein <111189+lstein@users.noreply.github.com>	2025-12-24 05:31:11 +00:00
copilot-swe-agent[bot]	8d76b4e4d4	Fix ruff whitespace errors and improve timeout logging - Remove all trailing whitespace (W293 errors) - Add debug logging when timeout fires but activity detected - Add debug logging when timeout fires but cache is empty - Only log "Clearing model cache" message when actually clearing - Prevents misleading timeout messages during active generation Co-authored-by: lstein <111189+lstein@users.noreply.github.com>	2025-12-24 04:05:57 +00:00
copilot-swe-agent[bot]	c3217d8a08	Address code review feedback - Remove unused variable in test - Add clarifying comment for daemon thread setting - Add detailed comment explaining cache clearing with 1000 GB value - Improve code documentation Co-authored-by: lstein <111189+lstein@users.noreply.github.com>	2025-12-24 00:27:39 +00:00
copilot-swe-agent[bot]	2500153ed8	Fix race condition in timeout mechanism - Added clarifying comment that _record_activity is called with lock held - Enhanced double-check in _on_timeout for thread safety - Added lock protection to shutdown method - Improved handling of edge cases where timer fires during activity Co-authored-by: lstein <111189+lstein@users.noreply.github.com>	2025-12-24 00:26:01 +00:00
copilot-swe-agent[bot]	9bbd2b3f11	Add model_cache_keep_alive config option and timeout mechanism - Added model_cache_keep_alive config field (minutes, default 0 = infinite) - Implemented timeout tracking in ModelCache class - Added _record_activity() to track model usage - Added _on_timeout() to auto-clear cache when timeout expires - Added shutdown() method to clean up timers - Integrated timeout with get(), lock(), unlock(), and put() operations - Updated ModelManagerService to pass keep_alive parameter - Added cleanup in stop() method Co-authored-by: lstein <111189+lstein@users.noreply.github.com>	2025-12-24 00:22:59 +00:00
psychedelicious	5f12b9185f	feat(mm): add cache_snapshot to model cache clear callback	2025-05-15 16:06:47 +10:00
psychedelicious	d958d2e5a0	feat(mm): iterate on cache callbacks API	2025-05-15 14:37:22 +10:00
psychedelicious	823ca214e6	feat(mm): iterate on cache callbacks API	2025-05-15 13:28:51 +10:00
psychedelicious	a33da450fd	feat(mm): support cache callbacks	2025-05-15 11:23:58 +10:00
Billy	182580ff69	Imports	2025-03-26 12:55:10 +11:00
Billy	f2689598c0	Formatting	2025-03-06 09:11:00 +11:00
Ryan Dick	cc9d215a9b	Add endpoint for emptying the model cache. Also, adds a threading lock to the ModelCache to make it thread-safe.	2025-01-30 09:18:28 -05:00
Ryan Dick	f7315f0432	Make the default max RAM cache size more conservative.	2025-01-30 08:46:59 -05:00
Ryan Dick	0cf51cefe8	Revise the logic for calculating the RAM model cache limit.	2025-01-16 23:46:07 +00:00
Ryan Dick	36a3869af0	Add keep_ram_copy_of_weights config option.	2025-01-16 15:35:25 +00:00
Ryan Dick	04087c38ce	Add keep_ram_copy option to CachedModelWithPartialLoad.	2025-01-16 14:51:44 +00:00
Ryan Dick	d7ab464176	Offload the current model when locking if it is already partially loaded and we have insufficient VRAM.	2025-01-07 02:53:44 +00:00
Ryan Dick	b343f81644	Use torch.cuda.memory_allocated() rather than torch.cuda.memory_reserved() to be more conservative in setting dynamic VRAM cache limits.	2025-01-07 01:20:15 +00:00
Ryan Dick	fc4a22fe78	Allow expensive operations to request more working memory.	2025-01-07 01:20:13 +00:00
Ryan Dick	a167632f09	Calculate model cache size limits dynamically based on the available RAM / VRAM.	2025-01-07 01:14:20 +00:00
Ryan Dick	6a9de1fcf3	Change definition of VRAM in use for the ModelCache from sum of model weights to the total torch.cuda.memory_allocated().	2025-01-07 00:31:53 +00:00
Ryan Dick	ceb2498a67	Add log prefix to model cache logs.	2025-01-07 00:31:00 +00:00
Ryan Dick	d0bfa019be	Add 'enable_partial_loading' config flag.	2025-01-07 00:31:00 +00:00
Ryan Dick	535e45cedf	First pass at adding partial loading support to the ModelCache.	2025-01-07 00:30:58 +00:00
Ryan Dick	c579a218ef	Allow models to be locked in VRAM, even if they have been dropped from the RAM cache (related: https://github.com/invoke-ai/InvokeAI/issues/7513 ).	2025-01-06 23:02:52 +00:00
Ryan Dick	6d49ee839c	Switch the LayerPatcher to use 'custom modules' to manage layer patching.	2024-12-29 01:18:30 +00:00
Ryan Dick	55b13c1da3	(minor) Add TODO comment regarding the location of get_model_cache_key().	2024-12-24 14:23:19 +00:00
Ryan Dick	7dc3e0fdbe	Get rid of ModelLocker. It was an unnecessary layer of indirection.	2024-12-24 14:23:18 +00:00
Ryan Dick	a39bcf7e85	Move lock(...) and unlock(...) logic from ModelLocker to the ModelCache and make a bunch of ModelCache properties/methods private.	2024-12-24 14:23:18 +00:00
Ryan Dick	a7c72992a6	Pull get_model_cache_key(...) out of ModelCache. The ModelCache should not be concerned with implementation details like the submodel_type.	2024-12-24 14:23:18 +00:00
Ryan Dick	d30a9ced38	Rename model_cache_default.py -> model_cache.py.	2024-12-24 14:23:18 +00:00

33 Commits