InvokeAI

mirror of https://github.com/invoke-ai/InvokeAI.git synced 2026-02-10 20:05:18 -05:00

Author	SHA1	Message	Date
Billy	b9972be7f1	Merge branch 'model-classification-api' into stripped-models	2025-03-18 14:57:23 +11:00
Billy	e61c5a3f26	Merge	2025-03-18 14:55:11 +11:00
Ryan Dick	9a389e6b93	Add a LLaVA OneVision starter model.	2025-03-18 11:53:06 +11:00
Ryan Dick	2ef1ecf381	Fix copy-paste errors.	2025-03-18 11:53:06 +11:00
Ryan Dick	e9714fe476	Add LLaVA Onevision model loading and inference support.	2025-03-18 11:53:06 +11:00
Ryan Dick	3f29293e39	Add LlavaOnevision model type and probing logic.	2025-03-18 11:53:06 +11:00
Billy	3469fc9843	Ruff	2025-03-18 09:22:16 +11:00
Billy	7cdd4187a9	Update classify script	2025-03-18 09:21:38 +11:00
Billy	24218b34bf	Make ruff happy	2025-03-17 12:04:26 +11:00
Billy	d970c6d6d5	Use override fixture	2025-03-17 11:58:13 +11:00
Billy	8bcd9fe4b7	Extend ModelOnDisk	2025-03-17 09:18:51 +11:00
Billy	4377158503	Variant	2025-03-13 13:32:57 +11:00
Billy	d8b9a8d0dd	Merge branch 'main' into model-classification-api	2025-03-13 13:03:51 +11:00
Billy	39a4608d15	Fix annotations compatability 3.11	2025-03-13 13:01:19 +11:00
Billy	b86ac5e049	Explicit union	2025-03-13 10:28:07 +11:00
Billy	665236bb79	Type hints	2025-03-13 09:21:58 +11:00
Billy	f45400a275	Remove hash algo	2025-03-12 18:39:29 +11:00
psychedelicious	e35537e60a	fix(mm): move flux_redux starter model to the flux bundle, make siglip a dependency of it	2025-03-11 11:17:19 +11:00
Billy	d86b392bfd	Remove redundant hash_algo field	2025-03-11 09:16:59 +11:00
Billy	3e9e45b177	Update comments	2025-03-11 09:04:19 +11:00
Billy	907d960745	PR suggestions	2025-03-11 08:37:43 +11:00
Billy	bfdace6437	New API for model classification	2025-03-11 08:34:34 +11:00
psychedelicious	cf0cbaf0ae	chore: ruff (more)	2025-03-06 10:57:54 +11:00
psychedelicious	ac6fc6eccb	chore: ruff	2025-03-06 10:57:54 +11:00
Ryan Dick	8e28888bc4	Fix SigLipPipeline model size calculation.	2025-03-06 10:31:17 +11:00
Ryan Dick	f1fde792ee	Get FLUX Redux working: model loading and inference.	2025-03-06 10:31:17 +11:00
Ryan Dick	e82393f7ed	Add FLUX Redux to starter models list.	2025-03-06 10:31:17 +11:00
Ryan Dick	d5211a8088	Add FluxRedux model type and probing logic.	2025-03-06 10:31:17 +11:00
Ryan Dick	3b095b5945	Add SigLIP starter model.	2025-03-06 10:31:17 +11:00
Ryan Dick	34959ef573	Add SigLIP model type and probing.	2025-03-06 10:31:17 +11:00
Billy	f2689598c0	Formatting	2025-03-06 09:11:00 +11:00
Ryan Dick	cc9d215a9b	Add endpoint for emptying the model cache. Also, adds a threading lock to the ModelCache to make it thread-safe.	2025-01-30 09:18:28 -05:00
Ryan Dick	f7315f0432	Make the default max RAM cache size more conservative.	2025-01-30 08:46:59 -05:00
Ryan Dick	229834a5e8	Performance optimizations for LoRAs applied on top of GGML-quantized tensors.	2025-01-28 14:51:35 +00:00
Ryan Dick	5d472ac1b8	Move quantized weight handling for patch layers up from ConcatenatedLoRALayer to CustomModuleMixin.	2025-01-28 14:51:35 +00:00
Ryan Dick	28514ba59a	Update ConcatenatedLoRALayer to work with all sub-layer types.	2025-01-28 14:51:35 +00:00
Ryan Dick	0db6639b4b	Add FLUX OneTrainer model probing.	2025-01-28 14:51:35 +00:00
Ryan Dick	0cf51cefe8	Revise the logic for calculating the RAM model cache limit.	2025-01-16 23:46:07 +00:00
Ryan Dick	da589b3f1f	Memory optimization to load state dicts one module at a time in CachedModelWithPartialLoad when we are not storing a CPU copy of the state dict (i.e. when keep_ram_copy_of_weights=False).	2025-01-16 17:00:33 +00:00
Ryan Dick	36a3869af0	Add keep_ram_copy_of_weights config option.	2025-01-16 15:35:25 +00:00
Ryan Dick	c76d08d1fd	Add keep_ram_copy option to CachedModelOnlyFullLoad.	2025-01-16 15:08:23 +00:00
Ryan Dick	04087c38ce	Add keep_ram_copy option to CachedModelWithPartialLoad.	2025-01-16 14:51:44 +00:00
Ryan Dick	b2bb359d47	Update the model loading logic for several of the large FLUX-related models to ensure that the model is initialized on the meta device prior to loading the state dict into it. This helps to keep peak memory down.	2025-01-16 02:30:28 +00:00
Ryan Dick	d7ab464176	Offload the current model when locking if it is already partially loaded and we have insufficient VRAM.	2025-01-07 02:53:44 +00:00
Ryan Dick	5b42b7bd45	Add a utility to help with determining the working memory required for expensive operations.	2025-01-07 01:20:15 +00:00
Ryan Dick	b343f81644	Use torch.cuda.memory_allocated() rather than torch.cuda.memory_reserved() to be more conservative in setting dynamic VRAM cache limits.	2025-01-07 01:20:15 +00:00
Ryan Dick	fc4a22fe78	Allow expensive operations to request more working memory.	2025-01-07 01:20:13 +00:00
Ryan Dick	a167632f09	Calculate model cache size limits dynamically based on the available RAM / VRAM.	2025-01-07 01:14:20 +00:00
Ryan Dick	6a9de1fcf3	Change definition of VRAM in use for the ModelCache from sum of model weights to the total torch.cuda.memory_allocated().	2025-01-07 00:31:53 +00:00
Ryan Dick	e5180c4e6b	Add get_effective_device(...) utility to aid in determining the effective device of models that are partially loaded.	2025-01-07 00:31:00 +00:00

1 2 3 4 5 ...

509 Commits