InvokeAI

mirror of https://github.com/invoke-ai/InvokeAI.git synced 2026-02-14 13:44:59 -05:00

Author	SHA1	Message	Date
Ryan Dick	3f990393a1	Simplify the state management in InvokeLinear8bitLt and add unit tests. This is in preparation for wrapping it to support streaming of weights from cpu to gpu.	2024-12-24 14:32:11 +00:00
Ryan Dick	65fcbf5f60	Bump bitsandbytes. The new verson contains improvements to state_dict loading/saving for LLM.int8 and promises improved speed on some HW.	2024-12-24 14:32:11 +00:00
Ryan Dick	29fe1533f2	Fix bug in InvokeLinear8bitLt that was causing old state information to persist after loading from a state dict. This manifested as state tensors being left on the GPU even when a model had been offloaded to the CPU cache.	2024-08-29 19:08:18 +00:00
Ryan Dick	19a68afb3a	Fix bug in InvokeInt8Params that was causing it to use double the necessary VRAM.	2024-08-26 20:17:50 -04:00
Ryan Dick	d3a5ca5247	More improvements for LLM.int8() - not fully tested.	2024-08-26 20:17:50 -04:00
Ryan Dick	f01f56a98e	LLM.int8() quantization is working, but still some rough edges to solve.	2024-08-26 20:17:50 -04:00