InvokeAI

mirror of https://github.com/invoke-ai/InvokeAI.git synced 2026-01-14 02:08:00 -05:00

Author	SHA1	Message	Date
jiangmencity	5259693ed1	chore: fix some comments Signed-off-by: jiangmencity <jiangmen@52it.net>	2025-08-14 09:32:54 +10:00
Kevin Turner	8bd52ed744	fix: improve gguf performance with torch.compile pytorch 2.7 does not implement `set.__contains__`, so make this a list instead. See https://github.com/pytorch/pytorch/issues/145761	2025-05-22 13:42:09 +10:00
David Burnett	6c0bd7d150	fix import ordering, remove code I reverted that the resync added back	2025-05-19 11:16:23 +10:00
David Burnett	99e154d773	fix picky ruff issue	2025-05-19 11:16:23 +10:00
David Burnett	e4e43ae126	fix missing bracket	2025-05-19 11:16:23 +10:00
David Burnett	a07fac6180	raise exected exception when attempting to change dtype	2025-05-19 11:16:23 +10:00
David Burnett	93d4b00082	Add to overload for GGMLTensor, so calling to on the model moves the quantized data as well	2025-05-19 11:16:23 +10:00
David Burnett	86719f2065	revert to overload due to failing tests, use Torch futures instead	2025-05-19 11:16:23 +10:00
David Burnett	5271fc1cac	fix picky ruff issue	2025-05-19 11:16:23 +10:00
David Burnett	96ff7d9093	fix missing bracket	2025-05-19 11:16:23 +10:00
David Burnett	6f73d9e9c6	raise exected exception when attempting to change dtype	2025-05-19 11:16:23 +10:00
David Burnett	29b406a84b	Add to overload for GGMLTensor, so calling to on the model moves the quantized data as well	2025-05-19 11:16:23 +10:00
Ryan Dick	5ea7953537	Update GGMLTensor with ops necessary to work with ConcatenatedLoRALayer.	2025-01-28 14:51:35 +00:00
Ryan Dick	a8b2c4c3d2	Add inference tests for all custom module types (i.e. to test autocasting from cpu to device).	2024-12-26 18:33:46 +00:00
Ryan Dick	3f990393a1	Simplify the state management in InvokeLinear8bitLt and add unit tests. This is in preparation for wrapping it to support streaming of weights from cpu to gpu.	2024-12-24 14:32:11 +00:00
Ryan Dick	65fcbf5f60	Bump bitsandbytes. The new verson contains improvements to state_dict loading/saving for LLM.int8 and promises improved speed on some HW.	2024-12-24 14:32:11 +00:00
Ryan Dick	9369b39a12	Add GGMLTensor op.	2024-12-17 13:20:19 +00:00
David Burnett	9bd17ea02f	Get flux working with MPS on 2.4.1, with GGUF support	2024-10-23 10:20:42 +11:00
Brandon Rising	d328eaf743	Remove no longer used dequantize_tensor function	2024-10-02 18:33:05 -04:00
Ryan Dick	bc63e2acc5	Add workaround for FLUX GGUF models with incorrect img_in.weight shape.	2024-10-02 18:33:05 -04:00
Ryan Dick	ec7e771942	Add a compute_dtype field to GGMLTensor.	2024-10-02 18:33:05 -04:00
Ryan Dick	fe84013392	Add unit tests for GGMLTensor.	2024-10-02 18:33:05 -04:00
Ryan Dick	710f81266b	Fix type errors in GGMLTensor.	2024-10-02 18:33:05 -04:00
Brandon Rising	446e2884bc	Remove no longer used code paths, general cleanup of new dequantization code, update probe	2024-10-02 18:33:05 -04:00
Brandon Rising	7d9f125232	Run ruff and update imports	2024-10-02 18:33:05 -04:00
Brandon Rising	66bbd62758	Run ruff and fix typing in torch patcher	2024-10-02 18:33:05 -04:00
Brandon Rising	0875e861f5	Various updates to gguf performance	2024-10-02 18:33:05 -04:00
Ryan Dick	f06765dfba	Get alternative GGUF implementation working... barely.	2024-10-02 18:33:05 -04:00
Ryan Dick	f347b26999	Initial experimentation with Tensor-like extension for GGUF.	2024-10-02 18:33:05 -04:00
Brandon Rising	2bfb0ddff5	Initial GGUF support for flux models	2024-10-02 18:33:05 -04:00
Ryan Dick	29fe1533f2	Fix bug in InvokeLinear8bitLt that was causing old state information to persist after loading from a state dict. This manifested as state tensors being left on the GPU even when a model had been offloaded to the CPU cache.	2024-08-29 19:08:18 +00:00
Brandon Rising	65bb46bcca	Rename params for flux and flux vae, add comments explaining use of the config_path in model config	2024-08-26 20:17:50 -04:00
Ryan Dick	635d2f480d	ruff	2024-08-26 20:17:50 -04:00
Brandon Rising	56b9906e2e	Setup scaffolding for in progress images and add ability to cancel the flux node	2024-08-26 20:17:50 -04:00
Ryan Dick	dff4a88baa	Move quantization scripts to a scripts/ subdir.	2024-08-26 20:17:50 -04:00
Ryan Dick	a21f6c4964	Update docs for T5 quantization script.	2024-08-26 20:17:50 -04:00
Ryan Dick	97562504b7	Remove all references to optimum-quanto and downgrade diffusers.	2024-08-26 20:17:50 -04:00
Ryan Dick	b9dd354e2b	Fixes to the T5XXL quantization script.	2024-08-26 20:17:50 -04:00
Ryan Dick	33c2fbd201	Add script for quantizing a T5 model.	2024-08-26 20:17:50 -04:00
Ryan Dick	b66f19d4d1	Add docs to the quantization scripts.	2024-08-26 20:17:50 -04:00
Ryan Dick	4105a78b83	Update load_flux_model_bnb_llm_int8.py to work with a single-file FLUX transformer checkpoint.	2024-08-26 20:17:50 -04:00
Ryan Dick	19a68afb3a	Fix bug in InvokeInt8Params that was causing it to use double the necessary VRAM.	2024-08-26 20:17:50 -04:00
Ryan Dick	cfac7c8189	Move requantize.py to the quatnization/ dir.	2024-08-26 20:17:50 -04:00
Ryan Dick	ac96f187bd	Remove duplicate log_time(...) function.	2024-08-26 20:17:50 -04:00
Brandon Rising	57168d719b	Fix styling/lint	2024-08-26 20:17:50 -04:00
Brandon Rising	4bd7fda694	Install sub directories with folders correctly, ensure consistent dtype of tensors in flux pipeline and vae	2024-08-26 20:17:50 -04:00
Brandon Rising	2d9042fb93	Run Ruff	2024-08-26 20:17:50 -04:00
Brandon Rising	9ed53af520	Run Ruff	2024-08-26 20:17:50 -04:00
Brandon Rising	56fda669fd	Manage quantization of models within the loader	2024-08-26 20:17:50 -04:00
Ryan Dick	1fa6bddc89	WIP on moving from diffusers to FLUX	2024-08-26 20:17:50 -04:00

1 2

54 Commits