Kevin Turner
|
8bd52ed744
|
fix: improve gguf performance with torch.compile
pytorch 2.7 does not implement `set.__contains__`, so make this a list instead.
See https://github.com/pytorch/pytorch/issues/145761
|
2025-05-22 13:42:09 +10:00 |
|
David Burnett
|
6c0bd7d150
|
fix import ordering, remove code I reverted that the resync added back
|
2025-05-19 11:16:23 +10:00 |
|
David Burnett
|
99e154d773
|
fix picky ruff issue
|
2025-05-19 11:16:23 +10:00 |
|
David Burnett
|
e4e43ae126
|
fix missing bracket
|
2025-05-19 11:16:23 +10:00 |
|
David Burnett
|
a07fac6180
|
raise exected exception when attempting to change dtype
|
2025-05-19 11:16:23 +10:00 |
|
David Burnett
|
93d4b00082
|
Add to overload for GGMLTensor, so calling to on the model moves the quantized data as well
|
2025-05-19 11:16:23 +10:00 |
|
David Burnett
|
86719f2065
|
revert to overload due to failing tests, use Torch futures instead
|
2025-05-19 11:16:23 +10:00 |
|
David Burnett
|
5271fc1cac
|
fix picky ruff issue
|
2025-05-19 11:16:23 +10:00 |
|
David Burnett
|
96ff7d9093
|
fix missing bracket
|
2025-05-19 11:16:23 +10:00 |
|
David Burnett
|
6f73d9e9c6
|
raise exected exception when attempting to change dtype
|
2025-05-19 11:16:23 +10:00 |
|
David Burnett
|
29b406a84b
|
Add to overload for GGMLTensor, so calling to on the model moves the quantized data as well
|
2025-05-19 11:16:23 +10:00 |
|
Ryan Dick
|
5ea7953537
|
Update GGMLTensor with ops necessary to work with ConcatenatedLoRALayer.
|
2025-01-28 14:51:35 +00:00 |
|
Ryan Dick
|
a8b2c4c3d2
|
Add inference tests for all custom module types (i.e. to test autocasting from cpu to device).
|
2024-12-26 18:33:46 +00:00 |
|
Ryan Dick
|
9369b39a12
|
Add GGMLTensor op.
|
2024-12-17 13:20:19 +00:00 |
|
David Burnett
|
9bd17ea02f
|
Get flux working with MPS on 2.4.1, with GGUF support
|
2024-10-23 10:20:42 +11:00 |
|
Brandon Rising
|
d328eaf743
|
Remove no longer used dequantize_tensor function
|
2024-10-02 18:33:05 -04:00 |
|
Ryan Dick
|
bc63e2acc5
|
Add workaround for FLUX GGUF models with incorrect img_in.weight shape.
|
2024-10-02 18:33:05 -04:00 |
|
Ryan Dick
|
ec7e771942
|
Add a compute_dtype field to GGMLTensor.
|
2024-10-02 18:33:05 -04:00 |
|
Ryan Dick
|
fe84013392
|
Add unit tests for GGMLTensor.
|
2024-10-02 18:33:05 -04:00 |
|
Ryan Dick
|
710f81266b
|
Fix type errors in GGMLTensor.
|
2024-10-02 18:33:05 -04:00 |
|
Brandon Rising
|
446e2884bc
|
Remove no longer used code paths, general cleanup of new dequantization code, update probe
|
2024-10-02 18:33:05 -04:00 |
|
Brandon Rising
|
7d9f125232
|
Run ruff and update imports
|
2024-10-02 18:33:05 -04:00 |
|
Brandon Rising
|
66bbd62758
|
Run ruff and fix typing in torch patcher
|
2024-10-02 18:33:05 -04:00 |
|
Brandon Rising
|
0875e861f5
|
Various updates to gguf performance
|
2024-10-02 18:33:05 -04:00 |
|
Ryan Dick
|
f06765dfba
|
Get alternative GGUF implementation working... barely.
|
2024-10-02 18:33:05 -04:00 |
|
Ryan Dick
|
f347b26999
|
Initial experimentation with Tensor-like extension for GGUF.
|
2024-10-02 18:33:05 -04:00 |
|
Brandon Rising
|
2bfb0ddff5
|
Initial GGUF support for flux models
|
2024-10-02 18:33:05 -04:00 |
|