Commit Graph

3 Commits

Author SHA1 Message Date
George Hotz
e6879035a0 work to make GEMV fast (#5824)
* work to make GEMV fast

* half8 cast

* align struct

* fix amd

* float8 is a later problem
2024-07-30 17:41:40 -07:00
George Hotz
489a5b99a5 hotfix: triton_nv_matmul touchups 2024-07-24 23:24:29 +00:00
George Hotz
4d47968580 fix acc folding for NV tensor cores (#5658)
* fix acc folding for NV tensor cores

* fix correctness of reduce_before_expand
2024-07-23 13:03:02 -07:00