Commit Graph

4 Commits

Author SHA1 Message Date
George Hotz
267be7fc5e fp16 acc 2025-11-02 12:53:04 +08:00
George Hotz
25c2da1579 check SPEC=2 in CI (#12945)
* check SPEC=2 in CI

* split SPEC=2

* fast enough
2025-10-27 21:53:57 +08:00
George Hotz
a3c78d47b3 speed docs + upgrades [pr] (#8964)
* add some docs about speed [pr]

* better torch gemm

* enable locals on llvm/clang

* disable locals for beam speed on LLVM/CLANG

* 0x20 alignment in llvm allows ymm use
2025-02-08 17:28:52 +08:00
George Hotz
fe71282ba1 faster RDNA assembly backend (#990)
* fast asm

* torch gemm
2023-06-16 12:06:38 -07:00