George Hotz
|
d71fe7faa5
|
rename allocator methods to not conflict [pr] (#7788)
* rename allocator methods to not conflict [pr]
* forgot those
* transfer + offset
|
2024-11-20 00:10:29 +08:00 |
|
nimlgen
|
654a8b9ef7
|
retire hsa (#4885)
* retire hsa
* EMULATE_AMD
|
2024-06-09 11:33:03 +03:00 |
|
Francis Lam
|
04746022b1
|
extra/gemm/hip_matmul: fix to use new HSA devices and no headers (#3999)
* extra/gemm/hip_matmul: fix to use new HSA devices and no headers
* remove compile_hip import
|
2024-03-30 15:42:23 -04:00 |
|
Ahmed Harmouche
|
168b1f879c
|
Fix hip_matmul gemm in extra (#3241)
|
2024-01-25 16:03:04 -08:00 |
|
George Hotz
|
a280cfe169
|
move dtypes to dtype.py (#2964)
* move dtypes to dtype.py
* fix urllib
|
2024-01-01 14:58:48 -08:00 |
|
Yixiang Gao
|
fde44aed76
|
update hip_matmul with new abstraction (#2605)
|
2023-12-04 13:37:10 -08:00 |
|
Davi Silva
|
186ac77ec3
|
Update hip_matmul.py (#2480)
|
2023-11-27 18:36:19 -08:00 |
|
George Hotz
|
c417cd3c97
|
fast HIP gemm -> 100 TFLOPS (#1476)
* fast HIP gemm
* wmma
* correct b
* fix spilling
* 60 TFLOPS
* 64 TFLOPS
* 65 TFLOPS
|
2023-08-09 06:54:15 -07:00 |
|
David Hou
|
3300d0aeaf
|
syncthreads before wmma (#1389)
(venv) chaos@tiny3:~/tinygrad$ KX=2 KY=2 N=2048 python extra/gemm/hip_matmul.py
4194304 289.60 us, would be 59322.55 GFLOPS matmul, 173.80 GB/s
|
2023-07-31 17:05:49 -07:00 |
|
George Hotz
|
b8dfbba703
|
hip_matmul: f16 gemm 2048x2048 gets 36 TFLOPS
|
2023-07-08 00:35:45 +00:00 |
|
George Hotz
|
e234bf2298
|
hip matmul : add K support
|
2023-06-28 19:54:33 +00:00 |
|
George Hotz
|
0e93b9642a
|
hip matmul
|
2023-06-28 19:21:01 +00:00 |
|