qazal
|
2cc64d71b0
|
simplify mi350x gemm / viz asm tests (#13984)
* mi350x gemm cleanup
* asm tests work
* simpler asm tests
|
2026-01-03 11:11:07 +09:00 |
|
qazal
|
5f52266225
|
mi350x gemm: use Tensor.custom_kernel in asm test (#13969)
* mi350x gemm: use Tensor.custom_kernel in asm test
* A @ B for baseline
|
2026-01-02 18:30:50 +09:00 |
|
qazal
|
c0f52c9dcb
|
split assembly gemm to per arch directory (#13953)
|
2026-01-02 00:10:22 +09:00 |
|
qazal
|
6a5430ab00
|
correct args order in mi350x gemm (#13949)
|
2026-01-01 23:01:46 +09:00 |
|
qazal
|
b23f4517ab
|
prep mi350x gemm for python dsl (#13918)
* start by pruning existing asm
* better branch names
* split to template and real instructions
|
2025-12-31 20:00:57 +09:00 |
|
qazal
|
b557c46233
|
assembly gemm clean ups, instructions for cli (#13892)
|
2025-12-30 16:14:06 +09:00 |
|
qazal
|
f541540129
|
variable N for asm gemm (#13869)
* variable N for asm gemm
* cleanup spacing
|
2025-12-29 19:35:50 +09:00 |
|
qazal
|
fc5278746f
|
mi350x assembly gemm cleanups (#13867)
|
2025-12-29 18:47:23 +09:00 |
|
qazal
|
066d96c397
|
print tflops in asm gemm test (#13859)
* print tflops in asm gemm test
* change order
|
2025-12-29 02:26:40 +09:00 |
|
qazal
|
2cfbabdc34
|
mi350x 1tflop bf16 gemm in extra (#13702)
|
2025-12-28 21:45:42 +09:00 |
|