George Hotz
|
cb500466c2
|
assembly/amd: amd_asm_matmul (#13989)
* amd_asm_matmul
* dsl transform
* asm roundtrip
* fixed
* less
* better
* more
* simpler
* simplify
* lil
* simpler
* compact
* work
* cleanups
* simplify
* simpler
* cleanup
* name the regs
* simp
* big simp
* big simp
* simp
* acc grid
* fast
* stuff
* fast
* simpler
* owrks
* save vgprs
* save vgprs
* Compact
* less VGPRs
* after
* SQTT support
* fastest
* faster
* lil faster
* tile regs
* faster
* readable
* one more
* simpler
* lil simpler
* NO_GLOBAL skips early globals
* stock kernel
* cleanups
* cleanups
* one b reg
* safe reg changes
* acc is compact now
* remove confusing stuff
* sregs
* lds cleanups
* vopd
|
2026-01-07 20:11:05 -08:00 |
|