George Hotz
267bbb163e
progress
2026-01-01 21:11:29 -05:00
George Hotz
de29a49ea3
all the ones i can find
2026-01-01 20:56:30 -05:00
George Hotz
742e10a572
remove fake ones
2026-01-01 20:26:53 -05:00
George Hotz
447fe8907b
more
2026-01-01 20:22:52 -05:00
George Hotz
b0cfcec183
good
2026-01-01 20:12:20 -05:00
George Hotz
1726084b2a
filt
2026-01-01 19:40:43 -05:00
George Hotz
de069a4876
many
2026-01-01 19:21:46 -05:00
George Hotz
4573e91e61
more
2026-01-01 18:51:31 -05:00
George Hotz
8d43212bc6
assembly/amd: start work on SQTT parsing/emulation
2026-01-01 18:40:58 -05:00
George Hotz
a8bea4ec52
remove __all__
2026-01-01 16:14:15 -05:00
George Hotz
729bb04d8c
fix test failure
2026-01-01 13:21:55 -05:00
George Hotz
a5959ef0f1
fix all tests
2026-01-01 13:11:51 -05:00
George Hotz
5ba06892c0
generic
2026-01-01 12:46:08 -05:00
George Hotz
469efe313d
that's a hack
2026-01-01 12:40:14 -05:00
George Hotz
e3b3cb163d
fix emu test
2026-01-01 12:12:47 -05:00
George Hotz
3e32185faf
more tests
2026-01-01 12:04:41 -05:00
George Hotz
5328913d2b
fix flat bug
2026-01-01 11:51:10 -05:00
George Hotz
9c49ec1cc1
update autogen
2026-01-01 11:36:33 -05:00
George Hotz
000d4a125b
fix ds op
2026-01-01 10:36:37 -05:00
George Hotz
63289902d8
refactors
2025-12-31 17:57:27 -05:00
George Hotz
b596f77e33
assembly/amd: add pcode ds ops
2025-12-31 16:59:02 -05:00
George Hotz
2bb07d4824
assembly/amd: move Reg out of the psuedocode ( #13934 )
...
* assembly/amd: move Reg out of the psuedocode
* remove extra
* fix pcode tests
* simpler pcode
* simpler
* simpler
* cleaner
* fix mypy
2025-12-31 15:34:51 -05:00
George Hotz
f14428090f
assembly/amd: speed up emulator ( #13932 )
2025-12-31 13:32:25 -05:00
George Hotz
29402034a1
assembly/amd: cleanups to asm and emu ( #13912 )
...
* a bunch of cleanups
* ops are back
* bug fixes
* cleanups
* a lil simpler
* more refactors
* _disasm_vop1
* sops
* more
* continue
* more
* num_srcs
* simpler
* no _is16
* op cleanups
* isinstnace
2025-12-31 12:46:11 -05:00
George Hotz
b998a80b5d
assembly/amd: split generated stuff into enum/ins ( #13924 )
2025-12-31 10:10:52 -05:00
qazal
b23f4517ab
prep mi350x gemm for python dsl ( #13918 )
...
* start by pruning existing asm
* better branch names
* split to template and real instructions
2025-12-31 20:00:57 +09:00
qazal
3f3786ded9
mmapeak: fix compiler import ( #13915 )
2025-12-31 16:52:23 +09:00
George Hotz
0221b96761
assembly/amd: fix all ops tests ( #13910 )
...
* assembly/amd: fix all ops tests
* test_ops with smaller sizes
* ds store/load 2addr
2025-12-30 18:01:34 -05:00
George Hotz
efc99d0c55
assembly/amd: more refactors ( #13907 )
...
* assembly/amd: more refactors
* more refactors
* more refactors
* simpler emu
* generate.py
* regen all
* cleanups
* more
* work
* more readme
* lil
2025-12-30 16:13:24 -05:00
George Hotz
49d1bf93d6
assembly/amd: refactor asm.py to be simpler ( #13900 )
...
* assembly/amd: refactor asm.py
* assembly/amd: refactor asm.py to be simpler
* multiple fxns
* fast
* more tests pass
* regen
* stop decode
2025-12-30 13:51:40 -05:00
George Hotz
7e14cdcb06
assembly/amd: clean up clt/ctz hack ( #13901 )
...
* assembly/amd: clean up clt/ctz hack
* add breaks
2025-12-30 11:59:28 -05:00
George Hotz
69cdc8066d
assembly/amd: add dtype tests to AMD IDE CI ( #13899 )
...
* add dtype tests to AMD IDE CI
* more tests
* add trig preop
* regen done
* split to amd autogen
* simpler
2025-12-30 11:09:51 -05:00
George Hotz
9c89be5235
assembly/amd: fix v_perm_b32 + PC fixes ( #13897 )
...
* assembly/amd: fix v_perm_b32
* add pc support
2025-12-30 09:25:40 -05:00
George Hotz
2b838dc1d8
assembly/amd: fix AMD_LLVM=1 support in emulator ( #13881 )
...
* fix AMD_LLVM=1 support in emulator
* more llvm with dtype
* work
* more fixes
* fix dtype
2025-12-30 09:09:57 -05:00
qazal
b557c46233
assembly gemm clean ups, instructions for cli ( #13892 )
2025-12-30 16:14:06 +09:00
qazal
d7e1f26e3d
command line interface for sqtt viz ( #13891 )
...
* command line interface for sqtt viz
* cleanup
* api surface area
* this confuses the llms
* document
2025-12-30 12:33:21 +09:00
George Hotz
94bca91f3e
assembly/amd: have asm go through the dsl ( #13886 )
...
* assembly/amd: have asm go through the dsl
* lil
2025-12-29 17:39:11 -05:00
George Hotz
7322d9ec4a
assembly/amd: add new instruction support to pcode ( #13885 )
...
* assembly/amd: add new instruction support
* more
* regen all
2025-12-29 17:30:17 -05:00
George Hotz
0d326f5b9b
fix missing instructions in psuedocode ( #13884 )
2025-12-29 16:11:22 -05:00
George Hotz
9d8397be11
add CDNA3+RDNA4 support ( #13882 )
...
* fix CI
* remove junk
* rename lib to dsl
* correct
* cleanups
2025-12-29 15:51:29 -05:00
George Hotz
81cf9ea0ab
rename to extra.assembly.amd ( #13879 )
2025-12-29 14:10:55 -05:00
George Hotz
37f0fa11b6
rdna3 test cleanups ( #13878 )
...
* rdna3 test cleanups
* cleanups
* ugh DONT SKIP
2025-12-29 13:41:59 -05:00
George Hotz
35db73b231
add cdna4 support to parsers ( #13877 )
...
* add cdna4 support to parsers
* cdna4
2025-12-29 13:23:43 -05:00
George Hotz
ff856a74cb
minor refactoring for rdna3 ( #13873 )
...
* minor refactoring for rdna3
* fix div scale stuff
* more bugfixes
2025-12-29 13:20:00 -05:00
George Hotz
f1471a3b99
speed up rdna3 unit tests + add to CI ( #13871 )
...
* speed up rdna3 unit tests
* add test to CI
* faster and simpler
* speedups
* bugfixes
* use helper
* fix CI maybe
* test fixes
* llvm-21 on 24.04
* upd
* llvm-21
* fix test
* bring that back
* merge gen into lib
* test generators
2025-12-29 10:26:48 -05:00
George Hotz
25ef866e89
write python emulator from RDNA3 psuedocode in pdf ( #13841 )
...
* write python emulator from RDNA3 psuedocode in pdf
* emu2
* more emu
* working
* more psueod
* progress
* cleanups
* delete junk
* delete stale files
* just emu
* work
* emu compare
* bemu
* cleanups and more failures
* revert bench emu
* fix emu cmp
* four tests fail
* bugfixes
* dsl
* ext
* refactor
* dsl
* div scale fix
* test_emu
* fix emu tests
* pcode
* test pcode
* top imports
* fix test_emu to use run_asm
* emu tests on real hardware
* more tests
* more emu tests
* more
* work
* work
* bug fix
* bugfixes
* fix fp16 gemm
* all ops tests pass in emulator
* fix llvm tests
* fix a few more tests
* fix mockgpu timeout
2025-12-29 07:39:53 -05:00
qazal
f541540129
variable N for asm gemm ( #13869 )
...
* variable N for asm gemm
* cleanup spacing
2025-12-29 19:35:50 +09:00
qazal
fc5278746f
mi350x assembly gemm cleanups ( #13867 )
2025-12-29 18:47:23 +09:00
George Hotz
f07c39cfa4
hwtest fixes for rdna3 dsl ( #13865 )
2025-12-28 20:42:29 -05:00
George Hotz
d9603c1bee
improve asm dsl syntax ( #13864 )
...
* improve asm dsl syntax
* improve asm dsl syntax
2025-12-28 20:04:59 -05:00