Commit Graph

23 Commits

Author SHA1 Message Date
George Hotz
7ebda28692 assembly/amd: add CDNA support to asm (#13982)
* add CDNA support

* more cdna tests

* something

* fix more stuff

* more work

* simpler

* simplier

* cdna

* disasm

* less skip

* fixes

* simpler
2026-01-04 08:53:56 -08:00
George Hotz
34ea053b26 assembly/amd: clean up pcode, jit pcode instead of static (#14001)
* assembly/amd: clean up pcode

* regen

* lil

* jit the pcode

* sendmsg

* cleanups

* inst prefetch lol
2026-01-03 23:06:15 -08:00
George Hotz
8328511808 assembly/amd: make the emu.py code shine (#13996)
* assembly/amd: make the code shine

* lil clean

* reg back in pcode

* cleanups

* gen fma_mix

* no writelane hacks

* fn cleanup

* dead vgpr_write

* readable

* smem

* cleanup bench_emu

* speedups

* simpler and faster

* direct inst._fn

* split fxn

* Revert "simpler and faster"

This reverts commit e85f6594b3.

* move lds to wavestate

* dispatcher

* pc in dispatch

* literal isn't wavestate

* cleanups + program

* one readlane

* exec_vop3sd in exec_vop

* cleaner exec_vopd

* fully merge VOP3P

* no special paths

* no SliceProxy

* low=0

* no bigint

* failing tests

* fma on python 3.13
2026-01-03 20:33:09 -08:00
George Hotz
0e282025ff assembly/amd: split test_emu into hw tests (#13966)
* assmebly/amd: split test_emu into hw tests

* hw tests

* bugfixes

* more tests and fix
2026-01-02 08:04:56 -08:00
nietras
f49e4714af Fix spelling errors in README for AMD assembly (#13975) 2026-01-02 10:15:20 -05:00
George Hotz
5a1a561e0f assembly/amd: rdna4 autogen (#13967)
* assembly/amd: add pcode ds ops

* refactors

* fix ds op

* update autogen

* fix flat bug

* more tests

* fix emu test

* that's a hack

* generic

* fix all tests

* two tests

* fix test failure

* better

* remove __all__

* assembly/amd: fix autogen for RDNA4
2026-01-01 23:12:18 -05:00
George Hotz
dfb813b760 assembly/amd: add pcode ds ops (#13939)
* assembly/amd: add pcode ds ops

* refactors

* fix ds op

* update autogen

* fix flat bug

* more tests

* fix emu test

* that's a hack

* generic

* fix all tests

* two tests

* fix test failure

* better

* remove __all__
2026-01-01 16:24:13 -05:00
George Hotz
2bb07d4824 assembly/amd: move Reg out of the psuedocode (#13934)
* assembly/amd: move Reg out of the psuedocode

* remove extra

* fix pcode tests

* simpler pcode

* simpler

* simpler

* cleaner

* fix mypy
2025-12-31 15:34:51 -05:00
George Hotz
f14428090f assembly/amd: speed up emulator (#13932) 2025-12-31 13:32:25 -05:00
George Hotz
29402034a1 assembly/amd: cleanups to asm and emu (#13912)
* a bunch of cleanups

* ops are back

* bug fixes

* cleanups

* a lil simpler

* more refactors

* _disasm_vop1

* sops

* more

* continue

* more

* num_srcs

* simpler

* no _is16

* op cleanups

* isinstnace
2025-12-31 12:46:11 -05:00
George Hotz
b998a80b5d assembly/amd: split generated stuff into enum/ins (#13924) 2025-12-31 10:10:52 -05:00
George Hotz
0221b96761 assembly/amd: fix all ops tests (#13910)
* assembly/amd: fix all ops tests

* test_ops with smaller sizes

* ds store/load 2addr
2025-12-30 18:01:34 -05:00
George Hotz
efc99d0c55 assembly/amd: more refactors (#13907)
* assembly/amd: more refactors

* more refactors

* more refactors

* simpler emu

* generate.py

* regen all

* cleanups

* more

* work

* more readme

* lil
2025-12-30 16:13:24 -05:00
George Hotz
49d1bf93d6 assembly/amd: refactor asm.py to be simpler (#13900)
* assembly/amd: refactor asm.py

* assembly/amd: refactor asm.py to be simpler

* multiple fxns

* fast

* more tests pass

* regen

* stop decode
2025-12-30 13:51:40 -05:00
George Hotz
7e14cdcb06 assembly/amd: clean up clt/ctz hack (#13901)
* assembly/amd: clean up clt/ctz hack

* add breaks
2025-12-30 11:59:28 -05:00
George Hotz
69cdc8066d assembly/amd: add dtype tests to AMD IDE CI (#13899)
* add dtype tests to AMD IDE CI

* more tests

* add trig preop

* regen done

* split to amd autogen

* simpler
2025-12-30 11:09:51 -05:00
George Hotz
9c89be5235 assembly/amd: fix v_perm_b32 + PC fixes (#13897)
* assembly/amd: fix v_perm_b32

* add pc support
2025-12-30 09:25:40 -05:00
George Hotz
2b838dc1d8 assembly/amd: fix AMD_LLVM=1 support in emulator (#13881)
* fix AMD_LLVM=1 support in emulator

* more llvm with dtype

* work

* more fixes

* fix dtype
2025-12-30 09:09:57 -05:00
George Hotz
94bca91f3e assembly/amd: have asm go through the dsl (#13886)
* assembly/amd: have asm go through the dsl

* lil
2025-12-29 17:39:11 -05:00
George Hotz
7322d9ec4a assembly/amd: add new instruction support to pcode (#13885)
* assembly/amd: add new instruction support

* more

* regen all
2025-12-29 17:30:17 -05:00
George Hotz
0d326f5b9b fix missing instructions in psuedocode (#13884) 2025-12-29 16:11:22 -05:00
George Hotz
9d8397be11 add CDNA3+RDNA4 support (#13882)
* fix CI

* remove junk

* rename lib to dsl

* correct

* cleanups
2025-12-29 15:51:29 -05:00
George Hotz
81cf9ea0ab rename to extra.assembly.amd (#13879) 2025-12-29 14:10:55 -05:00