nimlgen
f7ee644950
amd: lazy sdma queue allocation ( #13920 )
...
* ams: lazy queue
* nv
* linter
* f
2025-12-31 15:17:13 +03:00
George Hotz
25ef866e89
write python emulator from RDNA3 psuedocode in pdf ( #13841 )
...
* write python emulator from RDNA3 psuedocode in pdf
* emu2
* more emu
* working
* more psueod
* progress
* cleanups
* delete junk
* delete stale files
* just emu
* work
* emu compare
* bemu
* cleanups and more failures
* revert bench emu
* fix emu cmp
* four tests fail
* bugfixes
* dsl
* ext
* refactor
* dsl
* div scale fix
* test_emu
* fix emu tests
* pcode
* test pcode
* top imports
* fix test_emu to use run_asm
* emu tests on real hardware
* more tests
* more emu tests
* more
* work
* work
* bug fix
* bugfixes
* fix fp16 gemm
* all ops tests pass in emulator
* fix llvm tests
* fix a few more tests
* fix mockgpu timeout
2025-12-29 07:39:53 -05:00
George Hotz
f1111ac7de
move amd compilers to new style ( #13831 )
...
* move amd compilers to new style
* simplest diff
* AMDHIPrenderer
2025-12-25 13:42:24 -05:00
George Hotz
9d94b8c6b2
python asm dsl in extra + python REMU ( #13436 )
...
* having fun with python asm dsl
* rdna3
* meh
* all in rdna3
* work
* more work
* work
* integration
* tests
* simpler
* simpler
* asm
* better
* simpler
* progress
* emu
* simpler
* emu
* tests
* types
* vopd
* cleaups
* work
* memory ranges
* add tracing
* refactors
* run_asm exit
* more readable
* compare to remu
* test gemm
* bug + stale
* more tests
* refactor
* tests fix
* more ins
* more instructions
* refactor
* faster
* match case
* match case
* simpler
* work
* tests
* run_asm
* work
* bug fixes
* more emu
* alu/emu
* refactor
* no pipeline emu yet
* alu direct
* fix
* bugfixes + new test
* fix exceptions in emulators
* update gen.py
* pylint
* no pdf
* improve bench_emu
* speedups
* cleanups
* more tests
2025-12-25 13:04:14 -05:00
nimlgen
90b217896f
am: xgmi p2p ( #13811 )
...
* system: use addr space
* am: xgmi
* fix
* ugh
2025-12-23 20:11:38 +03:00
nimlgen
f6bda6ae4e
am: continue from saved state ( #13799 )
...
* am: gfx queue cont
* f
* reset
* f
* l
2025-12-22 15:55:07 +03:00
George Hotz
a987a8ed44
add neg VIZ support to not start server ( #13772 )
2025-12-20 00:36:38 -04:00
nimlgen
3eecb4f123
am: mi350 support ( #13733 )
2025-12-17 14:57:21 +03:00
nimlgen
5778722979
am: restore queues ( #13714 )
...
* am: restore queues
* l
* cmnt
2025-12-16 15:21:42 +03:00
nimlgen
615dcab767
am: minimal mi300 boot ( #13679 )
...
* nbio7_9
* psp
* gmc
* gfx
* sdma
* ih
* linter
* linter
* minor
* finish
* add missing
* do not allow warm boot for now
2025-12-15 15:55:03 +03:00
nimlgen
0b15c573ca
amd: xccs in PCIIface ( #13669 )
2025-12-13 17:22:11 +03:00
qazal
019e71f8ca
lds bank count tests from pmc counters ( #13667 )
...
* lds bank count tests from pmc counters
* these tests run on the RDNA3 card too
* rename duration to cycles, other rename comment
* add SQ_LDS_IDX_ACTIVE to gfx9 defaults
2025-12-13 17:39:32 +08:00
nimlgen
b4796e2d32
amd: set queue prio to normal ( #13658 )
2025-12-12 18:25:41 +03:00
nimlgen
dd8a1a10d4
amd: tiny cleanups ( #13616 )
2025-12-08 13:15:56 +03:00
nimlgen
dcd50baca4
amd/nv: cleanup ( #13608 )
2025-12-07 17:05:26 +03:00
nimlgen
abafb96441
hcq: check all subbufs are free ( #13599 )
...
* hcq: check all subbufs are free
* fix
* Update ops_amd.py
2025-12-06 17:43:18 +03:00
nimlgen
f2b549d921
amd: refactor scratch calc ( #13595 )
...
* amd: refactor scratch calc
* fix
2025-12-06 16:41:35 +03:00
chenyu
0977206b1c
Revert am ( #13591 )
...
* Revert "hotfix: amd: tmpring (#13589 )"
This reverts commit 4d8b283b36 .
* Revert "amd: use correct structs (#13583 )"
This reverts commit d8b09eda57 .
2025-12-05 11:03:12 -05:00
nimlgen
4d8b283b36
hotfix: amd: tmpring ( #13589 )
...
* hotfix: amd: tmpring
* more
2025-12-05 18:19:05 +03:00
nimlgen
d8b09eda57
amd: use correct structs ( #13583 )
2025-12-05 14:46:38 +03:00
qazal
f21c9dbf4b
enable PMC with VIZ=2 ( #13575 )
2025-12-05 03:09:53 +08:00
qazal
8390de39e6
amd: static flag check for sqtt/pmc ( #13545 )
2025-12-03 18:36:15 +08:00
nimlgen
77a76d1b13
device: respect compiler ContextVars ( #13523 )
...
* device: envvars for cc
* fix
* fix
* x
* um
* fix
* remote
* em
* cleanup
* typing
* fix
* debug
* lvp?
* ugh
* singl
* rm
* lol
* fix
* ?
* this?
* why?
* rev
* mod test
* l
2025-12-02 14:42:04 +03:00
nimlgen
759b41ab91
amd: fix rsrc_word3 on gfx9 ( #13509 )
2025-12-01 12:47:54 +03:00
qazal
d457ee0ba4
viz: correctly handle multiple sqtt traces of the same prg ( #13460 )
2025-11-29 20:52:41 +08:00
nimlgen
192bf4e00a
amd,nv: remove unused env vars ( #13487 )
2025-11-28 23:12:53 +03:00
nimlgen
18cfb54736
amd: a bit better se limiting ( #13440 )
...
* amd: a bit better se limiting
* SQTT_LIMIT_SE=0
2025-11-24 21:51:47 +03:00
George Hotz
5110409339
continue work on parse sqtt, enable with SQTT_PARSE ( #13425 )
...
* continue work on parse sqtt, enable with SQTT_PARSE
* fix timing
* delta is pre instruction
* hi8 values
* a few more
* a bit more
* let it crash if you enabled it
* figure out simd
* hide 0x11
2025-11-22 19:03:17 -08:00
George Hotz
dabb02767f
set AMD profile mode with sudo on SQTT or PMC ( #13403 )
...
* require profile mode
* add mode setter
* cleanup
* not needed
* SQTT_LIMIT_SE
2025-11-20 23:19:11 -08:00
qazal
5623e765c8
VIZ=2 enables SQTT ( #13330 )
2025-11-18 22:20:31 +08:00
wozeparrot
8894a5409d
feat: hipcc compiler ( #13319 )
2025-11-17 15:13:32 -08:00
nimlgen
e2cee64050
Revert "hcq: add tag to exec events ( #13311 )" ( #13314 )
...
This reverts commit f63ded5817 .
2025-11-17 22:15:31 +03:00
nimlgen
f63ded5817
hcq: add tag to exec events ( #13311 )
...
* hcq: add tag to exec events
* f
* fix
* fix
2025-11-17 16:59:30 +03:00
nimlgen
9bb17c53ea
amd: timer fix ( #13267 )
2025-11-17 13:59:03 +03:00
Christopher Milan
09f3aae169
In-tree autogen: all C libraries ( #13220 )
...
* checkout files from autogen branch
* ioctl with payload
* fix am generations
* properly fix generations
This reverts commit b2a54f4f41 .
* revert discovery.h
* support pragma pack(1)
* typo
* better getter
* typo
* NVCEC0_QMDV05_00_RELEASE[01]_ENABLE
* align support
* anon handling fix
---------
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com >
2025-11-13 18:57:44 -08:00
nimlgen
f9586b38ba
system: pci mask and val ( #13251 )
2025-11-13 20:44:58 +08:00
George Hotz
a23dea202b
actually make AMD_LLVM not default ( #13238 )
2025-11-12 15:07:23 -08:00
nimlgen
9a53fcbde4
amd: sqtt on rdna3.5 ( #13233 )
2025-11-13 03:30:42 +08:00
nimlgen
b9b68bf437
amd: add kern to sqtt event ( #13126 )
...
* amd: add kern to sqtt event
* fix
2025-11-06 22:02:02 +08:00
nimlgen
05e2ff4d87
system: fix flock on pcidevs ( #13123 )
...
* system: fix locking of hcq devices
* rename and fullrun
* force ok
* fix
* fix
2025-11-06 19:02:13 +08:00
nimlgen
eff80beeed
amd: props in device not sqtt ( #13106 )
...
* amd: props in device not sqtt
* fix
* f
* fix
* fix
2025-11-05 23:43:20 +08:00
nimlgen
eaf7cbc178
amd: flush sqtt after each kernel ( #13092 )
...
* amd: flush sqtt after each kernel
* merge for rgp
2025-11-04 22:12:48 +08:00
nimlgen
16f1f644ba
amd: remove sqtt=2 ( #13090 )
2025-11-04 18:29:24 +08:00
nimlgen
dfde3f54d9
rocprof: use llvm disasm ( #13077 )
...
* rocprof: use llvm disasm
* rm
2025-11-03 23:58:58 +08:00
nimlgen
08855c162b
amd: correct sqtt_read for several xccs ( #13075 )
...
* amd: correct sqtt_read for several xccs
* default mask
2025-11-03 19:59:56 +08:00
nimlgen
be0028d3ce
amd: universal set_grbm ( #13062 )
...
* amd: universal set_grbm
* fix
2025-11-03 03:35:55 +08:00
nimlgen
37a730abce
amd: fix pmc sq gfx11+ ( #13058 )
...
* amd: fix pmc sq gfx11+
* fix
2025-11-02 21:56:47 +08:00
nimlgen
2db57f3a97
amd: better msg when out of perf regs ( #13042 )
2025-11-01 22:47:50 +08:00
nimlgen
a23226e61e
amd: pmc for gfx9 ( #13036 )
...
* amd: pmc for gfx9
* xcc
* vmid mask
* ugh
* tiny
* minor
* sorryg
2025-11-01 04:26:34 +08:00
nimlgen
d532117df5
amd: rename set_grbm_se -> set_grbm_se_sh ( #13037 )
2025-11-01 01:37:57 +08:00