Commit Graph

328 Commits

Author SHA1 Message Date
nimlgen
f7ee644950 amd: lazy sdma queue allocation (#13920)
* ams: lazy queue

* nv

* linter

* f
2025-12-31 15:17:13 +03:00
George Hotz
25ef866e89 write python emulator from RDNA3 psuedocode in pdf (#13841)
* write python emulator from RDNA3 psuedocode in pdf

* emu2

* more emu

* working

* more psueod

* progress

* cleanups

* delete junk

* delete stale files

* just emu

* work

* emu compare

* bemu

* cleanups and more failures

* revert bench emu

* fix emu cmp

* four tests fail

* bugfixes

* dsl

* ext

* refactor

* dsl

* div scale fix

* test_emu

* fix emu tests

* pcode

* test pcode

* top imports

* fix test_emu to use run_asm

* emu tests on real hardware

* more tests

* more emu tests

* more

* work

* work

* bug fix

* bugfixes

* fix fp16 gemm

* all ops tests pass in emulator

* fix llvm tests

* fix a few more tests

* fix mockgpu timeout
2025-12-29 07:39:53 -05:00
George Hotz
f1111ac7de move amd compilers to new style (#13831)
* move amd compilers to new style

* simplest diff

* AMDHIPrenderer
2025-12-25 13:42:24 -05:00
George Hotz
9d94b8c6b2 python asm dsl in extra + python REMU (#13436)
* having fun with python asm dsl

* rdna3

* meh

* all in rdna3

* work

* more work

* work

* integration

* tests

* simpler

* simpler

* asm

* better

* simpler

* progress

* emu

* simpler

* emu

* tests

* types

* vopd

* cleaups

* work

* memory ranges

* add tracing

* refactors

* run_asm exit

* more readable

* compare to remu

* test gemm

* bug + stale

* more tests

* refactor

* tests fix

* more ins

* more instructions

* refactor

* faster

* match case

* match case

* simpler

* work

* tests

* run_asm

* work

* bug fixes

* more emu

* alu/emu

* refactor

* no pipeline emu yet

* alu direct

* fix

* bugfixes + new test

* fix exceptions in emulators

* update gen.py

* pylint

* no pdf

* improve bench_emu

* speedups

* cleanups

* more tests
2025-12-25 13:04:14 -05:00
nimlgen
90b217896f am: xgmi p2p (#13811)
* system: use addr space

* am: xgmi

* fix

* ugh
2025-12-23 20:11:38 +03:00
nimlgen
f6bda6ae4e am: continue from saved state (#13799)
* am: gfx queue cont

* f

* reset

* f

* l
2025-12-22 15:55:07 +03:00
George Hotz
a987a8ed44 add neg VIZ support to not start server (#13772) 2025-12-20 00:36:38 -04:00
nimlgen
3eecb4f123 am: mi350 support (#13733) 2025-12-17 14:57:21 +03:00
nimlgen
5778722979 am: restore queues (#13714)
* am: restore queues

* l

* cmnt
2025-12-16 15:21:42 +03:00
nimlgen
615dcab767 am: minimal mi300 boot (#13679)
* nbio7_9

* psp

* gmc

* gfx

* sdma

* ih

* linter

* linter

* minor

* finish

* add missing

* do not allow warm boot for now
2025-12-15 15:55:03 +03:00
nimlgen
0b15c573ca amd: xccs in PCIIface (#13669) 2025-12-13 17:22:11 +03:00
qazal
019e71f8ca lds bank count tests from pmc counters (#13667)
* lds bank count tests from pmc counters

* these tests run on the RDNA3 card too

* rename duration to cycles, other rename comment

* add SQ_LDS_IDX_ACTIVE to gfx9 defaults
2025-12-13 17:39:32 +08:00
nimlgen
b4796e2d32 amd: set queue prio to normal (#13658) 2025-12-12 18:25:41 +03:00
nimlgen
dd8a1a10d4 amd: tiny cleanups (#13616) 2025-12-08 13:15:56 +03:00
nimlgen
dcd50baca4 amd/nv: cleanup (#13608) 2025-12-07 17:05:26 +03:00
nimlgen
abafb96441 hcq: check all subbufs are free (#13599)
* hcq: check all subbufs are free

* fix

* Update ops_amd.py
2025-12-06 17:43:18 +03:00
nimlgen
f2b549d921 amd: refactor scratch calc (#13595)
* amd: refactor scratch calc

* fix
2025-12-06 16:41:35 +03:00
chenyu
0977206b1c Revert am (#13591)
* Revert "hotfix: amd: tmpring (#13589)"

This reverts commit 4d8b283b36.

* Revert "amd: use correct structs (#13583)"

This reverts commit d8b09eda57.
2025-12-05 11:03:12 -05:00
nimlgen
4d8b283b36 hotfix: amd: tmpring (#13589)
* hotfix: amd: tmpring

* more
2025-12-05 18:19:05 +03:00
nimlgen
d8b09eda57 amd: use correct structs (#13583) 2025-12-05 14:46:38 +03:00
qazal
f21c9dbf4b enable PMC with VIZ=2 (#13575) 2025-12-05 03:09:53 +08:00
qazal
8390de39e6 amd: static flag check for sqtt/pmc (#13545) 2025-12-03 18:36:15 +08:00
nimlgen
77a76d1b13 device: respect compiler ContextVars (#13523)
* device: envvars for cc

* fix

* fix

* x

* um

* fix

* remote

* em

* cleanup

* typing

* fix

* debug

* lvp?

* ugh

* singl

* rm

* lol

* fix

* ?

* this?

* why?

* rev

* mod test

* l
2025-12-02 14:42:04 +03:00
nimlgen
759b41ab91 amd: fix rsrc_word3 on gfx9 (#13509) 2025-12-01 12:47:54 +03:00
qazal
d457ee0ba4 viz: correctly handle multiple sqtt traces of the same prg (#13460) 2025-11-29 20:52:41 +08:00
nimlgen
192bf4e00a amd,nv: remove unused env vars (#13487) 2025-11-28 23:12:53 +03:00
nimlgen
18cfb54736 amd: a bit better se limiting (#13440)
* amd: a bit better se limiting

* SQTT_LIMIT_SE=0
2025-11-24 21:51:47 +03:00
George Hotz
5110409339 continue work on parse sqtt, enable with SQTT_PARSE (#13425)
* continue work on parse sqtt, enable with SQTT_PARSE

* fix timing

* delta is pre instruction

* hi8 values

* a few more

* a bit more

* let it crash if you enabled it

* figure out simd

* hide 0x11
2025-11-22 19:03:17 -08:00
George Hotz
dabb02767f set AMD profile mode with sudo on SQTT or PMC (#13403)
* require profile mode

* add mode setter

* cleanup

* not needed

* SQTT_LIMIT_SE
2025-11-20 23:19:11 -08:00
qazal
5623e765c8 VIZ=2 enables SQTT (#13330) 2025-11-18 22:20:31 +08:00
wozeparrot
8894a5409d feat: hipcc compiler (#13319) 2025-11-17 15:13:32 -08:00
nimlgen
e2cee64050 Revert "hcq: add tag to exec events (#13311)" (#13314)
This reverts commit f63ded5817.
2025-11-17 22:15:31 +03:00
nimlgen
f63ded5817 hcq: add tag to exec events (#13311)
* hcq: add tag to exec events

* f

* fix

* fix
2025-11-17 16:59:30 +03:00
nimlgen
9bb17c53ea amd: timer fix (#13267) 2025-11-17 13:59:03 +03:00
Christopher Milan
09f3aae169 In-tree autogen: all C libraries (#13220)
* checkout files from autogen branch

* ioctl with payload

* fix am generations

* properly fix generations

This reverts commit b2a54f4f41.

* revert discovery.h

* support pragma pack(1)

* typo

* better getter

* typo

* NVCEC0_QMDV05_00_RELEASE[01]_ENABLE

* align support

* anon handling fix

---------

Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
2025-11-13 18:57:44 -08:00
nimlgen
f9586b38ba system: pci mask and val (#13251) 2025-11-13 20:44:58 +08:00
George Hotz
a23dea202b actually make AMD_LLVM not default (#13238) 2025-11-12 15:07:23 -08:00
nimlgen
9a53fcbde4 amd: sqtt on rdna3.5 (#13233) 2025-11-13 03:30:42 +08:00
nimlgen
b9b68bf437 amd: add kern to sqtt event (#13126)
* amd: add kern to sqtt event

* fix
2025-11-06 22:02:02 +08:00
nimlgen
05e2ff4d87 system: fix flock on pcidevs (#13123)
* system: fix locking of hcq devices

* rename and fullrun

* force ok

* fix

* fix
2025-11-06 19:02:13 +08:00
nimlgen
eff80beeed amd: props in device not sqtt (#13106)
* amd: props in device not sqtt

* fix

* f

* fix

* fix
2025-11-05 23:43:20 +08:00
nimlgen
eaf7cbc178 amd: flush sqtt after each kernel (#13092)
* amd: flush sqtt after each kernel

* merge for rgp
2025-11-04 22:12:48 +08:00
nimlgen
16f1f644ba amd: remove sqtt=2 (#13090) 2025-11-04 18:29:24 +08:00
nimlgen
dfde3f54d9 rocprof: use llvm disasm (#13077)
* rocprof: use llvm disasm

* rm
2025-11-03 23:58:58 +08:00
nimlgen
08855c162b amd: correct sqtt_read for several xccs (#13075)
* amd: correct sqtt_read for several xccs

* default mask
2025-11-03 19:59:56 +08:00
nimlgen
be0028d3ce amd: universal set_grbm (#13062)
* amd: universal set_grbm

* fix
2025-11-03 03:35:55 +08:00
nimlgen
37a730abce amd: fix pmc sq gfx11+ (#13058)
* amd: fix pmc sq gfx11+

* fix
2025-11-02 21:56:47 +08:00
nimlgen
2db57f3a97 amd: better msg when out of perf regs (#13042) 2025-11-01 22:47:50 +08:00
nimlgen
a23226e61e amd: pmc for gfx9 (#13036)
* amd: pmc for gfx9

* xcc

* vmid mask

* ugh

* tiny

* minor

* sorryg
2025-11-01 04:26:34 +08:00
nimlgen
d532117df5 amd: rename set_grbm_se -> set_grbm_se_sh (#13037) 2025-11-01 01:37:57 +08:00