Commit Graph

77 Commits

Author SHA1 Message Date
chenyu
2e2b5fed12 fix misspellings (#13976) 2026-01-02 10:37:38 -05:00
qazal
d7e1f26e3d command line interface for sqtt viz (#13891)
* command line interface for sqtt viz

* cleanup

* api surface area

* this confuses the llms

* document
2025-12-30 12:33:21 +09:00
qazal
2180eee5e4 use the asm dsl in remu hwtest.py (#13856)
* remu hw test with the asm dsl

* simpler

* nthreads and exec mask

* cmp/cmpx

* assembler error in s_mov_b32

* vopd in dsl?
2025-12-28 11:32:41 +09:00
qazal
f6c660f7fa simplify sqtt decoder infra (#13849)
* more work

* simpler
2025-12-28 00:31:16 +09:00
qazal
a2da61d096 use new style amd compiler in viz (#13848)
* working version, handcode gfx1100 arch

* get target from device properties

* lib in cfg test program spec
2025-12-27 23:59:30 +09:00
qazal
389f01c7f4 viz: amdgpu assembly basic block graph (#13755) 2025-12-22 23:17:16 +08:00
qazal
81d9053013 roc: cast to nullptr instead of changing header (#13801) 2025-12-22 22:34:06 +08:00
qazal
019e71f8ca lds bank count tests from pmc counters (#13667)
* lds bank count tests from pmc counters

* these tests run on the RDNA3 card too

* rename duration to cycles, other rename comment

* add SQ_LDS_IDX_ACTIVE to gfx9 defaults
2025-12-13 17:39:32 +08:00
qazal
93ad1f7732 viz: readable pmc print, share unpacker with tests (#13655)
* viz: readable pmc print, share unpacker with tests

* sections

* static analyzer

* rm that
2025-12-12 19:29:59 +08:00
qazal
d7caae5f61 viz: tabulate pmc (#13574)
* viz: tabulate pmc

* linter

* enable nesting

* pmc comes before waves
2025-12-05 03:08:39 +08:00
qazal
512a8f3dd4 viz: start global memory PMC tests (#13569) 2025-12-05 00:40:27 +08:00
George Hotz
ddf3f2d0c4 rdna3 asm + zip_extract (#13499)
* rdna3 asm + zip_extract

* include sqtt

* fix end parsing

* disassembler working

* parsing fields

* instruction

* op

* more parsing
2025-12-02 22:56:01 -08:00
qazal
c65aa93081 refactor sqtt loader to enable PMC=1 SQTT=0 (#13526) 2025-12-02 22:50:38 +08:00
qazal
a5ec3b24be viz: start PMC in the counters view (#13510) 2025-12-02 00:01:57 +08:00
qazal
9023ca30ef show number of waves in each SE/CU (#13491)
* show number of waves in each SE/CU

* update to test_ones
2025-11-30 22:29:16 +08:00
qazal
d457ee0ba4 viz: correctly handle multiple sqtt traces of the same prg (#13460) 2025-11-29 20:52:41 +08:00
qazal
5520f1fb0b viz: per cu timeline (#13451)
* add cu_loc

* work

* WAVE -> W
2025-11-26 00:05:20 +08:00
qazal
2a9bd12700 sqtt: add occupancy events to the timeline (#13430) 2025-11-24 22:28:05 +08:00
qazal
712c7a6448 sqtt loader cleanups from the occupancy branch (#13431)
* cleanup err handling

* from disasms

* s/wave_execs/wave_insts
2025-11-23 21:50:34 +08:00
George Hotz
9d7a17ee39 beautiful SQTT_PARSE=1 with color (#13428)
* beautiful SQTT_PARSE=1 with color

* linter

* linter 2

* a few more labels

* filter and or

* wave alloc

* a few more
2025-11-23 01:05:14 -08:00
George Hotz
da0aa57a3b add cu parsing to attempt_sqtt_parse 2025-11-22 22:09:05 -08:00
qazal
320ed78803 can view wave timeline with SQTT_ITRACE_SE_MASK=0 (#13427) 2025-11-23 13:55:47 +08:00
George Hotz
5110409339 continue work on parse sqtt, enable with SQTT_PARSE (#13425)
* continue work on parse sqtt, enable with SQTT_PARSE

* fix timing

* delta is pre instruction

* hi8 values

* a few more

* a bit more

* let it crash if you enabled it

* figure out simd

* hide 0x11
2025-11-22 19:03:17 -08:00
George Hotz
423b76a852 improve sqtt format parser (saturday coffee shop project) (#13419)
* improve sqtt format parser

* actually read the trash code ChatGPT wrote

* cleanups

* hand written parser

* quality

* more

* was missing first packet

* maybe

* filt

* fixups

* label the waves

* progress
2025-11-22 15:04:10 -08:00
qazal
c14033e10f viz: faster startup time with SQTT=1 (#13337)
* roc.py cleanups

* direct append

* viz index cleanup

* simd row details

* add kernel arg

* late instructions decode

* more instruction decode to sep server request

* 200ms startup, 6 second to waves timeline

* sort units

* creating new http paths is easy now

* instructions unpacker

* min diff, use hyphens

* summary table
2025-11-22 22:02:30 +08:00
George Hotz
dabb02767f set AMD profile mode with sudo on SQTT or PMC (#13403)
* require profile mode

* add mode setter

* cleanup

* not needed

* SQTT_LIMIT_SE
2025-11-20 23:19:11 -08:00
qazal
9dbc550692 roc: map disassembly to prog name (#13384) 2025-11-20 23:47:19 +08:00
qazal
a647c9eca6 sqtt ui minor fixes (#13335)
* roc.py cleanups

* direct append

* viz index cleanup

* simd row details
2025-11-19 01:27:56 +08:00
nimlgen
331f70aa75 roc: ctrlc (#13255)
* roc: ctrl-c works

* rm
2025-11-18 19:29:28 +08:00
George Hotz
6d3385c284 print special ops in postrange (#13318)
* print special ops in postrange

* fix on OSX
2025-11-17 14:43:23 -08:00
qazal
e7e1935225 cleanup sqtt/test_timing (#13315) 2025-11-18 04:28:05 +08:00
nimlgen
e2cee64050 Revert "hcq: add tag to exec events (#13311)" (#13314)
This reverts commit f63ded5817.
2025-11-17 22:15:31 +03:00
nimlgen
f63ded5817 hcq: add tag to exec events (#13311)
* hcq: add tag to exec events

* f

* fix

* fix
2025-11-17 16:59:30 +03:00
qazal
50a443f558 viz: add shader engine to wave exec payload (#13310)
* viz: show sqtt shader engine

* order it from smallest unit

* easier to config
2025-11-17 19:11:34 +08:00
George Hotz
55be95da15 cleanup sqtt raw parser (#13309)
* cleanup sqtt raw parser

* better names (don't merge yet)

* clean up amd

* a few more names

* one more filter
2025-11-16 13:11:51 -08:00
George Hotz
cabd4add48 more work parsing SQTT, separate VIZ/PROFILE (#13308)
* more work parsing SQTT

* more minimal runner

* sep VIZ/PROFILE

* parse print new

* improve parser

* more filter

* that

* split them

* lil cleanup

* skip flaky test

* AQL in mmapeak
2025-11-16 10:40:39 -08:00
qazal
13efdf8c31 test s_nop stall (#13307) 2025-11-17 00:59:39 +08:00
George Hotz
295600dc5a saturday coffee shop work parsing the att format (#13295)
* saturday coffee shop work parsing the att format

* add examples

* parser

* classes of packets

* fully vibe coded parser

* vibing

* empty

* some vibe names

* vibes

* most of these are wrong

* more vibes

* better names

* parsing

* parse

* cleanup parser

* touchups
2025-11-16 08:25:51 -08:00
qazal
c70b06ec19 sqtt test_timing work (#13304)
* sqtt test_timing cleanups

* only the instruction

* v_mfma_f32_16x16x32_f16 16 cycles, only after second one though
2025-11-16 23:49:24 +08:00
qazal
7c110e1a57 viz: minor cleanups for sqtt (#13275)
* small prg cleanup

* test_timing
2025-11-15 01:08:56 +08:00
qazal
2ee701a009 roc: fix CEnum access (#13270)
* roc: add decoder to ci

* also add installer

* use CEnum syntax

* try 2

* add to setup

* revert ci change

* the other enum too
2025-11-14 21:41:24 +08:00
Christopher Milan
09f3aae169 In-tree autogen: all C libraries (#13220)
* checkout files from autogen branch

* ioctl with payload

* fix am generations

* properly fix generations

This reverts commit b2a54f4f41.

* revert discovery.h

* support pragma pack(1)

* typo

* better getter

* typo

* NVCEC0_QMDV05_00_RELEASE[01]_ENABLE

* align support

* anon handling fix

---------

Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
2025-11-13 18:57:44 -08:00
nimlgen
f9b7586e08 roc: fix blob gc (#13256) 2025-11-13 23:38:35 +08:00
qazal
006dea4c3e roc: only save instruction execs (#13254) 2025-11-13 21:28:40 +08:00
qazal
be2e24cb25 roc: requires sudo to install (#13237) 2025-11-12 16:59:22 -05:00
qazal
8b26cf2b3d sqtt: update rcp timing test (#13231)
* sqtt: assert correct output in timing test

* found why
2025-11-13 02:01:54 +08:00
nimlgen
af17e07251 viz: sqtt touchups (#13228)
* viz: sqtt touchups

* revert

* matches
2025-11-12 22:40:37 +08:00
nimlgen
fcd8d0751a test_timing for hip (#13229) 2025-11-12 20:28:58 +08:00
qazal
50934050bc sqtt: append all wave execs (#13190) 2025-11-10 23:50:08 +08:00
qazal
38a24731a1 cleanup sqtt tooling (#13188)
* cleanup viz/serve.py

* use latest profile in rgptool.py

* unwrap nullable in roc.py, fix disasms typing
2025-11-10 20:52:57 +08:00