qazal
a647c9eca6
sqtt ui minor fixes ( #13335 )
...
* roc.py cleanups
* direct append
* viz index cleanup
* simd row details
2025-11-19 01:27:56 +08:00
nimlgen
331f70aa75
roc: ctrlc ( #13255 )
...
* roc: ctrl-c works
* rm
2025-11-18 19:29:28 +08:00
George Hotz
6d3385c284
print special ops in postrange ( #13318 )
...
* print special ops in postrange
* fix on OSX
2025-11-17 14:43:23 -08:00
qazal
e7e1935225
cleanup sqtt/test_timing ( #13315 )
2025-11-18 04:28:05 +08:00
nimlgen
e2cee64050
Revert "hcq: add tag to exec events ( #13311 )" ( #13314 )
...
This reverts commit f63ded5817 .
2025-11-17 22:15:31 +03:00
nimlgen
f63ded5817
hcq: add tag to exec events ( #13311 )
...
* hcq: add tag to exec events
* f
* fix
* fix
2025-11-17 16:59:30 +03:00
qazal
50a443f558
viz: add shader engine to wave exec payload ( #13310 )
...
* viz: show sqtt shader engine
* order it from smallest unit
* easier to config
2025-11-17 19:11:34 +08:00
George Hotz
55be95da15
cleanup sqtt raw parser ( #13309 )
...
* cleanup sqtt raw parser
* better names (don't merge yet)
* clean up amd
* a few more names
* one more filter
2025-11-16 13:11:51 -08:00
George Hotz
cabd4add48
more work parsing SQTT, separate VIZ/PROFILE ( #13308 )
...
* more work parsing SQTT
* more minimal runner
* sep VIZ/PROFILE
* parse print new
* improve parser
* more filter
* that
* split them
* lil cleanup
* skip flaky test
* AQL in mmapeak
2025-11-16 10:40:39 -08:00
qazal
13efdf8c31
test s_nop stall ( #13307 )
2025-11-17 00:59:39 +08:00
George Hotz
295600dc5a
saturday coffee shop work parsing the att format ( #13295 )
...
* saturday coffee shop work parsing the att format
* add examples
* parser
* classes of packets
* fully vibe coded parser
* vibing
* empty
* some vibe names
* vibes
* most of these are wrong
* more vibes
* better names
* parsing
* parse
* cleanup parser
* touchups
2025-11-16 08:25:51 -08:00
qazal
c70b06ec19
sqtt test_timing work ( #13304 )
...
* sqtt test_timing cleanups
* only the instruction
* v_mfma_f32_16x16x32_f16 16 cycles, only after second one though
2025-11-16 23:49:24 +08:00
qazal
7c110e1a57
viz: minor cleanups for sqtt ( #13275 )
...
* small prg cleanup
* test_timing
2025-11-15 01:08:56 +08:00
qazal
2ee701a009
roc: fix CEnum access ( #13270 )
...
* roc: add decoder to ci
* also add installer
* use CEnum syntax
* try 2
* add to setup
* revert ci change
* the other enum too
2025-11-14 21:41:24 +08:00
Christopher Milan
09f3aae169
In-tree autogen: all C libraries ( #13220 )
...
* checkout files from autogen branch
* ioctl with payload
* fix am generations
* properly fix generations
This reverts commit b2a54f4f41 .
* revert discovery.h
* support pragma pack(1)
* typo
* better getter
* typo
* NVCEC0_QMDV05_00_RELEASE[01]_ENABLE
* align support
* anon handling fix
---------
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com >
2025-11-13 18:57:44 -08:00
nimlgen
f9b7586e08
roc: fix blob gc ( #13256 )
2025-11-13 23:38:35 +08:00
qazal
006dea4c3e
roc: only save instruction execs ( #13254 )
2025-11-13 21:28:40 +08:00
qazal
be2e24cb25
roc: requires sudo to install ( #13237 )
2025-11-12 16:59:22 -05:00
qazal
8b26cf2b3d
sqtt: update rcp timing test ( #13231 )
...
* sqtt: assert correct output in timing test
* found why
2025-11-13 02:01:54 +08:00
nimlgen
af17e07251
viz: sqtt touchups ( #13228 )
...
* viz: sqtt touchups
* revert
* matches
2025-11-12 22:40:37 +08:00
nimlgen
fcd8d0751a
test_timing for hip ( #13229 )
2025-11-12 20:28:58 +08:00
qazal
50934050bc
sqtt: append all wave execs ( #13190 )
2025-11-10 23:50:08 +08:00
qazal
38a24731a1
cleanup sqtt tooling ( #13188 )
...
* cleanup viz/serve.py
* use latest profile in rgptool.py
* unwrap nullable in roc.py, fix disasms typing
2025-11-10 20:52:57 +08:00
qazal
7e94369464
add helper for test_timing custom ops ( #13140 )
2025-11-07 17:13:55 +08:00
nimlgen
b9b68bf437
amd: add kern to sqtt event ( #13126 )
...
* amd: add kern to sqtt event
* fix
2025-11-06 22:02:02 +08:00
qazal
88245d6579
qol improvements to sqtt decoder and timing tests ( #13125 )
2025-11-06 20:51:30 +08:00
George Hotz
bcfe42937f
move permute/flip/shrink to mixins ( #13113 )
...
* move permute to mixins
* move more stuff
* two more
* fix local mypy
* fix tests
* fix shrink
2025-11-05 14:14:15 -08:00
nimlgen
eff80beeed
amd: props in device not sqtt ( #13106 )
...
* amd: props in device not sqtt
* fix
* f
* fix
* fix
2025-11-05 23:43:20 +08:00
qazal
8119d9f082
sqtt: decode each instruction exec ( #13093 )
...
* sqtt: decode each instruction exec
* start tests
* run_asm
* capture sqtt per kernel
* chaining vgprs
* test things
* inst_execs in viz
* can also configure l and g
* 1l + cleanup
* test_sleep
* test_wmma
* work
* test sleep with llvm builtin
2025-11-05 17:30:27 +08:00
nimlgen
eaf7cbc178
amd: flush sqtt after each kernel ( #13092 )
...
* amd: flush sqtt after each kernel
* merge for rgp
2025-11-04 22:12:48 +08:00
nimlgen
49191ada77
roc: install sqtt decoder ( #13091 )
...
* roc: install?
* msg
* 0.1.4
2025-11-04 18:56:01 +08:00
nimlgen
2e97eaa866
roc: no nullptr when no wave instructions ( #13087 )
2025-11-04 17:32:14 +08:00
qazal
6df34a5887
lint sqtt parser with mypy ( #13079 )
...
* llvm address table errs
* mypy likes annotated dicts
* unwrap nullable
2025-11-04 00:53:59 +08:00
nimlgen
dfde3f54d9
rocprof: use llvm disasm ( #13077 )
...
* rocprof: use llvm disasm
* rm
2025-11-03 23:58:58 +08:00
qazal
27d42fd575
sqtt decoder print behind DEBUG>=5 ( #13076 )
...
* sqtt decoder print behind DEBUG>=5
* gfx version stuff also behind 5
2025-11-03 23:20:03 +08:00
qazal
1c0d4f1cd2
viz: counters loader ( #12987 )
...
* standalone custom loader
* first iteration on the ui
* work
* add center helper
* add edge offsets
* enumerate all edge types
* try dagre layout algorithm
* simpler spec
* bring back double edges
* more work on edge paths
* aesthetics
* custom edges also works
* dimmer inactive links
* cleanup
* cleanup
* split out the ncu layout
* this is just a k/v map now
* rm that
* more cleanup and comments
* do work
* also this work
* simpler start
* rm that
* sqtt work
* view sqtt
* sqtt
* --custom is just in profile
* wrap c call
* from tinygrad install
* eg. module not found
2025-11-03 19:42:36 +08:00
nimlgen
a23226e61e
amd: pmc for gfx9 ( #13036 )
...
* amd: pmc for gfx9
* xcc
* vmid mask
* ugh
* tiny
* minor
* sorryg
2025-11-01 04:26:34 +08:00
nimlgen
f6786c1bfd
autogen: py314 ( #13038 )
...
* autogen: py314
* bump py?
2025-11-01 04:02:19 +08:00
nimlgen
629b177b66
amd: sqtt works in profile mode ( #13019 )
2025-10-30 23:48:52 +08:00
nimlgen
4d7a7096c9
am: enable perfmon ( #13013 )
...
* am: enable perfmon
* try
* msg
2025-10-30 22:28:36 +08:00
nimlgen
a6f5b1482e
amd: perf counters ( #12975 )
...
* amd: perf counters
* sq
* cleaner
* fix
* if enabled
* ruff
* mypy
* counters
* reset
* fix
* no cpu
2025-10-30 00:10:31 +08:00
nimlgen
1ad6598963
amd: trace all instructions ( #12831 )
2025-10-21 20:52:24 +08:00
qazal
cd6aeebfee
sqtt: osx decoder installer ( #12637 )
2025-10-13 17:26:12 +08:00
nimlgen
89be3590aa
amd: sqtt on gfx12 ( #12564 )
...
* amd: sqtt on gfx12
* cleaner
* thi
* and this
* ops
* ugh
* back
* rm this
* rm
2025-10-10 17:54:14 +08:00
nimlgen
1309cea247
rocprof parser in extra ( #12569 )
...
* rocprof parser
* viewer
* vw
* skip
2025-10-10 14:56:42 +08:00
nimlgen
a11b686c71
amd: sqtt for all gfx11 ( #12546 )
...
* amd: general sqtt for gfx11
* target
* ops
* no gfx12 here
2025-10-09 17:04:06 +08:00
qazal
a388d2cb1a
remove PROFILE=1 option, it's just VIZ=1 [pr] ( #12176 )
...
* remove PROFILE=1 option, it's just VIZ=1 [pr]
* sqtt
* sqtt 2
* return last
* rename
2025-09-15 12:51:50 +03:00
qazal
577e581943
fix typo in sqtt/readme ( #11281 )
2025-07-19 15:10:24 +03:00
Andrey
7b865ed03d
use tuple in isinstance for type checking ( #9583 )
2025-03-26 19:36:48 +08:00
uuuvn
e85001b6ee
SQTT profiling ( #9278 )
...
* sqtt
* docs
* multi-device
* ProfileSQTTEvent
* exec update
* 256mb default
* don't let people hang their gpus
* bitfields from autogen
* asic info from mesa
* more bitfields from autogen
* SQTT_ITRACE_SE_MASK
---------
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com >
2025-03-11 13:19:56 +08:00