Files
tinygrad/extra/sqtt
qazal 266fb07721 viz: show exec duration (#15484)
* duration

* handwritten tests

* rdna3 pickle

* rdna4 pickle

* asserts

* rm that

* wmma work

* r4

* this shows the overlap well

* ohh okay it goes back

* are ds_load and ds_store different queues on RDNA4?

* print msg, v_mul_lo_u32 is 4 cycles?

* discover

* wmma something

* wmma comment

* less

* less

* better comments

* work

* inst st

* delay column

* better cli

* emit_alt

* update test_handwritten

* work
2026-03-28 22:48:59 +09:00
..
2026-03-28 22:48:59 +09:00
2026-03-03 22:43:24 +09:00
2026-02-06 16:39:12 +03:00
2026-03-03 22:43:24 +09:00
2025-10-10 17:54:14 +08:00

SQTT Profiling

Getting SQ Thread Trace

VIZ=2 to enable SQTT profiling.

SQTT_ITRACE_SE_MASK=X to select shader engines for instruction tracing, -1 = all, 0 = disabled, >0 = SE bitmask, default 0b11.

SQTT_BUFFER_SIZE=X to change size of SQTT buffer (per shader engine, 6 SEs on 7900xtx) in megabytes, default 256.

Viewing the traces

  • Web UI: tinygrad/viz/serve.py
  • Command line: python -m tinygrad.renderer.amd.sqtt