Commit Graph

1363 Commits

Author SHA1 Message Date
nimlgen
0c9fbf87e1 nvioctl: classes (#13346) 2025-11-19 16:14:15 +03:00
wozeparrot
be72b78dcb tk: small fixes (#13345)
* fix: handle case where final uop isn't a tk wrapped one

* clean: remove after from mma
2025-11-19 00:58:50 -08:00
qazal
a647c9eca6 sqtt ui minor fixes (#13335)
* roc.py cleanups

* direct append

* viz index cleanup

* simd row details
2025-11-19 01:27:56 +08:00
nimlgen
331f70aa75 roc: ctrlc (#13255)
* roc: ctrl-c works

* rm
2025-11-18 19:29:28 +08:00
George Hotz
6d3385c284 print special ops in postrange (#13318)
* print special ops in postrange

* fix on OSX
2025-11-17 14:43:23 -08:00
George Hotz
98e9e73286 hotfix: amd_uop_matmul getenvs 2025-11-17 13:26:01 -08:00
qazal
e7e1935225 cleanup sqtt/test_timing (#13315) 2025-11-18 04:28:05 +08:00
wozeparrot
33773fda87 tk initial mi350 (#13289) 2025-11-17 11:46:32 -08:00
nimlgen
e2cee64050 Revert "hcq: add tag to exec events (#13311)" (#13314)
This reverts commit f63ded5817.
2025-11-17 22:15:31 +03:00
nimlgen
f63ded5817 hcq: add tag to exec events (#13311)
* hcq: add tag to exec events

* f

* fix

* fix
2025-11-17 16:59:30 +03:00
qazal
50a443f558 viz: add shader engine to wave exec payload (#13310)
* viz: show sqtt shader engine

* order it from smallest unit

* easier to config
2025-11-17 19:11:34 +08:00
George Hotz
55be95da15 cleanup sqtt raw parser (#13309)
* cleanup sqtt raw parser

* better names (don't merge yet)

* clean up amd

* a few more names

* one more filter
2025-11-16 13:11:51 -08:00
George Hotz
cabd4add48 more work parsing SQTT, separate VIZ/PROFILE (#13308)
* more work parsing SQTT

* more minimal runner

* sep VIZ/PROFILE

* parse print new

* improve parser

* more filter

* that

* split them

* lil cleanup

* skip flaky test

* AQL in mmapeak
2025-11-16 10:40:39 -08:00
qazal
13efdf8c31 test s_nop stall (#13307) 2025-11-17 00:59:39 +08:00
George Hotz
295600dc5a saturday coffee shop work parsing the att format (#13295)
* saturday coffee shop work parsing the att format

* add examples

* parser

* classes of packets

* fully vibe coded parser

* vibing

* empty

* some vibe names

* vibes

* most of these are wrong

* more vibes

* better names

* parsing

* parse

* cleanup parser

* touchups
2025-11-16 08:25:51 -08:00
qazal
c70b06ec19 sqtt test_timing work (#13304)
* sqtt test_timing cleanups

* only the instruction

* v_mfma_f32_16x16x32_f16 16 cycles, only after second one though
2025-11-16 23:49:24 +08:00
wozeparrot
ef42334239 tk: load store cleanup (#13290) 2025-11-15 17:08:23 -08:00
qazal
7c110e1a57 viz: minor cleanups for sqtt (#13275)
* small prg cleanup

* test_timing
2025-11-15 01:08:56 +08:00
qazal
2ee701a009 roc: fix CEnum access (#13270)
* roc: add decoder to ci

* also add installer

* use CEnum syntax

* try 2

* add to setup

* revert ci change

* the other enum too
2025-11-14 21:41:24 +08:00
nimlgen
14eb48b13a autogen: rename nv_gpu to nv_570 (#13273)
* autogen: rename nv_gpu to nv_570

* rename
2025-11-14 20:07:19 +08:00
Christopher Milan
09f3aae169 In-tree autogen: all C libraries (#13220)
* checkout files from autogen branch

* ioctl with payload

* fix am generations

* properly fix generations

This reverts commit b2a54f4f41.

* revert discovery.h

* support pragma pack(1)

* typo

* better getter

* typo

* NVCEC0_QMDV05_00_RELEASE[01]_ENABLE

* align support

* anon handling fix

---------

Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
2025-11-13 18:57:44 -08:00
wozeparrot
777cbec5b3 tk: rename rt tile dims to base (#13265) 2025-11-13 18:43:02 -08:00
wozeparrot
7eb0d8e744 feat: mixins on tiles (#13246) 2025-11-13 16:52:52 -08:00
George Hotz
ba84d415fe work from benchmarking tinybox red v2 (#13264)
* work from benchmarking tinybox red v2

* gpuburn
2025-11-13 16:38:40 -08:00
wozeparrot
547304c471 tk: group cleanup (#13262) 2025-11-13 14:19:51 -08:00
wozeparrot
4ada51618f tk: don't flatten in clear (#13249) 2025-11-13 13:38:01 -08:00
George Hotz
faf68c03a8 more mi350x matmul work (#13138)
* more mi350x matmul work

* broken compute
2025-11-13 09:09:28 -08:00
alpharush
7e0aaadecd feat: add repro command to summary (#10930) 2025-11-13 08:52:27 -08:00
nimlgen
f9b7586e08 roc: fix blob gc (#13256) 2025-11-13 23:38:35 +08:00
qazal
006dea4c3e roc: only save instruction execs (#13254) 2025-11-13 21:28:40 +08:00
George Hotz
17aa3379e9 hotfix: improve self_tokenize 2025-11-13 00:18:57 -08:00
qazal
be2e24cb25 roc: requires sudo to install (#13237) 2025-11-12 16:59:22 -05:00
George Hotz
8f1f195b6d hotfix: no hexdump for usbgpu patch.py 2025-11-12 12:05:37 -08:00
qazal
8b26cf2b3d sqtt: update rcp timing test (#13231)
* sqtt: assert correct output in timing test

* found why
2025-11-13 02:01:54 +08:00
nimlgen
af17e07251 viz: sqtt touchups (#13228)
* viz: sqtt touchups

* revert

* matches
2025-11-12 22:40:37 +08:00
nimlgen
fcd8d0751a test_timing for hip (#13229) 2025-11-12 20:28:58 +08:00
wozeparrot
371c1f2355 tk: move tiles to class (#13224) 2025-11-11 21:53:46 -08:00
wozeparrot
787f0070ed feat: don't use output reg as local reduce reg (#13203) 2025-11-11 14:35:16 -08:00
George Hotz
0c978d45e6 stub attention (#13196)
* stub attention

* name the kernels
2025-11-10 13:48:38 -08:00
qazal
50934050bc sqtt: append all wave execs (#13190) 2025-11-10 23:50:08 +08:00
qazal
38a24731a1 cleanup sqtt tooling (#13188)
* cleanup viz/serve.py

* use latest profile in rgptool.py

* unwrap nullable in roc.py, fix disasms typing
2025-11-10 20:52:57 +08:00
wozeparrot
6252831ceb feat: initial tk library (#13160) 2025-11-09 22:54:29 -08:00
George Hotz
d7369de048 hotfix: update weekly commits table 2025-11-09 19:37:06 -08:00
nimlgen
614783693e nv: remove hardcoded expansion_rom_off (#13180)
* nv: remove hardcoded expansion_rom_off

* to max size
2025-11-09 21:43:19 +08:00
nimlgen
10dc8335d2 tinygpu: fix teardown crash (#13143)
* tinygpu: fix crash

* um?

* double relase

* restore
2025-11-07 19:52:54 +08:00
qazal
7e94369464 add helper for test_timing custom ops (#13140) 2025-11-07 17:13:55 +08:00
nimlgen
95620426d5 tinygpu: unmap dma when client closed (#13129)
* tinygpu: unmap dma when client closed

* syn

* tiny fixes
2025-11-07 16:08:43 +08:00
nimlgen
b9b68bf437 amd: add kern to sqtt event (#13126)
* amd: add kern to sqtt event

* fix
2025-11-06 22:02:02 +08:00
qazal
88245d6579 qol improvements to sqtt decoder and timing tests (#13125) 2025-11-06 20:51:30 +08:00
George Hotz
bcfe42937f move permute/flip/shrink to mixins (#13113)
* move permute to mixins

* move more stuff

* two more

* fix local mypy

* fix tests

* fix shrink
2025-11-05 14:14:15 -08:00