nimlgen
0c9fbf87e1
nvioctl: classes ( #13346 )
2025-11-19 16:14:15 +03:00
wozeparrot
be72b78dcb
tk: small fixes ( #13345 )
...
* fix: handle case where final uop isn't a tk wrapped one
* clean: remove after from mma
2025-11-19 00:58:50 -08:00
qazal
a647c9eca6
sqtt ui minor fixes ( #13335 )
...
* roc.py cleanups
* direct append
* viz index cleanup
* simd row details
2025-11-19 01:27:56 +08:00
nimlgen
331f70aa75
roc: ctrlc ( #13255 )
...
* roc: ctrl-c works
* rm
2025-11-18 19:29:28 +08:00
George Hotz
6d3385c284
print special ops in postrange ( #13318 )
...
* print special ops in postrange
* fix on OSX
2025-11-17 14:43:23 -08:00
George Hotz
98e9e73286
hotfix: amd_uop_matmul getenvs
2025-11-17 13:26:01 -08:00
qazal
e7e1935225
cleanup sqtt/test_timing ( #13315 )
2025-11-18 04:28:05 +08:00
wozeparrot
33773fda87
tk initial mi350 ( #13289 )
2025-11-17 11:46:32 -08:00
nimlgen
e2cee64050
Revert "hcq: add tag to exec events ( #13311 )" ( #13314 )
...
This reverts commit f63ded5817 .
2025-11-17 22:15:31 +03:00
nimlgen
f63ded5817
hcq: add tag to exec events ( #13311 )
...
* hcq: add tag to exec events
* f
* fix
* fix
2025-11-17 16:59:30 +03:00
qazal
50a443f558
viz: add shader engine to wave exec payload ( #13310 )
...
* viz: show sqtt shader engine
* order it from smallest unit
* easier to config
2025-11-17 19:11:34 +08:00
George Hotz
55be95da15
cleanup sqtt raw parser ( #13309 )
...
* cleanup sqtt raw parser
* better names (don't merge yet)
* clean up amd
* a few more names
* one more filter
2025-11-16 13:11:51 -08:00
George Hotz
cabd4add48
more work parsing SQTT, separate VIZ/PROFILE ( #13308 )
...
* more work parsing SQTT
* more minimal runner
* sep VIZ/PROFILE
* parse print new
* improve parser
* more filter
* that
* split them
* lil cleanup
* skip flaky test
* AQL in mmapeak
2025-11-16 10:40:39 -08:00
qazal
13efdf8c31
test s_nop stall ( #13307 )
2025-11-17 00:59:39 +08:00
George Hotz
295600dc5a
saturday coffee shop work parsing the att format ( #13295 )
...
* saturday coffee shop work parsing the att format
* add examples
* parser
* classes of packets
* fully vibe coded parser
* vibing
* empty
* some vibe names
* vibes
* most of these are wrong
* more vibes
* better names
* parsing
* parse
* cleanup parser
* touchups
2025-11-16 08:25:51 -08:00
qazal
c70b06ec19
sqtt test_timing work ( #13304 )
...
* sqtt test_timing cleanups
* only the instruction
* v_mfma_f32_16x16x32_f16 16 cycles, only after second one though
2025-11-16 23:49:24 +08:00
wozeparrot
ef42334239
tk: load store cleanup ( #13290 )
2025-11-15 17:08:23 -08:00
qazal
7c110e1a57
viz: minor cleanups for sqtt ( #13275 )
...
* small prg cleanup
* test_timing
2025-11-15 01:08:56 +08:00
qazal
2ee701a009
roc: fix CEnum access ( #13270 )
...
* roc: add decoder to ci
* also add installer
* use CEnum syntax
* try 2
* add to setup
* revert ci change
* the other enum too
2025-11-14 21:41:24 +08:00
nimlgen
14eb48b13a
autogen: rename nv_gpu to nv_570 ( #13273 )
...
* autogen: rename nv_gpu to nv_570
* rename
2025-11-14 20:07:19 +08:00
Christopher Milan
09f3aae169
In-tree autogen: all C libraries ( #13220 )
...
* checkout files from autogen branch
* ioctl with payload
* fix am generations
* properly fix generations
This reverts commit b2a54f4f41 .
* revert discovery.h
* support pragma pack(1)
* typo
* better getter
* typo
* NVCEC0_QMDV05_00_RELEASE[01]_ENABLE
* align support
* anon handling fix
---------
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com >
2025-11-13 18:57:44 -08:00
wozeparrot
777cbec5b3
tk: rename rt tile dims to base ( #13265 )
2025-11-13 18:43:02 -08:00
wozeparrot
7eb0d8e744
feat: mixins on tiles ( #13246 )
2025-11-13 16:52:52 -08:00
George Hotz
ba84d415fe
work from benchmarking tinybox red v2 ( #13264 )
...
* work from benchmarking tinybox red v2
* gpuburn
2025-11-13 16:38:40 -08:00
wozeparrot
547304c471
tk: group cleanup ( #13262 )
2025-11-13 14:19:51 -08:00
wozeparrot
4ada51618f
tk: don't flatten in clear ( #13249 )
2025-11-13 13:38:01 -08:00
George Hotz
faf68c03a8
more mi350x matmul work ( #13138 )
...
* more mi350x matmul work
* broken compute
2025-11-13 09:09:28 -08:00
alpharush
7e0aaadecd
feat: add repro command to summary ( #10930 )
2025-11-13 08:52:27 -08:00
nimlgen
f9b7586e08
roc: fix blob gc ( #13256 )
2025-11-13 23:38:35 +08:00
qazal
006dea4c3e
roc: only save instruction execs ( #13254 )
2025-11-13 21:28:40 +08:00
George Hotz
17aa3379e9
hotfix: improve self_tokenize
2025-11-13 00:18:57 -08:00
qazal
be2e24cb25
roc: requires sudo to install ( #13237 )
2025-11-12 16:59:22 -05:00
George Hotz
8f1f195b6d
hotfix: no hexdump for usbgpu patch.py
2025-11-12 12:05:37 -08:00
qazal
8b26cf2b3d
sqtt: update rcp timing test ( #13231 )
...
* sqtt: assert correct output in timing test
* found why
2025-11-13 02:01:54 +08:00
nimlgen
af17e07251
viz: sqtt touchups ( #13228 )
...
* viz: sqtt touchups
* revert
* matches
2025-11-12 22:40:37 +08:00
nimlgen
fcd8d0751a
test_timing for hip ( #13229 )
2025-11-12 20:28:58 +08:00
wozeparrot
371c1f2355
tk: move tiles to class ( #13224 )
2025-11-11 21:53:46 -08:00
wozeparrot
787f0070ed
feat: don't use output reg as local reduce reg ( #13203 )
2025-11-11 14:35:16 -08:00
George Hotz
0c978d45e6
stub attention ( #13196 )
...
* stub attention
* name the kernels
2025-11-10 13:48:38 -08:00
qazal
50934050bc
sqtt: append all wave execs ( #13190 )
2025-11-10 23:50:08 +08:00
qazal
38a24731a1
cleanup sqtt tooling ( #13188 )
...
* cleanup viz/serve.py
* use latest profile in rgptool.py
* unwrap nullable in roc.py, fix disasms typing
2025-11-10 20:52:57 +08:00
wozeparrot
6252831ceb
feat: initial tk library ( #13160 )
2025-11-09 22:54:29 -08:00
George Hotz
d7369de048
hotfix: update weekly commits table
2025-11-09 19:37:06 -08:00
nimlgen
614783693e
nv: remove hardcoded expansion_rom_off ( #13180 )
...
* nv: remove hardcoded expansion_rom_off
* to max size
2025-11-09 21:43:19 +08:00
nimlgen
10dc8335d2
tinygpu: fix teardown crash ( #13143 )
...
* tinygpu: fix crash
* um?
* double relase
* restore
2025-11-07 19:52:54 +08:00
qazal
7e94369464
add helper for test_timing custom ops ( #13140 )
2025-11-07 17:13:55 +08:00
nimlgen
95620426d5
tinygpu: unmap dma when client closed ( #13129 )
...
* tinygpu: unmap dma when client closed
* syn
* tiny fixes
2025-11-07 16:08:43 +08:00
nimlgen
b9b68bf437
amd: add kern to sqtt event ( #13126 )
...
* amd: add kern to sqtt event
* fix
2025-11-06 22:02:02 +08:00
qazal
88245d6579
qol improvements to sqtt decoder and timing tests ( #13125 )
2025-11-06 20:51:30 +08:00
George Hotz
bcfe42937f
move permute/flip/shrink to mixins ( #13113 )
...
* move permute to mixins
* move more stuff
* two more
* fix local mypy
* fix tests
* fix shrink
2025-11-05 14:14:15 -08:00