George Hotz
28c2776999
fix on OSX
2025-11-17 14:19:39 -08:00
George Hotz
e13580a1d7
print special ops in postrange
2025-11-17 13:35:29 -08:00
George Hotz
98e9e73286
hotfix: amd_uop_matmul getenvs
2025-11-17 13:26:01 -08:00
qazal
e7e1935225
cleanup sqtt/test_timing ( #13315 )
2025-11-18 04:28:05 +08:00
wozeparrot
33773fda87
tk initial mi350 ( #13289 )
2025-11-17 11:46:32 -08:00
nimlgen
e2cee64050
Revert "hcq: add tag to exec events ( #13311 )" ( #13314 )
...
This reverts commit f63ded5817 .
2025-11-17 22:15:31 +03:00
chenyu
646372490c
move tiktoken import in llama3 ( #13316 )
...
only Tokenizer requires that
2025-11-17 14:09:37 -05:00
qazal
a37f221e44
viz: visualize waves in the timeline ( #13292 )
...
* viz: visualize waves in the timeline
* timeline in format
* per step
* rm that
2025-11-17 22:04:21 +08:00
nimlgen
f63ded5817
hcq: add tag to exec events ( #13311 )
...
* hcq: add tag to exec events
* f
* fix
* fix
2025-11-17 16:59:30 +03:00
qazal
50a443f558
viz: add shader engine to wave exec payload ( #13310 )
...
* viz: show sqtt shader engine
* order it from smallest unit
* easier to config
2025-11-17 19:11:34 +08:00
nimlgen
9bb17c53ea
amd: timer fix ( #13267 )
2025-11-17 13:59:03 +03:00
George Hotz
55be95da15
cleanup sqtt raw parser ( #13309 )
...
* cleanup sqtt raw parser
* better names (don't merge yet)
* clean up amd
* a few more names
* one more filter
2025-11-16 13:11:51 -08:00
George Hotz
cabd4add48
more work parsing SQTT, separate VIZ/PROFILE ( #13308 )
...
* more work parsing SQTT
* more minimal runner
* sep VIZ/PROFILE
* parse print new
* improve parser
* more filter
* that
* split them
* lil cleanup
* skip flaky test
* AQL in mmapeak
2025-11-16 10:40:39 -08:00
qazal
13efdf8c31
test s_nop stall ( #13307 )
2025-11-17 00:59:39 +08:00
George Hotz
295600dc5a
saturday coffee shop work parsing the att format ( #13295 )
...
* saturday coffee shop work parsing the att format
* add examples
* parser
* classes of packets
* fully vibe coded parser
* vibing
* empty
* some vibe names
* vibes
* most of these are wrong
* more vibes
* better names
* parsing
* parse
* cleanup parser
* touchups
2025-11-16 08:25:51 -08:00
Christopher Milan
a9ed241172
properly suppress NIRRenderer.__del__ error ( #13299 )
2025-11-16 18:58:04 +03:00
qazal
c70b06ec19
sqtt test_timing work ( #13304 )
...
* sqtt test_timing cleanups
* only the instruction
* v_mfma_f32_16x16x32_f16 16 cycles, only after second one though
2025-11-16 23:49:24 +08:00
chenyu
8f0e747b3a
Tensor._tri with arange ( #13297 )
2025-11-16 10:21:16 -05:00
chenyu
6372c95094
disable benchmark MobileNetV2 on DSP ( #13305 )
...
failed on tinyc2
2025-11-16 09:42:52 -05:00
Christopher Milan
61625a3898
fix objc finalizing bug ( #13296 )
2025-11-16 12:43:04 +03:00
nimlgen
acbe6361ab
qcom: suppress_finalizing to free ( #13294 )
2025-11-16 11:49:23 +03:00
wozeparrot
ef42334239
tk: load store cleanup ( #13290 )
2025-11-15 17:08:23 -08:00
chenyu
e8844853ed
Tensor.eye with arange ( #13287 )
...
with rangify we can write these with arange
2025-11-15 12:32:27 -05:00
Christopher Milan
5b823af696
Remove (pypi) clang dep for autogen ( #13284 )
...
* no more clang
* regen comgr_3
* ci doesn't need pypi clang
* fix objc
* REGEN for libclang
---------
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com >
2025-11-15 09:05:11 -08:00
George Hotz
df53c62a9f
bump line count
2025-11-15 08:16:20 -08:00
nimlgen
d37e1fe065
nv: wait for wpr to reset ( #13282 )
...
* nv: wait for wpr to reet
* fix
* comment
* wai
* f
* fix
2025-11-15 20:00:49 +08:00
George Hotz
22c08b470c
fold using outerworld range ( #13286 )
...
* scan using outerworld range
* almost
* sched
* simple range
* mypy
* woooo outer range
* spec passes
* print the numbers
* lol it runs
* real test
2025-11-14 20:43:41 -08:00
George Hotz
567066f51f
tests for cast there and back ( #13195 )
...
* fix cast folding in llama
* dtypes that work everywhere
* Skip test_cast_there_and_back for backend casts
Skip test due to backend casting issues.
2025-11-14 16:56:09 -08:00
George Hotz
6c5fa349e1
add (unused) outer range ( #13285 )
2025-11-14 16:47:52 -08:00
Christopher Milan
d1bb08c5a1
In-tree autogen: objective c ( #13223 )
...
* checkout changes from autogen branch
* move assert
---------
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com >
2025-11-14 14:08:42 -08:00
George Hotz
e5351699bd
openpilot warp ( #13283 )
...
* openpilot image warp test
* 0.4 ms on metal, 1 ms on CPU
* new inputs each time
* reshape
2025-11-14 13:55:32 -08:00
qazal
7c110e1a57
viz: minor cleanups for sqtt ( #13275 )
...
* small prg cleanup
* test_timing
2025-11-15 01:08:56 +08:00
chenyu
888aaab151
test_tiny cleanup ( #13276 )
2025-11-14 11:11:32 -05:00
nimlgen
3e63831b98
nv: support 580+ drivers ( #13269 )
...
* nv: 580+ support
* start
* f
* fake
* fix
2025-11-14 21:44:16 +08:00
qazal
2ee701a009
roc: fix CEnum access ( #13270 )
...
* roc: add decoder to ci
* also add installer
* use CEnum syntax
* try 2
* add to setup
* revert ci change
* the other enum too
2025-11-14 21:41:24 +08:00
nimlgen
c80d459d99
autogen: fix packed args structs ( #13274 )
...
* autogen: fix packed args structs
* and test this
2025-11-14 20:24:06 +08:00
nimlgen
14eb48b13a
autogen: rename nv_gpu to nv_570 ( #13273 )
...
* autogen: rename nv_gpu to nv_570
* rename
2025-11-14 20:07:19 +08:00
nimlgen
734bfa07b4
nv: refactor uvm calls ( #13272 )
2025-11-14 19:53:04 +08:00
nimlgen
f72b1fbca4
nv: read numClasses ( #13271 )
...
* nv: read numClasses
* fix
* d
2025-11-14 19:43:25 +08:00
nimlgen
84f065f2a2
autogen: warning and msg ( #13268 )
...
* autogen: warning and msg
* f
2025-11-14 19:10:26 +08:00
George Hotz
44d84228ff
move comgr_3 logic back to the old place ( #13266 )
...
* move comgr_3 logic back to the old place
* explicit
2025-11-13 20:05:54 -08:00
Christopher Milan
09f3aae169
In-tree autogen: all C libraries ( #13220 )
...
* checkout files from autogen branch
* ioctl with payload
* fix am generations
* properly fix generations
This reverts commit b2a54f4f41 .
* revert discovery.h
* support pragma pack(1)
* typo
* better getter
* typo
* NVCEC0_QMDV05_00_RELEASE[01]_ENABLE
* align support
* anon handling fix
---------
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com >
2025-11-13 18:57:44 -08:00
wozeparrot
777cbec5b3
tk: rename rt tile dims to base ( #13265 )
2025-11-13 18:43:02 -08:00
wozeparrot
7eb0d8e744
feat: mixins on tiles ( #13246 )
2025-11-13 16:52:52 -08:00
George Hotz
ba84d415fe
work from benchmarking tinybox red v2 ( #13264 )
...
* work from benchmarking tinybox red v2
* gpuburn
2025-11-13 16:38:40 -08:00
wozeparrot
547304c471
tk: group cleanup ( #13262 )
2025-11-13 14:19:51 -08:00
wozeparrot
4ada51618f
tk: don't flatten in clear ( #13249 )
2025-11-13 13:38:01 -08:00
George Hotz
6b1bae6614
ruff format mixin ( #13261 )
2025-11-13 10:10:38 -08:00
Faizaan Gagan
3049f3edda
support _rebuild_tensor method interception ( #13253 )
2025-11-13 09:41:21 -08:00
Harald Schäfer
3af231904e
openpilot compile tests: assert pre-rangify speeds ( #12775 )
...
* assert pre-rangify speeds
* typo
---------
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com >
2025-11-13 09:39:06 -08:00