Commit Graph

11201 Commits

Author SHA1 Message Date
nimlgen
d37e1fe065 nv: wait for wpr to reset (#13282)
* nv: wait for wpr to reet

* fix

* comment

* wai

* f

* fix
2025-11-15 20:00:49 +08:00
George Hotz
22c08b470c fold using outerworld range (#13286)
* scan using outerworld range

* almost

* sched

* simple range

* mypy

* woooo outer range

* spec passes

* print the numbers

* lol it runs

* real test
2025-11-14 20:43:41 -08:00
George Hotz
567066f51f tests for cast there and back (#13195)
* fix cast folding in llama

* dtypes that work everywhere

* Skip test_cast_there_and_back for backend casts

Skip test due to backend casting issues.
2025-11-14 16:56:09 -08:00
George Hotz
6c5fa349e1 add (unused) outer range (#13285) 2025-11-14 16:47:52 -08:00
Christopher Milan
d1bb08c5a1 In-tree autogen: objective c (#13223)
* checkout changes from autogen branch

* move assert

---------

Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
2025-11-14 14:08:42 -08:00
George Hotz
e5351699bd openpilot warp (#13283)
* openpilot image warp test

* 0.4 ms on metal, 1 ms on CPU

* new inputs each time

* reshape
2025-11-14 13:55:32 -08:00
qazal
7c110e1a57 viz: minor cleanups for sqtt (#13275)
* small prg cleanup

* test_timing
2025-11-15 01:08:56 +08:00
chenyu
888aaab151 test_tiny cleanup (#13276) 2025-11-14 11:11:32 -05:00
nimlgen
3e63831b98 nv: support 580+ drivers (#13269)
* nv: 580+ support

* start

* f

* fake

* fix
2025-11-14 21:44:16 +08:00
qazal
2ee701a009 roc: fix CEnum access (#13270)
* roc: add decoder to ci

* also add installer

* use CEnum syntax

* try 2

* add to setup

* revert ci change

* the other enum too
2025-11-14 21:41:24 +08:00
nimlgen
c80d459d99 autogen: fix packed args structs (#13274)
* autogen: fix packed args structs

* and test this
2025-11-14 20:24:06 +08:00
nimlgen
14eb48b13a autogen: rename nv_gpu to nv_570 (#13273)
* autogen: rename nv_gpu to nv_570

* rename
2025-11-14 20:07:19 +08:00
nimlgen
734bfa07b4 nv: refactor uvm calls (#13272) 2025-11-14 19:53:04 +08:00
nimlgen
f72b1fbca4 nv: read numClasses (#13271)
* nv: read numClasses

* fix

* d
2025-11-14 19:43:25 +08:00
nimlgen
84f065f2a2 autogen: warning and msg (#13268)
* autogen: warning and msg

* f
2025-11-14 19:10:26 +08:00
George Hotz
44d84228ff move comgr_3 logic back to the old place (#13266)
* move comgr_3 logic back to the old place

* explicit
2025-11-13 20:05:54 -08:00
Christopher Milan
09f3aae169 In-tree autogen: all C libraries (#13220)
* checkout files from autogen branch

* ioctl with payload

* fix am generations

* properly fix generations

This reverts commit b2a54f4f41.

* revert discovery.h

* support pragma pack(1)

* typo

* better getter

* typo

* NVCEC0_QMDV05_00_RELEASE[01]_ENABLE

* align support

* anon handling fix

---------

Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
2025-11-13 18:57:44 -08:00
wozeparrot
777cbec5b3 tk: rename rt tile dims to base (#13265) 2025-11-13 18:43:02 -08:00
wozeparrot
7eb0d8e744 feat: mixins on tiles (#13246) 2025-11-13 16:52:52 -08:00
George Hotz
ba84d415fe work from benchmarking tinybox red v2 (#13264)
* work from benchmarking tinybox red v2

* gpuburn
2025-11-13 16:38:40 -08:00
wozeparrot
547304c471 tk: group cleanup (#13262) 2025-11-13 14:19:51 -08:00
wozeparrot
4ada51618f tk: don't flatten in clear (#13249) 2025-11-13 13:38:01 -08:00
George Hotz
6b1bae6614 ruff format mixin (#13261) 2025-11-13 10:10:38 -08:00
Faizaan Gagan
3049f3edda support _rebuild_tensor method interception (#13253) 2025-11-13 09:41:21 -08:00
Harald Schäfer
3af231904e openpilot compile tests: assert pre-rangify speeds (#12775)
* assert pre-rangify speeds

* typo

---------

Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
2025-11-13 09:39:06 -08:00
George Hotz
faf68c03a8 more mi350x matmul work (#13138)
* more mi350x matmul work

* broken compute
2025-11-13 09:09:28 -08:00
Ayman Jabr
256f81bb02 Fix tracemeta 0 (#13049)
* chore: tclesius branch resolved

* fix: indentation

---------

Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
2025-11-13 09:07:11 -08:00
alpharush
7e0aaadecd feat: add repro command to summary (#10930) 2025-11-13 08:52:27 -08:00
nimlgen
6be86dde17 nv: add timeout when repsonding to rpc (#13260) 2025-11-14 00:42:21 +08:00
nimlgen
f9b7586e08 roc: fix blob gc (#13256) 2025-11-13 23:38:35 +08:00
George Hotz
263b724143 one cache and bump it (#13258) 2025-11-13 07:33:31 -08:00
George Hotz
5efa727b83 move _pool to MovementMixins (#13257) 2025-11-13 07:28:52 -08:00
George Hotz
bcdfc109b5 hotfix: disable flaky test 2025-11-13 06:19:28 -08:00
qazal
006dea4c3e roc: only save instruction execs (#13254) 2025-11-13 21:28:40 +08:00
nimlgen
f9586b38ba system: pci mask and val (#13251) 2025-11-13 20:44:58 +08:00
George Hotz
7316da3253 new readme (#13250)
* new readme

* update
2025-11-13 00:48:28 -08:00
George Hotz
17aa3379e9 hotfix: improve self_tokenize 2025-11-13 00:18:57 -08:00
chenyu
4e5a9132e7 JIT_BATCH_SIZE=0 in compile3 (#13245)
fixed some enqueue time
2025-11-12 23:12:45 -05:00
wozeparrot
759557f633 feat: move tk tests to testextra (#13242) 2025-11-12 17:06:53 -08:00
chenyu
3f939f3d3c update pm_simplify_valid (#13241)
* update pm_simplify_valid

fixed openpilot conv regression

* IMAGE training is broken
2025-11-12 19:40:02 -05:00
chenyu
f9851a852f minor update to uop_given_valid [pr] (#13243)
split from #13241
2025-11-12 19:03:18 -05:00
qazal
fe2876a6d8 hotfix: second GB/s in viz (#13240) 2025-11-13 07:14:27 +08:00
George Hotz
a23dea202b actually make AMD_LLVM not default (#13238) 2025-11-12 15:07:23 -08:00
George Hotz
ab9fa964d8 DISABLE_COMPILER_CACHE -> CCACHE (#13234)
* DISABLE_COMPILER_CACHE -> CCACHE

* Fix cachekey assignment in Compiler constructor
2025-11-12 15:07:09 -08:00
qazal
be2e24cb25 roc: requires sudo to install (#13237) 2025-11-12 16:59:22 -05:00
George Hotz
8f1f195b6d hotfix: no hexdump for usbgpu patch.py 2025-11-12 12:05:37 -08:00
nimlgen
9a53fcbde4 amd: sqtt on rdna3.5 (#13233) 2025-11-13 03:30:42 +08:00
George Hotz
13f10a31dc AMD_LLVM default off (#13232) 2025-11-12 11:06:33 -08:00
qazal
8b26cf2b3d sqtt: update rcp timing test (#13231)
* sqtt: assert correct output in timing test

* found why
2025-11-13 02:01:54 +08:00
Jan Akhremchik
bc8e537423 Add NONZERO op to onnx backend (#13211) 2025-11-12 08:55:51 -08:00