alpharush
|
7e0aaadecd
|
feat: add repro command to summary (#10930)
|
2025-11-13 08:52:27 -08:00 |
|
nimlgen
|
6be86dde17
|
nv: add timeout when repsonding to rpc (#13260)
|
2025-11-14 00:42:21 +08:00 |
|
nimlgen
|
f9b7586e08
|
roc: fix blob gc (#13256)
|
2025-11-13 23:38:35 +08:00 |
|
George Hotz
|
263b724143
|
one cache and bump it (#13258)
|
2025-11-13 07:33:31 -08:00 |
|
George Hotz
|
5efa727b83
|
move _pool to MovementMixins (#13257)
|
2025-11-13 07:28:52 -08:00 |
|
George Hotz
|
bcdfc109b5
|
hotfix: disable flaky test
|
2025-11-13 06:19:28 -08:00 |
|
qazal
|
006dea4c3e
|
roc: only save instruction execs (#13254)
|
2025-11-13 21:28:40 +08:00 |
|
nimlgen
|
f9586b38ba
|
system: pci mask and val (#13251)
|
2025-11-13 20:44:58 +08:00 |
|
George Hotz
|
7316da3253
|
new readme (#13250)
* new readme
* update
|
2025-11-13 00:48:28 -08:00 |
|
George Hotz
|
17aa3379e9
|
hotfix: improve self_tokenize
|
2025-11-13 00:18:57 -08:00 |
|
chenyu
|
4e5a9132e7
|
JIT_BATCH_SIZE=0 in compile3 (#13245)
fixed some enqueue time
|
2025-11-12 23:12:45 -05:00 |
|
wozeparrot
|
759557f633
|
feat: move tk tests to testextra (#13242)
|
2025-11-12 17:06:53 -08:00 |
|
chenyu
|
3f939f3d3c
|
update pm_simplify_valid (#13241)
* update pm_simplify_valid
fixed openpilot conv regression
* IMAGE training is broken
|
2025-11-12 19:40:02 -05:00 |
|
chenyu
|
f9851a852f
|
minor update to uop_given_valid [pr] (#13243)
split from #13241
|
2025-11-12 19:03:18 -05:00 |
|
qazal
|
fe2876a6d8
|
hotfix: second GB/s in viz (#13240)
|
2025-11-13 07:14:27 +08:00 |
|
George Hotz
|
a23dea202b
|
actually make AMD_LLVM not default (#13238)
|
2025-11-12 15:07:23 -08:00 |
|
George Hotz
|
ab9fa964d8
|
DISABLE_COMPILER_CACHE -> CCACHE (#13234)
* DISABLE_COMPILER_CACHE -> CCACHE
* Fix cachekey assignment in Compiler constructor
|
2025-11-12 15:07:09 -08:00 |
|
qazal
|
be2e24cb25
|
roc: requires sudo to install (#13237)
|
2025-11-12 16:59:22 -05:00 |
|
George Hotz
|
8f1f195b6d
|
hotfix: no hexdump for usbgpu patch.py
|
2025-11-12 12:05:37 -08:00 |
|
nimlgen
|
9a53fcbde4
|
amd: sqtt on rdna3.5 (#13233)
|
2025-11-13 03:30:42 +08:00 |
|
George Hotz
|
13f10a31dc
|
AMD_LLVM default off (#13232)
|
2025-11-12 11:06:33 -08:00 |
|
qazal
|
8b26cf2b3d
|
sqtt: update rcp timing test (#13231)
* sqtt: assert correct output in timing test
* found why
|
2025-11-13 02:01:54 +08:00 |
|
Jan Akhremchik
|
bc8e537423
|
Add NONZERO op to onnx backend (#13211)
|
2025-11-12 08:55:51 -08:00 |
|
nimlgen
|
af17e07251
|
viz: sqtt touchups (#13228)
* viz: sqtt touchups
* revert
* matches
|
2025-11-12 22:40:37 +08:00 |
|
qazal
|
7a6853fa40
|
viz: show python callstack in the first graph (#13218)
|
2025-11-12 20:52:28 +08:00 |
|
nimlgen
|
82eb63d3ad
|
qcom: auto switch idle timer when profiling (#13230)
* qcom: auto switch idle timer when profiling
* fi
|
2025-11-12 20:31:24 +08:00 |
|
nimlgen
|
fcd8d0751a
|
test_timing for hip (#13229)
|
2025-11-12 20:28:58 +08:00 |
|
qazal
|
74b9d33acb
|
viz: direct link to program source (#13227)
|
2025-11-12 16:27:13 +08:00 |
|
wozeparrot
|
371c1f2355
|
tk: move tiles to class (#13224)
|
2025-11-11 21:53:46 -08:00 |
|
Christopher Milan
|
41a098a82d
|
In-tree autogen: libc.py (#13217)
* checkout changes from autogen branch
* parents
* pylint happy
* move sys to system in helpers.py
* typo
* typo
|
2025-11-11 19:13:48 -08:00 |
|
wozeparrot
|
222bb12ddf
|
tk softmax (#13205)
|
2025-11-11 15:13:16 -08:00 |
|
wozeparrot
|
787f0070ed
|
feat: don't use output reg as local reduce reg (#13203)
|
2025-11-11 14:35:16 -08:00 |
|
chenyu
|
ece1415def
|
clean up image_dot and image_conv2d (#13222)
* clean up image_dot and image_conv2d
* those are fine
* interesting
|
2025-11-11 15:53:03 -05:00 |
|
nimlgen
|
2f0ea29b34
|
qcom: 48bit timestamps (#13214)
* qcom: 48bit timestamps
* f
* lol
* fix
|
2025-11-12 04:14:33 +08:00 |
|
qazal
|
bc55bc4849
|
cleanup test_viz profiler tests (#13221)
|
2025-11-12 03:46:48 +08:00 |
|
chenyu
|
23b90945c3
|
add a benchmark for openpilot vision with DEBUG=2 (#13219)
see per kernel speed, also disable the jobs for 0.9.9
|
2025-11-11 14:41:52 -05:00 |
|
George Hotz
|
c2075f3613
|
gc disable during big rewrites (#13215)
* gc disable during big rewrites
* cleaner with helper
|
2025-11-11 10:30:47 -08:00 |
|
Roelof van Dijk
|
e59313da08
|
migrate pytest and ruff (#13216)
|
2025-11-11 13:27:51 -05:00 |
|
Gaétan Lepage
|
6fd7ce3832
|
migrate to pyproject.toml (#13189)
* migrate to pyproject.toml
* move mypy config to pyproject.toml
|
2025-11-11 09:09:27 -08:00 |
|
qazal
|
8002921a04
|
viz: improve the program run tooltip (#13212)
* add tflops to tooltip format
* show if the run was batched
|
2025-11-12 00:56:03 +08:00 |
|
qazal
|
f91e366a17
|
viz: display the graph layout recursion error (#13194)
* viz: display the graph layout recursion error
* share styles
* +min-width
* same thing
* inline the append
|
2025-11-11 15:25:12 +08:00 |
|
wozeparrot
|
73497af4c0
|
clean: use np for allclose (#13204)
|
2025-11-10 23:02:43 -08:00 |
|
George Hotz
|
a6360fd94d
|
store can have shape (#13202)
* store can have shape
* _shape
|
2025-11-10 22:16:47 -08:00 |
|
b1tg
|
f3692b7406
|
clean up hip renderer (#13063)
* clean up hip renderer
* ocml
---------
Co-authored-by: chenyu <chenyu@fastmail.com>
|
2025-11-11 00:44:24 -05:00 |
|
chenyu
|
22b8579234
|
one last regressed dm kernel (#13201)
|
2025-11-10 23:30:52 -05:00 |
|
chenyu
|
58b7e4fab3
|
GROUPTOP heuristic on more axes (#13206)
fixed dm speed
|
2025-11-10 23:30:37 -05:00 |
|
chenyu
|
829cdafccc
|
update openpilot slow conv uop ast (#13197)
the two remaining slow ones
|
2025-11-10 17:03:20 -05:00 |
|
George Hotz
|
0c978d45e6
|
stub attention (#13196)
* stub attention
* name the kernels
|
2025-11-10 13:48:38 -08:00 |
|
chenyu
|
58c30fc7ce
|
minor image_conv2d cleanup (#13193)
|
2025-11-10 16:05:40 -05:00 |
|
chenyu
|
60e55d9a2d
|
line count 18500 (#13191)
|
2025-11-10 13:52:13 -05:00 |
|