Commit Graph

10728 Commits

Author SHA1 Message Date
George Hotz
076bfa50e3 fix that 2025-10-23 10:24:25 +08:00
George Hotz
f4cea6a403 simpler 2025-10-23 10:21:27 +08:00
wozeparrot
6e00dec95d feat: pin openpilot 0.10.1 models (#12878) 2025-10-22 14:57:54 -07:00
wozeparrot
3a9aa05359 feat: extra nvcc options (#12876) 2025-10-22 13:21:11 -07:00
chenyu
f0831c8c30 add 0.10.0 to comma benchmark (#12875)
* add 0.10.0 to comma benchmark

disabled the 0.10.1 ones which are pinned to master. it does not work because benchmark uses the cached old version

* that's pinned
2025-10-22 15:18:21 -04:00
nimlgen
e7e535cd53 amd: sqtt for gfx9 (#12844)
* amd: start sqtt for gfx9

* writes something, but sometimes zeroes

* HEADER!

* w

* tiny

* mypy
2025-10-23 02:31:07 +08:00
b1tg
81108f91ee amd tc: 16x16x32 (#12874)
* amd tc: 16x16x32

* test

* clean, test amd_cdna4
2025-10-22 13:48:01 -04:00
George Hotz
bf173c0a37 we don't support multi end yet (#12869) 2025-10-22 23:43:32 +08:00
nimlgen
a7bc0104c2 amd: clean up sqtt_stop (#12872) 2025-10-22 22:17:03 +08:00
nimlgen
b6eb9172ea amd: fix ip offsets (#12867) 2025-10-22 20:50:18 +08:00
George Hotz
174811fc0f hotfix: slightly looser load spec for AMD bfloat16 2025-10-22 19:55:59 +08:00
George Hotz
7762b3558b clean up the spec (#12868)
* tighten up the spec

* move validate into a different file

* that moved to validate

* after(barr)
2025-10-22 19:50:42 +08:00
George Hotz
726988fa4b late ifs try 2 (#12865)
* late ifs try 2

* fix image

* fix that test

* panic

* ptx fixups

* preserve toposort

* those pass locally

* Revert "those pass locally"

This reverts commit 063409f828.

* no ls

* make that explicit
2025-10-22 18:49:27 +08:00
George Hotz
6abe90fb7c fix linearizer non-determinism (#12866) 2025-10-22 17:51:35 +08:00
qazal
cebc2b5721 cleanup viz profiler metadata ui (#12860)
* cleanup viz profiler metadata ui

* text

* select over .args

* space
2025-10-22 17:31:12 +08:00
Sieds Lykles
8d0256c46b Move gate to load for loaded index (#12861)
* change condition

* change test to better represent how the uop looks irl
2025-10-22 09:53:07 +02:00
chenyu
6d86e962c7 update ASSERT_MIN_STEP_TIME (#12857)
0.10.1 driving_policy is good now, still need driving_vision and dmonitoring to be fast
2025-10-21 22:46:07 -04:00
George Hotz
92778c7a8b rename opts to ren, add store ranges back (#12856)
* rename opts to ren

* fix docs and bring store back
2025-10-22 09:15:38 +08:00
chenyu
c5cee74706 remove BLOCK_REORDER (#12854)
not used
2025-10-21 19:10:14 -04:00
chenyu
0b673eddec simpler newton_schulz transpose (#12853) 2025-10-21 17:21:45 -04:00
b1tg
60d7e232f2 cuda fp8 (#12782)
* cuda fp8

* tensor core

* tc test

* clean

* clean pm
2025-10-21 15:05:25 -04:00
Harald Schäfer
587ccc0e5c compile3: make selftests opt-in (#12851) 2025-10-21 11:32:27 -07:00
wozeparrot
c3149c618a feat: nvcc compiler (#12852) 2025-10-21 11:31:23 -07:00
chenyu
8baa61bd67 use torch 2.9 and its Muon in test (#12773)
* use torch 2.9 and its Muon in test

* relax and disable
2025-10-21 13:35:17 -04:00
chenyu
f51f9aaa16 muon ns_params -> ns_coefficients (#12850)
match the official torch one
2025-10-21 12:35:52 -04:00
wozeparrot
62e7b8b870 feat: just use compile3 (#12849) 2025-10-21 07:56:50 -07:00
nimlgen
c7336c3e31 amd: sqtt for aql (#12846) 2025-10-21 22:35:01 +08:00
George Hotz
8960ac54f3 remove RewriteStep premature optimization (#12840)
* remove RewriteStep premature optimization

* fix ebs

* core line count
2025-10-21 21:45:20 +08:00
Sieds Lykles
7f798a9630 Cleanup const buffers (#12829)
* split pm_cleanups

* update test_schedule

* shrink when we remove bufferize

* dont do shrink if shape is empty

* update tests

* remove *1 from metadata

* deal with the noop bufferize

* only noop on cvar

* cleanup

* fix if

* rename
2025-10-21 14:53:49 +02:00
nimlgen
1ad6598963 amd: trace all instructions (#12831) 2025-10-21 20:52:24 +08:00
Christopher Milan
cdc72556a1 no more brew (#12839) 2025-10-21 08:12:46 -04:00
George Hotz
20a232f1c5 bugfixes from multioutput + PCONTIG=3 for fa bw memory fix (#12837)
* bugfixes from multioutput

* PCONTIG=3 fixes fa memory usage

* that's base
2025-10-21 19:21:02 +08:00
qazal
0435d31f1c viz: generic back button functionality (#12838) 2025-10-21 18:52:00 +08:00
George Hotz
7d9551ce2e move to late/control_flow.py (#12835) 2025-10-21 18:15:06 +08:00
George Hotz
d711a4b933 delete old linearizer (#12834)
* new linearizer with early endrange

* cleanups

* second stage removal

* not store

* do that later

* end cleanup

* fix globals

* end

* multi end

* fix ends earlier

* work

* do_merge_ends

* mini change

* range_gate

* fix cpu

* test fixups

* ranges on index

* not for ptx

* delete linearizer

* remove more junk

* delete that test

* we insert endif

* all ends
2025-10-21 17:52:18 +08:00
qazal
40633ab34d list buffer args to kernel in profiler (#12826)
* list buffer args to kernel in profiler

* stable order

* back button works

* deselect also works
2025-10-21 17:51:36 +08:00
George Hotz
c780cd9abb new linearizer with early endrange (#12823)
* new linearizer with early endrange

* cleanups

* second stage removal

* not store

* do that later

* end cleanup

* fix globals

* end

* multi end

* fix ends earlier

* work

* do_merge_ends

* mini change

* range_gate

* fix cpu

* test fixups

* ranges on index

* not for ptx
2025-10-21 17:37:48 +08:00
George Hotz
d59d4cdbe4 lil less is okay 2025-10-21 17:09:44 +08:00
qazal
32af1ff84b viz graph drawing small cleanups (#12830)
* viz graph drawing small cleanups

* str literal
2025-10-21 15:51:32 +08:00
Sieds Lykles
367fbabc30 remove Ops.SUBSTITUTE (#12827)
* remove Ops.SUBSTITUTE

* remove from viz
2025-10-21 08:19:42 +02:00
qazal
57f6b6f229 style view codegen like a link in profiler (#12825) 2025-10-21 12:15:13 +08:00
qazal
154cdfe46d viz state cleanups (#12821)
* viz state cleanups

* more generic
2025-10-21 11:44:51 +08:00
George Hotz
a71a41f6d1 rename Ops.ENDRANGE -> Ops.END (#12824) 2025-10-21 11:32:18 +08:00
qazal
8521fd5263 viz: hierarchical rewrites (#12805)
* viz: hierarchical rewrites

* count of subrewrites

* arrows

* better keyboard things

* add select and deselect utils

* works

* diff

* event stopPropagation

* work

* don't change the rewrite

* walk tree back
2025-10-21 10:55:41 +08:00
George Hotz
df2f8b9295 use after on locals (#12815)
* use after on locals

* fix estimates

* too much compute

* correct for both ptx and normal

* err, that

* tighter spec

* keep that
2025-10-21 10:29:12 +08:00
Christopher Milan
68c045bf0a NIR: Check for brew packages tinymesa and tinymesa_cpu (#12739)
* brew install tinymesa_cpu

* brew --prefix tinygrad_cpu too

* fix brew paths

* check both brew paths

* better errors

* handle failure
2025-10-21 09:38:43 +08:00
wozeparrot
990e8b97ee feat: log openpilot 0.10.1 times (#12816) 2025-10-20 18:30:34 -07:00
George Hotz
565a7a6218 num_batches_tracked has shape () (#12820) 2025-10-21 09:22:39 +08:00
George Hotz
25beea5769 hotfix: suppress_finalizing on device __del__ 2025-10-21 09:04:36 +08:00
chenyu
c7c59e6dd7 unused UPat.or_broadcasted and GroupOp.Block [pr] (#12819) 2025-10-20 12:24:58 -04:00