wozeparrot
c3149c618a
feat: nvcc compiler ( #12852 )
2025-10-21 11:31:23 -07:00
chenyu
8baa61bd67
use torch 2.9 and its Muon in test ( #12773 )
...
* use torch 2.9 and its Muon in test
* relax and disable
2025-10-21 13:35:17 -04:00
chenyu
f51f9aaa16
muon ns_params -> ns_coefficients ( #12850 )
...
match the official torch one
2025-10-21 12:35:52 -04:00
wozeparrot
62e7b8b870
feat: just use compile3 ( #12849 )
2025-10-21 07:56:50 -07:00
nimlgen
c7336c3e31
amd: sqtt for aql ( #12846 )
2025-10-21 22:35:01 +08:00
George Hotz
8960ac54f3
remove RewriteStep premature optimization ( #12840 )
...
* remove RewriteStep premature optimization
* fix ebs
* core line count
2025-10-21 21:45:20 +08:00
Sieds Lykles
7f798a9630
Cleanup const buffers ( #12829 )
...
* split pm_cleanups
* update test_schedule
* shrink when we remove bufferize
* dont do shrink if shape is empty
* update tests
* remove *1 from metadata
* deal with the noop bufferize
* only noop on cvar
* cleanup
* fix if
* rename
2025-10-21 14:53:49 +02:00
nimlgen
1ad6598963
amd: trace all instructions ( #12831 )
2025-10-21 20:52:24 +08:00
Christopher Milan
cdc72556a1
no more brew ( #12839 )
2025-10-21 08:12:46 -04:00
George Hotz
20a232f1c5
bugfixes from multioutput + PCONTIG=3 for fa bw memory fix ( #12837 )
...
* bugfixes from multioutput
* PCONTIG=3 fixes fa memory usage
* that's base
2025-10-21 19:21:02 +08:00
qazal
0435d31f1c
viz: generic back button functionality ( #12838 )
2025-10-21 18:52:00 +08:00
George Hotz
7d9551ce2e
move to late/control_flow.py ( #12835 )
2025-10-21 18:15:06 +08:00
George Hotz
d711a4b933
delete old linearizer ( #12834 )
...
* new linearizer with early endrange
* cleanups
* second stage removal
* not store
* do that later
* end cleanup
* fix globals
* end
* multi end
* fix ends earlier
* work
* do_merge_ends
* mini change
* range_gate
* fix cpu
* test fixups
* ranges on index
* not for ptx
* delete linearizer
* remove more junk
* delete that test
* we insert endif
* all ends
2025-10-21 17:52:18 +08:00
qazal
40633ab34d
list buffer args to kernel in profiler ( #12826 )
...
* list buffer args to kernel in profiler
* stable order
* back button works
* deselect also works
2025-10-21 17:51:36 +08:00
George Hotz
c780cd9abb
new linearizer with early endrange ( #12823 )
...
* new linearizer with early endrange
* cleanups
* second stage removal
* not store
* do that later
* end cleanup
* fix globals
* end
* multi end
* fix ends earlier
* work
* do_merge_ends
* mini change
* range_gate
* fix cpu
* test fixups
* ranges on index
* not for ptx
2025-10-21 17:37:48 +08:00
George Hotz
d59d4cdbe4
lil less is okay
2025-10-21 17:09:44 +08:00
qazal
32af1ff84b
viz graph drawing small cleanups ( #12830 )
...
* viz graph drawing small cleanups
* str literal
2025-10-21 15:51:32 +08:00
Sieds Lykles
367fbabc30
remove Ops.SUBSTITUTE ( #12827 )
...
* remove Ops.SUBSTITUTE
* remove from viz
2025-10-21 08:19:42 +02:00
qazal
57f6b6f229
style view codegen like a link in profiler ( #12825 )
2025-10-21 12:15:13 +08:00
qazal
154cdfe46d
viz state cleanups ( #12821 )
...
* viz state cleanups
* more generic
2025-10-21 11:44:51 +08:00
George Hotz
a71a41f6d1
rename Ops.ENDRANGE -> Ops.END ( #12824 )
2025-10-21 11:32:18 +08:00
qazal
8521fd5263
viz: hierarchical rewrites ( #12805 )
...
* viz: hierarchical rewrites
* count of subrewrites
* arrows
* better keyboard things
* add select and deselect utils
* works
* diff
* event stopPropagation
* work
* don't change the rewrite
* walk tree back
2025-10-21 10:55:41 +08:00
George Hotz
df2f8b9295
use after on locals ( #12815 )
...
* use after on locals
* fix estimates
* too much compute
* correct for both ptx and normal
* err, that
* tighter spec
* keep that
2025-10-21 10:29:12 +08:00
Christopher Milan
68c045bf0a
NIR: Check for brew packages tinymesa and tinymesa_cpu ( #12739 )
...
* brew install tinymesa_cpu
* brew --prefix tinygrad_cpu too
* fix brew paths
* check both brew paths
* better errors
* handle failure
2025-10-21 09:38:43 +08:00
wozeparrot
990e8b97ee
feat: log openpilot 0.10.1 times ( #12816 )
2025-10-20 18:30:34 -07:00
George Hotz
565a7a6218
num_batches_tracked has shape () ( #12820 )
2025-10-21 09:22:39 +08:00
George Hotz
25beea5769
hotfix: suppress_finalizing on device __del__
2025-10-21 09:04:36 +08:00
chenyu
c7c59e6dd7
unused UPat.or_broadcasted and GroupOp.Block [pr] ( #12819 )
2025-10-20 12:24:58 -04:00
nimlgen
e284f6325a
llvm: fix compile key for different processors ( #12812 )
2025-10-20 19:46:48 +08:00
George Hotz
203a93363c
Revert "after clean up of locals ( #12813 )" ( #12814 )
...
This reverts commit 5d0d3d7aac .
2025-10-20 19:33:35 +08:00
George Hotz
5d0d3d7aac
after clean up of locals ( #12813 )
2025-10-20 19:24:24 +08:00
George Hotz
d1e2c393f8
after in sym, axis_letters in range ( #12811 )
...
* after in sym, axis_letters in range
* this is better
* this work?
2025-10-20 18:54:37 +08:00
Sieds Lykles
a8e4614436
remove REAL_SUBSTITUTE=0 and make it fast ( #12809 )
...
* fast REAL_substitute
* remove REAL_SUBSTITUTE=0
2025-10-20 12:44:20 +02:00
Sieds Lykles
1e93d19ee3
stable diffusion --fakeweights ( #12810 )
2025-10-20 12:41:06 +02:00
nimlgen
b5e36e3c6c
nv: check if jitlink is avail ( #12808 )
...
* nv: check if jitlink is avail
* why
* fix
* fix
2025-10-20 18:13:16 +08:00
George Hotz
b8a9cce783
replace NOOP with AFTER in reg init ( #12804 )
...
* after op
* fix tests
* replace NOOP with AFTER in reg init
* closer
* or_after there
* fix device
* fix all renderers
* better spec for after
2025-10-20 15:34:32 +08:00
qazal
12fd2c9c7b
explicitly set ignore_indexing for schedule only ( #12803 )
2025-10-20 13:11:57 +08:00
qazal
734c99f722
viz: show indexing rewrites during run_rangeify ( #12802 )
...
* viz: show indexing rewrites during run_rangeify
* sinking index
2025-10-20 12:37:03 +08:00
George Hotz
2e9082e0bc
after op ( #12801 )
...
* after op
* fix tests
2025-10-20 12:27:56 +08:00
qazal
339e6edb7d
viz: ui prereqs for hierarchical rewrites ( #12799 )
2025-10-20 12:15:15 +08:00
wozeparrot
357dac8425
feat: allow tuple indexing on uops ( #12797 )
2025-10-19 19:11:05 -07:00
George Hotz
ba593f7b98
don't render index ( #12796 )
...
* don't render index
* update to ignore_indexing
---------
Co-authored-by: qazal <qazal.software@gmail.com >
2025-10-20 09:48:36 +08:00
George Hotz
cad3ada909
tinygpu: build with SIP off works
2025-10-20 09:11:09 +08:00
nimlgen
9cd35deae7
amd: fix alignment + pointers for aql over usb ( #12793 )
2025-10-19 23:55:57 +08:00
nimlgen
59784a5972
amd: ensure ts is written ( #12794 )
2025-10-19 23:55:49 +08:00
chenyu
63a23dfe80
test step 0 in TestTrainingOnnxOps ( #12790 )
...
and tighter rtol
2025-10-19 09:15:49 -04:00
chenyu
e8158afd4b
update test_qlinear_add_round_half_to_even ( #12789 )
...
this does not pass locally
2025-10-19 08:47:27 -04:00
Sieds Lykles
1df9c7d7e7
reduce_collapse uses symbolic_flat ( #12766 )
...
* sym->symbolic_flat
* cast invalid drops invalid
2025-10-19 12:27:47 +02:00
Sieds Lykles
fd6ef4801c
rangeify uses symbolic_flat ( #12786 )
...
* symbolic_simple -> symbolic_flat
* remove expected failures
2025-10-19 12:27:14 +02:00
George Hotz
89e7f2fa00
mmapeak: gfx1103 support
2025-10-19 16:57:28 +08:00