Commit Graph

11106 Commits

Author SHA1 Message Date
wozeparrot
c3149c618a feat: nvcc compiler (#12852) 2025-10-21 11:31:23 -07:00
chenyu
8baa61bd67 use torch 2.9 and its Muon in test (#12773)
* use torch 2.9 and its Muon in test

* relax and disable
2025-10-21 13:35:17 -04:00
chenyu
f51f9aaa16 muon ns_params -> ns_coefficients (#12850)
match the official torch one
2025-10-21 12:35:52 -04:00
wozeparrot
62e7b8b870 feat: just use compile3 (#12849) 2025-10-21 07:56:50 -07:00
nimlgen
c7336c3e31 amd: sqtt for aql (#12846) 2025-10-21 22:35:01 +08:00
George Hotz
8960ac54f3 remove RewriteStep premature optimization (#12840)
* remove RewriteStep premature optimization

* fix ebs

* core line count
2025-10-21 21:45:20 +08:00
Sieds Lykles
7f798a9630 Cleanup const buffers (#12829)
* split pm_cleanups

* update test_schedule

* shrink when we remove bufferize

* dont do shrink if shape is empty

* update tests

* remove *1 from metadata

* deal with the noop bufferize

* only noop on cvar

* cleanup

* fix if

* rename
2025-10-21 14:53:49 +02:00
nimlgen
1ad6598963 amd: trace all instructions (#12831) 2025-10-21 20:52:24 +08:00
Christopher Milan
cdc72556a1 no more brew (#12839) 2025-10-21 08:12:46 -04:00
George Hotz
20a232f1c5 bugfixes from multioutput + PCONTIG=3 for fa bw memory fix (#12837)
* bugfixes from multioutput

* PCONTIG=3 fixes fa memory usage

* that's base
2025-10-21 19:21:02 +08:00
qazal
0435d31f1c viz: generic back button functionality (#12838) 2025-10-21 18:52:00 +08:00
George Hotz
7d9551ce2e move to late/control_flow.py (#12835) 2025-10-21 18:15:06 +08:00
George Hotz
d711a4b933 delete old linearizer (#12834)
* new linearizer with early endrange

* cleanups

* second stage removal

* not store

* do that later

* end cleanup

* fix globals

* end

* multi end

* fix ends earlier

* work

* do_merge_ends

* mini change

* range_gate

* fix cpu

* test fixups

* ranges on index

* not for ptx

* delete linearizer

* remove more junk

* delete that test

* we insert endif

* all ends
2025-10-21 17:52:18 +08:00
qazal
40633ab34d list buffer args to kernel in profiler (#12826)
* list buffer args to kernel in profiler

* stable order

* back button works

* deselect also works
2025-10-21 17:51:36 +08:00
George Hotz
c780cd9abb new linearizer with early endrange (#12823)
* new linearizer with early endrange

* cleanups

* second stage removal

* not store

* do that later

* end cleanup

* fix globals

* end

* multi end

* fix ends earlier

* work

* do_merge_ends

* mini change

* range_gate

* fix cpu

* test fixups

* ranges on index

* not for ptx
2025-10-21 17:37:48 +08:00
George Hotz
d59d4cdbe4 lil less is okay 2025-10-21 17:09:44 +08:00
qazal
32af1ff84b viz graph drawing small cleanups (#12830)
* viz graph drawing small cleanups

* str literal
2025-10-21 15:51:32 +08:00
Sieds Lykles
367fbabc30 remove Ops.SUBSTITUTE (#12827)
* remove Ops.SUBSTITUTE

* remove from viz
2025-10-21 08:19:42 +02:00
qazal
57f6b6f229 style view codegen like a link in profiler (#12825) 2025-10-21 12:15:13 +08:00
qazal
154cdfe46d viz state cleanups (#12821)
* viz state cleanups

* more generic
2025-10-21 11:44:51 +08:00
George Hotz
a71a41f6d1 rename Ops.ENDRANGE -> Ops.END (#12824) 2025-10-21 11:32:18 +08:00
qazal
8521fd5263 viz: hierarchical rewrites (#12805)
* viz: hierarchical rewrites

* count of subrewrites

* arrows

* better keyboard things

* add select and deselect utils

* works

* diff

* event stopPropagation

* work

* don't change the rewrite

* walk tree back
2025-10-21 10:55:41 +08:00
George Hotz
df2f8b9295 use after on locals (#12815)
* use after on locals

* fix estimates

* too much compute

* correct for both ptx and normal

* err, that

* tighter spec

* keep that
2025-10-21 10:29:12 +08:00
Christopher Milan
68c045bf0a NIR: Check for brew packages tinymesa and tinymesa_cpu (#12739)
* brew install tinymesa_cpu

* brew --prefix tinygrad_cpu too

* fix brew paths

* check both brew paths

* better errors

* handle failure
2025-10-21 09:38:43 +08:00
wozeparrot
990e8b97ee feat: log openpilot 0.10.1 times (#12816) 2025-10-20 18:30:34 -07:00
George Hotz
565a7a6218 num_batches_tracked has shape () (#12820) 2025-10-21 09:22:39 +08:00
George Hotz
25beea5769 hotfix: suppress_finalizing on device __del__ 2025-10-21 09:04:36 +08:00
chenyu
c7c59e6dd7 unused UPat.or_broadcasted and GroupOp.Block [pr] (#12819) 2025-10-20 12:24:58 -04:00
nimlgen
e284f6325a llvm: fix compile key for different processors (#12812) 2025-10-20 19:46:48 +08:00
George Hotz
203a93363c Revert "after clean up of locals (#12813)" (#12814)
This reverts commit 5d0d3d7aac.
2025-10-20 19:33:35 +08:00
George Hotz
5d0d3d7aac after clean up of locals (#12813) 2025-10-20 19:24:24 +08:00
George Hotz
d1e2c393f8 after in sym, axis_letters in range (#12811)
* after in sym, axis_letters in range

* this is better

* this work?
2025-10-20 18:54:37 +08:00
Sieds Lykles
a8e4614436 remove REAL_SUBSTITUTE=0 and make it fast (#12809)
* fast REAL_substitute

* remove REAL_SUBSTITUTE=0
2025-10-20 12:44:20 +02:00
Sieds Lykles
1e93d19ee3 stable diffusion --fakeweights (#12810) 2025-10-20 12:41:06 +02:00
nimlgen
b5e36e3c6c nv: check if jitlink is avail (#12808)
* nv: check if jitlink is avail

* why

* fix

* fix
2025-10-20 18:13:16 +08:00
George Hotz
b8a9cce783 replace NOOP with AFTER in reg init (#12804)
* after op

* fix tests

* replace NOOP with AFTER in reg init

* closer

* or_after there

* fix device

* fix all renderers

* better spec for after
2025-10-20 15:34:32 +08:00
qazal
12fd2c9c7b explicitly set ignore_indexing for schedule only (#12803) 2025-10-20 13:11:57 +08:00
qazal
734c99f722 viz: show indexing rewrites during run_rangeify (#12802)
* viz: show indexing rewrites during run_rangeify

* sinking index
2025-10-20 12:37:03 +08:00
George Hotz
2e9082e0bc after op (#12801)
* after op

* fix tests
2025-10-20 12:27:56 +08:00
qazal
339e6edb7d viz: ui prereqs for hierarchical rewrites (#12799) 2025-10-20 12:15:15 +08:00
wozeparrot
357dac8425 feat: allow tuple indexing on uops (#12797) 2025-10-19 19:11:05 -07:00
George Hotz
ba593f7b98 don't render index (#12796)
* don't render index

* update to ignore_indexing

---------

Co-authored-by: qazal <qazal.software@gmail.com>
2025-10-20 09:48:36 +08:00
George Hotz
cad3ada909 tinygpu: build with SIP off works 2025-10-20 09:11:09 +08:00
nimlgen
9cd35deae7 amd: fix alignment + pointers for aql over usb (#12793) 2025-10-19 23:55:57 +08:00
nimlgen
59784a5972 amd: ensure ts is written (#12794) 2025-10-19 23:55:49 +08:00
chenyu
63a23dfe80 test step 0 in TestTrainingOnnxOps (#12790)
and tighter rtol
2025-10-19 09:15:49 -04:00
chenyu
e8158afd4b update test_qlinear_add_round_half_to_even (#12789)
this does not pass locally
2025-10-19 08:47:27 -04:00
Sieds Lykles
1df9c7d7e7 reduce_collapse uses symbolic_flat (#12766)
* sym->symbolic_flat

* cast invalid drops invalid
2025-10-19 12:27:47 +02:00
Sieds Lykles
fd6ef4801c rangeify uses symbolic_flat (#12786)
* symbolic_simple -> symbolic_flat

* remove expected failures
2025-10-19 12:27:14 +02:00
George Hotz
89e7f2fa00 mmapeak: gfx1103 support 2025-10-19 16:57:28 +08:00