Commit Graph

10698 Commits

Author SHA1 Message Date
George Hotz
c9f1ed10c3 gate MULTIOUTPUT 2025-10-21 18:25:38 +08:00
George Hotz
483cd44cbf Merge branch 'master' into multioutput 2025-10-21 18:16:34 +08:00
George Hotz
7d9551ce2e move to late/control_flow.py (#12835) 2025-10-21 18:15:06 +08:00
George Hotz
d711a4b933 delete old linearizer (#12834)
* new linearizer with early endrange

* cleanups

* second stage removal

* not store

* do that later

* end cleanup

* fix globals

* end

* multi end

* fix ends earlier

* work

* do_merge_ends

* mini change

* range_gate

* fix cpu

* test fixups

* ranges on index

* not for ptx

* delete linearizer

* remove more junk

* delete that test

* we insert endif

* all ends
2025-10-21 17:52:18 +08:00
qazal
40633ab34d list buffer args to kernel in profiler (#12826)
* list buffer args to kernel in profiler

* stable order

* back button works

* deselect also works
2025-10-21 17:51:36 +08:00
George Hotz
c780cd9abb new linearizer with early endrange (#12823)
* new linearizer with early endrange

* cleanups

* second stage removal

* not store

* do that later

* end cleanup

* fix globals

* end

* multi end

* fix ends earlier

* work

* do_merge_ends

* mini change

* range_gate

* fix cpu

* test fixups

* ranges on index

* not for ptx
2025-10-21 17:37:48 +08:00
George Hotz
d59d4cdbe4 lil less is okay 2025-10-21 17:09:44 +08:00
qazal
32af1ff84b viz graph drawing small cleanups (#12830)
* viz graph drawing small cleanups

* str literal
2025-10-21 15:51:32 +08:00
Sieds Lykles
367fbabc30 remove Ops.SUBSTITUTE (#12827)
* remove Ops.SUBSTITUTE

* remove from viz
2025-10-21 08:19:42 +02:00
qazal
57f6b6f229 style view codegen like a link in profiler (#12825) 2025-10-21 12:15:13 +08:00
qazal
154cdfe46d viz state cleanups (#12821)
* viz state cleanups

* more generic
2025-10-21 11:44:51 +08:00
George Hotz
a71a41f6d1 rename Ops.ENDRANGE -> Ops.END (#12824) 2025-10-21 11:32:18 +08:00
qazal
8521fd5263 viz: hierarchical rewrites (#12805)
* viz: hierarchical rewrites

* count of subrewrites

* arrows

* better keyboard things

* add select and deselect utils

* works

* diff

* event stopPropagation

* work

* don't change the rewrite

* walk tree back
2025-10-21 10:55:41 +08:00
George Hotz
df2f8b9295 use after on locals (#12815)
* use after on locals

* fix estimates

* too much compute

* correct for both ptx and normal

* err, that

* tighter spec

* keep that
2025-10-21 10:29:12 +08:00
Christopher Milan
68c045bf0a NIR: Check for brew packages tinymesa and tinymesa_cpu (#12739)
* brew install tinymesa_cpu

* brew --prefix tinygrad_cpu too

* fix brew paths

* check both brew paths

* better errors

* handle failure
2025-10-21 09:38:43 +08:00
wozeparrot
990e8b97ee feat: log openpilot 0.10.1 times (#12816) 2025-10-20 18:30:34 -07:00
George Hotz
565a7a6218 num_batches_tracked has shape () (#12820) 2025-10-21 09:22:39 +08:00
George Hotz
25beea5769 hotfix: suppress_finalizing on device __del__ 2025-10-21 09:04:36 +08:00
chenyu
c7c59e6dd7 unused UPat.or_broadcasted and GroupOp.Block [pr] (#12819) 2025-10-20 12:24:58 -04:00
nimlgen
e284f6325a llvm: fix compile key for different processors (#12812) 2025-10-20 19:46:48 +08:00
George Hotz
203a93363c Revert "after clean up of locals (#12813)" (#12814)
This reverts commit 5d0d3d7aac.
2025-10-20 19:33:35 +08:00
George Hotz
5d0d3d7aac after clean up of locals (#12813) 2025-10-20 19:24:24 +08:00
George Hotz
d1e2c393f8 after in sym, axis_letters in range (#12811)
* after in sym, axis_letters in range

* this is better

* this work?
2025-10-20 18:54:37 +08:00
Sieds Lykles
a8e4614436 remove REAL_SUBSTITUTE=0 and make it fast (#12809)
* fast REAL_substitute

* remove REAL_SUBSTITUTE=0
2025-10-20 12:44:20 +02:00
Sieds Lykles
1e93d19ee3 stable diffusion --fakeweights (#12810) 2025-10-20 12:41:06 +02:00
nimlgen
b5e36e3c6c nv: check if jitlink is avail (#12808)
* nv: check if jitlink is avail

* why

* fix

* fix
2025-10-20 18:13:16 +08:00
George Hotz
b8a9cce783 replace NOOP with AFTER in reg init (#12804)
* after op

* fix tests

* replace NOOP with AFTER in reg init

* closer

* or_after there

* fix device

* fix all renderers

* better spec for after
2025-10-20 15:34:32 +08:00
qazal
12fd2c9c7b explicitly set ignore_indexing for schedule only (#12803) 2025-10-20 13:11:57 +08:00
qazal
734c99f722 viz: show indexing rewrites during run_rangeify (#12802)
* viz: show indexing rewrites during run_rangeify

* sinking index
2025-10-20 12:37:03 +08:00
George Hotz
2e9082e0bc after op (#12801)
* after op

* fix tests
2025-10-20 12:27:56 +08:00
qazal
339e6edb7d viz: ui prereqs for hierarchical rewrites (#12799) 2025-10-20 12:15:15 +08:00
George Hotz
aecd51f54a start multioutput support 2025-10-20 11:17:00 +08:00
wozeparrot
357dac8425 feat: allow tuple indexing on uops (#12797) 2025-10-19 19:11:05 -07:00
George Hotz
ba593f7b98 don't render index (#12796)
* don't render index

* update to ignore_indexing

---------

Co-authored-by: qazal <qazal.software@gmail.com>
2025-10-20 09:48:36 +08:00
George Hotz
cad3ada909 tinygpu: build with SIP off works 2025-10-20 09:11:09 +08:00
nimlgen
9cd35deae7 amd: fix alignment + pointers for aql over usb (#12793) 2025-10-19 23:55:57 +08:00
nimlgen
59784a5972 amd: ensure ts is written (#12794) 2025-10-19 23:55:49 +08:00
chenyu
63a23dfe80 test step 0 in TestTrainingOnnxOps (#12790)
and tighter rtol
2025-10-19 09:15:49 -04:00
chenyu
e8158afd4b update test_qlinear_add_round_half_to_even (#12789)
this does not pass locally
2025-10-19 08:47:27 -04:00
Sieds Lykles
1df9c7d7e7 reduce_collapse uses symbolic_flat (#12766)
* sym->symbolic_flat

* cast invalid drops invalid
2025-10-19 12:27:47 +02:00
Sieds Lykles
fd6ef4801c rangeify uses symbolic_flat (#12786)
* symbolic_simple -> symbolic_flat

* remove expected failures
2025-10-19 12:27:14 +02:00
George Hotz
89e7f2fa00 mmapeak: gfx1103 support 2025-10-19 16:57:28 +08:00
George Hotz
617614beb7 add mi350x support to mmapeak (#12784) 2025-10-19 16:11:07 +08:00
qazal
c8ef4b60f6 viz: share match tracing and TINY device profiler (#12783)
* set a default name for the traces

* set profile_matches + renames

* profile_matches test

* traces 4 steps total
2025-10-19 14:30:07 +08:00
chenyu
350a4754a9 Update openpilot models (#12780)
* Update openpilot models

* Update slower model

* fix that

---------

Co-authored-by: Bruce Wayne <harald.the.engineer@gmail.com>
2025-10-18 20:32:35 -04:00
chenyu
30ff84d050 update test_conv2d_ceildiv_edge_case (#12779) 2025-10-18 16:43:32 -04:00
nimlgen
442218266d qcom: fix profiler (#12778)
* qcom: fix profiler

* this way
2025-10-19 01:27:59 +08:00
Harald Schäfer
addc54b96c Simplify openpilot compile3.py (#12748)
* Simpler compile3

* tests

* remove default args

* onnx file is still fp16

* self-test FP16 too

* allow test disable

* absurd tolerance

* Just do latest

* Try simplest

* use later models

* kernel count not relevant if speed is good

* dead improts

* Revert "dead improts"

This reverts commit f68c2cd15d.

* Revert "kernel count not relevant if speed is good"

This reverts commit 0955ca4ee0.

* add back kernal count check on latest model
2025-10-18 10:12:22 -04:00
nimlgen
037f6e8fa0 qcom: ioctl for 7xx (#12777) 2025-10-18 20:33:14 +08:00
wozeparrot
82f10cfe2e feat: assert on bufferview math (#12772) 2025-10-17 14:20:08 -07:00