Commit Graph

11201 Commits

Author SHA1 Message Date
George Hotz
1f8b24a6b9 track flag count and op count (#13416)
* track flag count and op count

* text

* more

* file count
2025-11-21 22:46:33 -08:00
George Hotz
4c0f4226b9 delete the PRECAST op [p] (#13415)
* don't use PRECAST in cstyle renderer [p]

* fix in metal

* fix opencl

* __builtin_bit_cast

* precast is unused

* cuda is c99?

* lambda_union_bitcast

* helper function

* delete precast op
2025-11-21 21:47:14 -08:00
wozeparrot
1f648bb1ba feat: reenable mobilenetv2 dsp (#13320) 2025-11-21 15:21:49 -08:00
chenyu
054477a44f remove full_symbolic in simplify (#13413)
only flip one schedule in winograd backward, no functional difference
2025-11-21 15:04:00 -05:00
chenyu
cb29265f23 add test that shows the validhack regression with bad rewrite order (#13411) 2025-11-21 13:48:30 -05:00
qazal
fdfe83880b viz: unique sqtt wave names (#13410)
* viz: unique sqtt wave names

* better name for the shape

* it's a per program counter now

* table view, refactor to wave:insts dict
2025-11-22 02:43:31 +08:00
chenyu
a6c9b4ff6a fix symbolic comments [pr] (#13408) 2025-11-21 09:18:50 -05:00
Sieds Lykles
114bb94c55 Fix load collapse MAX to ADD (#13406)
* add Ops.ADD to pattern

* add test
2025-11-21 12:26:14 +01:00
qazal
87c248eafa small cleanups from viz memory usage fixes (#13405)
* shape link cleanups

* cleanup findRectAtPosition
2025-11-21 17:05:08 +08:00
qazal
0de1b24154 viz: SE : CU : SIMD : WAVE in sqtt timeline (#13404)
* wave id in device rows

* SE : CU : SIMD : WAVE

* automatic width

* better styling

* rm the blue

* sort
2025-11-21 15:42:29 +08:00
George Hotz
dabb02767f set AMD profile mode with sudo on SQTT or PMC (#13403)
* require profile mode

* add mode setter

* cleanup

* not needed

* SQTT_LIMIT_SE
2025-11-20 23:19:11 -08:00
George Hotz
e1051d00d7 multi like on full_like as well as rand_like (#13402)
* multi like on full_like as well as rand_like

* add test and fix bug

* mismatch, optim match

* one line
2025-11-20 20:46:48 -08:00
chenyu
fa3def2f12 call less simplify in simplify_valid_load [pr] (#13401) 2025-11-20 19:54:22 -05:00
qazal
895ec7417e viz: enable mapping function names to colors (#13400) 2025-11-21 06:43:02 +08:00
George Hotz
a74f6020d5 track apply map to tensors (#13399)
* track apply map to tensors

* sub
2025-11-20 14:24:55 -08:00
chenyu
647fde64e6 no sym in pm_reduce [pr] (#13398)
* no sym in pm_reduce [pr]

* fix that
2025-11-20 16:49:09 -05:00
qazal
1313250e0d viz: use system helper for llvm-mca (#13395) 2025-11-21 04:47:25 +08:00
Christopher Milan
de3593957f Revert "Revert "autogen: fix formatting on zero-argument function-like macros…" (#13388)
This reverts commit 0901a40685.
2025-11-20 15:36:13 -05:00
qazal
1220072328 viz: refactor to generic steps api (#13393) 2025-11-21 04:33:23 +08:00
George Hotz
26ccbf7040 debufferize with symbolic in one pm (#13392) 2025-11-20 11:47:03 -08:00
George Hotz
c46f608703 top down remove_bufferize (#13391)
* top down remove_bufferize

* removable if ALWAYS_CONTIGUOUS
2025-11-20 11:32:00 -08:00
Christopher Milan
4043489803 set curl -f in setup-tinygrad (#13389)
* set curl -f in setup-tinygrad

* test bad redirect

* Revert "test bad redirect"

This reverts commit ad945e7ffc.
2025-11-20 13:45:47 -05:00
chenyu
0251a8e628 parse_valid minor cleanup [pr] (#13385)
* stricter parse_valid [pr]

* not stricter

* no VCONST

* Revert "no VCONST"

This reverts commit 330dbdf4060562596febcbf970bda6051a35012f.
2025-11-20 13:15:06 -05:00
Christopher Milan
0901a40685 Revert "autogen: fix formatting on zero-argument function-like macros (#13386)" (#13387)
This reverts commit 58d85d4bab.
2025-11-20 12:45:35 -05:00
b1tg
91e289cb14 amd fp8 llvm (#13186)
* amd fp8 llvm support

* fix max

* clean

* add test_mi350.sh

---------

Co-authored-by: chenyu <chenyu@fastmail.com>
2025-11-20 12:35:57 -05:00
Roelof van Dijk
1058748440 torch backend: no aten.detach for torch 2.10 compat (#13381)
* this works, less cpp?

* simpler = better

* keep torch 2.9 working as well
2025-11-20 09:12:15 -08:00
Christopher Milan
58d85d4bab autogen: fix formatting on zero-argument function-like macros (#13386)
* fix formatting on zero-argument function-like macros

* autogen tests should run

* ugh
2025-11-20 12:11:04 -05:00
qazal
9dbc550692 roc: map disassembly to prog name (#13384) 2025-11-20 23:47:19 +08:00
qazal
ebcdf68bab viz: use content headers for profiler (#13383) 2025-11-20 23:33:16 +08:00
nimlgen
0b0ea4981c hcq: unwrap signals (#13382) 2025-11-20 18:12:41 +03:00
qazal
9dcd52287a add external_benchmark_pyrender (#13378)
* add external_benchmark_pyrender

* can ctrlc it

* cpu_profile exists
2025-11-20 17:38:28 +08:00
George Hotz
cb38c704c3 delete nonfunctional ramp.py 2025-11-19 20:43:44 -08:00
George Hotz
8919c994b7 Revert "AxisType.PLACEHOLDER in reshape to do less graph_rewrite (#13373)" (#13375)
This reverts commit ac7559e33d.
2025-11-19 19:34:30 -08:00
George Hotz
ac7559e33d AxisType.PLACEHOLDER in reshape to do less graph_rewrite (#13373)
* AxisType.PLACEHOLDER in reshape to do less graph_rewrite

* _apply_movement_op cache
2025-11-19 19:19:58 -08:00
chenyu
050682ab40 use invalid_gate consistently [pr] (#13374) 2025-11-19 22:15:12 -05:00
Roelof van Dijk
0dc2ff431d fix: revive torch backend (#13280)
* fix: revive torch backend

* as_strided view vs copy

* Revert "as_strided view vs copy"

This reverts commit 82a61223f2.

* add extra tests (move inplace, add fusion tests)

* better fusion with inplace_op

* no optimizer hooks (break mnist training fusion)

* split off fusion tests in separate file, assert on resnet fusion

fix: remove comments

* cleanup, reduce diff

* reduce diff

* better fusion and identity checks

---------

Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
2025-11-19 15:26:50 -08:00
wozeparrot
56b2540349 tk: keep extra tile data by replacing uop (#13370) 2025-11-19 15:11:43 -08:00
George Hotz
ab7df42c78 bring back fold_divmod_general with bugfix and test [pr] (#13369)
* Revert "Revert "merge to fold_divmod_general [p] (#13359)""

This reverts commit 05ccc69248.

* Revert "Revert "actually merge to fold_divmod_general [pr] (#13363)""

This reverts commit 90e5752199.

* Revert "Revert "add cache to fold_divmod_general (#13365)""

This reverts commit 8e17bd6791.

* bring back fold_divmod_general with bugfix and test
2025-11-19 14:51:51 -08:00
George Hotz
986d113024 symbolic fuzz failure (#13367)
* symbolic fuzz failure

* skip flaky test
2025-11-19 14:21:08 -08:00
George Hotz
05ccc69248 Revert "merge to fold_divmod_general [p] (#13359)"
This reverts commit 7711bbac7f.
2025-11-19 14:18:09 -08:00
George Hotz
90e5752199 Revert "actually merge to fold_divmod_general [pr] (#13363)"
This reverts commit 3d82b83cec.
2025-11-19 14:18:08 -08:00
George Hotz
8e17bd6791 Revert "add cache to fold_divmod_general (#13365)"
This reverts commit b5309a5043.
2025-11-19 14:18:08 -08:00
George Hotz
b5309a5043 add cache to fold_divmod_general (#13365) 2025-11-19 13:49:18 -08:00
George Hotz
3d82b83cec actually merge to fold_divmod_general [pr] (#13363)
* actually merge to fold_divmod_general [pr]

* one more merge

* Revert "one more merge"

This reverts commit aa79f6781c.

* avoid that case for speed

* faster and simpler
2025-11-19 13:17:56 -08:00
chenyu
a91f00925b remove VECTORIZE and WMMA rules from sym [pr] (#13362) 2025-11-19 14:51:21 -05:00
George Hotz
7711bbac7f merge to fold_divmod_general [p] (#13359)
* merge to fold_divmod_general [p]

* merge more

* merge more

* merge more
2025-11-19 11:37:45 -08:00
George Hotz
6fdbd03104 more divmod cleanup [p] (#13358)
* more divmod cleanup [p]

* lil cleanups, faster
2025-11-19 10:35:15 -08:00
George Hotz
bd88a72149 div and mod to its own file, try 2 [p] (#13357) 2025-11-19 10:10:06 -08:00
George Hotz
957cf717e7 Python speed (#13355)
* skip process replay by default

* work on python speed

* fix names of rewrite rules

* fix that test
2025-11-19 09:03:00 -08:00
chenyu
fc19ea76b5 clean up threefry rules (#13354) 2025-11-19 11:48:07 -05:00