George Hotz
1f8b24a6b9
track flag count and op count ( #13416 )
...
* track flag count and op count
* text
* more
* file count
2025-11-21 22:46:33 -08:00
George Hotz
4c0f4226b9
delete the PRECAST op [p] ( #13415 )
...
* don't use PRECAST in cstyle renderer [p]
* fix in metal
* fix opencl
* __builtin_bit_cast
* precast is unused
* cuda is c99?
* lambda_union_bitcast
* helper function
* delete precast op
2025-11-21 21:47:14 -08:00
wozeparrot
1f648bb1ba
feat: reenable mobilenetv2 dsp ( #13320 )
2025-11-21 15:21:49 -08:00
chenyu
054477a44f
remove full_symbolic in simplify ( #13413 )
...
only flip one schedule in winograd backward, no functional difference
2025-11-21 15:04:00 -05:00
chenyu
cb29265f23
add test that shows the validhack regression with bad rewrite order ( #13411 )
2025-11-21 13:48:30 -05:00
qazal
fdfe83880b
viz: unique sqtt wave names ( #13410 )
...
* viz: unique sqtt wave names
* better name for the shape
* it's a per program counter now
* table view, refactor to wave:insts dict
2025-11-22 02:43:31 +08:00
chenyu
a6c9b4ff6a
fix symbolic comments [pr] ( #13408 )
2025-11-21 09:18:50 -05:00
Sieds Lykles
114bb94c55
Fix load collapse MAX to ADD ( #13406 )
...
* add Ops.ADD to pattern
* add test
2025-11-21 12:26:14 +01:00
qazal
87c248eafa
small cleanups from viz memory usage fixes ( #13405 )
...
* shape link cleanups
* cleanup findRectAtPosition
2025-11-21 17:05:08 +08:00
qazal
0de1b24154
viz: SE : CU : SIMD : WAVE in sqtt timeline ( #13404 )
...
* wave id in device rows
* SE : CU : SIMD : WAVE
* automatic width
* better styling
* rm the blue
* sort
2025-11-21 15:42:29 +08:00
George Hotz
dabb02767f
set AMD profile mode with sudo on SQTT or PMC ( #13403 )
...
* require profile mode
* add mode setter
* cleanup
* not needed
* SQTT_LIMIT_SE
2025-11-20 23:19:11 -08:00
George Hotz
e1051d00d7
multi like on full_like as well as rand_like ( #13402 )
...
* multi like on full_like as well as rand_like
* add test and fix bug
* mismatch, optim match
* one line
2025-11-20 20:46:48 -08:00
chenyu
fa3def2f12
call less simplify in simplify_valid_load [pr] ( #13401 )
2025-11-20 19:54:22 -05:00
qazal
895ec7417e
viz: enable mapping function names to colors ( #13400 )
2025-11-21 06:43:02 +08:00
George Hotz
a74f6020d5
track apply map to tensors ( #13399 )
...
* track apply map to tensors
* sub
2025-11-20 14:24:55 -08:00
chenyu
647fde64e6
no sym in pm_reduce [pr] ( #13398 )
...
* no sym in pm_reduce [pr]
* fix that
2025-11-20 16:49:09 -05:00
qazal
1313250e0d
viz: use system helper for llvm-mca ( #13395 )
2025-11-21 04:47:25 +08:00
Christopher Milan
de3593957f
Revert "Revert "autogen: fix formatting on zero-argument function-like macros…" ( #13388 )
...
This reverts commit 0901a40685 .
2025-11-20 15:36:13 -05:00
qazal
1220072328
viz: refactor to generic steps api ( #13393 )
2025-11-21 04:33:23 +08:00
George Hotz
26ccbf7040
debufferize with symbolic in one pm ( #13392 )
2025-11-20 11:47:03 -08:00
George Hotz
c46f608703
top down remove_bufferize ( #13391 )
...
* top down remove_bufferize
* removable if ALWAYS_CONTIGUOUS
2025-11-20 11:32:00 -08:00
Christopher Milan
4043489803
set curl -f in setup-tinygrad ( #13389 )
...
* set curl -f in setup-tinygrad
* test bad redirect
* Revert "test bad redirect"
This reverts commit ad945e7ffc .
2025-11-20 13:45:47 -05:00
chenyu
0251a8e628
parse_valid minor cleanup [pr] ( #13385 )
...
* stricter parse_valid [pr]
* not stricter
* no VCONST
* Revert "no VCONST"
This reverts commit 330dbdf4060562596febcbf970bda6051a35012f.
2025-11-20 13:15:06 -05:00
Christopher Milan
0901a40685
Revert "autogen: fix formatting on zero-argument function-like macros ( #13386 )" ( #13387 )
...
This reverts commit 58d85d4bab .
2025-11-20 12:45:35 -05:00
b1tg
91e289cb14
amd fp8 llvm ( #13186 )
...
* amd fp8 llvm support
* fix max
* clean
* add test_mi350.sh
---------
Co-authored-by: chenyu <chenyu@fastmail.com >
2025-11-20 12:35:57 -05:00
Roelof van Dijk
1058748440
torch backend: no aten.detach for torch 2.10 compat ( #13381 )
...
* this works, less cpp?
* simpler = better
* keep torch 2.9 working as well
2025-11-20 09:12:15 -08:00
Christopher Milan
58d85d4bab
autogen: fix formatting on zero-argument function-like macros ( #13386 )
...
* fix formatting on zero-argument function-like macros
* autogen tests should run
* ugh
2025-11-20 12:11:04 -05:00
qazal
9dbc550692
roc: map disassembly to prog name ( #13384 )
2025-11-20 23:47:19 +08:00
qazal
ebcdf68bab
viz: use content headers for profiler ( #13383 )
2025-11-20 23:33:16 +08:00
nimlgen
0b0ea4981c
hcq: unwrap signals ( #13382 )
2025-11-20 18:12:41 +03:00
qazal
9dcd52287a
add external_benchmark_pyrender ( #13378 )
...
* add external_benchmark_pyrender
* can ctrlc it
* cpu_profile exists
2025-11-20 17:38:28 +08:00
George Hotz
cb38c704c3
delete nonfunctional ramp.py
2025-11-19 20:43:44 -08:00
George Hotz
8919c994b7
Revert "AxisType.PLACEHOLDER in reshape to do less graph_rewrite ( #13373 )" ( #13375 )
...
This reverts commit ac7559e33d .
2025-11-19 19:34:30 -08:00
George Hotz
ac7559e33d
AxisType.PLACEHOLDER in reshape to do less graph_rewrite ( #13373 )
...
* AxisType.PLACEHOLDER in reshape to do less graph_rewrite
* _apply_movement_op cache
2025-11-19 19:19:58 -08:00
chenyu
050682ab40
use invalid_gate consistently [pr] ( #13374 )
2025-11-19 22:15:12 -05:00
Roelof van Dijk
0dc2ff431d
fix: revive torch backend ( #13280 )
...
* fix: revive torch backend
* as_strided view vs copy
* Revert "as_strided view vs copy"
This reverts commit 82a61223f2 .
* add extra tests (move inplace, add fusion tests)
* better fusion with inplace_op
* no optimizer hooks (break mnist training fusion)
* split off fusion tests in separate file, assert on resnet fusion
fix: remove comments
* cleanup, reduce diff
* reduce diff
* better fusion and identity checks
---------
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com >
2025-11-19 15:26:50 -08:00
wozeparrot
56b2540349
tk: keep extra tile data by replacing uop ( #13370 )
2025-11-19 15:11:43 -08:00
George Hotz
ab7df42c78
bring back fold_divmod_general with bugfix and test [pr] ( #13369 )
...
* Revert "Revert "merge to fold_divmod_general [p] (#13359 )""
This reverts commit 05ccc69248 .
* Revert "Revert "actually merge to fold_divmod_general [pr] (#13363 )""
This reverts commit 90e5752199 .
* Revert "Revert "add cache to fold_divmod_general (#13365 )""
This reverts commit 8e17bd6791 .
* bring back fold_divmod_general with bugfix and test
2025-11-19 14:51:51 -08:00
George Hotz
986d113024
symbolic fuzz failure ( #13367 )
...
* symbolic fuzz failure
* skip flaky test
2025-11-19 14:21:08 -08:00
George Hotz
05ccc69248
Revert "merge to fold_divmod_general [p] ( #13359 )"
...
This reverts commit 7711bbac7f .
2025-11-19 14:18:09 -08:00
George Hotz
90e5752199
Revert "actually merge to fold_divmod_general [pr] ( #13363 )"
...
This reverts commit 3d82b83cec .
2025-11-19 14:18:08 -08:00
George Hotz
8e17bd6791
Revert "add cache to fold_divmod_general ( #13365 )"
...
This reverts commit b5309a5043 .
2025-11-19 14:18:08 -08:00
George Hotz
b5309a5043
add cache to fold_divmod_general ( #13365 )
2025-11-19 13:49:18 -08:00
George Hotz
3d82b83cec
actually merge to fold_divmod_general [pr] ( #13363 )
...
* actually merge to fold_divmod_general [pr]
* one more merge
* Revert "one more merge"
This reverts commit aa79f6781c .
* avoid that case for speed
* faster and simpler
2025-11-19 13:17:56 -08:00
chenyu
a91f00925b
remove VECTORIZE and WMMA rules from sym [pr] ( #13362 )
2025-11-19 14:51:21 -05:00
George Hotz
7711bbac7f
merge to fold_divmod_general [p] ( #13359 )
...
* merge to fold_divmod_general [p]
* merge more
* merge more
* merge more
2025-11-19 11:37:45 -08:00
George Hotz
6fdbd03104
more divmod cleanup [p] ( #13358 )
...
* more divmod cleanup [p]
* lil cleanups, faster
2025-11-19 10:35:15 -08:00
George Hotz
bd88a72149
div and mod to its own file, try 2 [p] ( #13357 )
2025-11-19 10:10:06 -08:00
George Hotz
957cf717e7
Python speed ( #13355 )
...
* skip process replay by default
* work on python speed
* fix names of rewrite rules
* fix that test
2025-11-19 09:03:00 -08:00
chenyu
fc19ea76b5
clean up threefry rules ( #13354 )
2025-11-19 11:48:07 -05:00