George Hotz
1b1752b6f8
Revert "remove the on_stack stuff and any retries"
...
This reverts commit 49a2b328b9 .
2025-10-08 21:11:36 +08:00
George Hotz
a4edf78351
don't use the p word
2025-10-08 20:00:27 +08:00
George Hotz
67b3e463b8
Merge branch 'master' into delete_slow_rangeify
2025-10-08 19:58:16 +08:00
George Hotz
49a2b328b9
remove the on_stack stuff and any retries
2025-10-08 19:57:13 +08:00
George Hotz
78b45838bf
delete the old rangeify path and all the children stuff
2025-10-08 19:50:20 +08:00
qazal
b6835f4134
remove Ops.VIEW and related UOp methods ( #12522 )
...
* remove Ops.VIEW and related UOp methods
* update abstractions2.py
* no ShapeTrackers in abstractions2.py
* it's a size 1
2025-10-08 14:47:02 +03:00
George Hotz
3b0b3a2e64
fast RANGEIFY ( #12504 )
...
* rtoposort is fast, can replace rangeify with this
* fast rangeify
* work
* fast rangeify works for mnist
* should work
* progress
* pad fix
* FAST
* tests passing
* don't delete those shape ops
* put in rangeify map
* ending ranges fix
* tests
* mstack/mselect no hacks
* move to indexing.py
* touch up tests + add comments
* disable failing test
* actually make the file readable
* failing
* error
2025-10-08 19:38:06 +08:00
qazal
9448924d9e
update gpt2 kernel count tests in CI=0 ( #12523 )
2025-10-08 14:29:11 +03:00
qazal
c5a1f9f5f9
no ShapeTrackers in multi.py ( #12521 )
...
* switch multi to all movement ops
* inline dvars
2025-10-08 14:04:05 +03:00
chenyu
ee0382ad99
remove ShapeTracker.invert ( #12520 )
2025-10-08 18:37:34 +08:00
chenyu
d5058427ea
remove ShapeTracker.real_size ( #12519 )
2025-10-08 06:15:29 -04:00
qazal
6f26603f06
delete swizzler.py ( #12518 )
...
* delete swizzler
* remove merge_views tests
* don't need rewrites_for_views
* apply_rewrites
2025-10-08 13:02:34 +03:00
qazal
7e0b14243e
delete grouper and kernelize ( #12517 )
...
* delete grouper and kernelize
* +sys.setrecursionlimit
2025-10-08 12:27:26 +03:00
chenyu
942022c309
smaller LLAMA_LAYER in Test llama 3 training ( #12516 )
...
very slow now
2025-10-08 05:10:51 -04:00
chenyu
e701106a64
remove FUSE_ARANGE ( #12511 )
...
it was the default already
2025-10-08 04:54:07 -04:00
qazal
291a19650b
move Kernel dataclass to rangeify ( #12510 )
2025-10-08 11:30:06 +03:00
qazal
ad49f8148b
switch process_replay to rangeify ( #12509 )
2025-10-08 11:26:43 +03:00
chenyu
da1f46ff3f
remove RANGEIFY specific test jobs ( #12507 )
2025-10-08 04:12:04 -04:00
George Hotz
1e567a5cf8
make RANGEIFY=1 the default ( #12161 )
...
Co-authored-by: chenyu <chenyu@fastmail.com >
Co-authored-by: Sieds Lykles <93992551+S-Lykles@users.noreply.github.com >
Co-authored-by: qazal <77887910+Qazalin@users.noreply.github.com >
2025-10-08 03:46:09 -04:00
nimlgen
9e7103647d
amd: rename cmd_id to sqtt_next_cmd_id ( #12503 )
...
* amd: rename cmd_id to sqtt_next_cmd_id
* and typo
2025-10-08 15:16:19 +08:00
nimlgen
4a756a37d8
amd: support rocm7 ( #12502 )
...
* amd: support rocm7
* mock
2025-10-08 14:30:39 +08:00
qazal
60b6dca5ba
update some tests instead of expect_rangeify_fails ( #12500 )
...
* update test_clone_doesnt_dedup to use base
* new_flat_buffer passes
* fix test_reorder_expand
* remove the view stuff
* remove that test, we don't want this view const behavior
* test_setitem_becomes_subbuffer is good
2025-10-08 07:42:31 +03:00
qazal
84597ed53c
early assert for device mistmatched asts in rangeify ( #12499 )
...
* early assert for device mistmatched asts in rangeify
* alt also passes
2025-10-08 07:19:36 +03:00
qazal
2e19354c1c
viz: reorder timeline graphs ( #12498 )
...
* viz: reorder timeline graphs
* update test_viz with the new order
2025-10-08 07:10:23 +03:00
George Hotz
d06226b575
fix SPEC and all_tensors iterator ( #12496 )
2025-10-07 23:18:17 -04:00
qazal
a7cb80bfab
use recursive_property in UOp device ( #12477 )
...
* simple failing test with RecursionError
* switch to @recursive_property
* merge 2
* diff
2025-10-08 06:15:05 +03:00
George Hotz
a6d59a0b45
backward_slice to get srcs recursively ( #12494 )
...
* change name to backward_slice
* faster check
* clean up comments and names
* comment
2025-10-08 10:31:42 +08:00
chenyu
eb3bc277b3
remove ASSERT_MIN_STEP_TIME in external_benchmark_openpilot ( #12495 )
...
should add for compile3 and compile 3 only
2025-10-07 22:13:42 -04:00
qazal
239f9a3029
update viz to not use children [pr] ( #12493 )
2025-10-08 04:35:01 +03:00
Sieds Lykles
b465c17b56
Revert "UOp.factor and add chain sorting ( #12413 )" ( #12492 )
...
This reverts commit e74be4a140 .
2025-10-08 03:20:23 +02:00
George Hotz
945cc46475
delete children tracking from uop ( #12491 )
...
* delete children tracking from uop
* uop children no longer exists
* no tracked children
* that test is flaky too
2025-10-08 09:04:14 +08:00
nimlgen
648e5bb223
hcq: do not raise when fini ( #12487 )
...
* hcq: do not raise when fini
* Revert "hcq: do not raise when fini"
This reverts commit 44af5f7d05 .
* this way
* runtime is fine
* nn
2025-10-07 23:27:03 +08:00
George Hotz
a2345787b9
parents is faster than sparents ( #12490 )
2025-10-07 21:31:50 +08:00
George Hotz
12c4963489
add more rangeify pm tests ( #12488 )
2025-10-07 05:45:38 -04:00
George Hotz
403fdfcfd4
check spec in test, cleanup vectorize render ( #12484 )
2025-10-07 17:05:50 +08:00
qazal
22674798df
assert correctness in test_permuted_assignment [pr] ( #12483 )
2025-10-07 11:42:22 +03:00
George Hotz
75ce11593c
test_reshape_match should match ( #12479 )
2025-10-07 16:07:21 +08:00
chenyu
fe774a4319
more skip WINO on benchmark ( #12482 )
2025-10-07 03:43:51 -04:00
chenyu
8ad5f9e74f
skip slow benchmarks ( #12481 )
...
* skip slow benchmarks
padded tc is already slow, rest are slow with rangeify (correct if run locally)
* relax more
2025-10-07 03:28:56 -04:00
George Hotz
ea7672931f
fix test_matmul_relu_cat ( #12478 )
2025-10-07 02:32:23 -04:00
George Hotz
514d2a0774
merge tagless reshapes ( #12474 )
...
* merge tagless reshapes
* cleanup
2025-10-07 13:57:58 +08:00
chenyu
7b48f3cc45
failed test case repro for openpilot model ( #12475 )
...
* failed test case repro for openpilot model
* assertEqual
2025-10-07 13:46:43 +08:00
chenyu
a5484b767e
remove skipping cast in simplify_valid [pr] ( #12472 )
...
* remove skipping cast in simplify_valid [pr]
unsupported statements are handled in uop_given_valid already. the test failed because (100%x) somehow got simplified
* better test
2025-10-07 00:10:04 -04:00
George Hotz
b4509fba31
thundermittens ( #12471 )
...
* thundermittens
* give device a type
2025-10-07 11:47:39 +08:00
George Hotz
0f25b4b289
move frontend dir to nn [pr] ( #12470 )
2025-10-07 10:42:22 +08:00
qazal
f664bcc8bd
use recursive_property in UOp tracing ( #12469 )
...
* test
* simple passing
2025-10-06 21:10:52 +03:00
qazal
1af05dae77
fix rangeify in compile4.py ( #12467 )
...
* fix rangeify in compile4.py
* fix type_verify
2025-10-06 13:37:46 +03:00
qazal
76e8a3250c
rangeify: late zero folding ( #12464 )
...
* rangeify: late zero folding
* early
* not kernels
* none
* multi
* linter
* mstack is sink comment
* more comment
2025-10-06 12:52:33 +03:00
George Hotz
0c015a24fe
use recursive_property to prevent RecursionError ( #12465 )
...
* use recursive_property to prevent RecursionError
* not slower
* fix tests
* faster
* simpler
2025-10-06 15:59:18 +08:00
chenyu
a1881b0c17
update test_chicken ( #12466 )
...
logits are close, just numerical
2025-10-06 03:58:44 -04:00