Commit Graph

4476 Commits

Author SHA1 Message Date
chenyu
678f83e41b delete ShapeTracker to_valid_uop and substitute [pr] (#12563) 2025-10-09 05:06:10 -04:00
chenyu
cf8232ec6a clean up more RANGEIFY flag (#12556) 2025-10-09 03:06:48 -04:00
George Hotz
a8a9ac0e95 add more uop gc test (#12553) 2025-10-09 14:49:32 +08:00
chenyu
250f05a776 run some hashing test only on METAL (#12554)
quite slow on CPU
2025-10-09 02:39:49 -04:00
chenyu
ae51bdd06a remove trivial use of RANGEIFY flag (#12550)
some tests need update still
2025-10-09 02:29:38 -04:00
George Hotz
1dc500426e remove restrictions on range ending in indexing (#12543)
* remove restrictions on range ending in indexing

* early simplify

* Revert "early simplify"

This reverts commit 657d9972c2.

* disable const folding tests
2025-10-09 13:53:08 +08:00
chenyu
585bd95b50 fix ruff 0.14.0 [pr] (#12547) 2025-10-09 01:52:30 -04:00
chenyu
43bce1f39f delete View minify [pr] (#12538) 2025-10-08 23:25:53 -04:00
chenyu
20d98b19c3 delete more unused ShapeTracker stuff (#12536) 2025-10-08 23:09:44 -04:00
qazal
bb5671a837 some more ops.py cleanups (#12525)
* remove GroupOp.Meta and st_arg

* inline axis_arg

* only allow .buffer on reshapes (or the buffer)

* gate is the other way

* still want can_pad?

* use op_in_backward_slice_with_self

* .buffer is recursive

* lint

* pathlib there
2025-10-09 06:06:44 +03:00
chenyu
be05028419 move ASSERT_MIN_STEP_TIME to compile3 (#12535)
threshold is current time +20%
2025-10-08 22:16:59 -04:00
chenyu
c4732a18bd update tests that depend on SPLIT_REDUCEOP (#12534) 2025-10-08 21:53:30 -04:00
chenyu
28edea5d67 delete FUSE_CONV_BW (#12527) 2025-10-08 10:41:38 -04:00
George Hotz
0774575442 delete the old rangeify path and all the children stuff (#12524)
* delete the old rangeify path and all the children stuff

* remove the on_stack stuff and any retries

* don't use the p word

* Revert "remove the on_stack stuff and any retries"

This reverts commit 49a2b328b9.
2025-10-08 21:24:04 +08:00
qazal
b6835f4134 remove Ops.VIEW and related UOp methods (#12522)
* remove Ops.VIEW and related UOp methods

* update abstractions2.py

* no ShapeTrackers in abstractions2.py

* it's a size 1
2025-10-08 14:47:02 +03:00
George Hotz
3b0b3a2e64 fast RANGEIFY (#12504)
* rtoposort is fast, can replace rangeify with this

* fast rangeify

* work

* fast rangeify works for mnist

* should work

* progress

* pad fix

* FAST

* tests passing

* don't delete those shape ops

* put in rangeify map

* ending ranges fix

* tests

* mstack/mselect no hacks

* move to indexing.py

* touch up tests + add comments

* disable failing test

* actually make the file readable

* failing

* error
2025-10-08 19:38:06 +08:00
qazal
9448924d9e update gpt2 kernel count tests in CI=0 (#12523) 2025-10-08 14:29:11 +03:00
chenyu
ee0382ad99 remove ShapeTracker.invert (#12520) 2025-10-08 18:37:34 +08:00
chenyu
d5058427ea remove ShapeTracker.real_size (#12519) 2025-10-08 06:15:29 -04:00
qazal
6f26603f06 delete swizzler.py (#12518)
* delete swizzler

* remove merge_views tests

* don't need rewrites_for_views

* apply_rewrites
2025-10-08 13:02:34 +03:00
qazal
7e0b14243e delete grouper and kernelize (#12517)
* delete grouper and kernelize

* +sys.setrecursionlimit
2025-10-08 12:27:26 +03:00
chenyu
e701106a64 remove FUSE_ARANGE (#12511)
it was the default already
2025-10-08 04:54:07 -04:00
qazal
ad49f8148b switch process_replay to rangeify (#12509) 2025-10-08 11:26:43 +03:00
nimlgen
4a756a37d8 amd: support rocm7 (#12502)
* amd: support rocm7

* mock
2025-10-08 14:30:39 +08:00
qazal
60b6dca5ba update some tests instead of expect_rangeify_fails (#12500)
* update test_clone_doesnt_dedup to use base

* new_flat_buffer passes

* fix test_reorder_expand

* remove the view stuff

* remove that test, we don't want this view const behavior

* test_setitem_becomes_subbuffer is good
2025-10-08 07:42:31 +03:00
qazal
84597ed53c early assert for device mistmatched asts in rangeify (#12499)
* early assert for device mistmatched asts in rangeify

* alt also passes
2025-10-08 07:19:36 +03:00
qazal
2e19354c1c viz: reorder timeline graphs (#12498)
* viz: reorder timeline graphs

* update test_viz with the new order
2025-10-08 07:10:23 +03:00
qazal
a7cb80bfab use recursive_property in UOp device (#12477)
* simple failing test with RecursionError

* switch to @recursive_property

* merge 2

* diff
2025-10-08 06:15:05 +03:00
Sieds Lykles
b465c17b56 Revert "UOp.factor and add chain sorting (#12413)" (#12492)
This reverts commit e74be4a140.
2025-10-08 03:20:23 +02:00
George Hotz
945cc46475 delete children tracking from uop (#12491)
* delete children tracking from uop

* uop children no longer exists

* no tracked children

* that test is flaky too
2025-10-08 09:04:14 +08:00
George Hotz
12c4963489 add more rangeify pm tests (#12488) 2025-10-07 05:45:38 -04:00
George Hotz
403fdfcfd4 check spec in test, cleanup vectorize render (#12484) 2025-10-07 17:05:50 +08:00
qazal
22674798df assert correctness in test_permuted_assignment [pr] (#12483) 2025-10-07 11:42:22 +03:00
George Hotz
75ce11593c test_reshape_match should match (#12479) 2025-10-07 16:07:21 +08:00
George Hotz
ea7672931f fix test_matmul_relu_cat (#12478) 2025-10-07 02:32:23 -04:00
chenyu
7b48f3cc45 failed test case repro for openpilot model (#12475)
* failed test case repro for openpilot model

* assertEqual
2025-10-07 13:46:43 +08:00
chenyu
a5484b767e remove skipping cast in simplify_valid [pr] (#12472)
* remove skipping cast in simplify_valid [pr]

unsupported statements are handled in uop_given_valid already. the test failed because (100%x) somehow got simplified

* better test
2025-10-07 00:10:04 -04:00
George Hotz
0f25b4b289 move frontend dir to nn [pr] (#12470) 2025-10-07 10:42:22 +08:00
qazal
f664bcc8bd use recursive_property in UOp tracing (#12469)
* test

* simple passing
2025-10-06 21:10:52 +03:00
qazal
76e8a3250c rangeify: late zero folding (#12464)
* rangeify: late zero folding

* early

* not kernels

* none

* multi

* linter

* mstack is sink comment

* more comment
2025-10-06 12:52:33 +03:00
chenyu
a1881b0c17 update test_chicken (#12466)
logits are close, just numerical
2025-10-06 03:58:44 -04:00
qazal
1b1978b9c0 early copy fixup (#12463)
* simple failing test

* early copy fixup
2025-10-06 06:38:29 +03:00
chenyu
c1e85f699c multi test case for sharded ring allreduce (#12462)
* multi test case for sharded ring allreduce

triggers `children not making progress` with RANGEIFY

* expect_rangeify_fails
2025-10-05 23:18:24 -04:00
George Hotz
46e8ea15c1 split pm_substitute_recurse (#12460) 2025-10-05 21:35:50 -04:00
qazal
6ad9a688ed add failing test after "pend substitutes for speed" (#12457)
* add failing substitute test

* expect_rangeify_fails
2025-10-05 16:10:04 +03:00
qazal
4b60121498 fix bmnist torch with RANGEIFY=1 (#12442)
* fix bmnist torch with RANGEIFY=1

* alt

* test and comment

* this was always wrong

* simple failing test for rangeify

* simple upat to match the old behavior
2025-10-05 12:34:27 +03:00
George Hotz
b5f31d7505 earlier seen children (#12451) 2025-10-05 15:55:13 +08:00
qazal
865d5796f8 add a test for untested Tensor.assign behavior (#12448)
* add a test for untested Tensor.assign behavior

* better
2025-10-04 12:44:56 +03:00
Sieds Lykles
e74be4a140 UOp.factor and add chain sorting (#12413)
* add ordering

* fix some tests

* fix more tests

* shorten comment

* update test

* add rule and test

* add rule and test

* remove check

* use fold_divmod_congruence instead of simplify

* adjust tests

* shorten line

* new algo

* add test

* add function to un-nest the div

* add UOp.factor

* test UOp.factor

* uop_given_valid tries to factor simplex expression

* shorten line

* symbolic_flat is back

* change that back

* fix those new tests

* new rule for ordering

* factor multiple factors

* no symbolic_flat

* symbolic_flat to there

* move that back

* fix imports

* merge correctly

* linter happy

* add rule

* add a test

* cleanup

* revert that for now

* UOp.factor returns self instead of None

* try all_candidates

* remove or_else

* post index symbolic

* add test

* maket this closer to the original

* increase mac hlb_cifar min step time

* add some ordering tests

* cleanup

* increase pytest timeout time

* check dtype
2025-10-04 06:05:38 +02:00
Sieds Lykles
394dc24110 post index symbolic (#12446)
* post index symbolic

* add test
2025-10-03 23:23:03 +02:00