George Hotz
b987b8b22a
work
2025-10-27 15:29:26 +08:00
Sieds Lykles
7f798a9630
Cleanup const buffers ( #12829 )
...
* split pm_cleanups
* update test_schedule
* shrink when we remove bufferize
* dont do shrink if shape is empty
* update tests
* remove *1 from metadata
* deal with the noop bufferize
* only noop on cvar
* cleanup
* fix if
* rename
2025-10-21 14:53:49 +02:00
chenyu
fcdf4ab37e
remove a contiguous in LARS ( #12770 )
2025-10-17 17:07:30 -04:00
George Hotz
062a6d68d7
test flash attention backward ( #12762 )
...
* test flash attention backward
* TODO: fix pcontig
* end ranges
* render colors
* very big
* multiout at every level
* reset ending ranges
* fix tests
* ugh
2025-10-17 23:15:59 +08:00
chenyu
9561803cb0
fix assert in test_schedule ( #12745 )
...
* fix assert in test_schedule
updated kernel counts and some old tests
* fix
2025-10-16 15:39:50 -04:00
chenyu
285534ce64
delete DONT_REALIZE_EXPAND and DONT_GROUP_REDUCES ( #12744 )
...
does nothing now
2025-10-16 14:11:33 -04:00
George Hotz
592e86f6f5
remove UOp.st ( #12716 )
...
* remove UOp.st
* fix tests
* torch backend disable
2025-10-16 14:44:09 +08:00
George Hotz
612e3d6143
replace mop arg with vectorized index ( #12695 )
...
* replace mop arg with vectorized index
* tests passing
* better viz
* no compile4
2025-10-15 20:50:06 +08:00
George Hotz
cab034b863
improve typing ( #12611 )
...
* improve typing and bump to 3.11
* no need for Self yet
* improve typing
* binop also
2025-10-11 16:20:23 +08:00
chenyu
f2c3a72b0c
remove RANGEIFY flag [pr] ( #12577 )
2025-10-09 21:52:54 -04:00
qazal
b86ad6053a
test_schedule independent of RANGEIFY flag ( #12568 )
...
* test_schedule independent of RANGEIFY flag
* comment for expectedFailure + test_cast_padded_view
* test_cast_padded_const works
* don't use full_shape it's fine
* add todos for the rest
2025-10-09 20:00:50 +03:00
chenyu
ae51bdd06a
remove trivial use of RANGEIFY flag ( #12550 )
...
some tests need update still
2025-10-09 02:29:38 -04:00
qazal
bb5671a837
some more ops.py cleanups ( #12525 )
...
* remove GroupOp.Meta and st_arg
* inline axis_arg
* only allow .buffer on reshapes (or the buffer)
* gate is the other way
* still want can_pad?
* use op_in_backward_slice_with_self
* .buffer is recursive
* lint
* pathlib there
2025-10-09 06:06:44 +03:00
chenyu
c4732a18bd
update tests that depend on SPLIT_REDUCEOP ( #12534 )
2025-10-08 21:53:30 -04:00
chenyu
28edea5d67
delete FUSE_CONV_BW ( #12527 )
2025-10-08 10:41:38 -04:00
qazal
b6835f4134
remove Ops.VIEW and related UOp methods ( #12522 )
...
* remove Ops.VIEW and related UOp methods
* update abstractions2.py
* no ShapeTrackers in abstractions2.py
* it's a size 1
2025-10-08 14:47:02 +03:00
George Hotz
3b0b3a2e64
fast RANGEIFY ( #12504 )
...
* rtoposort is fast, can replace rangeify with this
* fast rangeify
* work
* fast rangeify works for mnist
* should work
* progress
* pad fix
* FAST
* tests passing
* don't delete those shape ops
* put in rangeify map
* ending ranges fix
* tests
* mstack/mselect no hacks
* move to indexing.py
* touch up tests + add comments
* disable failing test
* actually make the file readable
* failing
* error
2025-10-08 19:38:06 +08:00
qazal
6f26603f06
delete swizzler.py ( #12518 )
...
* delete swizzler
* remove merge_views tests
* don't need rewrites_for_views
* apply_rewrites
2025-10-08 13:02:34 +03:00
qazal
7e0b14243e
delete grouper and kernelize ( #12517 )
...
* delete grouper and kernelize
* +sys.setrecursionlimit
2025-10-08 12:27:26 +03:00
chenyu
e701106a64
remove FUSE_ARANGE ( #12511 )
...
it was the default already
2025-10-08 04:54:07 -04:00
qazal
60b6dca5ba
update some tests instead of expect_rangeify_fails ( #12500 )
...
* update test_clone_doesnt_dedup to use base
* new_flat_buffer passes
* fix test_reorder_expand
* remove the view stuff
* remove that test, we don't want this view const behavior
* test_setitem_becomes_subbuffer is good
2025-10-08 07:42:31 +03:00
qazal
84597ed53c
early assert for device mistmatched asts in rangeify ( #12499 )
...
* early assert for device mistmatched asts in rangeify
* alt also passes
2025-10-08 07:19:36 +03:00
qazal
76e8a3250c
rangeify: late zero folding ( #12464 )
...
* rangeify: late zero folding
* early
* not kernels
* none
* multi
* linter
* mstack is sink comment
* more comment
2025-10-06 12:52:33 +03:00
qazal
1b1978b9c0
early copy fixup ( #12463 )
...
* simple failing test
* early copy fixup
2025-10-06 06:38:29 +03:00
chenyu
c1e85f699c
multi test case for sharded ring allreduce ( #12462 )
...
* multi test case for sharded ring allreduce
triggers `children not making progress` with RANGEIFY
* expect_rangeify_fails
2025-10-05 23:18:24 -04:00
George Hotz
46e8ea15c1
split pm_substitute_recurse ( #12460 )
2025-10-05 21:35:50 -04:00
qazal
6ad9a688ed
add failing test after "pend substitutes for speed" ( #12457 )
...
* add failing substitute test
* expect_rangeify_fails
2025-10-05 16:10:04 +03:00
qazal
13a25b2e67
rangeify: don't shape INDEX on kernelize ( #12417 )
2025-10-02 09:45:37 +03:00
qazal
6fc6b51b59
fix limit_bufs with kernelize ( #12415 )
2025-10-02 07:49:11 +03:00
George Hotz
89bed28716
split reduceop ( #12404 )
...
* some rangeify tests fixed
* bring split reduceop to rangeify
* fix tests
2025-10-01 18:45:16 +08:00
qazal
90b1c0dd96
rangeify: test_where_fold kernel count ( #12379 )
...
* rangeify: test_where_fold kernel count
* get these from the index
* replace ranges
* fine
* movement ops
* diff
* better
2025-10-01 09:35:12 +03:00
nimlgen
2c397eb2a2
rangeify: buf limit ( #12336 )
...
* limit bufs
* g
* fix buffer limit
* um?
* fix
* only these?
* typo
* f
* cleaner
2025-09-30 14:59:47 +03:00
qazal
4ff7f20b9d
rangeify: fix kernelize ( #12357 )
2025-09-30 10:10:08 +03:00
George Hotz
ab6b0d3a21
enable cleanup_dead_axes ( #12351 )
...
* enable cleanup_dead_axes
* don't mess with user contig
* correct tag behavior
* double reshape isn't correct
* block on assign too
* skip messing with symbolic
* Fix tests
* disable RANGEIFY=2
* test w rangeify
2025-09-30 14:09:39 +08:00
wozeparrot
2a0caa09c2
push copy to disk ( #12348 )
2025-09-29 21:55:05 -07:00
qazal
250cb10e8f
rangeify permuted assign ( #12299 )
...
* enable RANGEIFY=1 test_assign
* work
* rangeify=0 asserts this ast
* remove that
* beta test, it's correct though
* skip multi
* matches torch/np output
* memcopy without memcopy
* can remove this
* rangeify isn't silently wrong anymore
* diff cleanup
* use UOp toposort instead of global tags
* actual assert TestRangeifyAssign
* step
* work
* this isn't optimizing away now
* some todos
* test fusion schedule
* typo
* dedup idxs
* cleaner
* pre
* work
* diff
2025-09-29 07:27:57 +03:00
Sieds Lykles
ed90de6583
Revert "Bufferize early, fix "children not making progress" on big graphs (#1…" ( #12318 )
...
This reverts commit 6f1cf717de .
2025-09-28 19:10:21 +02:00
Sieds Lykles
6f1cf717de
Bufferize early, fix "children not making progress" on big graphs ( #12308 )
...
* bufferize children early
* cleaner
* fix types
* lower number of reduceops
* test openpilot
2025-09-27 04:17:15 +02:00
nimlgen
f5eb46a3d9
fix limit buf metal on non rangeify ( #12303 )
...
* add failure test for limit buf on non rangeify
* correct metal
* correct
* hm
2025-09-26 11:06:28 +03:00
qazal
6c9d8c7e41
rangeify: simplify noop copy ( #12289 )
2025-09-24 17:01:23 +03:00
nimlgen
02a7b7fe48
rangeify: fix test_setitem ( #12269 )
...
* rangeify: fix test_setitem
* um?
* better?
* simple where folding
* f
* revert
* x
2025-09-23 20:42:36 +03:00
nimlgen
b53a266254
rangeify: fix test_optim ( #12262 )
...
* rangeify: fix test_optim
* add to cl?
* these are good now
2025-09-21 18:08:35 +03:00
qazal
bb59eed82f
rangeify: don't tag consts, they are global ( #12247 )
...
* rangeify: don't tag consts, they are global
* don't map movement ops
* sym failing test
* remove that
* update comment
* simpler test
* work
2025-09-19 15:25:03 +03:00
qazal
825f148469
rangeify: fix copy size mismatch errs ( #12232 )
...
* rangeify: fix copy size mismatch errs
* const folding can happen in sym
assert it
* shippable
* rangeify copy is completely wrong
* pre_bufferize
* tag bufferize
* pre back
2025-09-18 18:23:32 +03:00
qazal
dbbc261075
rangeify: fix COPY simplifier ( #12233 )
2025-09-18 14:35:33 +03:00
qazal
525f80e0d2
rangeify: enable putting consts back in the tensor graph ( #12225 )
...
* rangeify: enable putting consts back in the tensor graph
* work
* sym in ci
2025-09-17 19:45:04 +03:00
qazal
7733c217c5
remove spam comments in test_schedule ( #12224 )
2025-09-17 18:24:55 +03:00
qazal
d917895569
map out rangeify errors in test_schedule ( #12211 )
...
* map out rangeify errors in test_schedule
* skip that
* add to ci
2025-09-17 09:10:28 +03:00
qazal
122a50fe8c
assert kernel count ( #12205 )
2025-09-16 14:24:39 +03:00
qazal
02054b53fe
remove tests that pre date the uop spec ( #12168 )
...
* remove tests that pre date the uop spec
* const src
* for RANGEIFY=1
* update with bind
* remove import
2025-09-14 18:47:42 +03:00