Commit Graph

11 Commits

Author SHA1 Message Date
George Hotz
bcafa72b7f use tags instead of graph_rewrite_map in rangeify (#12110)
* use tags instead of graph_rewrite_map in rangeify

* new style, add realize

* metadata works

* simple failure

* fix

* loops

* stuff becomes a NOOP when you remove it

* stuff becomes a NOOP when you remove it

* tags on bufferize

* bmnist works

* locals don't work

* shippable

* fix some tests

* simpler map_realize

* remove const hack

* debuggable test

* broke

* assign test

* straight up bug

* wooo it passes

* sink shouldn't be there

* fix ops

* bmnist

* kv cache ish

* Set RANGEIFY context variable to 0

* should work normal

* better

* types

* hacks to fix test_symbolic

* pm_add_buffers

* tests should pass
2025-09-14 11:39:01 +08:00
George Hotz
3ef0e5e01e rangeify: use Ops.REALIZE and not Ops.CONTIGUOUS if it's added by system (#12111)
* rangeify: use Ops.REALIZE and not Ops.CONTIGUOUS if it's added by system

* fix contig + BufferizeOpts

* no outerworld
2025-09-11 11:56:59 +08:00
George Hotz
d4eba5800d rangeify cost function infrastructure (#12091)
* one call to hc opt

* does that pass?

* add cost function to rangeify

* test

* more test

* gate thread

* bufferize has shape

* ish

* match old behavior

* no ci there
2025-09-11 07:19:53 +08:00
George Hotz
5cf42dc4db add Scheduler to replace Kernel with POSTOPT=2 (#11924)
* ** simple kernel to replace Kernel for postopt

* support old

* fix beam

* beaming

* beam on old

* bring tensor cores back

* raise

* postbeam

* test ops passes on mac

* skip that

* postopt default

* gate that

* fix tensor cores

* a few test fixes

* dsp fix

* tc fix

* loop

* support swap

* test_gemv

* fix beam for variable

* test opts from high level stuff

* range annoying

* compile slow

* metal slow

* better beam

* no POSTBEAM

* fix nolocals

* hc opt mostly works

* put that back

* lil

* some work

* fix that

* POSTOPT 2

* fix tests

* no postopt 2

* work

* back

* padded tensors cores

* shift_to

* postopt 0 passes?

* write PADTO

* fix padded tensor cores

* compare hcopt

* 18000 lines

* should pass tests

* fix rangeify

* put types back
2025-09-03 19:23:30 -07:00
George Hotz
0dfca4e74b add failing test for rangeify setitem (#11954) 2025-09-01 16:24:35 -07:00
George Hotz
afad7d0cd1 remove dtype from range, it will be dtypes.index soon [pr] (#11914)
* remove dtype from range, it will be dtypes.index soon [pr]

* a few more
2025-08-29 09:52:07 -07:00
George Hotz
b9b438c516 small updates from postopt (#11903)
* tests from postopt

* modernize

* skip lin tests

* that's fixed?

* skip, not failure
2025-08-28 12:34:52 -07:00
George Hotz
b268755d51 small changes from postopt (#11854) 2025-08-26 11:56:16 -07:00
George Hotz
9832599c9e test_vmap + permute isn't a sint (#11783)
* test_vmap + permute isn't a sint

* order
2025-08-21 22:39:35 -07:00
George Hotz
bb8de51e5f remove unused early cleanups + contig w range [pr] (#11780)
* remove unused early cleanups [pr]

* contiguous with range

* woah, this works
2025-08-21 20:04:45 -07:00
George Hotz
9635592141 ** rangeify, try 3 (#11683)
* ** rangeify, try 3

* bring that over

* bufferize, don't use contig tag

* work

* ish

* fix rangeify

* flash attention is back

* fix rangeify tests

* stuff passes

* fix test_log_softmax

* more stuff passes

* progress children

* new endrange solution

* progress

* progress counter

* basic assign

* contigs only

* symbolic in schedule

* unbind_kernel

* late children

* ops fixed

* beautiful mnist is close

* that seems to work

* mnist works

* improve names

* fix bmnist

* no pcontig

* testing backward

* work

* clone movement ops

* new_range helper

* MBLOCK/MERGE

* ops tests pass

* revert mblock stuff

* cleanups...but it breaks ops

* remove reindex

* hack for relu

* disable the hacks

* more hacks

* upd

* mostly works with cleanups disabled

* ndr

* ops tests pass

* terrible hacks for indexing to work

* context mismatch

* pcontig

* split pcontig v contig

* z3 trunc

* null

* no fuse in rangeify

* ops test passes

* lnorm

* fix assign

* nd rangeify

* both should work

* tests for rangeify

* cleanups

* stores pass the pointer through

* disable pcontig for now

* PARTIAL_CONTIG is a flag
2025-08-20 14:22:44 -07:00