George Hotz
744af193f0
remove ScheduleItem and merge it with ExecItem ( #13759 )
...
* remove ExecItem and merge it with ScheduleItem
* less diff
* fix issues
* min diff
* don't change bufs in _lower
* min diff
* update
* revert
* fixes
* diff
2025-12-19 17:04:24 -04:00
George Hotz
962d980919
fuse hasn't worked since rangeify, remove it ( #13057 )
2025-11-02 14:01:52 +08:00
chenyu
285534ce64
delete DONT_REALIZE_EXPAND and DONT_GROUP_REDUCES ( #12744 )
...
does nothing now
2025-10-16 14:11:33 -04:00
chenyu
cf8232ec6a
clean up more RANGEIFY flag ( #12556 )
2025-10-09 03:06:48 -04:00
George Hotz
44558a37f7
fix some rangeify tests ( #12370 )
...
* fix bad range merges
* fix rng
* fix uop gc
* fix some rangeify tests
* now that needs rangeify 2 also
2025-09-30 20:12:08 +08:00
George Hotz
fd2e4f2353
failing rng test ( #12328 )
...
* tighten spec: fixup devectorizer types / rangeify
* tighten assign
* failing rangeify test
* simpler
* otherwise contig
* more tolerance cause rng seed changed
2025-09-29 16:06:45 +08:00
Sieds Lykles
29f0886395
skip test_softmax_fusion tests if RANGEIFY==1 ( #12310 )
2025-09-27 05:57:40 +02:00
ttomsa
220a2a88d7
a*(1/b) -> a/b on LLVM, CPU ( #11743 )
...
* add fdiv rewrite
* :)
* use float_lop
* use reciprocal()
* revert
* move to decompositions
2025-08-20 09:35:10 -04:00
chenyu
4666df71c1
fix test_fuse_and_tc_opt ( #11699 )
2025-08-16 21:10:53 -04:00
geohotstan
3d7c35d615
add fuse and tc opt bug repro ( #11695 )
...
* FINALLY HAVE A SMALL REPRO OH BOY
* show failure in CI
* cleaner?
* 1 possible fix
* Revert "1 possible fix"
This reverts commit 9e0fd215dd .
2025-08-16 18:24:49 -04:00
George Hotz
9764c6cdee
fix mismatch reduce, try 2 ( #11560 )
...
* fix mismatch reduce, try 2
* fix heuristic
* delete that test
* don't start allowing ones
2025-08-07 07:57:58 -07:00
George Hotz
a1aa5670aa
Revert "fix mismatch reduce ( #11547 )" ( #11549 )
...
This reverts commit 49d21a9055 .
2025-08-06 22:43:15 -07:00
George Hotz
49d21a9055
fix mismatch reduce ( #11547 )
...
* fix mismatch reduce
* cleanups
* fix shape
* fix mypy
* resolve
2025-08-06 21:12:51 -07:00
George Hotz
7c5e115747
test_mismatch_reduce ( #11538 )
2025-08-06 10:02:14 -07:00
chenyu
f02720ca2d
fix fuse gate_contiguous unique ( #11504 )
2025-08-04 23:43:31 -04:00
chenyu
e0106b6b25
1/(x*c) -> (1/c)*(1/x) ( #11491 )
...
example: 2*(2*a).reciprocal() -> a.reciprocal()
# TODO: bounds for reciprocal
# TODO: should z3 work?
2025-08-03 23:35:46 -04:00
George Hotz
49a2583584
real new lowerer ( #11419 )
...
* real new lowerer
* fix group for reduce
* skip missing ranges
* fix wmma and unroll/contract
* real fix for wmma
* disable that test
* fix if gate
* simpler
* flash attention fusion works
* no end barriers
* still broken
* flash attention finally works
2025-07-29 15:35:51 -07:00
George Hotz
81ef879da3
non recursive top_down_rewrite ( #10729 )
...
* non recursive top_down_rewrite
* nicer algorithm
* rewrite bottom up also
* only top down is broken?
* simpler iterative algo
* no recursion errors
* top down and bottom up
* unified rewrite
* simpler rewrite
* clean up comments
* move that comment
2025-06-09 16:33:04 -07:00
George Hotz
eaceafecae
do fusion locally ( #10095 )
...
* do fusion locally
* oops, that's the right way
* explicit delete closure
2025-04-28 20:45:37 -04:00
George Hotz
dd52951dd0
fix single kernel softmax with cast ( #9842 )
...
* fix single kernel softmax with cast
* tolerate none
* 3e-4
* skip on dtype
2025-04-11 12:12:02 +08:00
chenyu
7fa5f29582
add test_embedding to test_softmax_fusion ( #9832 )
2025-04-10 08:25:34 -04:00
George Hotz
53f0b2aad7
fix infinite loop in flash attention ( #9827 )
...
* fix infinite loop in flash attention
* get_contraction_with_reduce
* skip that test
* SINGLE_KERNEL_SOFTMAX + fix multi
* default IGNORE_OOB
* print change
2025-04-10 20:06:44 +08:00
George Hotz
fce432d2e3
Ops.FUSE makes softmax a single kernel ( #9808 )
...
* KERNELIZE makes softmax a single kernel
* single kernel works
* softmax works
* broken
* correct
* skip that test
* kernelize tests
* rename to fuse
* better reduce_push_add_ones code
* correct now
* cleanups
* oops
* return None if we can't push ones
* rename + docs
* atol fixes group
* flash attention broken test
2025-04-09 22:56:28 +08:00
chenyu
7a28133b37
failed test for single softmax backward ( #9778 )
...
getting RecursionError with DONT_GROUP_REDUCES=1
2025-04-08 02:36:32 -04:00
George Hotz
fefee5d3ab
single kernel softmax ( #9776 )
...
* real single kernel softmax
* cleanup
* fix blockend insertion
* add to bert test
2025-04-08 12:35:48 +08:00