Commit Graph

25 Commits

Author SHA1 Message Date
George Hotz
744af193f0 remove ScheduleItem and merge it with ExecItem (#13759)
* remove ExecItem and merge it with ScheduleItem

* less diff

* fix issues

* min diff

* don't change bufs in _lower

* min diff

* update

* revert

* fixes

* diff
2025-12-19 17:04:24 -04:00
George Hotz
962d980919 fuse hasn't worked since rangeify, remove it (#13057) 2025-11-02 14:01:52 +08:00
chenyu
285534ce64 delete DONT_REALIZE_EXPAND and DONT_GROUP_REDUCES (#12744)
does nothing now
2025-10-16 14:11:33 -04:00
chenyu
cf8232ec6a clean up more RANGEIFY flag (#12556) 2025-10-09 03:06:48 -04:00
George Hotz
44558a37f7 fix some rangeify tests (#12370)
* fix bad range merges

* fix rng

* fix uop gc

* fix some rangeify tests

* now that needs rangeify 2 also
2025-09-30 20:12:08 +08:00
George Hotz
fd2e4f2353 failing rng test (#12328)
* tighten spec: fixup devectorizer types / rangeify

* tighten assign

* failing rangeify test

* simpler

* otherwise contig

* more tolerance cause rng seed changed
2025-09-29 16:06:45 +08:00
Sieds Lykles
29f0886395 skip test_softmax_fusion tests if RANGEIFY==1 (#12310) 2025-09-27 05:57:40 +02:00
ttomsa
220a2a88d7 a*(1/b) -> a/b on LLVM, CPU (#11743)
* add fdiv rewrite

* :)

* use float_lop

* use reciprocal()

* revert

* move to decompositions
2025-08-20 09:35:10 -04:00
chenyu
4666df71c1 fix test_fuse_and_tc_opt (#11699) 2025-08-16 21:10:53 -04:00
geohotstan
3d7c35d615 add fuse and tc opt bug repro (#11695)
* FINALLY HAVE A SMALL REPRO OH BOY

* show failure in CI

* cleaner?

* 1 possible fix

* Revert "1 possible fix"

This reverts commit 9e0fd215dd.
2025-08-16 18:24:49 -04:00
George Hotz
9764c6cdee fix mismatch reduce, try 2 (#11560)
* fix mismatch reduce, try 2

* fix heuristic

* delete that test

* don't start allowing ones
2025-08-07 07:57:58 -07:00
George Hotz
a1aa5670aa Revert "fix mismatch reduce (#11547)" (#11549)
This reverts commit 49d21a9055.
2025-08-06 22:43:15 -07:00
George Hotz
49d21a9055 fix mismatch reduce (#11547)
* fix mismatch reduce

* cleanups

* fix shape

* fix mypy

* resolve
2025-08-06 21:12:51 -07:00
George Hotz
7c5e115747 test_mismatch_reduce (#11538) 2025-08-06 10:02:14 -07:00
chenyu
f02720ca2d fix fuse gate_contiguous unique (#11504) 2025-08-04 23:43:31 -04:00
chenyu
e0106b6b25 1/(x*c) -> (1/c)*(1/x) (#11491)
example: 2*(2*a).reciprocal() -> a.reciprocal()

# TODO: bounds for reciprocal
# TODO: should z3 work?
2025-08-03 23:35:46 -04:00
George Hotz
49a2583584 real new lowerer (#11419)
* real new lowerer

* fix group for reduce

* skip missing ranges

* fix wmma and unroll/contract

* real fix for wmma

* disable that test

* fix if gate

* simpler

* flash attention fusion works

* no end barriers

* still broken

* flash attention finally works
2025-07-29 15:35:51 -07:00
George Hotz
81ef879da3 non recursive top_down_rewrite (#10729)
* non recursive top_down_rewrite

* nicer algorithm

* rewrite bottom up also

* only top down is broken?

* simpler iterative algo

* no recursion errors

* top down and bottom up

* unified rewrite

* simpler rewrite

* clean up comments

* move that comment
2025-06-09 16:33:04 -07:00
George Hotz
eaceafecae do fusion locally (#10095)
* do fusion locally

* oops, that's the right way

* explicit delete closure
2025-04-28 20:45:37 -04:00
George Hotz
dd52951dd0 fix single kernel softmax with cast (#9842)
* fix single kernel softmax with cast

* tolerate none

* 3e-4

* skip on dtype
2025-04-11 12:12:02 +08:00
chenyu
7fa5f29582 add test_embedding to test_softmax_fusion (#9832) 2025-04-10 08:25:34 -04:00
George Hotz
53f0b2aad7 fix infinite loop in flash attention (#9827)
* fix infinite loop in flash attention

* get_contraction_with_reduce

* skip that test

* SINGLE_KERNEL_SOFTMAX + fix multi

* default IGNORE_OOB

* print change
2025-04-10 20:06:44 +08:00
George Hotz
fce432d2e3 Ops.FUSE makes softmax a single kernel (#9808)
* KERNELIZE makes softmax a single kernel

* single kernel works

* softmax works

* broken

* correct

* skip that test

* kernelize tests

* rename to fuse

* better reduce_push_add_ones code

* correct now

* cleanups

* oops

* return None if we can't push ones

* rename + docs

* atol fixes group

* flash attention broken test
2025-04-09 22:56:28 +08:00
chenyu
7a28133b37 failed test for single softmax backward (#9778)
getting RecursionError with DONT_GROUP_REDUCES=1
2025-04-08 02:36:32 -04:00
George Hotz
fefee5d3ab single kernel softmax (#9776)
* real single kernel softmax

* cleanup

* fix blockend insertion

* add to bert test
2025-04-08 12:35:48 +08:00