Commit Graph

29 Commits

Author SHA1 Message Date
George Hotz
fa14f7b4fd switch contract arg to match expand arg [run_process_replay] (#5667)
* switch contract arg to match expand arg [run_process_replay]

* support multiaxis contract too, it's easy

* cancel contract/expand
2024-07-23 18:08:33 -07:00
George Hotz
a85493bdbe multiaxis contract test 2024-07-23 15:09:15 -07:00
chenyu
16c27ae400 update UOp.SPECIAL arg spec [run_process_replay] (#5661)
* update UOp.SPECIAL arg spec [run_process_replay]

from `(0, "gid0", 4)` to just `("gid0", 4)`. closer to a Variable

* fix ptx
2024-07-23 16:58:12 -04:00
chenyu
24505199fb UOp.const(x.dtype, y) -> x.const(y) [run_process_replay] (#5642) 2024-07-22 17:09:40 -04:00
chenyu
b991097d41 move UPat and PatternMatcher from uopgraph.py to uops.py (#5597)
* move UPat and PatternMatcher from uopgraph.py to uops.py

towards instant UOps rewrite on UOp.alu

[run_process_replay]

* fix imports
2024-07-19 19:28:24 -04:00
kormann
2c4add6844 pretty print lazy op per default (#5505)
* pretty lop

* min diff

* walrus

* fix

* min diff

* simplify

* pretty helper function

* ws

* pretty uop upat

* tests

* stricter tests

* test passes

* ws

* stronger upat test

* delete print_tree

* min diff

* stricter exp test

* fix merge

* stronger uops eval test

* +readable and deep upat test

* +readable and deep upat test

* sort inv fix

* fix

* revert allowed_len
2024-07-18 09:34:08 -07:00
George Hotz
1242b302fa expand UOps with rewrite rules (#5501)
* expand UOps with rewrite rules [run_process_replay]

* progress

* much closer

* close, way less bugs

* bunch of expander tests

* fix contract

* ops tests pass

* fix barrier

* mostly passing

* bitcast in expanded ops

* support more expand merges

* all tests pass maybe

* fix empty EXPAND

* fix LIN fuzzing

* add ALL_SAME assert

* all same

* all same work

* raise CompileError

* pass fuzz linearizer

* revert whitespace

* fix nv tensor core test

* fix mypy

* bug fix

* fuzzer passes

* put tests back

* expand arg to idx
2024-07-17 10:17:50 -07:00
George Hotz
158221b36b expand tests from uop_expander [run_process_replay] (#5524)
* expand tests from uop_expander

* more changes from the branch
2024-07-17 09:22:36 -07:00
qazal
0b3a34e3b1 vectorize folding [run_process_replay] (#5470)
* test_gep_vec_fold

* remove that

* fix process replay

* lint
2024-07-14 09:41:48 +03:00
George Hotz
d13654a820 move uopgraph to file [run_process_replay] (#5364)
* move uopgraph to file [run_process_replay]

* fix print tree test
2024-07-10 17:34:50 -07:00
kormann
2349d837fb Fix scope order in graph toposort [run_process_replay] (#5330)
* fix

* test

* nothing
2024-07-08 11:46:15 -07:00
greg-niemeyer
77b2ce9fc9 Add UOps.VECTORIZE [run_process_replay] (#5289)
* Add UOps.VECTORIZE to core

* Update vectorized cast tests

* Addresses code review comments

- Removes VECTORIZE from LLVMRenderer
- Add line breaks to unduly long lines
- Add noop CAST rule back
- Update asserts and add render_vectorize in
  CSytleLanguage renderer

* Add missing const folding rule for VECTORIZE

Also adds corresponding test

* Fixes test_const_vectorize_fold and add assert

- Use sane types with VECTORIZE in test_const_vectorize_fold
- Add assert that sanity checks the types for VECTORIZE

* Rename test_cast_vectorized_fold

Renames test_cast_vectorized_fold to test_noop_vectorize_fold
because the test targets a very specific rule and there are
other tests for VECTORIZE.

* Revert unrelated changes

---------

Co-authored-by: qazal <77887910+Qazalin@users.noreply.github.com>
Co-authored-by: qazal <qazal.software@gmail.com>
2024-07-07 09:59:57 +03:00
qazal
1cefbb33ab uop graph tests + type_verify cleanup (#5292)
* test_cast_alu_fold

* test_double_cast_fold + these should assert
2024-07-05 13:00:01 +03:00
qazal
f374fb77af assert bool dtype for valid [run_process_replay] (#5214)
* valid is always bool

* prevent NumNode to begin with

* part 2

* test: disable pattern matchers, asserts should pass

* test: store without cast

* test: if (0)

* cleanup time

* only pattern match bool literal

* better for upstream debug
2024-06-29 21:20:32 +03:00
kormann
6c456b6d66 remove uopgraph dedup + slight speedup (#5199)
* rm dedup

* rm dedup

* tests

* reduce diff

* oups

* reduce diff

* rm UOp.tuple
2024-06-28 09:26:32 -07:00
George Hotz
345bcc2099 move graph_dedup out of class [run_process_replay] (#5197) 2024-06-27 12:04:00 -07:00
George Hotz
d094a6828f single pass rewrite (#5159)
* single pass rewrite

* claude cleanups

* claude cleanups

* skip those tests

* restrict that to ints

* comment

* asserts i don't expect to fail do fail

* simplest...rewrite...ever

* simplest...rewrite...ever

* add that rule back

* tests pass?

* only collapse reduce loops

* second SHL/SHR arg must be 4 bytes

* fix verify

* no SHL/SHR in ptx

* put that back

* skip them in PTX...bad tests
2024-06-27 11:36:05 -07:00
qazal
24c89a2a33 move assert_equiv_uops to helpers + use == for dtypes (#5067)
* dtypes should use ==

* use TestUOps

* should use assertIs
2024-06-20 16:39:34 +03:00
qazal
55e02cdd84 generic gate folding (#5061)
* add assert

* fold truthy gates [run_process_replay]

* fold falsy gates [run_process_replay] [no_assert]

* redo asserts

* check both barriers

* spec start

* spec end

* assert srcs

* make test_fold_gated_load_local better

* [run_process_replay] [no_assert]
2024-06-20 16:10:08 +03:00
kormann
7c3b877216 rename uop [run_process_replay] (#5031)
* rename

* fix unittests

* rename vin

* fix test

* fix type [run_process_replay]

* rm pre commit hook change
2024-06-18 21:34:05 +03:00
George Hotz
63a8add2c2 move uops add logic to linearize (#4952)
* move logic to linearize

* idk how this should work

* empty
2024-06-14 03:52:37 -07:00
George Hotz
9823752397 make uops.add private (#4950)
* make uops.add private

* modernize all tests
2024-06-14 03:23:25 -07:00
chenyu
3afc914617 CMPEQ -> CMPNE and make it safe to pad (#4818)
* CMPNE

* new dataset
2024-06-03 18:02:15 -04:00
qazal
0e69b22629 multireduce OptOps tests (start) (#4733)
* start

* full tests

* add skips

* unrelated

* notes
2024-05-27 12:21:33 +03:00
chenyu
eb714a600d fix UOps.CAST noop for vectorized dtypes (#4704)
* ==

* add test

* not lazyop

* use str comparison for PtrDType

---------

Co-authored-by: qazal <qazal.software@gmail.com>
2024-05-23 17:33:29 -04:00
George Hotz
07b350a8f4 new uops is an actual graph (#4560)
* new uops is an actual graph

* it's way slower

* simpler

* fix define acc

* render_loop unique

* ops test pass

* add pattern matcher back, there's bugs

* rewrite

* use priority queue

* recursive children

* fix tests

* fix tests with SINK

* fix abstractions

* fix assembly

* simpler

* link define_acc

* fix DEFINE_ACC placement

* type verify

* full cmp

* fix cmp

* ACCESS_ACC

* insert DEFINE_ACC

* fix PHI

* recursive rewrite

* fix many tests

* sum collapse

* more patterns

* correct change

* fold arange

* fix that lin test

* space

* big folding rule works

* close

* has more maxes, meh

* cached node replace

* set changed

* simplest folding yet

* works

* works

* DIV

* all tests pass

* del

* fuzz linearizer fails

* sum_collapse

* test depth 2 cf

* fix lin test 14

* fix clang depth

* disable that

* failure 14 is fixed

* fix ptx

* failure 27 is fixed

* fix llama

* run_cnt

* Revert "Optimize PTX gated loads index calculation (#4304)"

This reverts commit d97d5a7689.

* fix uops loop

* fix ptx bugs

* add barrier

* print

* mem_type in ptx direct

* bypass tests that fail in CI but pass locally

* ptx remove ptr_ar

* more ptx passing

* fix ptx tests

* assert compile support

* remove  model inference benchmark from red
2024-05-17 18:00:18 -07:00
qazal
267bbb57f9 Revert "Add insert_before to Linearizer Functions (#4320)" (#4421)
This reverts commit 664b563c91.
2024-05-04 17:50:21 +03:00
Timmy
664b563c91 Add insert_before to Linearizer Functions (#4320)
* adding insert_before to linearizer functions

* uop insert_before test case

* formatting

* more formatting

* more formatting

* syntax

* removing self.cast

* addressing err

* removing noqa s
2024-04-28 11:38:36 -04:00
George Hotz
2024b24f35 add some graph tests (#3702)
* add some graph tests

* PatternMatcher class

* speedup

* const cast test

* fix tests

* itertools chain
2024-03-12 09:49:47 -07:00