kormann
6c456b6d66
remove uopgraph dedup + slight speedup ( #5199 )
...
* rm dedup
* rm dedup
* tests
* reduce diff
* oups
* reduce diff
* rm UOp.tuple
2024-06-28 09:26:32 -07:00
George Hotz
345bcc2099
move graph_dedup out of class [run_process_replay] ( #5197 )
2024-06-27 12:04:00 -07:00
George Hotz
d094a6828f
single pass rewrite ( #5159 )
...
* single pass rewrite
* claude cleanups
* claude cleanups
* skip those tests
* restrict that to ints
* comment
* asserts i don't expect to fail do fail
* simplest...rewrite...ever
* simplest...rewrite...ever
* add that rule back
* tests pass?
* only collapse reduce loops
* second SHL/SHR arg must be 4 bytes
* fix verify
* no SHL/SHR in ptx
* put that back
* skip them in PTX...bad tests
2024-06-27 11:36:05 -07:00
qazal
24c89a2a33
move assert_equiv_uops to helpers + use == for dtypes ( #5067 )
...
* dtypes should use ==
* use TestUOps
* should use assertIs
2024-06-20 16:39:34 +03:00
qazal
55e02cdd84
generic gate folding ( #5061 )
...
* add assert
* fold truthy gates [run_process_replay]
* fold falsy gates [run_process_replay] [no_assert]
* redo asserts
* check both barriers
* spec start
* spec end
* assert srcs
* make test_fold_gated_load_local better
* [run_process_replay] [no_assert]
2024-06-20 16:10:08 +03:00
kormann
7c3b877216
rename uop [run_process_replay] ( #5031 )
...
* rename
* fix unittests
* rename vin
* fix test
* fix type [run_process_replay]
* rm pre commit hook change
2024-06-18 21:34:05 +03:00
George Hotz
63a8add2c2
move uops add logic to linearize ( #4952 )
...
* move logic to linearize
* idk how this should work
* empty
2024-06-14 03:52:37 -07:00
George Hotz
9823752397
make uops.add private ( #4950 )
...
* make uops.add private
* modernize all tests
2024-06-14 03:23:25 -07:00
chenyu
3afc914617
CMPEQ -> CMPNE and make it safe to pad ( #4818 )
...
* CMPNE
* new dataset
2024-06-03 18:02:15 -04:00
qazal
0e69b22629
multireduce OptOps tests (start) ( #4733 )
...
* start
* full tests
* add skips
* unrelated
* notes
2024-05-27 12:21:33 +03:00
chenyu
eb714a600d
fix UOps.CAST noop for vectorized dtypes ( #4704 )
...
* ==
* add test
* not lazyop
* use str comparison for PtrDType
---------
Co-authored-by: qazal <qazal.software@gmail.com >
2024-05-23 17:33:29 -04:00
George Hotz
07b350a8f4
new uops is an actual graph ( #4560 )
...
* new uops is an actual graph
* it's way slower
* simpler
* fix define acc
* render_loop unique
* ops test pass
* add pattern matcher back, there's bugs
* rewrite
* use priority queue
* recursive children
* fix tests
* fix tests with SINK
* fix abstractions
* fix assembly
* simpler
* link define_acc
* fix DEFINE_ACC placement
* type verify
* full cmp
* fix cmp
* ACCESS_ACC
* insert DEFINE_ACC
* fix PHI
* recursive rewrite
* fix many tests
* sum collapse
* more patterns
* correct change
* fold arange
* fix that lin test
* space
* big folding rule works
* close
* has more maxes, meh
* cached node replace
* set changed
* simplest folding yet
* works
* works
* DIV
* all tests pass
* del
* fuzz linearizer fails
* sum_collapse
* test depth 2 cf
* fix lin test 14
* fix clang depth
* disable that
* failure 14 is fixed
* fix ptx
* failure 27 is fixed
* fix llama
* run_cnt
* Revert "Optimize PTX gated loads index calculation (#4304 )"
This reverts commit d97d5a7689 .
* fix uops loop
* fix ptx bugs
* add barrier
* print
* mem_type in ptx direct
* bypass tests that fail in CI but pass locally
* ptx remove ptr_ar
* more ptx passing
* fix ptx tests
* assert compile support
* remove model inference benchmark from red
2024-05-17 18:00:18 -07:00
qazal
267bbb57f9
Revert "Add insert_before to Linearizer Functions ( #4320 )" ( #4421 )
...
This reverts commit 664b563c91 .
2024-05-04 17:50:21 +03:00
Timmy
664b563c91
Add insert_before to Linearizer Functions ( #4320 )
...
* adding insert_before to linearizer functions
* uop insert_before test case
* formatting
* more formatting
* more formatting
* syntax
* removing self.cast
* addressing err
* removing noqa s
2024-04-28 11:38:36 -04:00
George Hotz
2024b24f35
add some graph tests ( #3702 )
...
* add some graph tests
* PatternMatcher class
* speedup
* const cast test
* fix tests
* itertools chain
2024-03-12 09:49:47 -07:00