qazal
|
24c89a2a33
|
move assert_equiv_uops to helpers + use == for dtypes (#5067)
* dtypes should use ==
* use TestUOps
* should use assertIs
|
2024-06-20 16:39:34 +03:00 |
|
qazal
|
55e02cdd84
|
generic gate folding (#5061)
* add assert
* fold truthy gates [run_process_replay]
* fold falsy gates [run_process_replay] [no_assert]
* redo asserts
* check both barriers
* spec start
* spec end
* assert srcs
* make test_fold_gated_load_local better
* [run_process_replay] [no_assert]
|
2024-06-20 16:10:08 +03:00 |
|
kormann
|
7c3b877216
|
rename uop [run_process_replay] (#5031)
* rename
* fix unittests
* rename vin
* fix test
* fix type [run_process_replay]
* rm pre commit hook change
|
2024-06-18 21:34:05 +03:00 |
|
George Hotz
|
63a8add2c2
|
move uops add logic to linearize (#4952)
* move logic to linearize
* idk how this should work
* empty
|
2024-06-14 03:52:37 -07:00 |
|
George Hotz
|
9823752397
|
make uops.add private (#4950)
* make uops.add private
* modernize all tests
|
2024-06-14 03:23:25 -07:00 |
|
chenyu
|
3afc914617
|
CMPEQ -> CMPNE and make it safe to pad (#4818)
* CMPNE
* new dataset
|
2024-06-03 18:02:15 -04:00 |
|
qazal
|
0e69b22629
|
multireduce OptOps tests (start) (#4733)
* start
* full tests
* add skips
* unrelated
* notes
|
2024-05-27 12:21:33 +03:00 |
|
chenyu
|
eb714a600d
|
fix UOps.CAST noop for vectorized dtypes (#4704)
* ==
* add test
* not lazyop
* use str comparison for PtrDType
---------
Co-authored-by: qazal <qazal.software@gmail.com>
|
2024-05-23 17:33:29 -04:00 |
|
George Hotz
|
07b350a8f4
|
new uops is an actual graph (#4560)
* new uops is an actual graph
* it's way slower
* simpler
* fix define acc
* render_loop unique
* ops test pass
* add pattern matcher back, there's bugs
* rewrite
* use priority queue
* recursive children
* fix tests
* fix tests with SINK
* fix abstractions
* fix assembly
* simpler
* link define_acc
* fix DEFINE_ACC placement
* type verify
* full cmp
* fix cmp
* ACCESS_ACC
* insert DEFINE_ACC
* fix PHI
* recursive rewrite
* fix many tests
* sum collapse
* more patterns
* correct change
* fold arange
* fix that lin test
* space
* big folding rule works
* close
* has more maxes, meh
* cached node replace
* set changed
* simplest folding yet
* works
* works
* DIV
* all tests pass
* del
* fuzz linearizer fails
* sum_collapse
* test depth 2 cf
* fix lin test 14
* fix clang depth
* disable that
* failure 14 is fixed
* fix ptx
* failure 27 is fixed
* fix llama
* run_cnt
* Revert "Optimize PTX gated loads index calculation (#4304)"
This reverts commit d97d5a7689.
* fix uops loop
* fix ptx bugs
* add barrier
* print
* mem_type in ptx direct
* bypass tests that fail in CI but pass locally
* ptx remove ptr_ar
* more ptx passing
* fix ptx tests
* assert compile support
* remove model inference benchmark from red
|
2024-05-17 18:00:18 -07:00 |
|
qazal
|
267bbb57f9
|
Revert "Add insert_before to Linearizer Functions (#4320)" (#4421)
This reverts commit 664b563c91.
|
2024-05-04 17:50:21 +03:00 |
|
Timmy
|
664b563c91
|
Add insert_before to Linearizer Functions (#4320)
* adding insert_before to linearizer functions
* uop insert_before test case
* formatting
* more formatting
* more formatting
* syntax
* removing self.cast
* addressing err
* removing noqa s
|
2024-04-28 11:38:36 -04:00 |
|
George Hotz
|
2024b24f35
|
add some graph tests (#3702)
* add some graph tests
* PatternMatcher class
* speedup
* const cast test
* fix tests
* itertools chain
|
2024-03-12 09:49:47 -07:00 |
|