chenyu
f55459c98e
failed validhack test for a 0.9.7 conv ( #6677 )
2024-09-23 04:43:47 -04:00
chenyu
0362dbbbe8
relax idx simplification given valid ( #6669 )
...
apply to kernels in op 0.9.7.
if a valid has a complicated expr, we cannot drop valid but it's possible to simplify idx given valid
2024-09-23 03:04:57 -04:00
chenyu
26ebb7cab4
don't use div_folding in lt_folding ( #6666 )
...
* don't use div_folding in lt_folding
valids 35 -> 13
* fails the same as before
2024-09-23 01:50:18 -04:00
chenyu
da5b741656
removed valid in openpilot conv ( #6619 )
...
35 valids left
2024-09-23 00:30:18 -04:00
George Hotz
52c2c4df9c
fix match of sz 0 + dedup kernel ast [run_process_replay] ( #6663 )
...
* fix match of sz 0 [run_process_replay]
* empty graph rewrite to dedup st
2024-09-23 11:56:53 +08:00
chenyu
1923932339
canonicalize simplex lt ( #6658 )
...
(X := a0*x0 + a1*x1 + ...) > 0 is equivalent to x0 + x1 + ... > 0 if xi >= 0 and ai > 0 for ints
2024-09-22 23:04:47 -04:00
chenyu
90c1ccc402
simpler drop valid check in simplify_valid_image_load ( #6653 )
...
* simpler drop valid check in simplify_valid_image_load
* update tests
2024-09-22 21:46:39 -04:00
qazal
982086f54c
UOps.VALID try 2 ( #6623 )
...
* make UOps.VALID compile
* fixable tests
* bufs dedup
* cleanup the CONST spec
* regenerate dataset with graph_rewrite
```py
def rewrite_const(const:UOp, st_src:UOp) -> UOp:
st: ShapeTracker = st_src.arg
return UOp(UOps.VALID, dtypes.bool, (st.to_uop(),)).where(UOp.const(const.dtype, const.arg), UOp.const(const.dtype, 0))
pm = PatternMatcher([(UPat(UOps.CONST, name="const", src=(UPat(UOps.SHAPETRACKER, name="st_src"),)), rewrite_const)])
```
* rm arg
* remove arg
* revert arg removal
This reverts commit 2c35c75c95 .
* red test_pickle_define_var
2024-09-21 14:19:25 +08:00
qazal
d2351af019
fixup non-void SINKs in tests [run_process_replay] ( #6624 )
2024-09-21 13:29:18 +08:00
qazal
391d14438e
DEFINE_VAR prereqs for VALID [run_process_replay] ( #6637 )
2024-09-21 13:28:39 +08:00
nimlgen
053c4dee55
qcom test for image pitch ( #6621 )
...
* qcom test for image pitch
* comment
2024-09-20 18:13:48 +08:00
chenyu
5707503048
x//a<b -> x <a*b for positive a ( #6622 )
...
openpilot valids 47 -> 37
2024-09-20 04:38:47 -04:00
chenyu
b14c1bc417
UOps.RANGE is_increasing ( #6615 )
...
* UOps.RANGE is_increasing
283 -> 47 valids
* test
2024-09-20 03:14:52 -04:00
chenyu
036c2f5b26
validhack use the new style ge for upper bound valid ( #6612 )
...
also relaxed the bound check to check vmin/vmax instead just const.
valids 482 -> 283
2024-09-19 23:45:42 -04:00
chenyu
a37e92081a
fix unrolled arange folding ( #6606 )
...
* fix unrolled arange folding
also added flop test to test_arange to make sure it's 0 flop
* skip PTX
2024-09-19 09:03:01 -04:00
chenyu
d148a62f8d
more generic simplify_valid_image_load ( #6603 )
...
use graph_rewrite to simplify the expression with narrowed variables, and check boundry conditions on monotonically increasing function to drop valid.
2024-09-19 05:33:37 -04:00
chenyu
eeee032b14
tiny cleanup of test_image_valid ( #6597 )
...
* tiny cleanup of test_image_valid
Sepcial and Variable to setup UOp
* typo
2024-09-19 03:09:47 -04:00
George Hotz
012a2c449a
fix lt_folding VCONST issue [run_process_replay] ( #6424 )
...
* le and ge [run_process_replay]
* bugfix
* fix divides bug
* fix lt_folding issue
2024-09-19 14:59:20 +08:00
chenyu
496806ce75
another example of openpilot conv with valid ( #6595 )
2024-09-19 01:54:01 -04:00
chenyu
7f9fd556b0
_min_max for WHERE ( #6564 )
...
prereq to gated load simplification
just for int
2024-09-18 23:47:48 -04:00
chenyu
1b6eee02ad
failed test case for openpilot validhack conv ( #6590 )
...
* failed test case for openpilot validhack conv
can save 2ms once this is fixed
* fix order
2024-09-18 23:12:30 -04:00
chenyu
bd40a26b8b
image valid test case that current approach does not work ( #6584 )
2024-09-18 06:06:03 -04:00
chenyu
162ead02a9
remove LOAD where valid is an empty set ( #6579 )
...
356 -> 354 valids
2024-09-18 03:49:41 -04:00
chenyu
a72d51e277
brute force VALIDHACK matching ( #6575 )
...
* brute force VALIDHACK matching
* cleanup
* 9700
2024-09-18 01:59:50 -04:00
George Hotz
67a03e72bb
remove expr_idxs [run_process_replay] ( #6567 )
...
* remove expr_idxs [run_process_replay]
* goodbye that test
2024-09-17 18:34:51 +08:00
chenyu
b947db3de1
don't fold mul mod for common factor ( #6566 )
...
it makes valid pattern more annoying
2024-09-17 06:01:27 -04:00
chenyu
5fb877c78c
generic valid match criteria of #6552 ( #6558 )
...
455 -> 364 valids.
generalize `idx < image bound` to `idx < image bound + c` for some `c`
2024-09-17 02:40:36 -04:00
chenyu
c62b6fd8f0
match any statement in valid for simplification ( #6554 )
2024-09-17 01:39:47 -04:00
chenyu
7c942418a1
other side of simple out of bound valid case ( #6552 )
...
462 -> 455
2024-09-16 23:57:15 -04:00
chenyu
aeaf7894a7
more generic version of #6548 ( #6549 )
...
x*(-1)<0 can be generalized to x*(-1)<c, 473 -> 462 valids
2024-09-16 23:17:16 -04:00
chenyu
596f41eb46
simple drop image valid case ( #6548 )
...
* simple drop image valid case
started unit test, 530 -> 473 valids
* cleanup
2024-09-16 22:54:07 -04:00
George Hotz
07bd6e070d
add more uops tests for vmin/vmax/const_factor/divides ( #6533 )
2024-09-16 13:06:31 +08:00
George Hotz
21835fc08c
more graph rewrite tests ( #6521 )
2024-09-16 09:20:54 +08:00
George Hotz
cd90092f14
graph rewrite tests ( #6519 )
...
* more graph rewrite tests
* more complex test cases
* more tests
* more tests
* cleanups
* 9600 lines
* cleanups
2024-09-15 17:29:16 +08:00
George Hotz
76487a3533
remove nop, use upat [run_process_replay] ( #6489 )
...
* remove nop, use upat [run_process_replay]
* mypy passes
* no wonder nothing worked
* fixes
2024-09-12 12:16:19 +08:00
George Hotz
bdd0c06f29
add void type to uop ( #6471 )
...
* unwrap_dtype maybe
* uopgraph stuff that hardcoded None
* test_ops passes
* dtypes.py fixups
* update test_linearizer and friends
* more ast updates
* test_beam and test_schedule too
* add void type to uop [run_process_replay]
* remove dumb casts
* start making it green
* more cast cleanups
* more cls methods to fix
* regenerate dataset
* split UOp and NOp const
* maybe that too
* fix docs
* update test_uop_symbolic
* test_verify_ast
* new sops with no diff
* meh, type_ignore is alright
* remove that assert
---------
Co-authored-by: qazal <qazal.software@gmail.com >
2024-09-11 18:16:28 +08:00
chenyu
d9d1ae7248
more lt folding using gcd ( #6469 )
2024-09-11 02:09:35 -04:00
chenyu
15c4d4f406
fold unrolled arange div pattern ( #6465 )
2024-09-10 22:35:52 -04:00
chenyu
2105832b87
_min_max of MUL of 2 non-positive inputs ( #6454 )
2024-09-10 07:13:01 -04:00
chenyu
fcc69adfc5
simplify c0*x<c1 for negative int c0,c1 ( #6431 )
...
* simplify c0*x<c1 for negative int c0,c1
* fine if rhs is zero
2024-09-09 21:05:53 -04:00
George Hotz
dbd4536167
Revert "add UOps.VALID ( #6387 )" ( #6441 )
...
This reverts commit 8186e4e7d6 .
2024-09-09 21:33:00 +08:00
George Hotz
42e5c8335e
remove args from min/max [run_process_replay] ( #6430 )
...
* remove args from min/max [run_process_replay]
* it's a ConstType
* sconst_like unused
* any const is fine
2024-09-09 18:18:20 +08:00
George Hotz
8186e4e7d6
add UOps.VALID ( #6387 )
...
* uops valid
* broke full_shape
* fixup that st (hardcoded asts still red)
* fixup DEFINE_VAR
debug
more debug
* start moving stuff to ast_const
* move test_linearizer
* move test_linearizer_failures to ast_const
* fixup test_schedule
* small diff change
* regenerate dataset
* fixup test_multitensor
* regen dataset try 2
---------
Co-authored-by: qazal <qazal.software@gmail.com >
2024-09-09 16:58:43 +08:00
chenyu
ac98f5056e
move lt-folding to a function [run_process_replay] ( #6422 )
...
and added more tests (some failed to match symbolic)
2024-09-09 02:04:52 -04:00
George Hotz
90fb17304f
put rewrite back in ops [run_process_replay] ( #6421 )
2024-09-09 13:53:51 +08:00
chenyu
25af78c593
failed uop_symbolic divmod test by variable ( #6414 )
2024-09-08 23:08:58 -04:00
chenyu
26c5d8346a
remove Variable from UOp.DEFINE_VAR ( #6393 )
...
now it's just arg = (expr as str, min as UOp.const, max as UOp.const)
2024-09-06 05:55:19 -04:00
chenyu
9a9fea7b8c
move DEFINE_VAR min/max from src to arg ( #6388 )
...
new arg is (Variable, min as CONST, max as CONST)
2024-09-06 05:01:02 -04:00
qazal
f1bd2a5519
fix BUFFER_UOPS sts in verify_ast [run_process_replay] ( #6389 )
2024-09-06 16:59:22 +08:00
chenyu
cc05016fa8
move test_pattern_matcher to test/unit ( #6386 )
2024-09-06 03:22:43 -04:00