Commit Graph

72 Commits

Author SHA1 Message Date
chenyu
6fd24561d1 distribute MUL const into ADD for int (#6361)
pre-req for real_stride
2024-09-05 01:36:57 -04:00
chenyu
a666450e4d UOp pattern x + x -> x * 2 (#6224)
* UOp pattern x + x -> x * 2

now there's no NEG, with this it covers all kinds of a*x+b*x

* can remove x-x
2024-08-21 12:06:19 -04:00
chenyu
9ef82e1f2b UOp pattern DEFINE_VAR with min==max is also CONST (#6095)
* UOp pattern DEFINE_VAR with min==max is also CONST

* fix tests
2024-08-15 12:09:44 -04:00
chenyu
e6c7c3e499 update pylint path to check indent/space for all (#6022)
also fixed many errors. it was not checking nested dirs. exclude autogen for now.

can we use ruff for this?
2024-08-10 14:41:09 -04:00
chenyu
1f1eb46af6 more failed simplified UOp div test case (#5992)
this speculative div was handled by "divisor" in symbolic.
2024-08-08 18:39:25 -04:00
chenyu
c3e1ae2535 add failed simplified UOp div test case (#5990)
more cases!
2024-08-08 17:37:48 -04:00
chenyu
62c77a2831 trim const in UOp div_folding (#5982)
simplify `(4*x+4*y+7)//16` to `(x+y+1)//4`.
fixed `GPU=1 UOP_IS_SYMBOLIC=1 IMAGE=2 python -m pytest test/test_ops.py -k conv`
2024-08-08 12:49:05 -04:00
chenyu
489575c3be more UOp sum div with gcd tests (#5936)
* more UOp sum div with gcd tests

* one more
2024-08-06 12:50:10 -04:00
chenyu
da61dea1b2 simple failed UOp sub symbolic test case (#5894) 2024-08-03 14:27:23 -04:00
chenyu
41bbd3f4c1 update UOp mod reduction patterns (#5883)
prepare generic mod folding, also some test changes from mod folding pr
2024-08-02 17:43:40 -04:00
chenyu
c2ffcf6887 remove the wrong mod UOp pattern (#5847)
don't think we are hitting it because the stride construction, and it's wrong and not needed
2024-07-31 16:24:25 -04:00
chenyu
2e087ca8e4 UOp bound for div negative number (#5808) 2024-07-31 02:10:23 -04:00
chenyu
02f0be03f2 tests on UOp div negative number and arange opts (#5825) 2024-07-30 20:06:57 -04:00
chenyu
e7a14f398e more uop_symbolic tests for divmod pairs (#5785) 2024-07-28 21:27:06 -04:00
chenyu
71a64d8252 UOps.MUL bound when one is negative (#5781)
* UOps.MUL bound when one is negative

also one more distribute_mul rule

* don't always expand
2024-07-28 19:02:47 -04:00
chenyu
845b0d1c9d UOp more generic div folding (#5722)
old: `x // c` can fold if `0 <= x.vmin <= x.vmax < c`
new: `x // c` can fold if `0 < c and x.vmin // c == x.vmax // c`
2024-07-25 17:49:14 -04:00
chenyu
66a9c372af UOp mod reduction (#5697) 2024-07-24 20:36:00 -04:00
chenyu
85710e86cb UOps div folding (#5690)
#5689, with just div folding and new test cases
2024-07-24 14:21:44 -04:00
chenyu
4f83da626e uop symbolic simple mul mod (#5648) 2024-07-22 23:17:41 -04:00
chenyu
f2d2afdaa4 dumb linearizer example that max is not simplified (#5644)
* dumb linearizer example that max is not simplified

this might just get fix once basic mod simplification is done

* need local
2024-07-22 18:37:26 -04:00
chenyu
92e7e65712 one more test case for symbolic mod mul (#5615) 2024-07-20 17:23:06 -04:00
chenyu
519336cfea factor out partial in SumNode div int (#3841)
* factor out partial in SumNode div int

* div not rem

* space
2024-03-20 16:34:33 -04:00
chenyu
455f7bea9b test example from half resnet that idx has number outside of int32 (#3838)
* test example from half resnet that idx has number outside of int32

* ruff
2024-03-20 13:44:20 -04:00
Patrick Tsai
b436c9792f Fix factoring bug (O(n) arange related) (#3817)
* Factoring bug

* Another one in case

* It works now so change tests back

* large arange cumsum optimization

* More cleanup

* symbolic no factor div test

* name change

* Rename test

---------

Co-authored-by: Patrick Tsai <patosai@users.noreply.github.com>
2024-03-19 11:49:42 -04:00
chenyu
968d109453 apply more create_lt_node (#3597)
updated one in linearizer if condition, and various symbolic tests
2024-03-03 16:12:39 -05:00
Patrick Tsai
0082300a59 Fix symbolic negative floordiv (#3594)
Co-authored-by: Patrick Tsai <patosai@users.noreply.github.com>
2024-03-03 11:40:52 -05:00
chenyu
e09619ab6c explicitly create_lt_node when used in shapetracker _expr_view (#3561)
* explicitly create_lt_node when used in shapetracker

leave regular __lt__ and cmps for symbolic shape cmp

* hmm it fixed that?

* LtNode.substitute uses create_lt_node
2024-03-03 10:08:21 -05:00
chenyu
88939c3347 fix Node.max can be symbolic (#3514)
Also made sure taking max twice can get int.
2024-02-27 17:21:31 -05:00
chenyu
61605ccc69 Remove special case of SumNode div SumNode (#3502) 2024-02-26 09:42:06 -05:00
chenyu
0d326a48b8 fix LtNode simplification when lhs and rhs contain same variables (#3451)
* fix LtNode simplification when lhs and rhs contain same variables

`(Variable("a", 1, 5) < Variable("a", 1, 5))` should eval to `NumNode(0)`

* fix with less perf impact
2024-02-20 09:06:55 -05:00
chenyu
2da734920e use __getnewargs__ to fix unpickling Variable (#3441)
it's recommended to use __getnewargs__ to update the args of classes that use __new__ when unpickling.
It's preferred because it does not change the __new__ behavior.
2024-02-18 10:28:37 -05:00
David Hou
3378625773 name upcast variables (#3200)
* name upcast variables

* typing

* unused
2024-01-22 11:37:28 -05:00
chenyu
f018a55ea1 update NumNode.__hash__ to be hash(self.b) (#3105)
with this, `a:=NumNode(x) == b` implies `hash(a) == hash(b)`
2024-01-12 19:46:21 -05:00
chenyu
74cc6fd3c2 remove AndNode.__floordiv__ special case (#2996)
* remove AndNode.__floordiv__

AndNode produces a Node that min/max is bounded by [0, 1] so `//` on top of that is almost always 0.
we don't really use that either

* keep the test
2024-01-03 17:44:55 -05:00
chenyu
8291986959 Variable.sum -> Node.sum, Variable.ands -> Node.ands (#2961) 2024-01-01 16:21:28 -05:00
chenyu
3d720b5761 move expand_idx, iter_idxs and expand_node from symbolic to linearizer (#2959) 2024-01-01 14:41:21 -05:00
Umut Zengin
8ad7cfeeb1 More simplification in to_image_idx and symbolic (#2679)
* less valid

* add test

---------

Co-authored-by: chenyu <chenyu@fastmail.com>
2023-12-13 12:30:44 -05:00
George Hotz
35b5e95097 parallel beam search (#2610)
* better print

* fix beam search with vars

* cleanups

* parallel is not default

* restore that

* bugfix

* cleanups

* bugfix
2023-12-05 10:09:45 -08:00
Amrit Sahu
e8d6a6ef2e view.reshape without symbolic (#2218)
* handle reshape of contiguous subparts with explicit mask

* remove the add/remove ones logic in reshape

* accomodate ones in accumulate logic

* make multiply commutative

* fix linting

* make mypy happy

* add test for commutative mul

* merge dimensions in shape_strides for 1 range masks

* add offsets for merging

* fix linting

* add back explicit 1 reshapes

* fix mypy errors

* fix accumulate by includng state

* include non-zero stride dimension in acc

* small cleanup

* more compact to_shape_strides

* more logical cleanup

* compress more

* compress reshape mask

* adding some comments

* small bug fix

* improve test coverage

* remove explicit add remove ones

* small bug in test

* enable test_reshape_splitting_combining

* small fix

* 10 lines less to_shape_strides

* shorten reshape mask

* some more cleanup

* more cleanup

* introduce some symbols for compactness

* more symbols

* more cleaner

* lessen symbols, it became less readable

* remove merge_views from view.reshape

* change to_shape_strides to _merge_dims

* improve readability

* fix corner case

* cleanup

* better handling of 1 <= Variable('i',1,10) & new_dim = Variable('i',1,10)

* rewrite _reshape_mask for readability

* fix white space

* add comment

* nice shorthands for readability

* add proof in docs

* small nit

---------

Co-authored-by: chenyu <chenyu@fastmail.com>
2023-12-04 12:46:53 -05:00
chenyu
847f0a02b1 non-simplifiable mod should result in ModNode (#2490)
* non-simplifiable mod should result in ModNode

* space
2023-11-28 16:52:19 -05:00
Christopher Mauri Milan
7f01dd04f0 Apply ruff linting rules to tests (#2473)
* everything except F821

* enable F821 with noqa

* dumb fix

* fix remaining imports and (former) lambdas

* replace _ with noqa to avoid gc
2023-11-27 21:24:06 -08:00
Paul Gustafson
98cd9e8926 Add assertion to prevent nonsense mod values (#2474) 2023-11-27 18:37:44 -08:00
chenyu
61a80a0675 asserts LtNodes of SumNode with MulNode of Nodes (#2465) 2023-11-27 12:56:59 -05:00
Paul Gustafson
1d89c018fa Add isinstance check before gcd call in SumNode.__lt__ (#2450)
* Add isinstance check before gcd call

* Delete blank lines

* Fix unit test typo

* Delete blank lines again

---------

Co-authored-by: Paul Gustafson <paul.gustafson@theambrusgroup.com>
2023-11-26 13:05:04 -08:00
chenyu
d7d078c7f9 Node.vars() returns a set and properly dedup (#2356)
* dedup RedNode.vars()

* vars returns a set

* fix more vars

* unused import

* update to_movement_ops

* comment
2023-11-18 17:44:52 -05:00
chenyu
f02e17a967 Variable.num -> NumNode (#2354) 2023-11-18 15:45:52 -05:00
Umut Zengin
01b98b7f42 MulNode.__lt__ rule (#2086)
* Added the rule

* Added tests

* flake8

* self.b == -1 shortcut
2023-10-17 13:18:35 -07:00
Umut Zengin
776605f2fc O(1) VALIDHACKS (#2072)
* first refactoring

* O(1) validhacks

* O(1) validhacks

* Some cleaning

* mypy

* flake8

* Trim trim

* flake8

* clean

* less chaotic

* less chaotic

* flake8

* Symbolic, SumNode include mulnode for gcd

* fix tests

* smal optim

* revert

* clean

* clean

* flake8

* small fix

* Add symbolic test
2023-10-15 11:26:41 -07:00
Umut Zengin
6b7ac5c431 ModNode __mod__ rule (#2039)
* Implement mod rule

* mypy

* feat: New test added
2023-10-12 11:30:10 -07:00
Umut Zengin
3987280daf Fix VALIDHACKS for Images and make it default (#1832)
* valid hacks

* valid hacks

* valid hacks

* new method

* new method

* handtune

* is gate load breaking?

* lint

ruff

less junk

new approach?

maybe this?

* Make it more clear

* Make it more clear

* Will deal with the linter later

* hack for linter

* subs the idx but dont touch the valid

* Updated the mod rules

* lint hack

* I believe bug fix lets see

* Mod Node left

* revert

* Maybe this wont break?

* revert

* implemented "handtuned garbage"

* revert and use VALIDHACKS

* Lets see the CI

* still broken?

* currently its jungle

* maybe this jungle ?

* This works for everything somehow

* Added test for symbolic

* lint

* final touch

* This still works

* lint

* midway clean

* less garbage

* lint

* final form

* Slow but working way

* lint and other stuff

* lint

* mypy

* Make sure CI test Openpilot valid checks

* test if CI break

* Convert back

* refactor

* refactor

* Managed to reduce openpilot time from 30 secs to 5 secs

* Refactor

* Substitute a node with variable

* flake8

* Comment and refactor

* More comprehensive mod

* refactor

* bug fix

* More shave off

* remove not sure part
2023-09-23 07:34:43 +08:00