George Hotz
21835fc08c
more graph rewrite tests ( #6521 )
2024-09-16 09:20:54 +08:00
George Hotz
cd90092f14
graph rewrite tests ( #6519 )
...
* more graph rewrite tests
* more complex test cases
* more tests
* more tests
* cleanups
* 9600 lines
* cleanups
2024-09-15 17:29:16 +08:00
George Hotz
76487a3533
remove nop, use upat [run_process_replay] ( #6489 )
...
* remove nop, use upat [run_process_replay]
* mypy passes
* no wonder nothing worked
* fixes
2024-09-12 12:16:19 +08:00
George Hotz
bdd0c06f29
add void type to uop ( #6471 )
...
* unwrap_dtype maybe
* uopgraph stuff that hardcoded None
* test_ops passes
* dtypes.py fixups
* update test_linearizer and friends
* more ast updates
* test_beam and test_schedule too
* add void type to uop [run_process_replay]
* remove dumb casts
* start making it green
* more cast cleanups
* more cls methods to fix
* regenerate dataset
* split UOp and NOp const
* maybe that too
* fix docs
* update test_uop_symbolic
* test_verify_ast
* new sops with no diff
* meh, type_ignore is alright
* remove that assert
---------
Co-authored-by: qazal <qazal.software@gmail.com >
2024-09-11 18:16:28 +08:00
chenyu
d9d1ae7248
more lt folding using gcd ( #6469 )
2024-09-11 02:09:35 -04:00
chenyu
15c4d4f406
fold unrolled arange div pattern ( #6465 )
2024-09-10 22:35:52 -04:00
chenyu
2105832b87
_min_max of MUL of 2 non-positive inputs ( #6454 )
2024-09-10 07:13:01 -04:00
chenyu
fcc69adfc5
simplify c0*x<c1 for negative int c0,c1 ( #6431 )
...
* simplify c0*x<c1 for negative int c0,c1
* fine if rhs is zero
2024-09-09 21:05:53 -04:00
George Hotz
dbd4536167
Revert "add UOps.VALID ( #6387 )" ( #6441 )
...
This reverts commit 8186e4e7d6 .
2024-09-09 21:33:00 +08:00
George Hotz
42e5c8335e
remove args from min/max [run_process_replay] ( #6430 )
...
* remove args from min/max [run_process_replay]
* it's a ConstType
* sconst_like unused
* any const is fine
2024-09-09 18:18:20 +08:00
George Hotz
8186e4e7d6
add UOps.VALID ( #6387 )
...
* uops valid
* broke full_shape
* fixup that st (hardcoded asts still red)
* fixup DEFINE_VAR
debug
more debug
* start moving stuff to ast_const
* move test_linearizer
* move test_linearizer_failures to ast_const
* fixup test_schedule
* small diff change
* regenerate dataset
* fixup test_multitensor
* regen dataset try 2
---------
Co-authored-by: qazal <qazal.software@gmail.com >
2024-09-09 16:58:43 +08:00
chenyu
ac98f5056e
move lt-folding to a function [run_process_replay] ( #6422 )
...
and added more tests (some failed to match symbolic)
2024-09-09 02:04:52 -04:00
George Hotz
90fb17304f
put rewrite back in ops [run_process_replay] ( #6421 )
2024-09-09 13:53:51 +08:00
chenyu
25af78c593
failed uop_symbolic divmod test by variable ( #6414 )
2024-09-08 23:08:58 -04:00
chenyu
26c5d8346a
remove Variable from UOp.DEFINE_VAR ( #6393 )
...
now it's just arg = (expr as str, min as UOp.const, max as UOp.const)
2024-09-06 05:55:19 -04:00
chenyu
9a9fea7b8c
move DEFINE_VAR min/max from src to arg ( #6388 )
...
new arg is (Variable, min as CONST, max as CONST)
2024-09-06 05:01:02 -04:00
qazal
f1bd2a5519
fix BUFFER_UOPS sts in verify_ast [run_process_replay] ( #6389 )
2024-09-06 16:59:22 +08:00
chenyu
cc05016fa8
move test_pattern_matcher to test/unit ( #6386 )
2024-09-06 03:22:43 -04:00
chenyu
6fd24561d1
distribute MUL const into ADD for int ( #6361 )
...
pre-req for real_stride
2024-09-05 01:36:57 -04:00
chenyu
a666450e4d
UOp pattern x + x -> x * 2 ( #6224 )
...
* UOp pattern x + x -> x * 2
now there's no NEG, with this it covers all kinds of a*x+b*x
* can remove x-x
2024-08-21 12:06:19 -04:00
chenyu
c9a9631818
no UnaryOps.NEG in generated UOp patterns ( #6209 )
...
* no UnaryOps.NEG in generated UOp patterns
removed pattern `x * (-1) -> -x` and `x != True`
* those are fine because NEG became CMPNE and True
* fix sd validation L2 norm
2024-08-21 11:08:22 -04:00
George Hotz
16f420f7a7
split full_graph_rewrite and linearize_uop [run_process_replay] ( #6215 )
...
* split full_graph_rewrite and linearize_uop
* fix tests
* graph rewrite in test uops
* add types
2024-08-20 20:12:33 -07:00
Max-We
53b20afa3f
Write tar_extract ( #6180 )
...
* Add tar_extract
* Add tar_extract tests
* Fix dtype for initialization from path
* Tests for path initialization
* rm print
---------
Co-authored-by: Maximilian Weichart <maximilian.weichart@icloud.com >
2024-08-19 12:06:17 -07:00
Eitan Turok
8556d0c642
Support gunzip in fetch ( #6176 )
...
* init
* update
* clean
* add type
* clean
* fix import order
* shorten variable names
2024-08-19 12:04:40 -07:00
chenyu
00578a021b
re:6125 switch real_size to use uops [run_process_replay] ( #6138 )
...
* switch real_size to use uops [run_process_replay]
* enough to pass
---------
Co-authored-by: George Hotz <geohot@gmail.com >
2024-08-19 13:20:24 -04:00
qazal
be6dda4093
hotfix: more lazyop rename to uop [run_process_replay] ( #6157 )
2024-08-18 17:28:44 +03:00
chenyu
f7950fc2b6
add E275 missing-whitespace-after-keyword linting rule ( #6149 )
...
requires space after keywords like `assert`, `not`, `return`, `else`
2024-08-17 16:44:34 -04:00
George Hotz
88edc2902d
axis_is_masked with graph_rewrite [run_process_replay] ( #6144 )
2024-08-17 10:28:49 -07:00
qazal
d9ce664350
add test_verify_ast [run_process_replay] ( #6134 )
2024-08-17 14:14:30 +03:00
George Hotz
d9cb45af09
only axis is masked [run_process_replay] ( #6123 )
2024-08-16 21:01:17 -07:00
George Hotz
94aa5f11b5
Revert "use vmax for real_size [run_process_replay] ( #6120 )" ( #6122 )
...
This reverts commit a6e3211444 .
2024-08-16 20:33:19 -07:00
George Hotz
a6e3211444
use vmax for real_size [run_process_replay] ( #6120 )
...
* use vmax for real_size [run_process_replay]
* axis is masked
2024-08-16 20:17:23 -07:00
George Hotz
912f01ed4b
UOpGraph -> linearize_uop [run_process_replay] ( #6119 )
2024-08-16 19:48:39 -07:00
George Hotz
74ee9febec
remove iter from uopgraph ( #6110 )
...
* remove iter from uopgraph
* linearize returns uops
* fix tests
* linearize in linearize
* tests fix
* touchup
* test failures
2024-08-16 15:58:29 -07:00
qazal
28c75bf2a6
merge uops with ops ( #6111 )
...
Co-authored-by: chenyu <chenyu@fastmail.com >
2024-08-16 18:17:57 -04:00
chenyu
9ef82e1f2b
UOp pattern DEFINE_VAR with min==max is also CONST ( #6095 )
...
* UOp pattern DEFINE_VAR with min==max is also CONST
* fix tests
2024-08-15 12:09:44 -04:00
chenyu
5accfe26a0
rewrite bool ADD to OR and MUL to AND ( #6084 )
...
* rewrite bool ADD to OR and MUL to AND
fixed running `tinyphysics.onnx`, which contains a getitem from a boolean tensor.
only can repro through BEAM_COMPARE, which i think is a different bug in test_linearizer_failure
* fold those, and fix tests
* only for bool
* move dtypes.bool
2024-08-15 10:11:57 -04:00
chenyu
1782e4f64d
use div folding to do lt folding ( #6065 )
2024-08-13 16:59:05 -04:00
chenyu
6ed9711898
UOps pattern (x%c)+(x//c)*c = x ( #6051 )
...
pretty cool that this is very easy to write now
2024-08-12 14:58:48 -04:00
chenyu
e6c7c3e499
update pylint path to check indent/space for all ( #6022 )
...
also fixed many errors. it was not checking nested dirs. exclude autogen for now.
can we use ruff for this?
2024-08-10 14:41:09 -04:00
George Hotz
cfb04c67d1
run unit tests separate from others (and only once) ( #6020 )
...
* run unit tests separate from others
* ignore unit tests elsewhere
2024-08-10 11:17:56 -07:00
chenyu
63a8bc29d4
addition divisor in UOp div_folding ( #6002 )
...
in addition to try gcd of all terms, also try least common divisor of all MULs
2024-08-09 20:09:05 -04:00
chenyu
5961faa4be
minor change to UOp div_fold ( #6004 )
...
remove an unnecessary gcd and swap the quo rem order, minimize diff for divisor pr
2024-08-09 17:09:59 -04:00
chenyu
1f1eb46af6
more failed simplified UOp div test case ( #5992 )
...
this speculative div was handled by "divisor" in symbolic.
2024-08-08 18:39:25 -04:00
chenyu
c3e1ae2535
add failed simplified UOp div test case ( #5990 )
...
more cases!
2024-08-08 17:37:48 -04:00
chenyu
62c77a2831
trim const in UOp div_folding ( #5982 )
...
simplify `(4*x+4*y+7)//16` to `(x+y+1)//4`.
fixed `GPU=1 UOP_IS_SYMBOLIC=1 IMAGE=2 python -m pytest test/test_ops.py -k conv`
2024-08-08 12:49:05 -04:00
chenyu
859d0e4709
UOp simplify (x+c0)*c1 -> x*c1+c0*c1 ( #5973 )
2024-08-07 21:25:22 -04:00
chenyu
fa3a36e576
fancier UOp div gcd folding ( #5953 )
...
combine and cancel the remaining const based on gcd of other terms like SumNode.
2024-08-07 02:04:25 -04:00
chenyu
aa7fd7ef74
Use (-self).lt(-x+1) for UOp.ge ( #5955 )
...
matched symbolic and fixed UOP_IS_SYMBOLIC=1 arange folding
2024-08-07 01:31:27 -04:00
chenyu
aee737bd9e
divide by gcd in UOp div folding ( #5949 )
...
* divide by gcd in UOp div folding
`(6x+6y)//16 -> (3x+3y)//8` etc
simpler version
* only factor out const
* don't apply for unsigned
* don't need that if
* space
2024-08-06 20:00:57 -04:00