Commit Graph

127 Commits

Author SHA1 Message Date
chenyu
f511ad9103 No pyint again (#7156)
* Revert "bring back pyint (#7150)"

This reverts commit 37e83ca6fc.

* remove truncate in const folding

* truncate_output=False
2024-10-19 13:48:59 -04:00
nimlgen
54c6a317f8 test_failure_54 (#7155)
* test_failure_54

* metal
2024-10-18 23:31:18 +03:00
chenyu
37e83ca6fc bring back pyint (#7150)
fixed test_failure_52 and resnet. need to understand this better
2024-10-18 14:54:37 -04:00
chenyu
12ff52b88b test_failure_52 fails on real METAL (#7138) 2024-10-17 15:37:28 -04:00
chenyu
84e98900e8 test linearizer failure 53 (#7137)
variable scope issue caused compile error
2024-10-17 15:23:43 -04:00
George Hotz
ded1b38b84 minor dtype cleanup [pr] (#7124)
* minor dtype cleanup [pr]

* use ptr() function
2024-10-17 17:41:23 +08:00
nimlgen
9f00eacde5 nv tagged memory + resnet failed kernel (#7061)
* nv tagged memory

* linter

* metal fix?
2024-10-15 18:19:58 +03:00
chenyu
1a27417262 remove arbitrary multiplication case (#7033)
adds the wrongly simplified kernel in test_linearizer_failures
#7019
2024-10-13 15:06:05 -04:00
qazal
20d3c2d113 unify UOps.SHAPETRACKER and UOps.SWIZZLE with UOps.VIEW (#6955)
* add UOps.VIEW

* update hardcoded asts

* update sops.gz
2024-10-09 02:00:17 +08:00
George Hotz
14ad47b515 rewrite to use uops if (#6764)
* rewrite to use uops if

* does this pass

* careful penalty

* fix tests

* remove unused stuff

* that's a cstyle rewrite

* Update test_linearizer_dumb.py
2024-09-26 18:09:09 +08:00
George Hotz
bdd0c06f29 add void type to uop (#6471)
* unwrap_dtype maybe

* uopgraph stuff that hardcoded None

* test_ops passes

* dtypes.py fixups

* update test_linearizer and friends

* more ast updates

* test_beam and test_schedule too

* add void type to uop [run_process_replay]

* remove dumb casts

* start making it green

* more cast cleanups

* more cls methods to fix

* regenerate dataset

* split UOp and NOp const

* maybe that too

* fix docs

* update test_uop_symbolic

* test_verify_ast

* new sops with no diff

* meh, type_ignore is alright

* remove that assert

---------

Co-authored-by: qazal <qazal.software@gmail.com>
2024-09-11 18:16:28 +08:00
qazal
442150a8df more ast_const for hardcoding consts [run_process_replay] (#6418) 2024-09-09 11:35:08 +08:00
George Hotz
66e7e51c79 Revert beam failure (#6376)
* Revert "late gate creation for STORE [run_process_replay] (#6373)"

This reverts commit c26744de9f.

* Revert "gated store rewrite to UOps.IF (#5976)"

This reverts commit 48061e8400.
2024-09-06 09:36:44 +08:00
Ian Paul
48061e8400 gated store rewrite to UOps.IF (#5976)
* Core change to gate stores in IFs

* Updates to cstyle renderer to handle IFs around STOREs

* Make uops asserts happy

* Add tests and fix newly broken tests

* make ruff happy

* make mypy happy

* Simplify renderer to have all gated stores use IF

* Revert some changes

* Make test_where_fold happy

* Revert unnecessary handling of ifs rendering. Was included before when changes weren't fully built out

* Rewrite graph to have IFs be dependent on RANGEs if STORE is already dependent on RANGE

* Re-change broken test

* Make ifs be grouped together

* get non-merged IFs working. ALl tests pass except grouping related ifs together

* Fix tests by making the IF UOp dependent on the correct node of the STORE UOp

* Changes to uopgraph

* Simplify graph rewrite logic

* Changes to get test_padto_where_multireduce working

* Simplify uops.store renderer

* Make test_padto_where_multireduce pass but now other tests fail

* Clean up uopgraph from scrach work

* Ignore sudo IF srcs when rendering

* Attempt to fix llvm tests

* rm comment

* reduce lines

* Add line to make mypy happy :(

* llvmir fix pt 1

* Mods after rebasing to master

* Fix llvmir

* Fix ptx tests

* Fix other ptx tests

* Move changes from uops.py to ops.py

* rm uops.py

* Fix TestGateStoreRewrite tests

* Get multireduce tests working

* reset to remote branch

* Fix linearizer tests

* uop_graph test patch

* Add comment to create_gate

* hotfix: uncomment those tests

* Attempt to fix ptx tests by including whitespace inside if block

* Patch from remote tinybox. Tests passing here

* Min changes to get some ptx tests passsing

* Changes after rebase

* Exclude ifs and endifs from ptx

* IF conditional branching within ptx

* Save lines on delete_redundant_gates

* Simplify merge_gates

* rm noqa

* Remove unnecessary checks when merging gates

* Fix ops error msg

* Smarter check for if/endif in llvmir

* simplify delete redundant gates to only have 2 returns

* spacing

* Smarter check at beginning of merge_gates

* patches from comments

* Remove need for merge_gates

* include proper srcs in IF from the get-go

* test expand ifs dumb will result in 4 ifs, not 1 now

* Make tests happy

* Fix uops stats

* rm merge_gates method. Will add back in separate PR

* Spacing

* cleaner error msg

* Fix uops rendering when expanding. test_failure_43

* patch tests

* undo changes in delete_redundant_gates

* process replay attempt

* re-intro deletion of redundant gates

* fix addition of gates when they get nested in stores and loads

* patch tests

* smarter init of IF srcs when adding gate to STORE

* make ruff happy

* Resp to comment

* include all src[2]'s srcs in IF for gated store

* add reference of the storing value to the gate's src

* minor patch after rebasing

* change ptx renderer

---------

Co-authored-by: qazal <qazal.software@gmail.com>
2024-09-06 01:05:30 +08:00
George Hotz
e882294c02 uops touchups [run_process_replay] (#6368)
* uops touchups [run_process_replay]

* those are classmethods

* oops, kwargs

* no kwargs there
2024-09-05 17:22:32 +08:00
George Hotz
0d6922edb4 faster local tests. copy torch permuted to defautl device [run_process_replay] (#6363) 2024-09-05 13:57:20 +08:00
George Hotz
406ec8240e hotfix: lin_fail_41 passes on my M3 Max 2024-08-31 11:46:46 -07:00
gswangg
e44653e25a migrate test_linearizer_failures.py to UOp AST (#6240)
* add imports and update test_failure_1 to UOp AST

* update test_failure_2 with UOp AST

* update test_failure_3

* test_failure_5

* test_failure_6

* test_failure_7

* test_failure_8

* test_failure_9

* test_failure_10

* test_failure_11

* test_failure_12

* test_failure_12_multireduce

* uncomment skip and migrate test_failure_13

* test_failure_14

* test_failure_15

* test_failure_16

* test_failure_17

* test_failure_18

* test_failure_19

* test_failure_20

* test_failure_21

* test_failure_22

* test_failure_23

* test_failure_24

* test_failure_25

* test_failure_26

* test_failure_27

* test_failure_28

* test_failure_29

* test_failure_30

* test_failure_31

* test_failure_32

* test_failure_33

* test_failure_34

* test_failure_36

* test_failure_37

* test_failure_38

* test_update_39

* test_failure_40

* test_failure_41

* test_failure_42

* test_failure_43

* test_failure_44

* test_failure_45

* test_failure_46

* test_failure_47

* test_failure_48

* test_failure_49

* test_failure_50

* remove LazyOp

* reskip test_failure_22

* remove extra/ops

* replace ReduceOps with BinaryOps

* fixup that import

---------

Co-authored-by: qazal <qazal.software@gmail.com>
2024-08-24 16:26:58 +03:00
chenyu
3fc8203475 remove NEG from handwritten ast in tests (#6234)
* remove NEG from handwritten ast in tests

* test_linearizer_failures
2024-08-22 09:06:59 -04:00
chenyu
1c5ef5b793 format test_linearizer_failure (#6231)
made it easier to remove NEG
2024-08-21 21:10:56 -04:00
chenyu
c9a9631818 no UnaryOps.NEG in generated UOp patterns (#6209)
* no UnaryOps.NEG in generated UOp patterns

removed pattern `x * (-1) -> -x`  and `x != True`

* those are fine because NEG became CMPNE and True

* fix sd validation L2 norm
2024-08-21 11:08:22 -04:00
qazal
2242ff84be type verify intermediate UOps [run_process_replay] (#6140)
* type verify intermediate UOps [run_process_replay]

* merge asserts

* variable const
2024-08-19 20:59:01 +03:00
George Hotz
74ee9febec remove iter from uopgraph (#6110)
* remove iter from uopgraph

* linearize returns uops

* fix tests

* linearize in linearize

* tests fix

* touchup

* test failures
2024-08-16 15:58:29 -07:00
qazal
28c75bf2a6 merge uops with ops (#6111)
Co-authored-by: chenyu <chenyu@fastmail.com>
2024-08-16 18:17:57 -04:00
qazal
c23d44c779 AST is UOp (#6030)
* most of the work from the uops2 branch

* schedule

* realize

* kernel

* lowerer

* search

* green

* merge uops with ops

* Revert "merge uops with ops"

This reverts commit 1408a59f12.

* fix benchmark

* remove extra dedup
2024-08-16 22:09:00 +03:00
chenyu
5accfe26a0 rewrite bool ADD to OR and MUL to AND (#6084)
* rewrite bool ADD to OR and MUL to AND

fixed running `tinyphysics.onnx`, which contains a getitem from a boolean tensor.

only can repro through BEAM_COMPARE, which i think is a different bug in test_linearizer_failure

* fold those, and fix tests

* only for bool

* move dtypes.bool
2024-08-15 10:11:57 -04:00
chenyu
5048f9a4d5 test linearizer failure 49 (#6074)
with UOP_IS_SYMBOLIC=1, on METAL it breaks store fusion and have A+B and B+A being two different UOp
2024-08-14 11:29:10 -04:00
chenyu
3f2d24a6ec test_failure_48 for wrong truncation in idx on NV (#6055)
also added `RAWAST` to print pre-modified AST in DEBUG=3
2024-08-12 16:17:42 -04:00
George Hotz
cf7d3c1eb8 fix tests locally on metal (#6025)
* remove contiguous child, it was breaking tests locally

* hmm, it's still needed

* include NOOPT in method cache key
2024-08-10 12:36:22 -07:00
chenyu
0ec732b494 test lin fail 47 for UOP_IS_SYMBOLIC (#5853)
failed arange example with UOP_IS_SYMBOLIC
2024-07-31 23:09:22 -04:00
qazal
8174c438a3 pad test_failure_45 (#5846) 2024-07-31 23:08:48 +03:00
George Hotz
8672a9db3f add test to validate lazyops dims (#5845) 2024-07-31 12:59:38 -07:00
chenyu
4fe5b95568 fix UOp ALU bound (#5844)
* fix UOp ALU bound

root cause of resnet bug, the ALU bound is only correct for scalar, not vectorized

* it can be nan...
2024-07-31 15:19:31 -04:00
qazal
bcbd925001 hcopts failing test for fused arange kernel (#5815)
* add failure_43

* n 45
2024-07-31 09:02:44 +03:00
qazal
ed556c260e UOps.IF rules more tests (#5831)
* init tests

* split tests

* assert multiple gates simplicity
2024-07-31 00:11:02 -04:00
qazal
03d866b84f UOps.IF with rewrite rules (#5812)
* expand merge

* merge barriers

* gate_folder

* test_linearizer_failures

* this can be here

* bring the new repr back

* gate_folder2

* gate_creator is better

* gate_folder

* dedup conditions

* early gate folding

* dedup barrier

* fold noop conditions

* all consts can go away

* free lines
2024-07-30 20:50:56 +03:00
qazal
5e827e51d2 add llama3 BEAM=2 failures to test_linearizer_failures (#5553)
* skips

* opts.device

* benchmarks

* add to test_linearizer_failures

* remove hardcoded ones

* linter

* skip cpu
2024-07-30 00:37:32 +03:00
nimlgen
08f47d7dc3 more info on failure 41 (#5704) 2024-07-25 12:14:28 +03:00
nimlgen
69d4f474d8 amd resnet pf (#5703) 2024-07-25 11:21:22 +03:00
chenyu
3060e0be4f add vmin vmax of SPECIAL (#5670)
* add vmin vmax of SPECIAL

folded stuff like (-1 < gidx0)

* flaky
2024-07-23 22:55:54 -04:00
chenyu
01fe00e055 skip test_failure_39 in CI (#5660)
took more than 2 minutes in ci metal, it's basically the same as test_failure_37 but 20X bigger
2024-07-23 14:47:05 -04:00
George Hotz
386fb5e7f8 folding without UNMUL (#5628)
* folding without UNMUL

* fix failures, index_collapse

* import ReduceOps

* test_arange_4096 isn't folding
2024-07-21 20:14:44 -07:00
George Hotz
fa7e734b49 MetaOps.KERNEL (#5543) 2024-07-17 19:41:23 -07:00
Francis Lam
c4eb30a04c test/test_linearizer_failures: add a new beautiful_mnist one (#5531)
* test/test_linearizer_failures: add a new beautiful_mnist one

this one is from a DEPTH=2 fuzz_linearizer search

* add GPU to test_failure_40

---------

Co-authored-by: chenyu <chenyu@fastmail.com>
2024-07-17 16:27:04 -04:00
George Hotz
1a68854766 PatternMatcher add (#5532)
* PatternMatcher add [run_process_replay]

* f4 dynamic

* test_failure_36 is fixed

* fix PTX
2024-07-17 12:44:42 -07:00
George Hotz
158221b36b expand tests from uop_expander [run_process_replay] (#5524)
* expand tests from uop_expander

* more changes from the branch
2024-07-17 09:22:36 -07:00
George Hotz
42c25cc961 fix fixup_ast (#5523)
* fix fixup_ast

* these lin failures are fixed
2024-07-17 08:52:21 -07:00
Francis Lam
2d53abb04a test/external/fuzz_linearizer: fix for new AST changes (#5519)
* test/external/fuzz_linearizer: fix for new AST changes

also add beautiful_mnist failures

* add CLANG and LLVM to test_failure_35 failed_platforms

* fix test_linearizer_failure names
2024-07-17 00:08:07 -04:00
chenyu
07ff4b7d24 test_failure_33 ast that has UOps.UNMUL after linearize (#5504)
* test_failure_33 ast that has UOps.UNMUL after linearize

* smaller
2024-07-15 22:54:23 -04:00
George Hotz
03c2dc8bd7 lowerer is kernel [run_process_replay] (#5437) 2024-07-12 18:50:55 -07:00