Commit Graph

156 Commits

Author SHA1 Message Date
qazal
1a2ee37dd3 hotfix: remove redundant test_schedules [pr] (#7412) 2024-10-31 01:10:31 +08:00
qazal
1383df95af track_rewrites by function call [pr] (#7165)
* named track_rewrites [pr]

* group all of create_schedule_with_vars
2024-10-20 17:45:25 +03:00
chenyu
287a198c4f increase test_strongly_connected_DAG threshold (#7131)
flaky
2024-10-17 11:08:50 -04:00
George Hotz
ded1b38b84 minor dtype cleanup [pr] (#7124)
* minor dtype cleanup [pr]

* use ptr() function
2024-10-17 17:41:23 +08:00
qazal
6acda43a2c test a rewrite of permuted reduce [pr] (#7093)
* test a rewrite of permuted reduce [pr]

* addd rewrite tracker

* expected

* passes
2024-10-16 12:49:54 +03:00
qazal
390171d686 delete SAVE_SCHEDULE=1 [pr] (#7087) 2024-10-16 07:13:20 +03:00
George Hotz
3169cb386d remove graph [pr] (#7085) 2024-10-16 11:40:07 +08:00
qazal
fb29de6cc3 split schedule to view_left and view_right [pr] (#7077)
* split schedule to view_left and view_right [pr]

* move valid
2024-10-16 03:39:38 +03:00
Louis Novy
2ac5aec66b Fix exponential complexity in _is_padding_okay [pr] (#7008)
* preliminary test

* missed Optional

* don't check for cache during recursion

* match style from st_fixup... may be marginally faster?

* pathological test case: strongly connected DAG

* move to test_schedule as this isn't really a fusion

* oops this shouldn't be edited

* Revert "oops this shouldn't be edited"

This reverts commit 487cb027dc.

* Revert "move to test_schedule as this isn't really a fusion"

This reverts commit 48d8c550ce.

* move to test_schedule as this isn't really a fusion

* ok no more merge error funny business
2024-10-14 02:34:47 +03:00
chenyu
04d9b46d51 derivative of softmax is indepedent of max (#7009)
* derivative of softmax is indepedent of max

* update test
2024-10-12 15:59:23 -04:00
chenyu
cae1c41755 test case of softmax backward kernel count (#7022) 2024-10-12 15:46:32 -04:00
qazal
7451812bbf delete AST_REWRITE ctx var (#6995) 2024-10-11 11:33:16 +03:00
qazal
20d3c2d113 unify UOps.SHAPETRACKER and UOps.SWIZZLE with UOps.VIEW (#6955)
* add UOps.VIEW

* update hardcoded asts

* update sops.gz
2024-10-09 02:00:17 +08:00
George Hotz
4df5c7a4ef move lazy to engine [pr] (#6886)
* move lazy to engine [pr]

* engine.lazy
2024-10-04 23:19:26 +08:00
George Hotz
547733e57c stunning_mnist [run_process_replay] (#6828)
* stunning_mnist [run_process_replay]

* add loss to stunning mnist
2024-10-01 15:00:48 +08:00
qazal
391497a311 schedule independent of Device [run_process_replay] (#6829) 2024-10-01 14:46:26 +08:00
qazal
0c24fec9f4 test current behavior of const schedule [run_process_replay] (#6817) 2024-09-30 21:02:01 +08:00
qazal
2ec73d6f05 push swizzle through dim change (#6801)
* push swizzle through dim change

* can this be generic

* generic version

* cleanups
2024-09-30 09:04:59 +08:00
wozeparrot
2b899164c6 no numpy (#6751) 2024-09-26 16:40:18 +08:00
qazal
8a15ccb414 start gc/mem usage tests for buffer schedule [run_process_replay] (#6737)
* gc tests for buffer schedule [run_process_replay]

* assert global counters, maybe del

* check init

* rm global counters
2024-09-26 08:26:31 +08:00
qazal
b629a7998d early assert buffer count limit [run_process_replay] (#6746)
* better error message for buffer count limit [run_process_replay]

* 3.9 needs that

* assert ScheduleItem

* new _test_buf_cnt
2024-09-26 08:24:26 +08:00
wozeparrot
c100f3d406 default threefry (#6116) 2024-09-25 17:45:13 +08:00
George Hotz
cb22ef379a truncate consts early (#6741)
* truncate consts early

* ptx still fails

* Update dtype.py
2024-09-25 16:49:51 +08:00
qazal
3bf25aae78 start work on global buffer count limit [run_process_replay] (#6722)
* add a bufs_max option

* simple spec
2024-09-25 09:51:56 +08:00
George Hotz
e015b41ce9 remove e( function just alu( [run_process_replay] (#6589)
* remove e( function just alu( [run_process_replay]

* missed two
2024-09-19 10:24:02 +08:00
George Hotz
bdd0c06f29 add void type to uop (#6471)
* unwrap_dtype maybe

* uopgraph stuff that hardcoded None

* test_ops passes

* dtypes.py fixups

* update test_linearizer and friends

* more ast updates

* test_beam and test_schedule too

* add void type to uop [run_process_replay]

* remove dumb casts

* start making it green

* more cast cleanups

* more cls methods to fix

* regenerate dataset

* split UOp and NOp const

* maybe that too

* fix docs

* update test_uop_symbolic

* test_verify_ast

* new sops with no diff

* meh, type_ignore is alright

* remove that assert

---------

Co-authored-by: qazal <qazal.software@gmail.com>
2024-09-11 18:16:28 +08:00
qazal
3cde1503ce enable graph rewrite in the scheduler (#6249)
* test: enable

* skip those

* skip pads tests
2024-09-11 14:30:04 +08:00
qazal
262569a3eb green conv bw AST_REWRITE=1 (#6466)
* green conv bw AST_REWRITE=1

* new strides and dtype fix
2024-09-11 10:51:24 +08:00
qazal
4259311006 merge views in conv swizzle (#6464) 2024-09-11 10:11:01 +08:00
qazal
803b8b9313 conv bw schedule and correctness tests to iterate on (#6461)
first to fix AST_REWRITE=1, then to implement the same fusion for dtypes.half.
2024-09-11 08:47:07 +08:00
qazal
f4f705a07c can push SWIZZLE through reduce both ways (#6453) 2024-09-10 16:00:50 +08:00
qazal
1347e49e82 second iteration on UOps.SWIZZLE (#6451)
* new swizzle

* fix the failing tests

* test a double swizzle

* ci
2024-09-10 14:43:21 +08:00
qazal
95c9fe841e UOp.st infra for the new SWIZZLE (#6449) 2024-09-10 09:39:45 +08:00
qazal
29e63097a0 st is a cached_property on UOp [run_process_replay] (#6433) 2024-09-10 08:30:35 +08:00
George Hotz
90fb17304f put rewrite back in ops [run_process_replay] (#6421) 2024-09-09 13:53:51 +08:00
qazal
442150a8df more ast_const for hardcoding consts [run_process_replay] (#6418) 2024-09-09 11:35:08 +08:00
Tim Becker
dfb818788e Support reduction parameter in more loss functions (#6302) 2024-09-07 05:11:20 +08:00
George Hotz
c88329244b create rewrite.py [run_process_replay] (#6379)
* create rewrite.py [run_process_replay]

* fix tests

* not in rewrite or ops

* skip flaky test
2024-09-06 10:51:01 +08:00
qazal
e7f6b654ad cleanup uop eq asserts for swizzle [run_process_replay] (#6362)
* cleanup uop eq asserts for swizzle [run_process_replay]

* more stuff
2024-09-05 13:36:36 +08:00
qazal
2f00bf0c78 conv bw in one kernel with graph_rewrite (#6330)
* double reduce merger

* add test_fold_conv_relu_backward_ast_rewrite

* a correctness test to iterate on

* merge axes the other way around

* better
2024-09-03 03:53:53 +08:00
qazal
539654fbe1 graph_rewrite complexity tests [run_process_replay] (#6317) 2024-08-29 22:39:08 +03:00
qazal
07942ef361 Proposal: Better UOps.SWIZZLE (#6309)
* better UOps.SWIZZLE

* test_swizzle_rewrite

* add it to docs

* show a diff

* a lil more verbose

* two teeny notes

* hotfix: sink
2024-08-29 15:39:48 +03:00
qazal
f0cc8ca5f2 generic st_fixup in scheduler graph rewrite [compare_schedule] (#6278) 2024-08-25 11:02:17 +03:00
qazal
78d6bd8b41 start graph rewrite in the scheduler (#6248)
* start graph rewrite in the scheduler

* test: enable it

* test timings

* only fails in multi reduce

* more isolated tests
2024-08-23 13:15:55 +03:00
chenyu
3fc8203475 remove NEG from handwritten ast in tests (#6234)
* remove NEG from handwritten ast in tests

* test_linearizer_failures
2024-08-22 09:06:59 -04:00
madt2709
4bb98d8882 Fix track_running_stats in batchnorm (#6200)
* Fix track_running_stats in batchnorm

* Fix linter

* Update test_fold_conv_batchnorm_notrain to keep allowed at 1

* Add test_fold_conv_batchnorm_notrain_no_running_stats

* Save 1 line
2024-08-20 14:01:22 -07:00
qazal
1ba83cc7fa split test_sgd_4convs_fuse [run_process_replay] (#6158) 2024-08-18 18:35:42 +03:00
George Hotz
89c7989659 no shapetracker in ops [run_process_replay] (#6117) 2024-08-16 17:23:27 -07:00
qazal
28c75bf2a6 merge uops with ops (#6111)
Co-authored-by: chenyu <chenyu@fastmail.com>
2024-08-16 18:17:57 -04:00
qazal
c23d44c779 AST is UOp (#6030)
* most of the work from the uops2 branch

* schedule

* realize

* kernel

* lowerer

* search

* green

* merge uops with ops

* Revert "merge uops with ops"

This reverts commit 1408a59f12.

* fix benchmark

* remove extra dedup
2024-08-16 22:09:00 +03:00