Commit Graph

138 Commits

Author SHA1 Message Date
wozeparrot
2b899164c6 no numpy (#6751) 2024-09-26 16:40:18 +08:00
qazal
8a15ccb414 start gc/mem usage tests for buffer schedule [run_process_replay] (#6737)
* gc tests for buffer schedule [run_process_replay]

* assert global counters, maybe del

* check init

* rm global counters
2024-09-26 08:26:31 +08:00
qazal
b629a7998d early assert buffer count limit [run_process_replay] (#6746)
* better error message for buffer count limit [run_process_replay]

* 3.9 needs that

* assert ScheduleItem

* new _test_buf_cnt
2024-09-26 08:24:26 +08:00
wozeparrot
c100f3d406 default threefry (#6116) 2024-09-25 17:45:13 +08:00
George Hotz
cb22ef379a truncate consts early (#6741)
* truncate consts early

* ptx still fails

* Update dtype.py
2024-09-25 16:49:51 +08:00
qazal
3bf25aae78 start work on global buffer count limit [run_process_replay] (#6722)
* add a bufs_max option

* simple spec
2024-09-25 09:51:56 +08:00
George Hotz
e015b41ce9 remove e( function just alu( [run_process_replay] (#6589)
* remove e( function just alu( [run_process_replay]

* missed two
2024-09-19 10:24:02 +08:00
George Hotz
bdd0c06f29 add void type to uop (#6471)
* unwrap_dtype maybe

* uopgraph stuff that hardcoded None

* test_ops passes

* dtypes.py fixups

* update test_linearizer and friends

* more ast updates

* test_beam and test_schedule too

* add void type to uop [run_process_replay]

* remove dumb casts

* start making it green

* more cast cleanups

* more cls methods to fix

* regenerate dataset

* split UOp and NOp const

* maybe that too

* fix docs

* update test_uop_symbolic

* test_verify_ast

* new sops with no diff

* meh, type_ignore is alright

* remove that assert

---------

Co-authored-by: qazal <qazal.software@gmail.com>
2024-09-11 18:16:28 +08:00
qazal
3cde1503ce enable graph rewrite in the scheduler (#6249)
* test: enable

* skip those

* skip pads tests
2024-09-11 14:30:04 +08:00
qazal
262569a3eb green conv bw AST_REWRITE=1 (#6466)
* green conv bw AST_REWRITE=1

* new strides and dtype fix
2024-09-11 10:51:24 +08:00
qazal
4259311006 merge views in conv swizzle (#6464) 2024-09-11 10:11:01 +08:00
qazal
803b8b9313 conv bw schedule and correctness tests to iterate on (#6461)
first to fix AST_REWRITE=1, then to implement the same fusion for dtypes.half.
2024-09-11 08:47:07 +08:00
qazal
f4f705a07c can push SWIZZLE through reduce both ways (#6453) 2024-09-10 16:00:50 +08:00
qazal
1347e49e82 second iteration on UOps.SWIZZLE (#6451)
* new swizzle

* fix the failing tests

* test a double swizzle

* ci
2024-09-10 14:43:21 +08:00
qazal
95c9fe841e UOp.st infra for the new SWIZZLE (#6449) 2024-09-10 09:39:45 +08:00
qazal
29e63097a0 st is a cached_property on UOp [run_process_replay] (#6433) 2024-09-10 08:30:35 +08:00
George Hotz
90fb17304f put rewrite back in ops [run_process_replay] (#6421) 2024-09-09 13:53:51 +08:00
qazal
442150a8df more ast_const for hardcoding consts [run_process_replay] (#6418) 2024-09-09 11:35:08 +08:00
Tim Becker
dfb818788e Support reduction parameter in more loss functions (#6302) 2024-09-07 05:11:20 +08:00
George Hotz
c88329244b create rewrite.py [run_process_replay] (#6379)
* create rewrite.py [run_process_replay]

* fix tests

* not in rewrite or ops

* skip flaky test
2024-09-06 10:51:01 +08:00
qazal
e7f6b654ad cleanup uop eq asserts for swizzle [run_process_replay] (#6362)
* cleanup uop eq asserts for swizzle [run_process_replay]

* more stuff
2024-09-05 13:36:36 +08:00
qazal
2f00bf0c78 conv bw in one kernel with graph_rewrite (#6330)
* double reduce merger

* add test_fold_conv_relu_backward_ast_rewrite

* a correctness test to iterate on

* merge axes the other way around

* better
2024-09-03 03:53:53 +08:00
qazal
539654fbe1 graph_rewrite complexity tests [run_process_replay] (#6317) 2024-08-29 22:39:08 +03:00
qazal
07942ef361 Proposal: Better UOps.SWIZZLE (#6309)
* better UOps.SWIZZLE

* test_swizzle_rewrite

* add it to docs

* show a diff

* a lil more verbose

* two teeny notes

* hotfix: sink
2024-08-29 15:39:48 +03:00
qazal
f0cc8ca5f2 generic st_fixup in scheduler graph rewrite [compare_schedule] (#6278) 2024-08-25 11:02:17 +03:00
qazal
78d6bd8b41 start graph rewrite in the scheduler (#6248)
* start graph rewrite in the scheduler

* test: enable it

* test timings

* only fails in multi reduce

* more isolated tests
2024-08-23 13:15:55 +03:00
chenyu
3fc8203475 remove NEG from handwritten ast in tests (#6234)
* remove NEG from handwritten ast in tests

* test_linearizer_failures
2024-08-22 09:06:59 -04:00
madt2709
4bb98d8882 Fix track_running_stats in batchnorm (#6200)
* Fix track_running_stats in batchnorm

* Fix linter

* Update test_fold_conv_batchnorm_notrain to keep allowed at 1

* Add test_fold_conv_batchnorm_notrain_no_running_stats

* Save 1 line
2024-08-20 14:01:22 -07:00
qazal
1ba83cc7fa split test_sgd_4convs_fuse [run_process_replay] (#6158) 2024-08-18 18:35:42 +03:00
George Hotz
89c7989659 no shapetracker in ops [run_process_replay] (#6117) 2024-08-16 17:23:27 -07:00
qazal
28c75bf2a6 merge uops with ops (#6111)
Co-authored-by: chenyu <chenyu@fastmail.com>
2024-08-16 18:17:57 -04:00
qazal
c23d44c779 AST is UOp (#6030)
* most of the work from the uops2 branch

* schedule

* realize

* kernel

* lowerer

* search

* green

* merge uops with ops

* Revert "merge uops with ops"

This reverts commit 1408a59f12.

* fix benchmark

* remove extra dedup
2024-08-16 22:09:00 +03:00
qazal
4d38fec8c1 rename lazyops to parents [run_process_replay] (#6091) 2024-08-15 17:27:32 +03:00
qazal
7d1f118731 use assertIs in test_schedule (#6035)
* use self.assertIs in test_schedule

* test_lazybuffer
2024-08-11 19:19:18 +03:00
chenyu
5820940d98 more relax rtol for test_arange_fuse_grouped_children (#6027)
one more https://github.com/chenyuxyz/tinygrad/actions/runs/10334072657/job/28607120462
2024-08-10 16:10:03 -04:00
chenyu
10374a2741 relax rtol for test_arange_fuse_grouped_children (#6026)
flaky https://github.com/tinygrad/tinygrad/actions/runs/10333939631/job/28606831006?pr=6023
2024-08-10 15:49:11 -04:00
qazal
3ef2788c4f hotfix: run the entire test_conv_bw schedule (#6014) 2024-08-10 17:55:41 +03:00
qazal
b67d521a07 assert test_conv_bw correctness (#6000)
* assert test_conv_bw correctness

* reorder half

* metal and clang still red
2024-08-09 18:30:36 +03:00
qazal
45b1761175 smaller test_llama_embedding + assert correctness (#5986)
* smaller test_llama_embedding in CI

* test correctness
2024-08-08 22:11:29 +03:00
George Hotz
bf8ec23b00 hotfix: contiguous on precompute_freqs_cis 2024-08-07 14:40:56 -07:00
qazal
7677361d90 test pushing through different expands in 1 kernel (#5963)
* test pushing through different expands in 1 kernel

* realize eye

* back to test_example_matmul
2024-08-07 19:33:18 +03:00
qazal
d5d7f4e7b8 more TestIndexing correctness asserts [run_process_replay] (#5948)
* use torch in test_mnist_val

* more asserts
2024-08-07 01:50:42 +03:00
qazal
7b6496f2e6 fix the reduceops cache breaking beautiful_mnist (#5938)
* fix the reduceops cache breaking beautiful_mnist

* test_sparse_categorical_crossentropy_simple

* starting tests

* atol from test_nn

* test_sparse_categorical_crossentropy_alt

* dont use torch
2024-08-07 00:02:54 +03:00
qazal
3d4742dd2e override output shape in fused assign (#5930)
* override output shape in fused assign

This makes

```
FUSE_ARANGE=1 JIT=0 python3 examples/llama.py --gen 1 --prompt "Hello." --count 10 --temperature 0 --timing
```
work. In general we should assert ASSIGN doesn't change shape.

* merge asserts
2024-08-06 13:28:50 +03:00
George Hotz
5d17f54e3c fast mnist indexing (#5921)
* fast mnist indexing

* more tests

* remove those tests, new indexing rule
2024-08-05 13:55:15 -07:00
qazal
e0c6520138 check arange fusing with VIEW and COPY (#5912)
* check arange fusing with VIEW and COPY

* gpu and clang
2024-08-05 17:09:21 +03:00
qazal
aad9234e52 test fused precompute_freqs_cis (#5900)
* test_precompute_freqs_cis

* tiny for ci
2024-08-04 21:01:05 +03:00
qazal
4c5ef2cc4f setitem with arange fusion 1 (#5898) 2024-08-04 16:09:21 +03:00
qazal
56ef9e453e pad reduceops to the max of each dimension (#5889)
* early verify

* pad reduceops to the max of each dim

* remove the function
2024-08-03 14:03:30 +03:00
qazal
65fa86901a indexing fusion 2 (#5888)
* arange fusion

* kernels that fuse

* tests
2024-08-03 13:13:39 +03:00