qazal
7d1f118731
use assertIs in test_schedule ( #6035 )
...
* use self.assertIs in test_schedule
* test_lazybuffer
2024-08-11 19:19:18 +03:00
chenyu
5820940d98
more relax rtol for test_arange_fuse_grouped_children ( #6027 )
...
one more https://github.com/chenyuxyz/tinygrad/actions/runs/10334072657/job/28607120462
2024-08-10 16:10:03 -04:00
chenyu
10374a2741
relax rtol for test_arange_fuse_grouped_children ( #6026 )
...
flaky https://github.com/tinygrad/tinygrad/actions/runs/10333939631/job/28606831006?pr=6023
2024-08-10 15:49:11 -04:00
qazal
3ef2788c4f
hotfix: run the entire test_conv_bw schedule ( #6014 )
2024-08-10 17:55:41 +03:00
qazal
b67d521a07
assert test_conv_bw correctness ( #6000 )
...
* assert test_conv_bw correctness
* reorder half
* metal and clang still red
2024-08-09 18:30:36 +03:00
qazal
45b1761175
smaller test_llama_embedding + assert correctness ( #5986 )
...
* smaller test_llama_embedding in CI
* test correctness
2024-08-08 22:11:29 +03:00
George Hotz
bf8ec23b00
hotfix: contiguous on precompute_freqs_cis
2024-08-07 14:40:56 -07:00
qazal
7677361d90
test pushing through different expands in 1 kernel ( #5963 )
...
* test pushing through different expands in 1 kernel
* realize eye
* back to test_example_matmul
2024-08-07 19:33:18 +03:00
qazal
d5d7f4e7b8
more TestIndexing correctness asserts [run_process_replay] ( #5948 )
...
* use torch in test_mnist_val
* more asserts
2024-08-07 01:50:42 +03:00
qazal
7b6496f2e6
fix the reduceops cache breaking beautiful_mnist ( #5938 )
...
* fix the reduceops cache breaking beautiful_mnist
* test_sparse_categorical_crossentropy_simple
* starting tests
* atol from test_nn
* test_sparse_categorical_crossentropy_alt
* dont use torch
2024-08-07 00:02:54 +03:00
qazal
3d4742dd2e
override output shape in fused assign ( #5930 )
...
* override output shape in fused assign
This makes
```
FUSE_ARANGE=1 JIT=0 python3 examples/llama.py --gen 1 --prompt "Hello." --count 10 --temperature 0 --timing
```
work. In general we should assert ASSIGN doesn't change shape.
* merge asserts
2024-08-06 13:28:50 +03:00
George Hotz
5d17f54e3c
fast mnist indexing ( #5921 )
...
* fast mnist indexing
* more tests
* remove those tests, new indexing rule
2024-08-05 13:55:15 -07:00
qazal
e0c6520138
check arange fusing with VIEW and COPY ( #5912 )
...
* check arange fusing with VIEW and COPY
* gpu and clang
2024-08-05 17:09:21 +03:00
qazal
aad9234e52
test fused precompute_freqs_cis ( #5900 )
...
* test_precompute_freqs_cis
* tiny for ci
2024-08-04 21:01:05 +03:00
qazal
4c5ef2cc4f
setitem with arange fusion 1 ( #5898 )
2024-08-04 16:09:21 +03:00
qazal
56ef9e453e
pad reduceops to the max of each dimension ( #5889 )
...
* early verify
* pad reduceops to the max of each dim
* remove the function
2024-08-03 14:03:30 +03:00
qazal
65fa86901a
indexing fusion 2 ( #5888 )
...
* arange fusion
* kernels that fuse
* tests
2024-08-03 13:13:39 +03:00
qazal
af59b2eea9
tests from the indexing fusion branch ( #5886 )
2024-08-03 11:56:48 +03:00
qazal
26d0265d66
test schedule of LazyBuffers [run_process_replay] ( #5859 )
2024-08-01 19:06:29 +03:00
qazal
1b53207b4f
revert isolated dags scheduling ( #5724 )
2024-07-25 19:45:12 -04:00
qazal
9ceb3a3d1f
beautiful_mnist -4.3% kernels ( #5709 )
...
* add is_complete
* partially delete forced_realized
* p2
* start
* refactor to can_group
* remove steps
* _get_inputs is nicer
* fix the cache
* cache is dict now
* rename to group
2024-07-25 20:30:49 +03:00
George Hotz
dc21e63bd2
test: put conv in one reduce ( #4441 )
...
* test: put conv in one reduce
* put reduce at the end
* more expand
* generic, and that expand was breaking things
* ratio
* don't undo the expand
* arg 1
* strides
* warning, for resnet
* warning removed
* disable cast
* handle cast
* op
* err, that's right
* fixup
* fix that
* a test to play with
* add double_reduces
* working up to final reshape
* fold the last reshape
* moved to schedule
* fix axis
* ci, need to bring arange back
* FUSE_CONV_BW maybe
* valid in 3.9
* test_expand_reduce_is_folded_on_different_axes
* add FUSE_CONV_BW=1
* test_fold_batchnorm_backward
* test_sgd_4convs_fuse
---------
Co-authored-by: qazal <qazal.software@gmail.com >
2024-07-22 12:16:13 +03:00
kormann
2c4add6844
pretty print lazy op per default ( #5505 )
...
* pretty lop
* min diff
* walrus
* fix
* min diff
* simplify
* pretty helper function
* ws
* pretty uop upat
* tests
* stricter tests
* test passes
* ws
* stronger upat test
* delete print_tree
* min diff
* stricter exp test
* fix merge
* stronger uops eval test
* +readable and deep upat test
* +readable and deep upat test
* sort inv fix
* fix
* revert allowed_len
2024-07-18 09:34:08 -07:00
George Hotz
fa7e734b49
MetaOps.KERNEL ( #5543 )
2024-07-17 19:41:23 -07:00
wozeparrot
90f0e2fc49
db in wal mode ( #5388 )
2024-07-12 20:43:36 -07:00
chenyu
4df63da190
clean up rest of the loadop [run_process_replay] ( #5440 )
...
to metaop and filter_sink
2024-07-12 23:38:51 -04:00
George Hotz
03c2dc8bd7
lowerer is kernel [run_process_replay] ( #5437 )
2024-07-12 18:50:55 -07:00
George Hotz
870dc8c350
s/Linearizer/Lowerer [run_process_replay] ( #5428 )
2024-07-12 15:54:07 -07:00
George Hotz
6707c778d0
scheduleitem is not Tuple [run_process_replay] ( #5425 )
...
* scheduleitem is not Tuple [run_process_replay]
* fix tests
* fix op + fuzzers
* fix mop test
2024-07-12 15:13:19 -07:00
George Hotz
f6ef283e6a
s/loadops/metaops [run_process_replay] ( #5421 )
2024-07-12 13:26:50 -07:00
qazal
e22b377839
generalize FUSE_AS_ONE_KERNEL in the scheduler ( #5397 )
...
* test: use const
* hotfix: base
* asserts
* dont push through reshape
* cleanup
* dont need the cache
* test_reduceop_reshape_dont_push and test_index_fused are next
2024-07-12 10:23:16 +03:00
Timmy
bb7746985f
multireduce scheduler tests ( #5141 )
...
Co-authored-by: qazal <77887910+Qazalin@users.noreply.github.com >
2024-07-08 20:28:55 +03:00
qazal
5292d37db6
LoadOps.VIEW in the scheduler spec ( #5296 )
...
* refactor to allow_buffer_view
* tests
* fix multi
2024-07-05 19:43:50 +03:00
hikettei
1ab7a4cff0
Handling Multiple UnaryOps.BITCAST in Function for Proper Kernel Fusion [run_process_replay] ( #5172 )
...
* [Patch] added an option not to ignore view replacing when doing bitcast
* added the testcase
* [Add] reproduced bitcast cannot be fused into a single kernel in the unittest
---------
Co-authored-by: qazal <77887910+Qazalin@users.noreply.github.com >
2024-07-05 19:16:44 +03:00
chenyu
ee0c6dfc15
build Tensor._tri with movements only ( #5110 )
...
* build Tensor._tri with movements only
doesn't need arange, saved a kernel in attention mask
* simpler, more tests
2024-06-23 00:07:36 -04:00
chenyu
2b2488f2e2
revert creating Tensor from a list without numpy ( #5041 )
...
the change was incomplete and broke creating Tensor from a list of np array
2024-06-18 17:31:22 -04:00
qazal
04feeb37e6
look for unsafe pad ops in multiview ShapeTrackers ( #5002 )
2024-06-17 00:28:12 +03:00
Timmy
01b26756d6
Multireduce Scheduler Tests ( #4972 )
...
* scheduler tests
* linters
* cleaning up tests
* fixing tests
* syntax
* fixing metal
2024-06-16 16:30:22 +03:00
chenyu
5eee974b2a
construct Tensor from python list/tuple directly ( #4947 )
...
* construct Tensor from python list/tuple directly
no numpy. annoying that half memoryview is 3.12 feature...
* simpler, and test
* flat already
* simpler
* cute
* 10% faster
* 5%
2024-06-14 11:36:05 -04:00
chenyu
286b4dbdf2
compile raise CompileError and skip only RuntimeError in multiprocess… ( #4646 )
...
* compile raise CompileError and skip only RuntimeError in multiprocess beam
renderer error with multiprocess should not be skipped by beam
* use `==` for dtype to dtype comparison
* that needs to be is
* typo
2024-05-19 00:25:25 -04:00
qazal
bf8f855838
assert kernel counts in unsupported fusions ( #4643 )
...
* replace with comments
* not relevant
* update comment
* custom exception maybe
* fix LoadOps.VIEW
2024-05-18 20:14:37 +03:00
qazal
f3f2b96583
pick schedule tests from external_test_opt ( #4615 )
...
* conv tests
* misc
* that shouldnt const fold
2024-05-16 15:43:41 +03:00
qazal
13200c6894
check simple_pads in all views ( #4614 )
2024-05-16 14:34:39 +03:00
qazal
0b464df605
base change scheduling spec ( #4613 )
...
* spec and kernel cnt
* dont use half
* skip half
2024-05-16 13:30:49 +03:00
George Hotz
ff64bcab69
move graph/search to engine ( #4596 )
2024-05-14 23:12:59 -07:00
George Hotz
fd02ab1e8b
move disassemblers and openpilot ( #4592 )
...
* move disassemblers and openpilot
* delete junk
* put that in pre-commit
* fixup readme
2024-05-14 19:30:02 -07:00
qazal
355e1c135c
pad fusion tests ( #4570 )
...
* what breaks
* Revert "what breaks"
This reverts commit e79f679283 .
* simplest case
* one unsafe op
* expand+pad, shrink+pad
* safe case
* refactor
2024-05-14 20:34:46 +03:00
chenyu
7afca52796
replace pow in LAMB by tracking b1**t and b2**t per step ( #4582 )
...
* replace pow in LAMB by tracking b1**t and b2**t per step
* remove t, add [self.b1_t, self.b2_t] to return
* adam has one less kernel
2024-05-14 13:08:22 -04:00
George Hotz
17faae091b
optimizer shouldn't be run without training ( #4460 )
...
* optimizer shouldn't be run without training
* set training in relevant tests
* fix multitensor
* that too
2024-05-06 15:34:12 -07:00
qazal
6dbe5585b0
batchnorm + conv backward in test_schedule ( #4420 )
...
* test both optims
* batchnorm_backward
2024-05-06 16:40:17 +03:00