George Hotz
411392dfb7
move files into uop dir ( #10399 )
...
* move files into uop dir [pr]
* tinygrad.uop is a thing
* fix uop docs, no pr
* fix viz
2025-05-18 11:38:28 -07:00
qazal
04b23087d8
grouper tests from fuse_arange_default [pr] ( #10394 )
2025-05-18 18:42:43 +03:00
qazal
0294bfe507
simpler can_pad ( #10364 )
...
* simpler can_pad [pr]
* 3 kernels
* tests
* less kernels
2025-05-18 10:00:07 +03:00
qazal
e054b53a75
kernel count tests for pad [pr] ( #10369 )
...
* kernel count tests for pads
* handcoded rand one kernel
* comment
* prerealize device rng counter
* test_rand_handcoded generates /0
* remove track_rewrites
2025-05-17 17:20:46 +03:00
qazal
0a45cd0cbe
grouper: merge views in fuse elementwise ( #10325 )
...
* grouper: merge views in fuse elementwise
* with gradient api
2025-05-15 13:17:09 +03:00
qazal
89d8d5b25e
add dims check in FUSE_ARANGE ( #10323 )
2025-05-15 11:33:21 +03:00
qazal
8fad0f0124
grouper: check for unsafe PAD in FUSE ( #10322 )
2025-05-15 10:53:44 +03:00
qazal
d342f7688d
remove some skips in test_schedule + use assertRaisesRegex [pr] ( #10296 )
2025-05-14 14:54:07 +03:00
qazal
a2d6b0afe0
fix FUSE pushing through SHRINK ( #10271 )
2025-05-13 11:38:53 +03:00
qazal
b6904bbf83
Revert "split grouper into insert and finalize stages [pr] ( #10222 )" ( #10224 )
...
This reverts commit 2594e4db15 .
2025-05-09 03:02:38 +03:00
qazal
2594e4db15
split grouper into insert and finalize stages [pr] ( #10222 )
2025-05-09 02:36:22 +03:00
qazal
1d0f239df7
use Tensor.train() in schedule test + typo [pr] ( #10220 )
2025-05-08 23:46:42 +03:00
George Hotz
8d4c563c01
all COPY can be clone ( #10205 )
...
* match old behavior
* simple
* it means the naive thing before the multi
* fix
2025-05-07 20:31:39 -07:00
qazal
94e07725a6
only reorder expand if it can fuse with input ( #10186 )
...
* failing test
* only reorder expand if it can fuse with input
* (16,) is reshaped to (4, 4)
2025-05-07 18:14:31 +08:00
qazal
62e86bc5ec
insert Ops.FUSE for arange ( #10140 )
...
* insert Ops.FUSE for arange
* reshape does not collapse
* do not fuse reshapes
* add children
* fixups
* work
* add Ops.WHERE support to z3
* fix fuse for cast
* diff
* ugh
* don't need this anymore
* contiguous
* add always_contiguous
* there too
2025-05-05 08:32:12 +03:00
George Hotz
36ccaa88a6
move merge views [pr] ( #10156 )
...
* move merge views [pr]
* move flow to __init__ [pr]
2025-05-04 14:41:47 -07:00
George Hotz
2ed3acd767
toposort is a function [pr] ( #10004 )
2025-04-23 16:25:03 +01:00
qazal
f4ec57baff
new schedule linearizer enqueues KERNEL UOps [pr] ( #9993 )
...
* new schedule linearizer enqueues kernels [pr]
* no defaultdict
* diff
* minor
2025-04-23 05:17:58 +08:00
qazal
6cb2d18c03
refactor schedule linearize to defaultdict [pr] ( #9984 )
...
* refactor schedule linearize to defaultdict [pr]
* skip that
* don't need .get
2025-04-23 00:00:23 +08:00
qazal
bbc324f5dc
remove CAST_AFTER_EXPAND ( #9980 )
2025-04-22 21:06:11 +08:00
qazal
7b55846e08
prep STORE UOp creation for multi output [pr] ( #9975 )
...
* prep STORE UOp creation for multi output [pr]
* test_multioutput_ast
2025-04-22 19:34:52 +08:00
qazal
1cf4e24ca5
fix kernelize usage with pm_gradient ( #9953 )
...
* fix kernelize usage with pm_gradient
* remove that
2025-04-22 17:26:05 +08:00
qazal
36ed3c3253
fix kernelize with VIEW children ( #9961 )
2025-04-21 23:38:46 +08:00
qazal
e8910540f6
Kernelize can be called multiple times on a Tensor ( #9949 )
...
* Kernelize can be called multiple times on a Tensor
* add (failing) test_kernelize_bw
2025-04-21 06:28:47 +08:00
qazal
e20ef7196a
Tensor.kernelize ( #9845 )
...
* add kernelize
* remove that
* kernelize returns self
* update abstractions2.py
* kernelize in test_schedule
* temp: assert BUFFER_VIEW's existence
* ASSIGN must have a buffer or subbuffer target
* assert and shrink
* fix
* padded setitem
* var
* toposort once
* extra
* base_buffer
* end with BUFFER_VIEW
* setitem for disk
* test_setitem_becomes_subbuffer
* mul slice test
* torch backend fix 1
* non-deterministic
* keep subbuffer
2025-04-20 20:53:49 +08:00
qazal
b58decac0c
fix diamond assigns before mapping tensors UOps to assigns ( #9855 )
...
* keep tensor_map until diamond assign fixup
* ctx
2025-04-18 14:17:43 +03:00
qazal
f13e9cf2d9
move view_left to grouper.py + tiny reorders [pr] ( #9780 )
...
* move view_left to grouper.py [pr]
* reorder grouper
* test_schedule
2025-04-08 15:39:28 +08:00
qazal
9963bb51e0
grouper tests cleanups [pr] ( #9777 )
...
* grouper tests cleanups [pr]
* viz
* tuple
* whitespace
2025-04-08 12:33:11 +08:00
qazal
891322fd51
split into grouper.py ( #9768 )
...
* split into grouper.py
* update tests
* reorder
2025-04-07 18:40:59 +08:00
qazal
ae688e4103
simple failing test for scheduling parallel reduce [pr] ( #9501 )
...
* simple failing test for scheduling parallel reduce [pr]
* atol
2025-03-19 10:52:13 +08:00
George Hotz
117b7a16ef
VALIDATE_WITH_CPU [pr] ( #9488 )
...
* VALIDATE_WITH_CPU [pr]
* fix test
2025-03-18 15:15:04 +08:00
qazal
e03c0aacf2
more explicit DONT_PUSH_VIEWS [pr] ( #9479 )
...
* more explicit DONT_PUSH_VIEWS [pr]
* update tests to not handcode ast
* lint
* test_recursive_swizzle and test_simple_store_reshape
2025-03-17 20:43:21 +08:00
qazal
3b00a778ba
fix view_left for unsafe pad ops [pr] ( #9478 )
2025-03-17 19:02:02 +08:00
qazal
813f713edc
merge_views for buffer ops + create valids last ( #9472 )
...
* merge_views for buffer ops + create valids last
* view.arg
* pass
2025-03-17 17:15:44 +08:00
qazal
bd1f71c1e2
simple failing test for extra ops in VALID [pr] ( #9474 )
...
* simple failing test for extra valids [pr]
* this has DEBUG=4
2025-03-17 17:02:40 +08:00
qazal
90ffa9bd45
swizzle without buffer ops try 2 [pr] ( #9427 )
...
* add DONT_PUSH_VIEWS to matchers
* swizzle without buffer ops try 2 [pr]
* swizzle reduceop
* simple failing test
* fix failing test
* s/on/for
2025-03-13 10:00:40 +01:00
qazal
59dfb234eb
replace hardcoded ast with tensors in TestSwizzle [pr] ( #9401 )
2025-03-10 19:33:57 +01:00
qazal
a1f41fadf6
test_schedule cleanups + add DONT_GROUP_REDUCES [pr] ( #9392 )
...
* test_schedule cleanups + add DONT_GROUP_REDUCES [pr]
* replace with test_swizzle_reduceop
* delete duplicate tests
* test_allow_push_permutes
* one kernel tests
2025-03-09 15:01:08 +01:00
qazal
286b480f82
do not replace assign with the offset buffer [pr] ( #9387 )
2025-03-08 11:57:44 +01:00
qazal
0d2762c010
prep refactor for adding buffer ops last [pr] ( #9383 )
...
* prep refactor for adding buffer ops last [pr]
* freeze buffers
* add swizzle_reduceop
* shape for reduceop_view_right
* simpler elementwise_view_right
* add shapetracker to const
* only const
* from process replay
2025-03-08 08:00:14 +01:00
qazal
23084fd850
merge merge_views and remove_movement_ops [pr] ( #9333 )
...
* merge merge_views and remove_movement_ops [pr]
* fix that assert
2025-03-03 12:38:59 +01:00
qazal
cdf66cc67f
test: recompute expanded CAST ( #9286 )
...
* those views should merge
* diff cleanup
* gpu
* put it behind CAST_AFTER_EXPAND
2025-02-27 19:22:17 +01:00
qazal
e162aa862d
is_realized only if buffer is allocated ( #9253 )
...
* is_realized only if the buffer is allocated
* fix the image check too
* assert test_lil_model after ExecItems run
2025-02-26 08:58:08 +01:00
George Hotz
3f4eb9006a
test for device mismatch [pr] ( #9250 )
...
* test for device mismatch [pr]
* fix bert
2025-02-26 13:06:33 +08:00
qazal
cbfe95d306
bring cast before view back ( #9230 )
...
* bring cast before view back
* tune it to only trigger on expands
---------
Co-authored-by: chenyu <chenyu@fastmail.com >
2025-02-25 01:50:39 +02:00
George Hotz
c9493e41a6
reorder expand ( #9051 )
...
* reorder expand
* symbolic ops needs resolve here
* s/arg/st + whitespace
* viz
---------
Co-authored-by: qazal <qazal.software@gmail.com >
2025-02-24 13:55:47 +01:00
qazal
14aa2395d0
allow VIEW(BUFFER) in Tensor UOps [pr] ( #9210 )
...
* allow VIEW(BUFFER) in Tensor UOps [pr]
* still reshapes
* update becomes_map tests
* bring copy folder to the scheduler
* lint
* only sgd left
* optimizer assign
* 13 kernels
* rename to test_reorder_expand + assert VIEW
2025-02-24 13:06:15 +01:00
qazal
2eab8021fb
remove inputs+outputs attributes from ScheduleItem [pr] ( #9192 )
...
* remove inputs/outputs from ScheduleItem
* fix test_linearizer
* fix test_conv_shapetracker
* fix test_schedule + lint
* test_image_dtype + multitensor + search
2025-02-21 13:48:11 +01:00
chenyu
2e7c2780a9
CLANG -> CPU ( #9189 )
2025-02-20 18:03:09 -05:00
George Hotz
1bf66d62cf
symbolic gets its own file [pr] ( #9132 )
2025-02-17 18:55:21 +08:00