tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-02-11 23:25:04 -05:00

Author	SHA1	Message	Date
qazal	ae688e4103	simple failing test for scheduling parallel reduce [pr] (#9501 ) * simple failing test for scheduling parallel reduce [pr] * atol	2025-03-19 10:52:13 +08:00
George Hotz	117b7a16ef	VALIDATE_WITH_CPU [pr] (#9488 ) * VALIDATE_WITH_CPU [pr] * fix test	2025-03-18 15:15:04 +08:00
qazal	e03c0aacf2	more explicit DONT_PUSH_VIEWS [pr] (#9479 ) * more explicit DONT_PUSH_VIEWS [pr] * update tests to not handcode ast * lint * test_recursive_swizzle and test_simple_store_reshape	2025-03-17 20:43:21 +08:00
qazal	3b00a778ba	fix view_left for unsafe pad ops [pr] (#9478 )	2025-03-17 19:02:02 +08:00
qazal	813f713edc	merge_views for buffer ops + create valids last (#9472 ) * merge_views for buffer ops + create valids last * view.arg * pass	2025-03-17 17:15:44 +08:00
qazal	bd1f71c1e2	simple failing test for extra ops in VALID [pr] (#9474 ) * simple failing test for extra valids [pr] * this has DEBUG=4	2025-03-17 17:02:40 +08:00
qazal	90ffa9bd45	swizzle without buffer ops try 2 [pr] (#9427 ) * add DONT_PUSH_VIEWS to matchers * swizzle without buffer ops try 2 [pr] * swizzle reduceop * simple failing test * fix failing test * s/on/for	2025-03-13 10:00:40 +01:00
qazal	59dfb234eb	replace hardcoded ast with tensors in TestSwizzle [pr] (#9401 )	2025-03-10 19:33:57 +01:00
qazal	a1f41fadf6	test_schedule cleanups + add DONT_GROUP_REDUCES [pr] (#9392 ) * test_schedule cleanups + add DONT_GROUP_REDUCES [pr] * replace with test_swizzle_reduceop * delete duplicate tests * test_allow_push_permutes * one kernel tests	2025-03-09 15:01:08 +01:00
qazal	286b480f82	do not replace assign with the offset buffer [pr] (#9387 )	2025-03-08 11:57:44 +01:00
qazal	0d2762c010	prep refactor for adding buffer ops last [pr] (#9383 ) * prep refactor for adding buffer ops last [pr] * freeze buffers * add swizzle_reduceop * shape for reduceop_view_right * simpler elementwise_view_right * add shapetracker to const * only const * from process replay	2025-03-08 08:00:14 +01:00
qazal	23084fd850	merge merge_views and remove_movement_ops [pr] (#9333 ) * merge merge_views and remove_movement_ops [pr] * fix that assert	2025-03-03 12:38:59 +01:00
qazal	cdf66cc67f	test: recompute expanded CAST (#9286 ) * those views should merge * diff cleanup * gpu * put it behind CAST_AFTER_EXPAND	2025-02-27 19:22:17 +01:00
qazal	e162aa862d	is_realized only if buffer is allocated (#9253 ) * is_realized only if the buffer is allocated * fix the image check too * assert test_lil_model after ExecItems run	2025-02-26 08:58:08 +01:00
George Hotz	3f4eb9006a	test for device mismatch [pr] (#9250 ) * test for device mismatch [pr] * fix bert	2025-02-26 13:06:33 +08:00
qazal	cbfe95d306	bring cast before view back (#9230 ) * bring cast before view back * tune it to only trigger on expands --------- Co-authored-by: chenyu <chenyu@fastmail.com>	2025-02-25 01:50:39 +02:00
George Hotz	c9493e41a6	reorder expand (#9051 ) * reorder expand * symbolic ops needs resolve here * s/arg/st + whitespace * viz --------- Co-authored-by: qazal <qazal.software@gmail.com>	2025-02-24 13:55:47 +01:00
qazal	14aa2395d0	allow VIEW(BUFFER) in Tensor UOps [pr] (#9210 ) * allow VIEW(BUFFER) in Tensor UOps [pr] * still reshapes * update becomes_map tests * bring copy folder to the scheduler * lint * only sgd left * optimizer assign * 13 kernels * rename to test_reorder_expand + assert VIEW	2025-02-24 13:06:15 +01:00
qazal	2eab8021fb	remove inputs+outputs attributes from ScheduleItem [pr] (#9192 ) * remove inputs/outputs from ScheduleItem * fix test_linearizer * fix test_conv_shapetracker * fix test_schedule + lint * test_image_dtype + multitensor + search	2025-02-21 13:48:11 +01:00
chenyu	2e7c2780a9	CLANG -> CPU (#9189 )	2025-02-20 18:03:09 -05:00
George Hotz	1bf66d62cf	symbolic gets its own file [pr] (#9132 )	2025-02-17 18:55:21 +08:00
qazal	2b9ce1235a	simple failing case for reorder expand + keep views in tensor_map [pr] (#9057 )	2025-02-13 11:22:55 +01:00
Ahmed Harmouche	916d5e7f08	WebGPU f16 support (f16 bounty part 2) (#8653 ) * WebGPU f16 support * Don't enable f16 yet * dtype tests passing after bitcast fix * Maybe all WebGPU green? * Require shader-f16 in examples * Minor wgsl touchup * 1 line shorter * Simpler * Add transcendetal support * log2 nan location mismatch on Vulkan * Nan skips	2025-02-12 19:46:53 +08:00
qazal	cd77e51810	fix tensor realization bug in #8975 (#8984 ) * fix tensor realization bug in #8975 * that's a reshape now * work * works * give those tests better names * test when multiple mops result in the same ShapeTracker * test_become_existing_buf_complex is enough * that too	2025-02-10 13:51:30 +01:00
qazal	fd9f9ec772	realized base tensors become RESHAPE(BUFFER) [pr] (#8994 )	2025-02-10 10:17:54 +01:00
qazal	7eba5fb413	Tensor.empty is RESHAPE(BUFFER) (#8987 ) * empty is RESHAPE(BUFFER) * eh * add test_empty_buf * can we unsupport this * linter * Revert "can we unsupport this" This reverts commit `0f71e1aadb`.	2025-02-09 18:42:51 +01:00
qazal	55351ebb31	minimal failing test for #8975 [pr] (#8982 )	2025-02-09 14:10:37 +01:00
chenyu	cfd28517df	move pow folding tests to test_schedule [pr] (#8955 ) not really belongs to test_const_folding	2025-02-07 12:51:43 -05:00
chenyu	488200f16c	move more pow const to rewrite (#8916 ) * move more pow const to rewrite one less use of _to_const_val * fix	2025-02-05 20:30:12 -05:00
qazal	af4f9d1aa9	use matchers to verify AST shape [pr] (#8828 ) * use matchers to verify kernel AST [pr] * work * use swizzle_cnt * add comment * imports * modified_ast comment * brief	2025-01-31 09:17:42 +02:00
George Hotz	643c09a6c6	tensor uop spec should be in spec.py [pr] (#8827 ) * tensor uop spec should be in spec.py [pr] * err, spec.py * print uops can stay	2025-01-31 13:54:04 +08:00
qazal	a78f0f85d3	remove support for checking tensor uops in FUSE_ARANGE [pr] (#8829 )	2025-01-31 07:48:28 +02:00
qazal	1fce864a6d	delete multi output support (#8822 ) * delete multioutput for now * test_schedule * test_assign too * linter * 515 for sd * update tests and ctx * update that assign check	2025-01-30 22:45:50 -05:00
qazal	530961f7d5	realized only exists on base (#8815 ) * realized only exists on base [pr] * shorter * update that too	2025-01-30 23:02:25 +02:00
qazal	5643429c17	give BUFFER UOp a ShapeTracker [pr] (#8811 ) * give BUFFER UOp a ShapeTracker [pr] * move that * update contiguous * test_advancedindex should use movement ops	2025-01-30 22:33:32 +02:00
qazal	ba17786068	do not construct unmasked VALID (#8759 ) * new lines that exist in codegen/ops * update tests * update sops.gz (13071 -> 13070 asts) * fix viz too * remove that TODO * diff pruning * mask assert + device * work * diff pruning * re: fix viz too --------- Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2025-01-28 20:51:21 +02:00
qazal	3417bc1814	fix ShapeTracker spec for const [pr] (#8791 )	2025-01-28 19:53:36 +02:00
George Hotz	96bff0b4f7	contiguous is no longer needed in SGD [pr] (#8760 ) * contiguous is no longer needed in SGD [pr] * add allow condition	2025-01-27 15:19:11 +09:00
qazal	ac70f63d4b	tensor_map cleanups [pr] (#8754 ) * tensor_map cleanups [pr] * update test_schedule too	2025-01-26 11:41:54 +02:00
George Hotz	b4bf6a7dea	switch backward to use gradient [pr] (#8235 ) * switch backward to use gradient [pr] * set device correctly, dedup * why does that fail? * add noop cast * simple backward * fix beautiful_mnist * touchups * set in compute_gradient * uop_count * uop_count was wrong * collections * no note * skip that test * update sched kernel counts * train mnist is 65 * fix metadata and gc * fixes * materialize_grads * no pathlib stuff * add contiguous_backward, fix bugs * add some realize * fix multi	2025-01-26 09:12:16 +09:00
qazal	8e5bd0cd7a	fix buffer init and skip test_swizzle_failure_permute [pr] (#8732 ) * fix buffer init and skip test_swizzle_failure_permute [pr] * replace preload with just load * add	2025-01-23 17:21:38 +02:00
qazal	07ec99001a	keep VIEW in big_sink + copy of buffer view spec [pr] (#8727 ) * keep views in sink [pr] * tests * things from the gpt2 bug	2025-01-23 11:29:30 +02:00
qazal	e3d1464ba4	move assign preload out of schedule item [pr] (#8710 ) * move assign preload out of schedule item [pr] * fix that	2025-01-22 12:43:57 +02:00
qazal	d6bf1feaab	remove the "no copy" line from copy_to_device (#8702 ) * delete the no copy one * add tests	2025-01-21 17:09:33 +02:00
qazal	f0d424ecdf	Tensor UOps can become a buffer or const after scheduling (#8698 ) * spec * work * update test_viewed_consts_do_not_realize * remove	2025-01-21 12:33:19 +02:00
qazal	e2008c98c3	allow symbolic shape in tensor const parents [pr] (#8699 )	2025-01-21 12:01:25 +02:00
qazal	66ac0087e8	more high level contiguous tests + scheduler deletions [pr] (#8695 ) * delete those * move the upat too * rename ops_folding to just sym * keep that	2025-01-21 01:52:58 +02:00
qazal	08eb1f1f56	simplify tensors before scheduling [pr] (#8580 ) * delete forced_realize * put that back * work * remove forced_realize * expectedFailures * contiguous(buffer) * multi * expectedFailures * cleaner create_subbuffer * more comments * remove that * note * realizes * work * one upat and image is back * remove * cleaner * fix test_complex_backward for now --------- Co-authored-by: George Hotz <geohot@gmail.com>	2025-01-20 23:42:42 +02:00
chenyu	679b1ad058	move softmax upcast to after subtracting max (#8684 ) * move softmax upcast to after subtracting max max can always be done in the same dtype without any numerical loss, so this is better when explicitly upcasting in softmax * skipUnless half	2025-01-20 12:16:32 -05:00
qazal	9e55495b4d	fold double contiguous [pr] (#8687 )	2025-01-20 14:38:33 +02:00

1 2 3 4 5 ...

303 Commits