tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-01-10 07:28:15 -05:00

Author	SHA1	Message	Date
nimlgen	b4c3780df0	hotfix: interop example (#9237 ) * hotfix: interop example * rm this * fix * fix ci mps * atol rtol * no uaf	2025-02-25 10:32:00 +03:00
Sieds Lykles	990c240b82	Stable pow gradient (#9226 ) * Stable gradient * More efficient * Fix and test for +-inf * cleaner * skip webgpu test --------- Co-authored-by: chenyu <chenyu@fastmail.com>	2025-02-24 20:54:26 -05:00
qazal	cbfe95d306	bring cast before view back (#9230 ) * bring cast before view back * tune it to only trigger on expands --------- Co-authored-by: chenyu <chenyu@fastmail.com>	2025-02-25 01:50:39 +02:00
chenyu	90c3ed17c5	move cast to before softmax in attention (#9213 ) * move cast to before softmax in attention saved some memory because exp (which is used for backward) are done in half. training bert seems fine and can fit BS=78 now (from 66) * test	2025-02-24 17:24:59 -05:00
geohotstan	f0b24d230c	add test_onnx_ops.py (#8569 ) * boom * fix webgpu * use exact variable names in test so that AI can read easier * add tag for specific test name like test a specific dtype * fix ruff * astype everything * dtype in array creation * just arange * is 67% considered fixed? * move test up * small cleanups * share function * add qgemm as well * add qgemm too * make sure qgemm comes out as int * take out qgemm for now * fixed test * add correct qgemm * addressing feedback here too, early naive fix for now * simplify bias and c to be minimalistic enough to test correctness * refactored qlinearops * maybe these asserts aren't the best.. * fix test * updated tests to cover new ops * try to add to CI * move test_onnx_ops into testextra/ * more attention tests * qlinear_add atol=1 * attention still not fullllllly correct * it is what it is --------- Co-authored-by: chenyu <chenyu@fastmail.com>	2025-02-24 16:15:22 -05:00
George Hotz	c9493e41a6	reorder expand (#9051 ) * reorder expand * symbolic ops needs resolve here * s/arg/st + whitespace * viz --------- Co-authored-by: qazal <qazal.software@gmail.com>	2025-02-24 13:55:47 +01:00
qazal	14aa2395d0	allow VIEW(BUFFER) in Tensor UOps [pr] (#9210 ) * allow VIEW(BUFFER) in Tensor UOps [pr] * still reshapes * update becomes_map tests * bring copy folder to the scheduler * lint * only sgd left * optimizer assign * 13 kernels * rename to test_reorder_expand + assert VIEW	2025-02-24 13:06:15 +01:00
qazal	d12efc95d4	support custom name function in viz [pr] (#9219 ) * support custom name function in viz [pr] * title case * assert name count in test_track_rewrites_name_fxn	2025-02-24 03:03:25 +02:00
chenyu	b3ae664d5d	fix gradient of pow(t, int) (#9217 ) semi revert some pow logic back to tensor. added direct gradient check because the backward in test_ops passed by luck	2025-02-23 17:42:09 -05:00
qazal	9db0ec46a7	simpler buf_uop [pr] (#9215 ) * simpler buf_uop [pr] * assert after realize it's buffer	2025-02-23 19:23:14 +01:00
qazal	81a71ae0f6	hotfix: skip test_exclude_const_metadata (#9208 )	2025-02-22 23:26:04 +02:00
qazal	4578c3e8fd	simpler tensor metadata mapping + tests [pr] (#9203 ) * simpler tensor metadata mapping + tests [pr] * remove kernel metadata * don't map nones	2025-02-22 20:18:46 +01:00
George Hotz	4e6665bda5	different way to write torch backend (#9197 ) * different way to write torch backend * both backends * more work * simpler code * more work * test both * imply unwrap/wrap * FORWARD_ONLY=1 TINY_BACKEND=1 python3 test/test_ops.py TestOps.test_add works * ready to start making test_ops work in torch backend * backward pass, TINY_BACKEND=1 python3 test/test_ops.py TestOps.test_add works * FORWARD_ONLY=1 TINY_BACKEND=1 python3 test/test_ops.py TestOps.test_simple_conv2d works * matmul backward is broken with as_strided	2025-02-22 14:42:26 +08:00
qazal	2eab8021fb	remove inputs+outputs attributes from ScheduleItem [pr] (#9192 ) * remove inputs/outputs from ScheduleItem * fix test_linearizer * fix test_conv_shapetracker * fix test_schedule + lint * test_image_dtype + multitensor + search	2025-02-21 13:48:11 +01:00
chenyu	2e7c2780a9	CLANG -> CPU (#9189 )	2025-02-20 18:03:09 -05:00
chenyu	3e22747799	run unit test on windows ci (#9187 ) * factor out testing_minimal in setup.py [pr] * testing_unit + windows	2025-02-20 14:40:41 -05:00
chenyu	287de4ecc6	use torch in test_gradient (#9186 ) used torch.autograd.grad, but not sure if it can be a template like jax	2025-02-20 12:26:11 -05:00
George Hotz	caee42e8a6	Revert "name from uops [pr] (#9151 )" (#9154 ) This reverts commit `28897be9a2`.	2025-02-18 16:06:44 +08:00
George Hotz	28897be9a2	name from uops [pr] (#9151 )	2025-02-18 15:52:03 +08:00
George Hotz	a4dab3ec3f	add name uop (#9149 ) * add name uop, TODO: refactor renderer to use * renderer uses name uop * fix tests * render * ptx	2025-02-18 15:26:58 +08:00
George Hotz	df3b320f46	rewriter -> devectorizer [pr] (#9147 )	2025-02-18 12:42:08 +08:00
chenyu	465421b525	fix Tensor.isclose (#9143 ) many corner cases around inf and nan	2025-02-17 12:03:12 -05:00
qazal	36741cbbc1	enable real_size assert for test_conv_2x2_backward_one_view [pr] (#9142 )	2025-02-17 17:53:44 +01:00
Ali Ladjevardi	35e9c4657b	Use proper units when printing beam time (#9103 ) * use proper units when printing beam time * refactor DEBUG=2	2025-02-17 23:41:38 +08:00
Clément Verrier	a7f91224eb	add `Tensor.isclose()` (#8844 ) * add `Tensor.isclose()` * support `equal_nan` so as to match PyTorch's behavior * update unit tests * remove some tests temporarily * re-enable one test * re-enable other test * try to fix failing tests during CI * save one line of code --------- Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2025-02-17 10:11:40 -05:00
qazal	660c034da6	KERNEL op try 3 (#9061 ) * work * tolerate shape, maybe this is ASSIGN(RESHAPE(BUF), KERNEL) * err, it's not ASSIGN(BUF, KERNEL), it's ASSIGN(VIEW(BUF), KERNEL) * burn the boats * assign slightly works * assign works * cleanup + var_vals can exist * fine image + fix metadata * metadata, without making everything 30% slower * diff pruning * faster assign schedule * add_buffer_ops stage * add kernel_spec back * add viz display * more strict kernel_spec	2025-02-17 14:47:54 +01:00
George Hotz	4dd10d03b7	move is_increasing to ops [pr] (#9134 )	2025-02-17 19:27:48 +08:00
George Hotz	1bf66d62cf	symbolic gets its own file [pr] (#9132 )	2025-02-17 18:55:21 +08:00
George Hotz	bd694faf6c	factor out the expander logic [pr] (#9131 )	2025-02-17 18:09:48 +08:00
quortus	5bdf0c7951	Bitcast constant folding 2.0 (#9089 ) * Prevent const folding in test_payne_hanek_reduction * Do not use list as a default parameter * Bitcast constant folding --------- Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2025-02-17 18:08:20 +08:00
quortus	2be4529f14	Test broken const folding wraparound behavior (#9080 ) * Test broken const folding wraparound behavior * Add repro for test_payne_hanek_reduction const folding bug --------- Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2025-02-17 17:44:56 +08:00
quortus	638d925e4e	Prevent const folding in test_payne_hanek_reduction (#9088 ) * Prevent const folding in test_payne_hanek_reduction * Do not use list as a default parameter	2025-02-17 17:31:10 +08:00
George Hotz	9289425170	add ast to ProgramSpec + pre matcher [pr] (#9128 ) * add ast to ProgramSpec + pre matcher [pr] * cleaner cast + test fix	2025-02-17 16:39:14 +08:00
quortus	edf7213f34	Make bitcast to the same dtype noop (#9121 )	2025-02-16 20:28:44 -05:00
Ahmed Harmouche	59fe45f947	Solve get_grouped_dims does not split issue (#9085 ) * Solve dims too large errors on webgpu * Simplify divisor find * Test square root divisor * Fix lint * Refactor into group_dims and split_dims * Refactor * Fix lint * Add back max check in _group_dims * Prefer grouping over split --------- Co-authored-by: chenyu <chenyu@fastmail.com>	2025-02-16 19:57:29 -05:00
chenyu	c954419bc8	minor tweak to transcendental pow (#9112 ) also added more pow with const test cases	2025-02-15 18:03:25 -05:00
chenyu	8dfa0024f0	raise in scatter if self and src have different dtype [pr] (#9109 ) raise RuntimeError that matches torch instead of an implcitly cast	2025-02-15 11:21:34 -05:00
George Hotz	4672d9af73	actual tests for the dsp backend [pr] (#9102 ) * actual tests for the dsp backend [pr] * fix name	2025-02-15 15:17:56 +08:00
Marcello Fuschi	8824f7e9df	Make logcumsumexp numerically stable (#9050 ) * Make logcumsumexp numerically stable * Refactor * Refactor for special case ndim=0 * Refactor * Use the correct device for mask --------- Co-authored-by: chenyu <chenyu@fastmail.com>	2025-02-14 19:25:17 -05:00
b1tg	1f1362fd27	add truncate_bf16 (#9078 ) Co-authored-by: b1tg <b1tg@users.noreply.github.com>	2025-02-15 07:59:09 +08:00
chenyu	73af42aeab	fix pow backward when base is 0 (#9075 )	2025-02-13 21:06:01 -05:00
qazal	2d04a75a40	start tracking bottom_up_rewrite in viz [pr] (#9071 ) * start tracking bottom_up_rewrite in viz [pr] * use the tracking matcher in test_viz	2025-02-14 00:28:10 +01:00
chenyu	5ef48bbe0a	swap order in rsqrt (#9069 ) fixed backward for 0	2025-02-13 16:51:21 -05:00
chenyu	e02e3b94c3	remove SQRT hack in llvm (#9067 ) replaced with xpow 0.5 in transcendental. fixed sqrt(0) backward	2025-02-13 15:42:34 -05:00
chenyu	947c97e6ff	add test_sqrt to test_speed_v_torch (#9066 ) working on getting rid of llvm sqrt hack	2025-02-13 15:25:54 -05:00
chenyu	49abc09f77	remove the reshapes in test_arange_2_reduce [pr] (#9063 )	2025-02-13 12:33:25 -05:00
chenyu	2573d0621a	Tensor.scatter_reduce touchup [pr] (#9060 )	2025-02-13 10:01:14 -05:00
Josh Moore	1f9d2442b9	Add `Tensor.scatter_reduce` (#8947 ) * pytorch scatter -> scatter_reduce * WIP scatter_reduce implementation * _pre_scatter return type hint * split out src, mask to satisfy linter * Add src cast back in * dict of lambdas instead of ifs * sum and prod reduction ops with include_self * add reduce arg error message * add amax and amin reduction ops * Fix include_self for higher dims * Simplify * Simplify amax and amin too * Pull include_self logic out into _inv_mask function * reduce arg cannot be None for scatter_reduce * Fix self-mask issue * Add mean reduce op * Add tests * any() not needed here * remove comment * End support for Tensor src with reduce arg in tinygrad scatter * Process index, dim inside actual functions * Add scatter_reduce to onnx * Add excluded onnx ScatterElements reduction tests back in * Save 2 lines on the mask helpers * Update docs * Add include_self=False tests * cleanup * Remove unneeded helper function --------- Co-authored-by: chenyu <chenyu@fastmail.com>	2025-02-13 09:08:54 -05:00
qazal	2b9ce1235a	simple failing case for reorder expand + keep views in tensor_map [pr] (#9057 )	2025-02-13 11:22:55 +01:00
George Hotz	33a1151f2f	Revert "match torch rmsnorm implementation (#6799 )" (#9052 ) This reverts commit `a66b8250e0`.	2025-02-13 14:42:45 +08:00

1 2 3 4 5 ...

3419 Commits