tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-01-27 15:58:10 -05:00

Author	SHA1	Message	Date
nimlgen	9bd13de44c	lower test_gemv_4096_16384 to 750 for red (#9367 )	2025-03-05 22:44:48 +03:00
uuuvn	b75f307234	amd: autogen ip bases (#9360 )	2025-03-05 22:30:38 +03:00
chenyu	2cb2fce8d9	lower test_gemm_8192 amd_tflops to 65 (#9364 )	2025-03-05 14:06:11 -05:00
nimlgen	14c88abf27	add some options to allreduce bench (#9348 )	2025-03-04 23:46:36 +03:00
Anish Umale	bafa40fe12	Tiny backend test_ops fix part1 (#9338 ) * extract name methods from https://github.com/tinygrad/tinygrad/pull/9302 * t.grad.numpy() -> t.grad.cpu().numpy() * revert TORCH_DEBUG change * revert dtype change in aten.sum	2025-03-03 12:36:51 -05:00
George Hotz	0d4ba7dd87	import tinygrad.frontend.torch (#9337 ) * import tinygrad.frontend.torch * type ignore	2025-03-04 00:15:29 +08:00
qazal	23084fd850	merge merge_views and remove_movement_ops [pr] (#9333 ) * merge merge_views and remove_movement_ops [pr] * fix that assert	2025-03-03 12:38:59 +01:00
George Hotz	ece0a0f305	use empty for test instead of rand (#9332 )	2025-03-03 16:19:06 +08:00
George Hotz	2cc4cb74f0	reorder binops (#9328 ) * reorder binops * test improvements + fix string tests * ugh, okay this	2025-03-03 14:58:18 +08:00
chenyu	146eb73790	fix Tensor.view with a tuple arg (#9330 )	2025-03-02 23:35:23 -05:00
chenyu	ba4b8c2c23	Tensor.copysign (#9329 )	2025-03-02 21:33:49 -05:00
nimlgen	8cae00833c	flaky test in ci (#9321 )	2025-03-02 16:27:22 +03:00
Ali Ladjevardi	00028e87bb	Failing test for not realizing intermediate expand in multi-GPU (#9320 )	2025-03-02 12:54:48 +01:00
George Hotz	ba97fd0b9c	hotfix: add test/external/external_benchmark_disk_raw	2025-03-02 02:32:15 +00:00
chenyu	cc2bbb0bf1	Tensor.isfinite (#9316 )	2025-03-01 19:58:56 -05:00
geohotstan	d9ec05cea6	Test Onnx quantization behavior (#9301 ) * add DynamicDequantizeLinear and corresponding tests * wow qlinearops are round away from zero * this passes locally... * again * try * try separate test * round to even again * also add QLinearMul --------- Co-authored-by: chenyu <chenyu@fastmail.com>	2025-03-01 19:21:58 -05:00
chenyu	fe0f860209	update test_ops for tensors from torch (#9308 ) a few detach().numpy() -> detach().cpu().numpy()	2025-02-28 15:57:25 -05:00
chenyu	38d7aae3b7	onnx fmod (#9307 )	2025-02-28 14:09:22 -05:00
chenyu	7c7db78feb	support float mod (#9306 ) also added spec check on Ops.MOD to be ints only	2025-02-28 13:33:58 -05:00
chenyu	90808e2dd0	div rounding_mode (#9304 )	2025-02-28 11:38:25 -05:00
chenyu	3ae66e59a3	least_upper_float is at least default_float (#9303 ) * least_upper_float is at least default_float en route for div rounding mode. dtype of true int division would change from int32 to default_float, which matches torch too. * fix bert acc	2025-02-28 10:41:56 -05:00
Eitan Turok	d657d5f754	[Bounty] Vectorize Transcendental (#9058 ) * init * cast everythig right * more casting * install pillow in test * quick tests * simplify * quick tests * delete test * tests * fix import error * add vec to ldexp3k * vec for bitcast * some helper tests * high level tests * clean tests * change tolerance so cuda passes * ruff passes * remove tests for transcendental helpers * ruff passes * make exponent in power vectorized * fix pow test * add newline * add vec dtype to ilogb2k * comment + clean up * ruff --------- Co-authored-by: chenyu <chenyu@fastmail.com> Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2025-02-28 15:47:25 +08:00
qazal	cdf66cc67f	test: recompute expanded CAST (#9286 ) * those views should merge * diff cleanup * gpu * put it behind CAST_AFTER_EXPAND	2025-02-27 19:22:17 +01:00
chenyu	4342300eff	lower test_gemm_8192 amd to 70 (#9277 ) flaky	2025-02-26 16:32:08 -05:00
Francis Lata	86b737a120	leakyrelu to leaky_relu (#9270 )	2025-02-26 13:22:08 -05:00
chenyu	cd822bbe11	hotfix torch_grad.detach().cpu().numpy() in test_ops (#9268 )	2025-02-26 12:27:35 -05:00
chenyu	49ca90df75	update test_ops backward tests (#9267 ) instead of `(out+1).square().mean().backward()`, use forward.sum().gradient to get closer to the gradients	2025-02-26 12:09:24 -05:00
chenyu	aaf0a8069f	xor -> bitwise_xor (#9264 )	2025-02-26 10:21:14 -05:00
qazal	e162aa862d	is_realized only if buffer is allocated (#9253 ) * is_realized only if the buffer is allocated * fix the image check too * assert test_lil_model after ExecItems run	2025-02-26 08:58:08 +01:00
George Hotz	3f4eb9006a	test for device mismatch [pr] (#9250 ) * test for device mismatch [pr] * fix bert	2025-02-26 13:06:33 +08:00
Sieds Lykles	9c4d9d9f10	Acc first (#9232 ) * put acc in front of the add chain * handle the other case * Make loop collapse more generic * Remove mulacc_unrolled * Actually remove it --------- Co-authored-by: George Hotz <geohot@gmail.com> Co-authored-by: chenyu <chenyu@fastmail.com>	2025-02-25 22:10:15 -05:00
nimlgen	70db8c3003	hcq: dyn alloc signals (#9238 ) * hcq: dyn alloc signals * types and uniqueue devs * typing * mypy * mypy one more time * test * make fds to not intersect in mockgpu between drivers	2025-02-25 17:22:24 +03:00
nimlgen	b4c3780df0	hotfix: interop example (#9237 ) * hotfix: interop example * rm this * fix * fix ci mps * atol rtol * no uaf	2025-02-25 10:32:00 +03:00
Sieds Lykles	990c240b82	Stable pow gradient (#9226 ) * Stable gradient * More efficient * Fix and test for +-inf * cleaner * skip webgpu test --------- Co-authored-by: chenyu <chenyu@fastmail.com>	2025-02-24 20:54:26 -05:00
qazal	cbfe95d306	bring cast before view back (#9230 ) * bring cast before view back * tune it to only trigger on expands --------- Co-authored-by: chenyu <chenyu@fastmail.com>	2025-02-25 01:50:39 +02:00
chenyu	90c3ed17c5	move cast to before softmax in attention (#9213 ) * move cast to before softmax in attention saved some memory because exp (which is used for backward) are done in half. training bert seems fine and can fit BS=78 now (from 66) * test	2025-02-24 17:24:59 -05:00
geohotstan	f0b24d230c	add test_onnx_ops.py (#8569 ) * boom * fix webgpu * use exact variable names in test so that AI can read easier * add tag for specific test name like test a specific dtype * fix ruff * astype everything * dtype in array creation * just arange * is 67% considered fixed? * move test up * small cleanups * share function * add qgemm as well * add qgemm too * make sure qgemm comes out as int * take out qgemm for now * fixed test * add correct qgemm * addressing feedback here too, early naive fix for now * simplify bias and c to be minimalistic enough to test correctness * refactored qlinearops * maybe these asserts aren't the best.. * fix test * updated tests to cover new ops * try to add to CI * move test_onnx_ops into testextra/ * more attention tests * qlinear_add atol=1 * attention still not fullllllly correct * it is what it is --------- Co-authored-by: chenyu <chenyu@fastmail.com>	2025-02-24 16:15:22 -05:00
George Hotz	c9493e41a6	reorder expand (#9051 ) * reorder expand * symbolic ops needs resolve here * s/arg/st + whitespace * viz --------- Co-authored-by: qazal <qazal.software@gmail.com>	2025-02-24 13:55:47 +01:00
qazal	14aa2395d0	allow VIEW(BUFFER) in Tensor UOps [pr] (#9210 ) * allow VIEW(BUFFER) in Tensor UOps [pr] * still reshapes * update becomes_map tests * bring copy folder to the scheduler * lint * only sgd left * optimizer assign * 13 kernels * rename to test_reorder_expand + assert VIEW	2025-02-24 13:06:15 +01:00
qazal	d12efc95d4	support custom name function in viz [pr] (#9219 ) * support custom name function in viz [pr] * title case * assert name count in test_track_rewrites_name_fxn	2025-02-24 03:03:25 +02:00
chenyu	b3ae664d5d	fix gradient of pow(t, int) (#9217 ) semi revert some pow logic back to tensor. added direct gradient check because the backward in test_ops passed by luck	2025-02-23 17:42:09 -05:00
qazal	9db0ec46a7	simpler buf_uop [pr] (#9215 ) * simpler buf_uop [pr] * assert after realize it's buffer	2025-02-23 19:23:14 +01:00
qazal	81a71ae0f6	hotfix: skip test_exclude_const_metadata (#9208 )	2025-02-22 23:26:04 +02:00
qazal	4578c3e8fd	simpler tensor metadata mapping + tests [pr] (#9203 ) * simpler tensor metadata mapping + tests [pr] * remove kernel metadata * don't map nones	2025-02-22 20:18:46 +01:00
George Hotz	4e6665bda5	different way to write torch backend (#9197 ) * different way to write torch backend * both backends * more work * simpler code * more work * test both * imply unwrap/wrap * FORWARD_ONLY=1 TINY_BACKEND=1 python3 test/test_ops.py TestOps.test_add works * ready to start making test_ops work in torch backend * backward pass, TINY_BACKEND=1 python3 test/test_ops.py TestOps.test_add works * FORWARD_ONLY=1 TINY_BACKEND=1 python3 test/test_ops.py TestOps.test_simple_conv2d works * matmul backward is broken with as_strided	2025-02-22 14:42:26 +08:00
qazal	2eab8021fb	remove inputs+outputs attributes from ScheduleItem [pr] (#9192 ) * remove inputs/outputs from ScheduleItem * fix test_linearizer * fix test_conv_shapetracker * fix test_schedule + lint * test_image_dtype + multitensor + search	2025-02-21 13:48:11 +01:00
chenyu	2e7c2780a9	CLANG -> CPU (#9189 )	2025-02-20 18:03:09 -05:00
chenyu	3e22747799	run unit test on windows ci (#9187 ) * factor out testing_minimal in setup.py [pr] * testing_unit + windows	2025-02-20 14:40:41 -05:00
chenyu	287de4ecc6	use torch in test_gradient (#9186 ) used torch.autograd.grad, but not sure if it can be a template like jax	2025-02-20 12:26:11 -05:00
George Hotz	caee42e8a6	Revert "name from uops [pr] (#9151 )" (#9154 ) This reverts commit `28897be9a2`.	2025-02-18 16:06:44 +08:00

1 2 3 4 5 ...

3451 Commits