tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-01-21 04:47:56 -05:00

Author	SHA1	Message	Date
geohotstan	cea5853cfa	add Tensor.scatter (#7737 ) * working I think * where are my onnx scatter tests?? * forward_only for now * try if nan hack fix NV * looks like issue is different... CUDA WHY * oops that was wrong. Try if this fixes CUDA * simpler multiply * actually finish this up tmrw morning :x * fix tests? * improve tests * improve test and implementation * fix ruff * complete but lots of expected failure... * reviewed tests * add onnx tests * is this a processing op? * add return type to indicate that it's not in-place * final cleanups * use or and improve tests a little * add masked_index_select * call it masked_setitem instead * try * FIXED --------- Co-authored-by: chenyu <chenyu@fastmail.com>	2024-11-27 10:52:04 -05:00
geohotstan	753f07e193	add circular pad mode to Tensor.pad (#7918 ) * start * send it * no more neg circular pads * quick fix onnx too --------- Co-authored-by: chenyu <chenyu@fastmail.com>	2024-11-27 10:30:51 -05:00
Ahmed Harmouche	10618aba98	Bring back WebGPU (#7063 ) * Start from andredaprato:webgpu-clean * Fix infs * inf wgsl function is not needed * Emulated ulong for threefry, more tests passing * Randomness tests passing * Update model export to support new changes in webgpu, efficientnet export works again * Simplify shift emulation in wgsl * Delete test file * Fix bigger than u32 u32 literal * Why was skip copies added here? * Python3.12 for webgpu tests * Fix model export syntax error * Get test ops passing with some skips * Fix lint * Much simpler shift * Run more tests * Timestamp queries are not supported in CI, so skip search tests * All fancy indexing passing * r is ctx * Run more dtype tests by using is_dtype_supported * Cleanup ulong shift rendering * UPat -> Pat, UOps -> Ops * Pat -> UPat * Refactor render_ushift if-else * Pattern to avoid ulong mul * Remove vals_dtype * is_nan trick + rewrite, test_isnan passing * Rewrite a * select(1, nan, gate) -> select(a, nan, gate) * No arg, just op * Support char, uchar, short, ushort * Run test_index_mnis now that we have uint8 * Fix pyling * Save 3 lines by using base Compiler * No more long emulation * Remove fixup_binops * No more external_local_bufx wgsl specific cstyle modif, use base extra_pm * Simpler, faster copyin/out * Skip some new tests that use long * Fix typo * copyout touchup * Save lines by using render_cast * WebGL is not supported in core, delete it from is_dtype_supported * More narrow test skips for some unary tests * TernaryOps, UnaryOps -> Ops * TinyGrad supports WebGPU * StableDiffusion demo: f16tof32 gpu is a lib, update UI * Packed load/store, no more scale_size, no core tinygrad changes * Rename copyin, copyout * Device -> dev * Fix lint * Pattern matcher rule for packed load/store * Refactor * Shorter packed load/store * this should fix lint * Fix mypy * SD compile script working * New SD webgpu UI * New default prompt * New SD weights * Fix title when webgpu not available * Run symbolic tests, simplify is_nan, use round_up * Show step time on UI * Bump minimum wgpu version to v0.19 * Fix latent --------- Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2024-11-26 12:26:40 +08:00
chenyu	3b26e51fce	Tensor.cummax (#7854 ) generalized the existing cumsum and take Ops.MAX in addition to Ops.ADD	2024-11-22 15:55:02 -05:00
geohotstan	cf1ec90ad4	add inverse trig functions to Tensor (#7805 ) * implement inverse trig functions * guess we should still test nans? * magnitude as variable name :D * reorder onnx_ops ops * approximation -> x for consistency * address feedback * simpler acos * improvement? * actually just have asin depend on atan * actually this is nicer * remove a comment --------- Co-authored-by: chenyu <chenyu@fastmail.com>	2024-11-21 09:13:36 -05:00
geohotstan	66a069ee25	add replicate mode to Tensor.pad (#7802 ) * base implementation * add tests * actually remove the assertionerror test * good	2024-11-20 08:39:58 -05:00
geohotstan	8100109c9d	Add replicate mode to Tensor.pad (#7608 ) * base implementation * add tests * actually remove the assertionerror test * actually only have reflect for this pr * change the 4 if-else one liner * maybe use a lambda * fix * maybe a lil cleaner * fix tests * complete * small change --------- Co-authored-by: chenyu <chenyu@fastmail.com>	2024-11-18 10:55:38 -05:00
chenyu	df817297b6	fix passing acc_dtype="" to Tensor.prod should fail (#7750 ) similar to sum	2024-11-17 11:38:13 -05:00
chenyu	55707fd00d	fix passing sum_acc_dtype="" to Tensor.sum should fail (#7748 )	2024-11-17 10:58:41 -05:00
chenyu	a15a900415	fix Tensor.meshgrid for 1D input and check indexing (#7740 )	2024-11-16 23:39:30 -05:00
geohotstan	72a41095bc	add Tensor.meshgrid (#7714 ) * initial implementation and test * some other places that can use meshgrid * revert the onnx_ops change * add to docs * revert interpolate too * update * improve edge case test * might as well test grad * add to test can improve docs --------- Co-authored-by: chenyu <chenyu@fastmail.com>	2024-11-16 23:06:47 -05:00
chenyu	f1efd84c92	fix repeat_interleave with negative dim (#7734 )	2024-11-16 10:15:29 -05:00
chenyu	22da31b223	clean up Tensor.dot (#7728 ) more docs (similar to numpy) and removed many confusing `-min(n2, 2)`	2024-11-15 18:21:15 -05:00
chenyu	4338c450ac	fix max_pool2d for int tensor with padding (#7726 ) padding inf messed output dtype	2024-11-15 16:22:11 -05:00
chenyu	9fb396f660	test_ops maxpool2d -> max_pool2d (#7696 ) and avgpool2d -> avg_pool2d for better grepping the tests	2024-11-14 10:39:12 -05:00
geohotstan	f8056a74d6	combine pad2d with pad (#7677 ) * I have pad2d, I have pad, uuh~, pad2dpad~ * fix some small things * strategically placed cast hack * fix more * fix more more * tests * periods	2024-11-14 17:56:02 +08:00
chenyu	333f5f9f8b	Tensor.bitwise_not (#7688 ) implemented with xor in tensor for now to not add another op. also used it in Tensor.min to fix dtype int on -2**31	2024-11-13 16:31:52 -05:00
chenyu	fb933b79a6	add test case for nll_loss with input > 2D (#7685 ) * failed test case for nll_loss with input > 2D * fixed * add more	2024-11-13 14:34:07 -05:00
geohotstan	9c41c376d3	add Tensor.nll_loss (#7683 ) * move nll_loss to new branch * make nll_loss examples practical * self is * add to docs * small	2024-11-13 13:12:13 -05:00
chenyu	3c6fe4b79a	fix Tensor.bitwise_and and Tensor.bitwise_or to support bool (#7684 )	2024-11-13 13:10:39 -05:00
James	d4e4a084a1	fix: Tensor min function for unsigned ints (#7675 ) * add failing tests for uint8 `min()` * fix unsigned data type min() * fix test data * fix whitespace --------- Co-authored-by: rezaarezvan <reza@rezvan.xyz> Co-authored-by: Jamesb <experimentallearning0@gmail.com>	2024-11-13 11:04:27 -05:00
Reza Rezvan	23363dee55	Add: failing tests for uint8 `min()` (#7669 ) * add failing tests for uint8 `min()` * mark as expected failure	2024-11-13 22:12:53 +08:00
chenyu	c06a5a9c72	Tensor.linspace raises for dtype.bool (#7649 ) also fixed an assert when passing str dtype to randint	2024-11-11 23:05:14 -05:00
geohotstan	5eef59d732	add Tensor.linspace (#7609 ) * add linspace * shave off tests and forgot to add to docs crap * WHOOPS * better tests	2024-11-12 10:29:36 +08:00
George Hotz	745316493c	hotfix: add test_simple_conv2d_bias	2024-11-10 18:36:42 +08:00
George Hotz	205befa788	move is_dtype_supported to device [pr] (#7575 )	2024-11-07 20:38:03 +08:00
geohotstan	934fb73994	fix test_schedule conv2d bug (#7549 ) * tests tests tests * slap a resolve on it * fix comment	2024-11-05 09:07:25 -05:00
Ahmed Harmouche	36488a2a43	Use is_dtype_supported in more places in tests (#7529 )	2024-11-04 09:21:15 -05:00
geohotstan	b1866cbfd9	failure test case for pool ops (#7483 ) * add failure test case * minimum case	2024-11-02 12:13:38 -04:00
geohotstan	585f3a0f24	Add isinf and isnan ops to Tensor (#7484 ) * move isinf and isnan to new branch * sneak a roll documentation fix in * add to docs * update test coverage for detect_positive and detect_negative * add types to isinf args	2024-11-02 12:12:52 -04:00
geohotstan	6513690223	Add Tensor.hardsigmoid (#7433 ) * move hardsigmoid to new branch * add to test * add NOTE to mention differing values for alpha and beta that match torch * shift from relu6 * correct shift implementation * or we just use relu? no more 666	2024-11-01 08:36:52 -04:00
chenyu	fb694a63eb	Tensor.erf (#7419 ) the same one used in onnx and the one in bert.	2024-10-30 18:12:28 -04:00
George Hotz	f3bd5cbf78	simplest migration of indexing [pr] (#7402 ) * simplest migration of indexing [pr] * fix locals/barrier	2024-10-30 20:58:18 +08:00
chenyu	f389e1a8a0	test more special values for sin/cos/tan [pr] (#7386 )	2024-10-29 21:13:37 -04:00
George Hotz	3989bd2682	idiv + reciprocal [pr] (#7354 ) * idiv + reciprocal * remove upcast from div * fix docs	2024-10-29 15:54:19 +08:00
George Hotz	d9d4dd6756	faster ci [pr] (#7348 )	2024-10-29 14:01:44 +08:00
chenyu	0843734927	clean up nan handling in transcendental (#7332 ) * clean up nan handling in transcendental * skip remu crash	2024-10-28 16:21:49 -04:00
chenyu	cb5702f170	tiny cleanup to transcendental xexp2 (#7326 ) also added test for exp and log of nan and inf	2024-10-27 21:54:20 -04:00
George Hotz	3c31497f55	instant isn't actually used [pr] (#7299 ) * instant isn't actually used [pr] * tolerance bump	2024-10-25 21:01:29 +08:00
chenyu	13575f080a	remove bitcast backward in function.py (#7031 ) bitcast cannot backward	2024-10-13 10:08:27 -04:00
Markiian Novosad	8831c691e2	Add slice parameter type checking to disallow Tensor usage for slices (#6967 ) * add support for single el tensors for slices * rm trailing spaces * cleanup long lines * remove tensor in slice support, add comprehensive err msg * cleanup getitem, add slice type check * Edit err message	2024-10-11 16:20:21 -04:00
chenyu	e4c0743188	failed example for logcumsumexp (#6936 ) need cummax for numerical stability	2024-10-07 10:55:45 -04:00
jeffzh4ng	19a7e41113	implement logcumsumexp (#6921 ) * implement logcumsumexp * change axis=None to axis=0	2024-10-06 10:45:36 -04:00
George Hotz	c178dc1071	faster uops ci [run_process_replay] (#6774 )	2024-09-26 20:15:01 +08:00
George Hotz	e945fa9c5c	put local on the PtrDtype [run_process_replay] (#6656 ) * put local on the PtrDtype [run_process_replay] * those are local too	2024-09-23 10:29:17 +08:00
Gaétan Lepage	f214bb140d	test: relax tolerance of test_broadcastdot (#6560 )	2024-09-17 03:26:39 -04:00
chenyu	b2c286f567	fix typing for test_ops (#6520 ) mostly passed TYPED=1 python3 -m pytest -n=auto test/test_ops.py. one last test specifically set an invalid value to test the exception, and to ignore that we need to import typeguard. And to get a working version of typeguard, we would need to get rid of dependency on tensorflow_addons because it requires a very old version of typeguard	2024-09-15 06:18:36 -04:00
chenyu	7df4373fd9	tensor reduction touchup (#6402 ) - fixing spacing - use get_args to get valid Literal values and raise ValueError to match, and a test for that - use `Y` to be consistent	2024-09-08 03:55:51 -04:00
Irakli Salia	2e01efc35f	tensor roll (#6375 ) * tensor roll function and tests * fix type annotations * reduce line count * more readable	2024-09-07 05:14:28 +08:00
Tim Becker	dfb818788e	Support `reduction` parameter in more loss functions (#6302 )	2024-09-07 05:11:20 +08:00

... 2 3 4 5 6 ...

623 Commits