tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-01-24 22:38:16 -05:00

Author	SHA1	Message	Date
chenyu	9fb396f660	test_ops maxpool2d -> max_pool2d (#7696 ) and avgpool2d -> avg_pool2d for better grepping the tests	2024-11-14 10:39:12 -05:00
geohotstan	f8056a74d6	combine pad2d with pad (#7677 ) * I have pad2d, I have pad, uuh~, pad2dpad~ * fix some small things * strategically placed cast hack * fix more * fix more more * tests * periods	2024-11-14 17:56:02 +08:00
qazal	0914c2fec9	add TestLinearizerFailures test_failure_56 and test_failure_57 (#7682 ) * add test_failure_56 and test_failure_57 * so it's only METAL=1	2024-11-14 12:00:33 +08:00
chenyu	333f5f9f8b	Tensor.bitwise_not (#7688 ) implemented with xor in tensor for now to not add another op. also used it in Tensor.min to fix dtype int on -2**31	2024-11-13 16:31:52 -05:00
chenyu	fb933b79a6	add test case for nll_loss with input > 2D (#7685 ) * failed test case for nll_loss with input > 2D * fixed * add more	2024-11-13 14:34:07 -05:00
geohotstan	9c41c376d3	add Tensor.nll_loss (#7683 ) * move nll_loss to new branch * make nll_loss examples practical * self is * add to docs * small	2024-11-13 13:12:13 -05:00
chenyu	3c6fe4b79a	fix Tensor.bitwise_and and Tensor.bitwise_or to support bool (#7684 )	2024-11-13 13:10:39 -05:00
chenyu	3d82f8e340	simpler rand_like (#7680 )	2024-11-13 12:28:41 -05:00
James	d4e4a084a1	fix: Tensor min function for unsigned ints (#7675 ) * add failing tests for uint8 `min()` * fix unsigned data type min() * fix test data * fix whitespace --------- Co-authored-by: rezaarezvan <reza@rezvan.xyz> Co-authored-by: Jamesb <experimentallearning0@gmail.com>	2024-11-13 11:04:27 -05:00
chenyu	d1dfd598a2	assert specifying device to rand_like a multi tensor (#7678 ) * assert specifying device to rand_like a multi tensor raise RuntimeError instead of dropping it silently * fix that	2024-11-13 10:24:40 -05:00
chenyu	51432bfbff	add rand_like test case with device specified (#7663 ) in single device or copied multi case, device is applied. but for sharded case the device is silently ignored now. maybe similar to rand we just don't allow tuple device in rand_like	2024-11-13 09:32:55 -05:00
Reza Rezvan	23363dee55	Add: failing tests for uint8 `min()` (#7669 ) * add failing tests for uint8 `min()` * mark as expected failure	2024-11-13 22:12:53 +08:00
qazal	e84d089ef1	delete ReduceOps, only use REDUCE_AXIS (#7667 )	2024-11-13 19:04:27 +08:00
chenyu	1884f021e3	add conv3x3 to speed_v_theoretical (#7658 ) * add conv3x3 to speed_v_theoretical * show test duration	2024-11-12 16:41:56 -05:00
chenyu	962dafb467	use randn in speed_v_theoretical instead of rand (#7656 ) * use randn in speed_v_theoretical instead of rand this made green gemv 20% faster... but why? * update threshold	2024-11-12 15:00:32 -05:00
chenyu	6159790ab8	add gemv to speed_v_theoretical (#7654 ) * add gemv to speed_v_theoretical getting ~300GB/s if we just count the memory of inputs and output * better green numbers * flip	2024-11-12 11:19:35 -05:00
George Hotz	4f1f823021	add tiny test for randomness + remove ulong buffers (#7648 ) * add tiny test for randomness * Tensor._device_seeds is a Tuple * no tuple, just a 2 element tensor * no more longs * fix tests, and maybe ocelot works now * NV still doesn't work. cleanup rules * test + two more rules	2024-11-12 12:45:52 +08:00
chenyu	c06a5a9c72	Tensor.linspace raises for dtype.bool (#7649 ) also fixed an assert when passing str dtype to randint	2024-11-11 23:05:14 -05:00
geohotstan	5eef59d732	add Tensor.linspace (#7609 ) * add linspace * shave off tests and forgot to add to docs crap * WHOOPS * better tests	2024-11-12 10:29:36 +08:00
chenyu	99f29e50b2	update speed_v_theoretical numbers (#7647 ) better amd after set compute profile	2024-11-11 20:05:13 -05:00
chenyu	773d5b60bf	beam benchmark tests (#7638 ) * beam benchmark tests * lower AMD number somehow * less flaky	2024-11-11 18:11:18 -05:00
chenyu	bfab03288d	fix HALF=1 in test_speed_v_torch (#7642 ) * fix HALF=1 in test_speed_v_torch "operation cache defeats" adds 1 to all arg, which were centered around 0. adding 1 makes big matmul and matvec go inf. fixed by subtract 1 after and bumpped tolerance for half input * bigger tol for BIG=2, update CI too * bigger tol	2024-11-11 14:29:37 -05:00
nimlgen	4d81b7952a	qcom match texture/sampler descriptors to OpenCL (#7622 ) * qcom ioctl compare more regs * bug fix	2024-11-11 21:56:51 +03:00
George Hotz	d40673505f	new cloud is cloudy [pr] (#7631 ) * new cloud is cloudy [pr] * waste lines to add security * safety, with speed and less lines * timing and del * lines * cleanups * restore CloudSession * bump to 3.10 * quotes * renderer security	2024-11-11 20:18:04 +08:00
George Hotz	bbc64bf305	x\|(x&y) -> x (#7629 ) * x\|(x&y) -> x * fix tests	2024-11-11 10:00:18 +08:00
uuuvn	94a484542b	Hook memoryview via class instead of a function (#7627 )	2024-11-11 09:07:06 +08:00
qazal	a8da84cce0	recursive swizzle with just graph_rewrite [pr] (#7626 )	2024-11-10 20:14:21 +02:00
qazal	092a441748	test swizzle post permute (#7623 ) * test swizzle post permute * add st_fixup assert	2024-11-10 16:18:22 +02:00
George Hotz	745316493c	hotfix: add test_simple_conv2d_bias	2024-11-10 18:36:42 +08:00
George Hotz	0a411b4f68	replace llvm with new llvm (#7616 ) * replace llvm with new llvm * fix test_linearizer * minor fixups * fix alloca * don't use alloca * fix DEFINE_ACC * lines * comments and lines * a little tighter	2024-11-10 11:28:52 +08:00
qazal	b61266eb97	late fusion spec for big graph [pr] (#7613 )	2024-11-09 23:43:11 +08:00
qazal	9d6b03d691	early assert swizzle in kernel [pr] (#7610 ) * early assert swizzle in kernel [pr] * better * note changes * TestIndexing 2	2024-11-09 21:54:43 +08:00
chenyu	8ca422e21a	script to compare kernel opt with BEAM (#7604 ) intersting that on m1 max hcopt wins BEAM 2 about 20% of the time	2024-11-08 17:40:28 -05:00
chenyu	573f145dcf	METAL raise RuntimeError with no compiler and bad src (#7603 ) fixed BEAM if src is invalid on METAL. it currently only accept RuntimeError in `_time_program`	2024-11-08 17:09:12 -05:00
chenyu	74b4d1c1e1	rewrite idx again in real_strides after uop_given_valid (#7600 ) uop_given_valid does not guarantee output to be flat. fixed one last real_strides test.	2024-11-08 14:30:32 -05:00
chenyu	c6189e38c1	simplify_valid in real_strides (#7599 ) improved one more real_strides. after finishing the last one will think about always applying these in to_indexed_uops	2024-11-08 10:45:22 -05:00
Ahmed Harmouche	e35226e698	Remove Ops.ALU (#7595 )	2024-11-08 19:52:14 +08:00
Harald Schäfer	e7cbc29f48	openpilot benchmark: add cast from numpy to benchmark (#7593 ) * openpilot benchmark: add cast from numpy to benchmark * whitespace * comment	2024-11-08 19:31:00 +08:00
chenyu	a1dfd288bb	different valid order (#7589 ) in simplify_valid, we start with valids that are in others' parent so the others is more likely to be simplified	2024-11-07 20:27:56 -05:00
chenyu	4378b100ad	make UOp.range arg a tuple [pr] (#7583 ) * make UOp.range arg a tuple [pr] so render works on output of ShapeTracker.to_indexed_uops * fix	2024-11-07 11:58:09 -05:00
chenyu	bb7b5362be	uop_given_valid in real_strides (#7231 ) simplified idx allows deriving more strides	2024-11-07 09:41:16 -05:00
uuuvn	c846dd70b2	Increase test tolerance for probabilistic test (#7580 )	2024-11-07 09:35:11 -05:00
George Hotz	205befa788	move is_dtype_supported to device [pr] (#7575 )	2024-11-07 20:38:03 +08:00
qazal	1f5ea1e412	late fusion tests, early merge view GroupOp.Buffer [pr] (#7577 ) * test_late_fusion_double_transpose * early merge view buffer ops	2024-11-07 20:04:57 +08:00
qazal	f0fc34e594	swizzle tests from the delete_fuse branch [pr] (#7576 ) * swizzle tests from the delete branch [pr] * actually test torch * atol	2024-11-07 18:29:06 +08:00
chenyu	a011562450	fix view add with symbolic shape (#7569 ) the issue is that the symbolic shape is not greedily simplified and canonicalized before reshape	2024-11-06 11:39:20 -05:00
qazal	6a19ca81c9	failing test for View.__add__ RecursionError (#7567 ) * failing test for View.__add__ RecursionError * move to test_symbolic_shapetracker	2024-11-06 23:46:47 +08:00
qazal	a9a040398c	don't print the entire schedule on assert [pr] (#7565 ) * don't print the entire schedule on assert [pr] * extra	2024-11-06 18:29:50 +08:00
chenyu	c805e3fff5	skip test_jit_batch_split if JIT >= 2 (#7561 ) * skip test_jit_batch_split if JIT >= 2 only test graphs * 1600	2024-11-05 14:59:04 -05:00
chenyu	f2fa183651	increase threshold test_strongly_connected_DAG (#7560 ) it shoult test some other properties. flakying with time test https://github.com/chenyuxyz/tinygrad/actions/runs/11688403523/job/32548762512	2024-11-05 11:44:39 -05:00

... 34 35 36 37 38 ...

4618 Commits