tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-02-10 06:35:33 -05:00

Author	SHA1	Message	Date
chenyu	dad4ee4539	use least_upper_dtype mlops to upcast the output type in mlops (#2788 ) * InterpretedFlopCounter uses least_upper_dtype for output dtype * fix target dtype check * fix that	2023-12-15 23:46:57 -05:00
chenyu	1bc378c3d6	_broadcasted handles the python number types (#2785 ) * _broadcasted handles the python number types * disable that test	2023-12-15 22:43:27 -05:00
chenyu	0703075357	bf16 is float (#2786 ) * add bfloat16 to is_float check * and test	2023-12-15 21:41:30 -05:00
chenyu	e4bbbc5bc3	Revert "Use the reduceop dtype to define the acc in linearizer (#2625 )" (#2783 ) This reverts commit `f3ed96a929`.	2023-12-15 16:29:10 -05:00
qazal	f3ed96a929	Use the reduceop dtype to define the acc in linearizer (#2625 ) * upcast the other way * Revert "upcast the other way" This reverts commit `355692ba79`. * remove uop cast, this should have never been there * add regression test * now fuzz it correct test * the accumulator is always the output type lint * fuzz all reduce ops * MULACC upcast_dtype could be half too opencl supports it https://man.opencl.org/mad.html * cast to the same dtype is a noop * internal casting support for MULACC * fuzz test mulacc internal casting * get_reduce_dtype handle vectorized acc update get_reduce_acc calls with the correct dtype update tests * pending _complete_ implementation of a function that gets the dtype based on self.reduceop +more failing tests * get_reduce_dtype try 2 add TODO * get_lazyop_info already does it * cleanup * bring back internal casting support for mulacc * use the scalar version of the acc dtype * conceptual diff cleanup * one extra line to a cleaner linearizer * correct test assumptions - these should promote? * rm mulacc cast, the cast of vins happens with the acc dtype promotion linearizer hacks * Revert "rm mulacc cast, the cast of vins happens with the acc dtype promotion" This reverts commit `afdd540733`. Revert "correct test assumptions - these should promote?" This reverts commit `49ae2206ed`. * skip tests blocked by MULACC->lazyop cleanup * final changes to add back internal casting for MULACC and update skip test logic, upcast works but downcast does not * only test the linearizer abstraction layer we wanna ensure that linearizer matches whatever lazy is returning * remove unused hypothesis module * remove mulacc related changes, those will move to the lazy pr * remove midcast test * move to helpers * Revert "remove midcast test" This reverts commit `86e74d7960`. add TODO with skip --------- Co-authored-by: chenyu <chenyu@fastmail.com>	2023-12-15 16:14:32 -05:00
chenyu	765f8b05e5	TernaryOps.WHERE has vin[0] as bool and BinaryOps.CMPLT always outputs bool (#2782 ) * vin[0] to where is always bool * due to better hack * update test * fix test_uops	2023-12-15 14:51:51 -05:00
George Hotz	96a276cc7c	hotfix: add test_reduce_permute_nofuse to master	2023-12-15 09:39:47 -08:00
qazal	66f07d97e2	don't auto-cast half to float in unary functions (#2776 ) * least upper float * dont cast to the same thing * tests for least_upper_float * add regression tests to test_dtype_alu * the call is pretty cheap probably cache is too much overhead	2023-12-15 10:11:47 -05:00
George Hotz	c6eb618013	tests from new lazy branch (#2774 ) * tests from new lazy branch * fix lin 11 * that was needed * doesn't fail * mark * meant that * llvm passes	2023-12-14 23:06:39 -08:00
chenyu	a044125c39	validate stable diffusion for seed 0 (#2773 ) * validate stable diffusion for seed 0 the closest false positive i can get is with the setup and one less step. dist = 0.0036 same setup with fp16 has dist=5e-6. so setting validation threshold to 1e-4 should be good * run with --seed 0	2023-12-15 00:07:09 -05:00
chenyu	9afa8009c1	hot fix explicitly set arange dtype to float (#2772 )	2023-12-14 23:14:38 -05:00
chenyu	c0f76ed4ea	transformer kvcache and mask have same dtype as input (#2771 ) * transformer kvcache and mask have same dtype as input * don't use `=0` in cstyle ternary where * (bool) * where float16 test	2023-12-14 22:41:51 -05:00
chenyu	2dd0dd4ae0	cleanup llvmir (#2770 )	2023-12-14 18:13:22 -05:00
chenyu	66d9eb10b6	arange default dtype to int and zeros/ones default to float (#2769 )	2023-12-14 17:53:00 -05:00
qazal	3cf4376ce2	test_linearizer cleanup (#2766 ) * test_linearizer cleanup * use unittest.skipIf * update msg	2023-12-14 17:20:09 -05:00
chenyu	57017c87e9	remove duplicated dtype in DEFINE_GLOBAL args (#2768 ) now DEFINE_GLOBAL uop.arg[1] is always the same as uop.dtype, we can remove the one in arg and just use uop.dtype	2023-12-14 15:42:36 -05:00
chenyu	5235cdee3d	remove _arg_int32 internal type (#2767 ) in DEFINE_GLOBAL, PtrDtype(int32) is buffer and int32 is int	2023-12-14 14:17:14 -05:00
chenyu	8a2a2257b4	minor onnx_op cleanups to prep dtype changes (#2764 ) * minor onnx_op cleanups to prep dtype changes read through it and clean some minor stuff * revert embedding - is it really being tested	2023-12-14 13:01:27 -05:00
geohotstan	0398288b79	Getitem round3 .... (#2760 ) * refactor round 3 * comment * oops * oops * oops2 * factored out multiple condition * add a comment for type * wooaah roundup is cool, thanks chenyu lol * add another walrus for symmetry and some spaces * lol wtf useless listcompre	2023-12-14 12:22:37 -05:00
chenyu	0ae22b0f81	restore Tensor.default_type in test_hip_rdna3 (#2763 ) might cause flaky tests	2023-12-14 11:35:38 -05:00
qazal	746cb5de21	Test coverage for matvec (#2762 ) * add test coverage for matvec * skip devices that don't support locals	2023-12-14 11:34:56 -05:00
chenyu	64fea9ff4a	Revert "minor onnx_op cleanups to prep dtype changes (#2758 )" (#2759 ) This reverts commit `38da001b64`.	2023-12-14 03:12:14 -05:00
chenyu	38da001b64	minor onnx_op cleanups to prep dtype changes (#2758 ) read through it and clean some minor stuff	2023-12-14 03:05:59 -05:00
jaredeh	d8952fc575	updating to work with new internal apis (#2755 )	2023-12-13 21:54:47 -08:00
chenyu	2c6814ba28	insert_before is None means insert at the end (#2757 )	2023-12-13 21:05:10 -05:00
chenyu	aad005e220	set default str for CStyleLanguage.arg_int_prefix (#2756 ) it's the same `const int` for clang, opencl, cuda and hip metal overwrites with `constant int&` and webgl has its own thing	2023-12-13 20:23:27 -05:00
chenyu	107dd8f3d7	fix a typo in test_dtype_alu (#2754 )	2023-12-13 19:23:21 -05:00
chenyu	fc6bca7ba8	update type annotation of _broadcasted (#2753 ) input can be Tensor, float, int. also updated scaled_dot_product_attention that might add a None to a Tensor	2023-12-13 19:03:14 -05:00
Maksym Sobolyev	bf4165ccac	Fix double exception in __del__() when __init__() raises exception. (#2738 )	2023-12-13 15:46:11 -08:00
chenyu	81a747fc63	more test cases in test_slice_fancy_indexing_with_idx (#2751 )	2023-12-13 17:52:26 -05:00
chenyu	22feb7330e	simplify fancy index with negative Tensor entries (#2749 )	2023-12-13 14:45:50 -05:00
chenyu	b229879613	refactor _broadcasted (#2747 ) also moved the expand noop check to .expand.	2023-12-13 13:36:25 -05:00
George Hotz	7e5b3e53fe	changes to prep for new lazy (#2748 ) * changes to prep for new lazy * put those back	2023-12-13 10:28:22 -08:00
Umut Zengin	8ad7cfeeb1	More simplification in to_image_idx and symbolic (#2679 ) * less valid * add test --------- Co-authored-by: chenyu <chenyu@fastmail.com>	2023-12-13 12:30:44 -05:00
Ahmed Harmouche	e7248b677c	Remove wgsl custom render_for (#2729 ) * Generic for * remove custom render_if * Simplify for loop * 150 line-length constraint * Put custom render_if back	2023-12-13 09:04:17 -08:00
tomtom-95	6b0f07e94a	add decorator to preserve info about original function (#2743 )	2023-12-13 09:03:50 -08:00
chenyu	aa4a0de287	simpler Tensor.pow to integer (#2746 )	2023-12-13 11:39:20 -05:00
chenyu	26f49869f4	minor tensor type annotation and cleanup (#2742 )	2023-12-13 01:53:59 -05:00
chenyu	2ef33abd20	some unary functions cast int input into float (#2740 ) * some unary functions cast int input into float * precision * image dtype	2023-12-13 00:10:29 -05:00
George Hotz	3e778fcc52	hotfix: ***	2023-12-12 19:44:31 -08:00
Shawn Hagler	51afe938f1	update onnx model links (#2737 )	2023-12-12 19:11:11 -08:00
George Hotz	431fae5ed3	hotfix: update_stats cleanup, yellow is nicer than red	2023-12-12 17:50:22 -08:00
chenyu	0869e7a301	update onnx benchmark urls (#2735 ) onnx is remapping the models, old ones are in archive/	2023-12-12 20:46:01 -05:00
George Hotz	6d6eb9302d	ruff checks the max line length is 150 (#2734 ) * ruff checks the max line length is 150 * fix tensor.py * a lot more * done	2023-12-12 17:34:47 -08:00
George Hotz	3635540ddb	shorter line (#2733 )	2023-12-12 15:34:17 -08:00
nimlgen	ede7971ada	save some lines (#2731 ) * remove unsused mem_cached var * one more	2023-12-12 15:26:27 -08:00
chenyu	00b611c156	simplify type promotion - remove weak types (#2730 )	2023-12-12 16:12:57 -05:00
Nguyen Nguyen Phuong	07cf45e133	fix cuda matmul (#2725 )	2023-12-12 07:59:31 -08:00
chenyu	ef6e942a23	dtype promotion helpers (#2724 ) * dtype promotion helpers * better tests * space	2023-12-11 23:14:23 -05:00
Christopher Mauri Milan	0232db294d	fix tolist issue (#2723 )	2023-12-11 19:14:00 -08:00

1 2 3 4 5 ...

3100 Commits