tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-02-07 05:05:13 -05:00

Author	SHA1	Message	Date
George Hotz	80f53245e8	shapetracker add and invert (#2828 ) * invert (broken) * decent invert * shapetracker invert works * plus is meh, invert is good * support invert mask * a few more invert tests * shapetracker math invert test	2023-12-18 16:03:27 -08:00
chenyu	73cadfbb3c	Remove pytest markers (#2831 ) * remove pytest marker * fix some, skip some * tweak * fix * skip slow * skip more	2023-12-18 18:53:28 -05:00
chenyu	264fe9c93f	clean up test_dtype.py (#2827 ) make is_dtype_supported a pure function and clean up long lines	2023-12-18 16:06:09 -05:00
chenyu	20ea43b6e7	dtypes.from_py to convert py types to dtypes (#2826 ) also updated some tests to test against default dtypes	2023-12-18 14:23:31 -05:00
chenyu	0723f26c80	dtypes.default_float and dtypes.default_int (#2824 )	2023-12-18 12:21:44 -05:00
chenyu	8aab19ce3d	Tensor.full of bool has dtypes.bool (#2823 )	2023-12-18 10:51:17 -05:00
chenyu	220abcd8ff	fix squeeze of 0-dim Tensor with negative dim (#2821 ) if ndim=0, only accepted dim is 0, -1, None. other negative dim results in IndexError	2023-12-17 22:02:07 -05:00
chenyu	21ec7e09f6	minor cleanup of image (#2820 ) use transpose for transpose instead of permute, and use pad for pad instead of slice	2023-12-17 20:53:49 -05:00
chenyu	959d9cfed4	clean up ops_torch and ops_cpu (#2819 )	2023-12-17 19:35:19 -05:00
Rory Clear	f409b57854	update metal matmul and matvec for new device style (#2732 ) * update for new device style * create device before compile --------- Co-authored-by: chenyu <chenyu@fastmail.com>	2023-12-17 16:15:07 -05:00
chenyu	91adb119b8	remove match_type in ops_torch and ops_cpu (#2817 ) * remove match_type in ops_torch and ops_cpu input dtypes are aligned and casted in mlops * dict union only after python3.9 * fix that * fix Sigmoid forward cast	2023-12-17 15:32:30 -05:00
Maksym Sobolyev	887f3d9933	Make torch backend more usable, fix bfloat support in the llvm backend (#2765 ) * Uncripple dtype tests, TestBFloat16DType never actually runs. * Fix conversion from/to bfloat16. Call cast() recursively, so that it works for any type combo. * Run this test on torch backend as well. * Add torch.bfloat16. * Add support for ushort and uint. * Convert np.uint32 to np.int32 when loading. * Fix warning.	2023-12-17 14:04:26 -05:00
chenyu	9c32474a1f	Revert "Revert "Tensor.randint is Tensor.uniform with dtypes.int32 (#2801 )" (#2802 )" (#2814 ) This reverts commit `fa84998244`.	2023-12-17 12:14:17 -05:00
chenyu	b4fa189c8c	Revert "Revert "Make Tensor creation allow multi-dim list of int and bool (#2793 )" (#2810 )" (#2813 ) This reverts commit `71a60762ed`.	2023-12-17 11:48:27 -05:00
Marcus Asteborg	1fa4f161fe	Update CLProgram to use unsigned long long for event profiling (#2808 ) On Windows, the unsigned long type is 32-bit, which is not compatible with the required data size for event profiling.	2023-12-16 23:48:44 -08:00
chenyu	c333bfcf69	replace a defaultdict counting with Counter in cstyle (#2809 ) also removed a questionable Any annotation	2023-12-17 02:44:16 -05:00
chenyu	4e2a92cee1	run HALF GPT2 in nvidia benchmark in addition to HALF/BEAM (#2811 ) easier to separate the issue between HALF and BEAM when it failed	2023-12-17 02:24:55 -05:00
George Hotz	bad0ff60b7	start Qualcomm GPU driver (#2804 ) * hooking works * working * qcom work * parsing command buffers * proper parse	2023-12-16 23:10:50 -08:00
chenyu	71a60762ed	Revert "Make Tensor creation allow multi-dim list of int and bool (#2793 )" (#2810 ) This reverts commit `798bf813b1`.	2023-12-17 02:03:52 -05:00
geohotstan	798bf813b1	Make Tensor creation allow multi-dim list of int and bool (#2793 ) * the universe is flat as a 2D tensor * try this * TESTS * less lines in test * don't change all_int since other places use it * add tests and del noqa by making non-aesthetic spacing LOOOOOL * some reordering * fixed empty list and add tests * more tests * add list bool tensors * clearer with least lines added * added bool * oops * more tests * improved tests * oops	2023-12-17 01:58:10 -05:00
chenyu	85c6250a3e	support Tensor.einsum with no "->" in formula (#2807 ) output is the sorted alphabets if there's no "->"	2023-12-17 00:46:24 -05:00
chenyu	157c0be509	cleanup onnx, pass one more reshape test and remove some casts (#2806 )	2023-12-16 20:40:43 -05:00
chenyu	baa94d6142	Tensor(False) has dtypes.bool (#2805 )	2023-12-16 19:04:08 -05:00
chenyu	fa84998244	Revert "Tensor.randint is Tensor.uniform with dtypes.int32 (#2801 )" (#2802 ) This reverts commit `86c2f267d4`.	2023-12-16 15:53:28 -05:00
chenyu	86c2f267d4	Tensor.randint is Tensor.uniform with dtypes.int32 (#2801 )	2023-12-16 15:14:50 -05:00
chenyu	0bb5d8f956	Revert "Green Uop unary check (#2792 )" (#2799 ) This reverts commit `d958777aed`.	2023-12-16 12:49:28 -05:00
qazal	d958777aed	Green Uop unary check (#2792 ) * assert UnaryOps has same dtype as input in uop * fallback to float on images * just unary ops for now * pass amt * noqa on the temp line --------- Co-authored-by: chenyu <chenyu@fastmail.com>	2023-12-16 12:43:53 -05:00
George Hotz	051402625e	remove pushing contig + fix linearizer bug (#2798 ) * remove that logic * fix test, move LOADs * fix repeat issue on LLVM * with_phi	2023-12-16 09:36:31 -08:00
Ahmed Harmouche	a7264dcb2b	inf hack that works on chrome and wgpu (#2712 ) Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2023-12-16 11:49:37 -05:00
George Hotz	877c78b4ce	lazy tests (#2796 ) * tests * mini sd is very mini	2023-12-16 08:24:21 -08:00
chenyu	88ff1edcf0	fix tensor creation with a list and dtype bfloat16 (#2795 ) it went through numpy and numpy does not have bfloat16. also added broadcasted with a python bool.	2023-12-16 10:06:47 -05:00
chenyu	bb6f7b6172	rsqrt is self.reciprocal().sqrt() (#2790 ) (1/self) is incorrect for int tensor	2023-12-16 01:58:05 -05:00
chenyu	c5fa9eb36e	int / List[int] data -> dtypes.int32 (#2789 )	2023-12-16 01:25:44 -05:00
chenyu	dad4ee4539	use least_upper_dtype mlops to upcast the output type in mlops (#2788 ) * InterpretedFlopCounter uses least_upper_dtype for output dtype * fix target dtype check * fix that	2023-12-15 23:46:57 -05:00
chenyu	1bc378c3d6	_broadcasted handles the python number types (#2785 ) * _broadcasted handles the python number types * disable that test	2023-12-15 22:43:27 -05:00
chenyu	0703075357	bf16 is float (#2786 ) * add bfloat16 to is_float check * and test	2023-12-15 21:41:30 -05:00
chenyu	e4bbbc5bc3	Revert "Use the reduceop dtype to define the acc in linearizer (#2625 )" (#2783 ) This reverts commit `f3ed96a929`.	2023-12-15 16:29:10 -05:00
qazal	f3ed96a929	Use the reduceop dtype to define the acc in linearizer (#2625 ) * upcast the other way * Revert "upcast the other way" This reverts commit `355692ba79`. * remove uop cast, this should have never been there * add regression test * now fuzz it correct test * the accumulator is always the output type lint * fuzz all reduce ops * MULACC upcast_dtype could be half too opencl supports it https://man.opencl.org/mad.html * cast to the same dtype is a noop * internal casting support for MULACC * fuzz test mulacc internal casting * get_reduce_dtype handle vectorized acc update get_reduce_acc calls with the correct dtype update tests * pending _complete_ implementation of a function that gets the dtype based on self.reduceop +more failing tests * get_reduce_dtype try 2 add TODO * get_lazyop_info already does it * cleanup * bring back internal casting support for mulacc * use the scalar version of the acc dtype * conceptual diff cleanup * one extra line to a cleaner linearizer * correct test assumptions - these should promote? * rm mulacc cast, the cast of vins happens with the acc dtype promotion linearizer hacks * Revert "rm mulacc cast, the cast of vins happens with the acc dtype promotion" This reverts commit `afdd540733`. Revert "correct test assumptions - these should promote?" This reverts commit `49ae2206ed`. * skip tests blocked by MULACC->lazyop cleanup * final changes to add back internal casting for MULACC and update skip test logic, upcast works but downcast does not * only test the linearizer abstraction layer we wanna ensure that linearizer matches whatever lazy is returning * remove unused hypothesis module * remove mulacc related changes, those will move to the lazy pr * remove midcast test * move to helpers * Revert "remove midcast test" This reverts commit `86e74d7960`. add TODO with skip --------- Co-authored-by: chenyu <chenyu@fastmail.com>	2023-12-15 16:14:32 -05:00
chenyu	765f8b05e5	TernaryOps.WHERE has vin[0] as bool and BinaryOps.CMPLT always outputs bool (#2782 ) * vin[0] to where is always bool * due to better hack * update test * fix test_uops	2023-12-15 14:51:51 -05:00
George Hotz	96a276cc7c	hotfix: add test_reduce_permute_nofuse to master	2023-12-15 09:39:47 -08:00
qazal	66f07d97e2	don't auto-cast half to float in unary functions (#2776 ) * least upper float * dont cast to the same thing * tests for least_upper_float * add regression tests to test_dtype_alu * the call is pretty cheap probably cache is too much overhead	2023-12-15 10:11:47 -05:00
George Hotz	c6eb618013	tests from new lazy branch (#2774 ) * tests from new lazy branch * fix lin 11 * that was needed * doesn't fail * mark * meant that * llvm passes	2023-12-14 23:06:39 -08:00
chenyu	a044125c39	validate stable diffusion for seed 0 (#2773 ) * validate stable diffusion for seed 0 the closest false positive i can get is with the setup and one less step. dist = 0.0036 same setup with fp16 has dist=5e-6. so setting validation threshold to 1e-4 should be good * run with --seed 0	2023-12-15 00:07:09 -05:00
chenyu	9afa8009c1	hot fix explicitly set arange dtype to float (#2772 )	2023-12-14 23:14:38 -05:00
chenyu	c0f76ed4ea	transformer kvcache and mask have same dtype as input (#2771 ) * transformer kvcache and mask have same dtype as input * don't use `=0` in cstyle ternary where * (bool) * where float16 test	2023-12-14 22:41:51 -05:00
chenyu	2dd0dd4ae0	cleanup llvmir (#2770 )	2023-12-14 18:13:22 -05:00
chenyu	66d9eb10b6	arange default dtype to int and zeros/ones default to float (#2769 )	2023-12-14 17:53:00 -05:00
qazal	3cf4376ce2	test_linearizer cleanup (#2766 ) * test_linearizer cleanup * use unittest.skipIf * update msg	2023-12-14 17:20:09 -05:00
chenyu	57017c87e9	remove duplicated dtype in DEFINE_GLOBAL args (#2768 ) now DEFINE_GLOBAL uop.arg[1] is always the same as uop.dtype, we can remove the one in arg and just use uop.dtype	2023-12-14 15:42:36 -05:00
chenyu	5235cdee3d	remove _arg_int32 internal type (#2767 ) in DEFINE_GLOBAL, PtrDtype(int32) is buffer and int32 is int	2023-12-14 14:17:14 -05:00

... 149 150 151 152 153 ...

10633 Commits