tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-02-03 11:14:56 -05:00

Author	SHA1	Message	Date
wozeparrot	2b899164c6	no numpy (#6751 )	2024-09-26 16:40:18 +08:00
George Hotz	cb22ef379a	truncate consts early (#6741 ) * truncate consts early * ptx still fails * Update dtype.py	2024-09-25 16:49:51 +08:00
George Hotz	1b4d1823b7	add pyint to DTYPES_DICT [run_process_replay] (#6477 ) * add pyint to DTYPES_DICT [run_process_replay] * also fix uop alu bug * exclude pyint there too * ne ne * force explicit dtype	2024-09-11 17:31:59 +08:00
chenyu	002303c145	fix output of truncate_fp16 (#6381 ) make sure the non-inf path returns the truncated value	2024-09-05 22:55:43 -04:00
chenyu	590c0922b6	Tensor.prod (#6250 ) * Tensor.prod a new reduce op! * onnx ReduceProd	2024-08-23 10:06:32 -04:00
wozeparrot	0c5189de25	threefry half (#6154 )	2024-08-18 15:23:12 -07:00
samm393	2dc586ffe5	Shape change bitcast for more dtypes (#6047 ) * bitcast & tests * use to_dtype * put disk tensor tests back * tests * bitmask * no bitmask --------- Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2024-08-14 10:03:34 -07:00
chenyu	4a65010de8	remove CUDACPU flag in tests [run_process_replay] (#5902 ) no longer used	2024-08-04 16:06:38 -04:00
chenyu	c67e9887f7	support using str to specify dtype (#5897 ) * support using str to specify dtype in Tensor creation and args into `cast` and `bitcast`, and acc_dtype * more tests	2024-08-04 12:56:28 -04:00
samm393	2c94316bd2	ull literal support and test (#5789 ) * ull literal support and test * missing .numpy()	2024-07-29 11:50:49 -04:00
chenyu	600a39771d	fix Tensor.arange if (stop-start) and step have different signs (#5775 )	2024-07-28 14:34:10 -04:00
kormann	2c4add6844	pretty print lazy op per default (#5505 ) * pretty lop * min diff * walrus * fix * min diff * simplify * pretty helper function * ws * pretty uop upat * tests * stricter tests * test passes * ws * stronger upat test * delete print_tree * min diff * stricter exp test * fix merge * stronger uops eval test * +readable and deep upat test * +readable and deep upat test * sort inv fix * fix * revert allowed_len	2024-07-18 09:34:08 -07:00
chenyu	f8a47608cc	test dtype.min and dtype.max (#5479 ) compared with np.iinfo for integer dtype	2024-07-14 15:31:37 -04:00
chenyu	ca021229e4	fix attention to always return in the same dtype as input (#5100 ) mid cast to default_float does not work as intended when default is float32 and qkv is in half	2024-06-22 10:34:57 -04:00
chenyu	cc2be9064f	fix out of bound python list into numpy array (#5043 ) numpy 2.0 does not allow oob python const and recommends writing as `np.array(value).astype(dtype)`	2024-06-18 18:05:21 -04:00
chenyu	acaf9a490d	RECIP(-0.0) should be -inf (#5024 ) * RECIP(-0.0) should be -inf added test_dtype_alu for PYTHON backend * catcht that * fix those two	2024-06-17 22:26:58 -04:00
chenyu	03b367c014	handle float16 overflow in PYTHON (#5022 ) * handle float16 overflow in PYTHON use `truncate` when constructing tensor from list to make sure all values are packable (might be slow, but should be correct). add truncate_fp16 to cast overflowed values to inf/-inf. * all valid fmt supports truncate	2024-06-17 21:12:52 -04:00
chenyu	4296507021	Tensor.sum returns in acc_dtype if specified (#5012 ) * Tensor.sum returns in acc_dtype if specified * skip PYTHON for now * revert that * relax that	2024-06-17 16:35:52 -04:00
chenyu	2b07847f2b	matmul returns in acc_dtype if specified (#4994 ) more flexible to not automatically downcast, can fix bert mixed precision training with this	2024-06-16 12:56:15 -04:00
chenyu	67e8df4969	remove numpy from dtype (#4969 ) replaced all dtype.np with _to_np_dtype defined in tensor.py. after this, the only numpy usages are (1) Tensor(np.ndarray), (2) construct .numpy() output, (3) numpy random buffer	2024-06-14 15:38:45 -04:00
chenyu	287d3c3b84	support list, tuple input in dtypes.from_py (#4945 ) * support list, tuple input in dtypes.from_py and used it to infer dtype from python list and tuple in Tensor constructor. * fix tests	2024-06-13 13:38:06 -04:00
qazal	637f482588	configure derandomizing CI tests (#4793 )	2024-05-31 17:06:58 +03:00
Szymon Ożóg	de5c69c4c9	Unify test_dtype naming conventions (#4730 )	2024-05-25 10:12:40 -04:00
chenyu	47aba47f64	update Torch.gather api (#4692 ) * update Torch.gather api gather(self, dim, index) to match torch * fix that	2024-05-22 21:54:06 -04:00
chenyu	286b4dbdf2	compile raise CompileError and skip only RuntimeError in multiprocess… (#4646 ) * compile raise CompileError and skip only RuntimeError in multiprocess beam renderer error with multiprocess should not be skipped by beam * use `==` for dtype to dtype comparison * that needs to be is * typo	2024-05-19 00:25:25 -04:00
chenyu	04f2327ca3	fix abs of diff of uint (#4411 )	2024-05-15 18:39:11 -04:00
nimlgen	eb9689336e	nv mockgpu (#4600 ) * mockgpu nv * works * comment that out * fix merge * setup gpuocelot * install packages * not run all of them * passes * fix ci * almost * should pass * linter * linter 2 * try this? * ugn, not supported * ci * remove ticket from description * better descs	2024-05-15 23:46:08 +03:00
chenyu	3c11ca452e	skip CLANG test casts between double and half for now (#4609 ) start breaking after github CI image update	2024-05-15 16:17:06 -04:00
chenyu	7eb035e7c5	stronger test case for half mean overflow (#4470 )	2024-05-07 22:40:09 -04:00
chenyu	ca7300c783	fix half mean and its backward (#4469 ) * fix half mean and its backward cast to sum_acc_type, sum, div, then cast back * mean dtype tests	2024-05-07 21:46:41 -04:00
qazal	35dfbc6354	rand_for_dtype helper (#4459 )	2024-05-07 00:03:42 +03:00
chenyu	826cccd54d	fix mean underflow for half tensor (#4377 ) * fix mean underflow for half tensor divide only the reduce factor. added unit test and non-nan assertion in resnet training. also added a failed test cast for symbolic shape var * skip for python backend	2024-05-01 13:38:57 -04:00
chenyu	077ea6926c	remove downcast_half in sum (#4376 ) breaks boolean mean and other stuff	2024-05-01 11:46:44 -04:00
chenyu	93abcd3113	fix function.py sum backward without downcast_half (#4353 ) without downcast_half, sum output dtype can be different from input dtype. cast back to input dtype in function.py	2024-04-29 17:53:02 -04:00
chenyu	c1d8d425eb	fix mean of half tensor if sum is greater than hlaf.max (#4327 ) sum of half does acc in float32 already, add an arg to not downcast to half and use that in mean	2024-04-28 18:04:54 -04:00
qazal	23445db2b9	no skipped tests in RHIP (#4337 ) * delete skip * delete split skip * remu dev * compiler fails here * Revert "remu dev" This reverts commit `28b933d4eb`.	2024-04-28 12:23:05 -04:00
chenyu	63eb0a68af	fix return dtype of gather (#4159 )	2024-04-12 16:25:12 -04:00
chenyu	d9c5a2b1bb	fix return dtype of getitem Tensor indexing (#4158 ) the use of sum can auto-upcast the result. fixed by using the data dtype as the acc_dtype	2024-04-12 15:55:02 -04:00
chenyu	380f27d629	move sum acc_dtype into lazy so it applies to backward (#4149 ) * move sum acc_dtype into lazy so it applies to backward * unit test	2024-04-11 14:43:56 -04:00
chenyu	7bc560ec49	remove outdated bf16 comments in test_dtype (#3987 )	2024-03-29 00:56:18 -04:00
uuuvn	8a40d7d423	Shape changing bitcast and assert bitcast in disk (#3973 ) * Shape changing bitcast * only support it on disk * basic test * more tests * RuntimeError instead of assert * create unique temp files * move tests that use disk to test_disk_tensor * linter * remove assert on error messages * that's RuntimeError now --------- Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2024-03-28 21:49:10 -07:00
chenyu	793ab0512e	use ctypes to truncate float64 and float32 in uops (#3986 ) this fixed the softmax.argmax bug for ops_python as the float is truncated to float32	2024-03-28 23:56:50 -04:00
chenyu	4ecd5789ab	#include <tgmath.h> in ops_clang (#3927 ) * different clang sqrt/log2/exp2/sin function based on dtype fixed softmax_argmax issue in #3552 for clang. * tgmath.h * revert those	2024-03-25 17:48:57 -04:00
chenyu	83f39a8ceb	env var to change default float (#3902 ) * env var to change default float to fp16 or bf16 looking for standard names for these. we have FLOAT16 that does something to IMAGE and HALF to convert weights. working on default bf16 too. ``` RuntimeError: compile failed: <null>(6): error: identifier "__bf16" is undefined __bf16 cast0 = (nv_bfloat16)(val0); ``` remove that in cifar * DEFAULT_FLOAT * default of default * unit test * don't check default * tests work on linux	2024-03-24 20:33:57 -04:00
chenyu	2c69888654	include negative float in test_dtype (#3884 ) * include negative float in test_dtype * that is ub * too annoying * pack can overflow	2024-03-24 02:39:15 -04:00
chenyu	2d3ce53348	touchup test_dtype.test_gradient_dtype (#3887 ) add back bad merge from #3613 and add float.double and float.bfloat16 to test	2024-03-22 20:56:45 -04:00
David Hou	fc11808a79	initialize Tensor grad same type as self (#3613 ) * initialize Tensor grad same type as self * also test different default float * check dtype + try/finally * don't test_gradient_dtype if f16 is not supported * fix bad merge --------- Co-authored-by: chenyu <chenyu@fastmail.com>	2024-03-22 20:33:18 -04:00
chenyu	c5467e5bd6	diverse test value in test_dtype DATA based on dtype (#3864 ) * diverse test value in test_dtype DATA based on dtype * eh fix typo * that too? * PTX does not support i8 and s8 * skip that * unused line * pus the hack back * remove that	2024-03-22 14:22:06 -04:00
chenyu	d17900bc45	use int32 instead of default_int in simplify_phi_loops (#3828 ) * use int32 instead of default_int in simplify_phi_loops indices are in int32 now and is separated from buffer dtype. fix #3823 * return early if not supported * it's not that * why is it failing for RHIP	2024-03-19 17:49:58 -04:00
chenyu	99cbc24390	use dtypes.int32 as return dtype for functions that return indices (#3827 ) behavior matches jax. It's fine to have a tensor greater than max int8 size even if we set default int to int8	2024-03-19 17:06:57 -04:00

1 2 3 4

153 Commits