tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-01-25 23:08:06 -05:00

Author	SHA1	Message	Date
Ahmed Harmouche	10618aba98	Bring back WebGPU (#7063 ) * Start from andredaprato:webgpu-clean * Fix infs * inf wgsl function is not needed * Emulated ulong for threefry, more tests passing * Randomness tests passing * Update model export to support new changes in webgpu, efficientnet export works again * Simplify shift emulation in wgsl * Delete test file * Fix bigger than u32 u32 literal * Why was skip copies added here? * Python3.12 for webgpu tests * Fix model export syntax error * Get test ops passing with some skips * Fix lint * Much simpler shift * Run more tests * Timestamp queries are not supported in CI, so skip search tests * All fancy indexing passing * r is ctx * Run more dtype tests by using is_dtype_supported * Cleanup ulong shift rendering * UPat -> Pat, UOps -> Ops * Pat -> UPat * Refactor render_ushift if-else * Pattern to avoid ulong mul * Remove vals_dtype * is_nan trick + rewrite, test_isnan passing * Rewrite a * select(1, nan, gate) -> select(a, nan, gate) * No arg, just op * Support char, uchar, short, ushort * Run test_index_mnis now that we have uint8 * Fix pyling * Save 3 lines by using base Compiler * No more long emulation * Remove fixup_binops * No more external_local_bufx wgsl specific cstyle modif, use base extra_pm * Simpler, faster copyin/out * Skip some new tests that use long * Fix typo * copyout touchup * Save lines by using render_cast * WebGL is not supported in core, delete it from is_dtype_supported * More narrow test skips for some unary tests * TernaryOps, UnaryOps -> Ops * TinyGrad supports WebGPU * StableDiffusion demo: f16tof32 gpu is a lib, update UI * Packed load/store, no more scale_size, no core tinygrad changes * Rename copyin, copyout * Device -> dev * Fix lint * Pattern matcher rule for packed load/store * Refactor * Shorter packed load/store * this should fix lint * Fix mypy * SD compile script working * New SD webgpu UI * New default prompt * New SD weights * Fix title when webgpu not available * Run symbolic tests, simplify is_nan, use round_up * Show step time on UI * Bump minimum wgpu version to v0.19 * Fix latent --------- Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2024-11-26 12:26:40 +08:00
chenyu	40d7535eeb	clean up DTYPES_DICT [pr] (#7845 )	2024-11-22 10:01:34 -05:00
George Hotz	205befa788	move is_dtype_supported to device [pr] (#7575 )	2024-11-07 20:38:03 +08:00
Ahmed Harmouche	36488a2a43	Use is_dtype_supported in more places in tests (#7529 )	2024-11-04 09:21:15 -05:00
George Hotz	76a41a1083	don't compare with pointer dtype (#7394 ) * don't compare with pointer dtype * more cleanup * images are pointers * handle IMAGE better * cleaner test_image * this work * pr match * cleanup	2024-10-30 17:48:27 +08:00
George Hotz	4e2895f8d2	safe changes from new dtype branch [pr] (#7397 ) * safe changes from new dtype branch [pr] * only image test on GPU	2024-10-30 17:18:48 +08:00
George Hotz	27995a2a04	vcount + cleanups (#7393 ) * Revert "Revert "Restore vcount [pr] (#7390)" (#7392)" This reverts commit `4ca53db604`. * ugh bugfix [pr] * uops_to_dtypes function * fixups * varnames * fix mypy * just 4,8 * tests	2024-10-30 12:50:15 +08:00
George Hotz	4ca53db604	Revert "Restore vcount [pr] (#7390 )" (#7392 ) This reverts commit `1058f9c9ff`.	2024-10-30 11:40:25 +08:00
George Hotz	1058f9c9ff	Restore vcount [pr] (#7390 ) * Revert "Revert "add vcount to PtrDtype (#7388)"" This reverts commit `399a5219dd`. * Revert "Revert "add tests to vcount stuff [pr] (#7389)"" This reverts commit `cc8d6dbdf3`. * no ptr	2024-10-30 11:27:55 +08:00
George Hotz	cc8d6dbdf3	Revert "add tests to vcount stuff [pr] (#7389 )" This reverts commit `1b7084899b`.	2024-10-30 10:56:49 +08:00
George Hotz	1b7084899b	add tests to vcount stuff [pr] (#7389 )	2024-10-30 10:54:54 +08:00
George Hotz	4cb236a495	index in cstyle (#7328 ) * index only in cstyle * fix prefix dtypes * fix tests * global indexing * Revert "global indexing" This reverts commit `4d507e8abb`. * fix image * fix image * ptx tests * fix CUDA dtype rendering	2024-10-29 13:06:26 +08:00
chenyu	f511ad9103	No pyint again (#7156 ) * Revert "bring back pyint (#7150)" This reverts commit `37e83ca6fc`. * remove truncate in const folding * truncate_output=False	2024-10-19 13:48:59 -04:00
chenyu	37e83ca6fc	bring back pyint (#7150 ) fixed test_failure_52 and resnet. need to understand this better	2024-10-18 14:54:37 -04:00
Bhavya Gada	b7b2017cb9	only ignore warnings not errors (#7146 )	2024-10-18 07:41:11 -04:00
Bhavya Gada	534597e753	fix all test warnings (#7024 ) * fix pytorch warning in nn.conv2d for same padding * fix future warning in torch load * fix overflow warning in tensor list test: https://github.com/numpy/numpy/issues/23606#issuecomment-1512752172 * fix floating point warnings in dtype tests using docs https://numpy.org/doc/stable/reference/generated/numpy.errstate.html and a neat solution https://stackoverflow.com/questions/53634965/change-np-seterr-behavior-inside-a-function-only * put err state in one place; comment taken care of by function hover * enter np errstate context manager on test setup * put decorator on class	2024-10-18 08:56:40 +08:00
George Hotz	ded1b38b84	minor dtype cleanup [pr] (#7124 ) * minor dtype cleanup [pr] * use ptr() function	2024-10-17 17:41:23 +08:00
George Hotz	f85c9ba00a	rewrite max to use cmplt + where (#7037 )	2024-10-14 20:00:51 +08:00
George Hotz	85a45164fb	remove pyint [pr] (#7016 ) * remove pyint * bump time on tp [pr] * dont truncate in const fold * remove dead code * Revert "dont truncate in const fold" This reverts commit `29c81db0f7`. * remove define_var	2024-10-12 22:36:24 +08:00
chenyu	75d9dcf000	support dtype in softmax and log_softmax (#6914 ) matches torch. for mixed precision training, we would want to use float for softmax	2024-10-06 07:18:15 -04:00
wozeparrot	2b899164c6	no numpy (#6751 )	2024-09-26 16:40:18 +08:00
George Hotz	cb22ef379a	truncate consts early (#6741 ) * truncate consts early * ptx still fails * Update dtype.py	2024-09-25 16:49:51 +08:00
George Hotz	1b4d1823b7	add pyint to DTYPES_DICT [run_process_replay] (#6477 ) * add pyint to DTYPES_DICT [run_process_replay] * also fix uop alu bug * exclude pyint there too * ne ne * force explicit dtype	2024-09-11 17:31:59 +08:00
chenyu	002303c145	fix output of truncate_fp16 (#6381 ) make sure the non-inf path returns the truncated value	2024-09-05 22:55:43 -04:00
chenyu	590c0922b6	Tensor.prod (#6250 ) * Tensor.prod a new reduce op! * onnx ReduceProd	2024-08-23 10:06:32 -04:00
wozeparrot	0c5189de25	threefry half (#6154 )	2024-08-18 15:23:12 -07:00
samm393	2dc586ffe5	Shape change bitcast for more dtypes (#6047 ) * bitcast & tests * use to_dtype * put disk tensor tests back * tests * bitmask * no bitmask --------- Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2024-08-14 10:03:34 -07:00
chenyu	4a65010de8	remove CUDACPU flag in tests [run_process_replay] (#5902 ) no longer used	2024-08-04 16:06:38 -04:00
chenyu	c67e9887f7	support using str to specify dtype (#5897 ) * support using str to specify dtype in Tensor creation and args into `cast` and `bitcast`, and acc_dtype * more tests	2024-08-04 12:56:28 -04:00
samm393	2c94316bd2	ull literal support and test (#5789 ) * ull literal support and test * missing .numpy()	2024-07-29 11:50:49 -04:00
chenyu	600a39771d	fix Tensor.arange if (stop-start) and step have different signs (#5775 )	2024-07-28 14:34:10 -04:00
kormann	2c4add6844	pretty print lazy op per default (#5505 ) * pretty lop * min diff * walrus * fix * min diff * simplify * pretty helper function * ws * pretty uop upat * tests * stricter tests * test passes * ws * stronger upat test * delete print_tree * min diff * stricter exp test * fix merge * stronger uops eval test * +readable and deep upat test * +readable and deep upat test * sort inv fix * fix * revert allowed_len	2024-07-18 09:34:08 -07:00
chenyu	f8a47608cc	test dtype.min and dtype.max (#5479 ) compared with np.iinfo for integer dtype	2024-07-14 15:31:37 -04:00
chenyu	ca021229e4	fix attention to always return in the same dtype as input (#5100 ) mid cast to default_float does not work as intended when default is float32 and qkv is in half	2024-06-22 10:34:57 -04:00
chenyu	cc2be9064f	fix out of bound python list into numpy array (#5043 ) numpy 2.0 does not allow oob python const and recommends writing as `np.array(value).astype(dtype)`	2024-06-18 18:05:21 -04:00
chenyu	acaf9a490d	RECIP(-0.0) should be -inf (#5024 ) * RECIP(-0.0) should be -inf added test_dtype_alu for PYTHON backend * catcht that * fix those two	2024-06-17 22:26:58 -04:00
chenyu	03b367c014	handle float16 overflow in PYTHON (#5022 ) * handle float16 overflow in PYTHON use `truncate` when constructing tensor from list to make sure all values are packable (might be slow, but should be correct). add truncate_fp16 to cast overflowed values to inf/-inf. * all valid fmt supports truncate	2024-06-17 21:12:52 -04:00
chenyu	4296507021	Tensor.sum returns in acc_dtype if specified (#5012 ) * Tensor.sum returns in acc_dtype if specified * skip PYTHON for now * revert that * relax that	2024-06-17 16:35:52 -04:00
chenyu	2b07847f2b	matmul returns in acc_dtype if specified (#4994 ) more flexible to not automatically downcast, can fix bert mixed precision training with this	2024-06-16 12:56:15 -04:00
chenyu	67e8df4969	remove numpy from dtype (#4969 ) replaced all dtype.np with _to_np_dtype defined in tensor.py. after this, the only numpy usages are (1) Tensor(np.ndarray), (2) construct .numpy() output, (3) numpy random buffer	2024-06-14 15:38:45 -04:00
chenyu	287d3c3b84	support list, tuple input in dtypes.from_py (#4945 ) * support list, tuple input in dtypes.from_py and used it to infer dtype from python list and tuple in Tensor constructor. * fix tests	2024-06-13 13:38:06 -04:00
qazal	637f482588	configure derandomizing CI tests (#4793 )	2024-05-31 17:06:58 +03:00
Szymon Ożóg	de5c69c4c9	Unify test_dtype naming conventions (#4730 )	2024-05-25 10:12:40 -04:00
chenyu	47aba47f64	update Torch.gather api (#4692 ) * update Torch.gather api gather(self, dim, index) to match torch * fix that	2024-05-22 21:54:06 -04:00
chenyu	286b4dbdf2	compile raise CompileError and skip only RuntimeError in multiprocess… (#4646 ) * compile raise CompileError and skip only RuntimeError in multiprocess beam renderer error with multiprocess should not be skipped by beam * use `==` for dtype to dtype comparison * that needs to be is * typo	2024-05-19 00:25:25 -04:00
chenyu	04f2327ca3	fix abs of diff of uint (#4411 )	2024-05-15 18:39:11 -04:00
nimlgen	eb9689336e	nv mockgpu (#4600 ) * mockgpu nv * works * comment that out * fix merge * setup gpuocelot * install packages * not run all of them * passes * fix ci * almost * should pass * linter * linter 2 * try this? * ugn, not supported * ci * remove ticket from description * better descs	2024-05-15 23:46:08 +03:00
chenyu	3c11ca452e	skip CLANG test casts between double and half for now (#4609 ) start breaking after github CI image update	2024-05-15 16:17:06 -04:00
chenyu	7eb035e7c5	stronger test case for half mean overflow (#4470 )	2024-05-07 22:40:09 -04:00
chenyu	ca7300c783	fix half mean and its backward (#4469 ) * fix half mean and its backward cast to sum_acc_type, sum, div, then cast back * mean dtype tests	2024-05-07 21:46:41 -04:00

1 2 3 4

173 Commits