tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-01-10 07:28:15 -05:00

Author	SHA1	Message	Date
nimlgen	f986e12f91	metal: choose compile spec based on macos (#9188 ) * metal: choose compile spec based on macos * correction	2025-02-21 00:43:39 +03:00
chenyu	3e22747799	run unit test on windows ci (#9187 ) * factor out testing_minimal in setup.py [pr] * testing_unit + windows	2025-02-20 14:40:41 -05:00
chenyu	287de4ecc6	use torch in test_gradient (#9186 ) used torch.autograd.grad, but not sure if it can be a template like jax	2025-02-20 12:26:11 -05:00
qazal	574a905291	Fix running VIZ=1 after package installation + test (#9183 ) * test running viz from pip install * add pkg * do 10 connection attempts * include assets in package_data * quiet curl * better print	2025-02-20 15:02:00 +01:00
chenyu	1692087db5	_one_hot_along_dim input needs to be int (#9179 ) * _one_hot_along_dim input needs to be int indexing and onehot compare with arange, and non-int dtype is likely a bug	2025-02-20 09:00:43 -05:00
George Hotz	bf36967883	cuda hooking (#9180 ) * cuda hooking * progress * more hook cuda * fix params * compile + cuMemHostAlloc hook * work * revert that	2025-02-20 19:20:01 +08:00
chenyu	3b37cc898b	add bert tiny config (#9177 ) set with BERT_SIZE=tiny. easier to study embedding and fusion	2025-02-19 14:57:03 -05:00
qazal	5662c898f1	correctly step through bottom_up_rewrites in viz [pr] (#9176 )	2025-02-19 19:20:57 +01:00
peppingdore	b1ddb2a1a6	fix win32 CPUProgram missing cache flush (#9171 ) * win32: fix missing inst cache flush, rename ptr->self.mem for consistency with posix code * fix types, remove assert * fix memory leak * rm whitespace	2025-02-19 21:38:51 +08:00
qazal	1bb9d78c7a	hotfix: add output buffer back to kernel parents + comment [pr] (#9174 )	2025-02-19 14:22:01 +01:00
chenyu	975c318dbc	bert use int32 for input ids (#9173 ) original data was int32 for these. float might have caused precision issues	2025-02-19 08:17:27 -05:00
qazal	e4a8bf28ea	scheduler cleanups + better cycle assert [pr] (#9172 ) * scheduler cleanups + better cycle assert [pr] * type_verify after assign fixup * don't need base * always realize sink parents	2025-02-19 13:30:58 +01:00
qazal	cf315d544b	rename can_pad arg to cache [pr] (#9170 )	2025-02-19 12:24:59 +01:00
qazal	2fc8bf115d	remove support for VIEW with two sources in ops [pr] (#9168 ) * only 1 src views can exist [pr] * views can still exist without a base, this is a separate project	2025-02-19 11:10:18 +01:00
Ahmed Harmouche	a2afa523a0	Only add enable f16 directive if ShaderF16 is supported (#9163 ) * F16 in check in wgsl renderer * Default in renderer to fix pickle * Refactor f16 check	2025-02-19 17:20:03 +08:00
Ahmed Harmouche	0f94b98646	Force WebGPU backend type [pr] (#9164 ) * Force webgpu backend type * Mypy fix * Rename to WEBGPU_BACKEND * Add it to env_vars docs * Remove link	2025-02-19 17:19:39 +08:00
qazal	4bc708a9b0	do not create buffers we never realize in scheduler (#9165 ) * work * delete * fix * works * FUSE_CONV_BW * FUSE_ARANGE * becomes_map * fix assign p1 * fix assign (diamond) - 2 * fix test_assign_double_diamond_reduce * fix subbuffer * faster rewrite * fix simple_pads * start metadata work * do some diff cleanups * make things that can't be images not images * openpilot fix * fix linter * diff * minimal diff * more work on the diff * metadata	2025-02-19 10:11:47 +01:00
George Hotz	1c4e9bc363	image fixup tensor map [pr] (#8611 ) Co-authored-by: qazal <qazal.software@gmail.com>	2025-02-19 10:11:06 +02:00
qazal	2a5fe3e700	whitespace changes from the map_tensors branch [pr] (#9167 )	2025-02-19 09:52:59 +02:00
qazal	a773ff73e3	match image cast folding on the cast itself [pr] (#9166 )	2025-02-19 09:31:34 +02:00
qazal	9a20063837	create subbuffer immediately before constructing ScheduleItem [pr] (#9162 )	2025-02-18 21:07:52 +01:00
qazal	1c92534bff	hotfix: viz should show if there's a rewrite [pr] (#9161 )	2025-02-18 19:11:03 +01:00
George Hotz	a330f3338c	save applied opts in ProgramSpec [pr] (#9150 )	2025-02-19 00:40:03 +08:00
chenyu	ff05bff221	put bert data shard inside jit (#9160 ) python time 45ms -> 9ms, it was spending time to schedule the shard also init bert data on CLANG since it's from numpy, so we don't create the tensor on default device then shard into GPUS	2025-02-18 10:36:54 -05:00
qazal	679291e26a	assert only base maps to buffer [pr] (#9159 )	2025-02-18 15:46:47 +01:00
qazal	4f592eeea6	hotfix: remove extra matcher for copy/buffer_view [pr] (#9157 )	2025-02-18 13:21:24 +01:00
George Hotz	ff9b985d9f	hotfix: View Base AST	2025-02-18 18:48:34 +08:00
George Hotz	30f470eaa3	UNIQUE UOp for buffer instead of arg (#9156 ) * UNIQUE UOp for buffer instead of arg * factor out buffer spec	2025-02-18 16:59:59 +08:00
qazal	38f5ea2132	increment writable buffers refcount from the kernel graph [pr] (#9153 )	2025-02-18 10:20:02 +02:00
George Hotz	ddddcc165b	colors back in DEBUG=2 [pr] (#9155 )	2025-02-18 16:17:57 +08:00
George Hotz	6d62966bf7	add support for named rewrites [pr] (#9152 )	2025-02-18 16:07:04 +08:00
George Hotz	caee42e8a6	Revert "name from uops [pr] (#9151 )" (#9154 ) This reverts commit `28897be9a2`.	2025-02-18 16:06:44 +08:00
George Hotz	28897be9a2	name from uops [pr] (#9151 )	2025-02-18 15:52:03 +08:00
George Hotz	a4dab3ec3f	add name uop (#9149 ) * add name uop, TODO: refactor renderer to use * renderer uses name uop * fix tests * render * ptx	2025-02-18 15:26:58 +08:00
George Hotz	2db8b4046a	minor linearizer refactor to finalize in rewrite [pr] (#9148 )	2025-02-18 12:42:22 +08:00
George Hotz	df3b320f46	rewriter -> devectorizer [pr] (#9147 )	2025-02-18 12:42:08 +08:00
chenyu	5dc1257ce0	clean up bert fake data iterator [pr] (#9145 ) reuse the same get_data_bert path in setup and real run	2025-02-17 20:03:38 -05:00
qazal	751c517b6c	cancel viz request after the kernel clicked away [pr] (#9144 )	2025-02-17 20:19:09 +01:00
chenyu	465421b525	fix Tensor.isclose (#9143 ) many corner cases around inf and nan	2025-02-17 12:03:12 -05:00
qazal	36741cbbc1	enable real_size assert for test_conv_2x2_backward_one_view [pr] (#9142 )	2025-02-17 17:53:44 +01:00
qazal	e9ff4ef4f7	s/ScheduleContext/GrouperContext [pr] (#9141 ) * refactor to kernel context [pr] * s/ScheduleContext/GrouperContext [pr]	2025-02-17 17:14:17 +01:00
qazal	96cc9f59e0	refactor to kernel context [pr] (#9140 )	2025-02-17 16:57:14 +01:00
qazal	df6781332e	remove var_vals from the scheduler context [pr] (#9139 ) * remove var_vals from the scheduler context [pr] * maps to int	2025-02-17 16:43:50 +01:00
Ali Ladjevardi	35e9c4657b	Use proper units when printing beam time (#9103 ) * use proper units when printing beam time * refactor DEBUG=2	2025-02-17 23:41:38 +08:00
Clément Verrier	a7f91224eb	add `Tensor.isclose()` (#8844 ) * add `Tensor.isclose()` * support `equal_nan` so as to match PyTorch's behavior * update unit tests * remove some tests temporarily * re-enable one test * re-enable other test * try to fix failing tests during CI * save one line of code --------- Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2025-02-17 10:11:40 -05:00
qazal	2b787c3b17	hotfix: lower ul.disabled opacity for viz [pr] (#9138 )	2025-02-17 15:16:48 +01:00
qazal	660c034da6	KERNEL op try 3 (#9061 ) * work * tolerate shape, maybe this is ASSIGN(RESHAPE(BUF), KERNEL) * err, it's not ASSIGN(BUF, KERNEL), it's ASSIGN(VIEW(BUF), KERNEL) * burn the boats * assign slightly works * assign works * cleanup + var_vals can exist * fine image + fix metadata * metadata, without making everything 30% slower * diff pruning * faster assign schedule * add_buffer_ops stage * add kernel_spec back * add viz display * more strict kernel_spec	2025-02-17 14:47:54 +01:00
qazal	ec80df5115	add PROGRAM renderer to viz [pr] (#9137 )	2025-02-17 14:46:08 +01:00
qazal	7b09a72682	don't display void dtype in viz nodes [pr] (#9136 ) * don't display void dtype in viz nodes [pr] * extra	2025-02-17 13:49:36 +01:00
George Hotz	4dd10d03b7	move is_increasing to ops [pr] (#9134 )	2025-02-17 19:27:48 +08:00

1 2 3 4 5 ...

7952 Commits