tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-01-24 22:38:16 -05:00

Author	SHA1	Message	Date
George Hotz	0514594083	fix openpilot test	2022-10-20 11:56:26 -07:00
George Hotz	b7f748c15a	Fix GPU 2*31 virtual size limit (#392 ) in progress * big conv test works * that's unneeded * fix opencl with reduce * rewrite contiguous_view_constant_fold * clean up mids in loop code * subidx * print cl kernel before run * no reduce, no loop * Revert "no reduce, no loop" This reverts commit `92777e40e9`.	2022-10-05 00:55:20 -04:00
George Hotz	8382c51c12	always MATMUL, test the ops in OPENCL	2022-10-01 13:31:29 -04:00
Ollin Boer Bohan	3b1767e013	Fix OpenCL Metal texture issues (#378 ) * Fix OpenCL Metal texture issues Tile CL images when needed, to fit into the 16384 max Metal image size; gets me to ~4.8s/iteration for SD on M1 Pro with OPENCL=1 FLOAT16=1. * Minor cleanup * Fix mish in CI, or no-op? * Is mish being framed? * It would help if any of this reproduced locally * ??? * OPT is reverted; use original mish * Cleanup post-review * Fix some shape usage * Tiler tests, shouldn't oom or overflow either * Can't CL if there's no CL? * Run tiler tests even if GPU=1 * relu6 segfault binary chop; revert test * relu6 segfault binary chop; revert accel * relu6 segfault binary chop; revert . (???) * end relu6 segfault binary chop; repo's haunted	2022-09-29 01:21:54 -04:00
Comma Device	75f937227a	add barrier	2022-09-13 11:39:48 -04:00
George Hotz	3c3534736e	fix matmul kernel and tests	2022-09-13 08:31:04 -07:00
Comma Device	62e9419206	fix test failure on MATMUL=1 backward pass	2022-09-13 11:18:52 -04:00
George Hotz	0516359af8	fix stupid OPENCL=1 OOM	2022-09-06 14:29:23 -07:00
George Hotz	f215534a64	1100 lines, but sane linter rules	2022-09-06 13:47:45 -07:00
George Hotz	f683b26eef	bring back native exp log	2022-09-06 07:59:04 -07:00
George Hotz	d6f499fd69	improve opencl, why is it OOMing	2022-09-05 20:14:31 -07:00
Comma Device	c07bf72d6e	save free 200ms	2022-08-31 20:31:42 -04:00
Comma Device	a734df98fa	TEST_ENET for openpilot compiler	2022-08-31 13:23:36 -04:00
George Hotz	e194ae0c1d	typos	2022-08-30 19:52:21 -07:00
George Hotz	5efab7cf1d	add reciprocal	2022-08-29 18:00:24 -07:00
George Hotz	dc7af8c3ac	thneed run float32	2022-08-28 11:03:35 -07:00
Comma Device	9678cb8a1a	hmm, the native exp/log breaks it too much	2022-08-22 17:13:08 -07:00
George Hotz	2162cd3383	fix typing	2022-08-22 16:25:15 -07:00
Comma Device	e0a8d0f836	image input works	2022-08-22 16:04:17 -07:00
George Hotz	18340e7d30	remove from_image	2022-08-22 15:52:26 -07:00
Comma Device	1b5f4e52d9	refactor getters	2022-08-22 13:29:08 -07:00
George Hotz	a8734df030	add openpilot tests to tinygrad	2022-08-21 12:03:37 -07:00
George Hotz	b132de677d	tinygrad.nn (#367 ) * tinygrad.nn * flake8 * working on pylint * more pylint * more pylint * pylint passes * networkx * mypy can't infer that type * junk	2022-08-18 07:41:00 -07:00
George Hotz	5d45c6e516	Fold reduce (#362 ) * folding reduce * fold through movementops * fixup shapes * was too aggressive * i knew we needed that * don't recompute reduce * working * fix openpilot compile * prunegraph openpilot * types and reduce_shape * refactor * cleanups * neater * 1009 * 1004 * clean up reduce for 998	2022-07-19 09:24:02 -07:00
George Hotz	5e96ed523a	fix opencl bug, no training on opencl	2022-07-17 12:55:26 -07:00
George Hotz	608e2431f7	test opencl, commit to removing the crap conv code from GPU	2022-07-17 11:55:37 -07:00
George Hotz	3c4565fa21	SLICE -> PAD,SHRINK	2022-07-17 11:33:59 -07:00
George Hotz	bcf422dfdd	Device2 (#358 ) * option for matmul * fixups * fast like a nascar * running * thneed runner * no buffer id makes no backing buffer * move constant folding to the top * runs on mac * folded biases * was v slow * maybe just that * elu touchup * speed and float32 Co-authored-by: Comma Device <device@comma.ai>	2022-07-16 07:26:19 -07:00
George Hotz	817b64f5e5	A conv is a reduce op (#356 ) * universal strided conv * more correct * hmm, CPU works * cleaner cl code output * make noconv a flag * cleanup __getitem__ * refactor broadcasting * put that back * unneeded reshape in getitem * fix strided for torch	2022-07-10 19:58:50 -07:00
George Hotz	68959be05d	precompute weights for opencl	2022-07-08 10:56:48 -07:00
George Hotz	d8e7f1f8bc	opencl type ignore	2022-07-08 10:33:55 -07:00
George Hotz	ae335b6d3e	opencl works, but tons of kernels	2022-07-08 10:22:04 -07:00
George Hotz	5b66d1bb0b	begin fixing up opencl	2022-07-08 10:20:14 -07:00
George Hotz	e3c2579537	flip stride to match canonical	2022-06-26 19:19:53 -07:00
George Hotz	3e13e3330a	UNSAFE_FLOAT4 env	2022-06-22 08:20:29 -07:00
George Hotz	73415e20ab	this fixes 2 of the conv recomputes...but it's ugh	2022-06-22 08:18:12 -07:00
George Hotz	b2d5df6049	3 convs are being recomputed	2022-06-22 07:54:52 -07:00
George Hotz	ba2defcdef	elif False	2022-06-21 23:54:09 -07:00
George Hotz	9cb0522574	noargs	2022-06-21 23:48:58 -07:00
George Hotz	1074dfbb71	unstrided	2022-06-21 23:42:21 -07:00
George Hotz	9ae01290ba	pass in shorts	2022-06-21 23:33:23 -07:00
George Hotz	18d74c01b1	float4 opt	2022-06-21 21:27:51 -07:00
George Hotz	ff3d5fe962	debugging while we compile	2022-06-21 21:12:04 -07:00
George Hotz	9d06a86f7f	CL class, debugging	2022-06-21 20:16:29 -07:00
George Hotz	1ebc2b5545	lazy opencl works	2022-06-21 19:41:08 -07:00
George Hotz	c53c91f949	opencl tests passed (#347 )	2022-06-21 18:57:09 -07:00
George Hotz	77f5cef8a6	First batch from lazy branch (#332 ) * test and helpers from lazy * lazy pt2	2022-06-18 17:26:59 -07:00
George Hotz	52505faaf4	minor	2022-06-16 15:53:45 -07:00
George Hotz	d5b3e18540	Accelerate with CL (#325 ) * accelerated opencl * it's running, it's just wrong * bugfix * model is correct in opencl * lazy image convert * add padding support to convolution * that stuff was all upstreamed * remove HEAD * oops * test_simple_conv2d_4 passes, add dilation support * put logic in ops_opencl * fix crash * hmm, stride seems okay * padding for batched inputs * just an issue now with cout%4 * op model still passes * fix startPackedInputChannel * pre and post processing ops for graph * don't break other llops * shapetrackering * reshapes are free * lazy movement ops	2022-06-16 15:40:52 -07:00

49 Commits