tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-01-09 15:08:02 -05:00

Author	SHA1	Message	Date
George Hotz	49c6e6d472	Latest attempt to add image (#462 ) * add image * load + store + boring stuff: * image tests pass * thneed print GFLOPS * op conv test * more debugging * hack for multiview image * shapetracker creates less views * disable image tests * working better * ugh, lkey not key * print in DEBUG, and allow views * works * simple padding conv2d * use index for image * that was bad code * debug print * fix types * less lines * save lines	2023-01-12 17:36:30 -08:00
George Hotz	4885fce56e	shapetracker from newgpu (#456 ) * shapetracker from newgpu * touchup ops * test * testst * thneed deletes unused inputs * test * bugfix	2023-01-09 12:40:01 -08:00
George Hotz	2cc1d970c6	updates from the chonker branch	2022-11-07 21:12:08 -08:00
George Hotz	db2da22a04	stop blowing up floats	2022-10-30 16:47:16 -07:00
George Hotz	8afc643bb1	fix bug in ops test, it was cheating somehow	2022-10-30 16:43:24 -07:00
George Hotz	2f602a92ff	seperate STRIDED and EXPAND	2022-10-30 13:23:58 -07:00
George Hotz	52bfbc31be	vectorization	2022-10-29 12:47:52 -07:00
George Hotz	e473d35f90	llvm doesn't vectorize	2022-10-29 11:59:48 -07:00
George Hotz	b65b70812a	Exec AST (#404 ) * working exec ast * exec_ast is staticmethod * GenericExecAST * fold that sometimes * ExplicitExecAST * exec_ast for GPU * gpu working * get_lazyop_shape * now gpubuffer is ExplicitExecAST * dedup * add a type * RESHAPE in opencl code * fix linter * that too for linter * cleanups * remove dead code * GenericShape is less lines * add ALLOWED_KERNEL_COUNT to tests * fix mypy * that's gotta be recursive * fix opencl shape processing * remove unneeded lambda	2022-10-28 08:27:03 -07:00
George Hotz	10921a60c4	more imports from llvm branch	2022-10-26 18:02:36 -07:00
Drew Hintz	a4ad1d774a	enable tests in test_ops.py that are disabled but now work. (#396 ) remove custom tolerances that don't appear to be needed.	2022-10-13 09:58:53 -07:00
George Hotz	b7f748c15a	Fix GPU 2*31 virtual size limit (#392 ) in progress * big conv test works * that's unneeded * fix opencl with reduce * rewrite contiguous_view_constant_fold * clean up mids in loop code * subidx * print cl kernel before run * no reduce, no loop * Revert "no reduce, no loop" This reverts commit `92777e40e9`.	2022-10-05 00:55:20 -04:00
George Hotz	7a61dc7ee9	test_sd_big_conv	2022-10-01 13:26:05 -04:00
George Hotz	271446e3eb	set requires_grad to None (#387 ) * set requires_grad to None * some things need gradients * hmm, why was get_parameters filtering	2022-09-21 11:16:02 -04:00
George Hotz	29ae21bb0d	import tests from CL metal texture fix	2022-09-19 20:01:47 -04:00
George Hotz	57e804a9bf	add min support	2022-09-18 20:39:41 -04:00
George Hotz	3c3534736e	fix matmul kernel and tests	2022-09-13 08:31:04 -07:00
Comma Device	62e9419206	fix test failure on MATMUL=1 backward pass	2022-09-13 11:18:52 -04:00
Comma Device	3b82afc6a0	simple on device failing test	2022-09-13 10:59:15 -04:00
George Hotz	4efde1ba0a	test_matmul	2022-09-13 07:51:33 -07:00
George Hotz	790af99a48	fix slice one multi, and linear can be simpler with new broadcasting	2022-09-06 19:51:33 -07:00
YassineYousfi	5aad460c7a	broadcast from right to left (#375 ) * broadcast from right to left * add another broadcasted add test	2022-09-06 16:36:13 -07:00
George Hotz	bcb867cdd6	better idea for numbers, do the division in python	2022-09-03 16:23:39 -07:00
George Hotz	033a3ecccf	found tinygrad bug	2022-09-03 12:32:43 -07:00
George Hotz	5d45c6e516	Fold reduce (#362 ) * folding reduce * fold through movementops * fixup shapes * was too aggressive * i knew we needed that * don't recompute reduce * working * fix openpilot compile * prunegraph openpilot * types and reduce_shape * refactor * cleanups * neater * 1009 * 1004 * clean up reduce for 998	2022-07-19 09:24:02 -07:00
George Hotz	f93e297804	fix bug caused by rounding	2022-07-17 12:49:58 -07:00
George Hotz	bcf422dfdd	Device2 (#358 ) * option for matmul * fixups * fast like a nascar * running * thneed runner * no buffer id makes no backing buffer * move constant folding to the top * runs on mac * folded biases * was v slow * maybe just that * elu touchup * speed and float32 Co-authored-by: Comma Device <device@comma.ai>	2022-07-16 07:26:19 -07:00
George Hotz	5e46561f7e	no_grad = NOT backward	2022-07-10 20:54:57 -07:00
George Hotz	b34ae7876f	lol chr(10) not chr(13)	2022-07-10 20:03:11 -07:00
George Hotz	93c378dffc	add test for slice_one	2022-07-03 12:14:20 -07:00
George Hotz	dffde3de5a	support both asymmetric and negative padding	2022-06-26 17:59:25 -07:00
George Hotz	49c954b389	comments	2022-06-26 17:20:25 -07:00
George Hotz	8c483fbdc9	maxpool lazy fix	2022-06-26 17:07:03 -07:00
George Hotz	6b652dafb2	touchups	2022-06-19 16:57:14 -07:00
George Hotz	d5b3e18540	Accelerate with CL (#325 ) * accelerated opencl * it's running, it's just wrong * bugfix * model is correct in opencl * lazy image convert * add padding support to convolution * that stuff was all upstreamed * remove HEAD * oops * test_simple_conv2d_4 passes, add dilation support * put logic in ops_opencl * fix crash * hmm, stride seems okay * padding for batched inputs * just an issue now with cout%4 * op model still passes * fix startPackedInputChannel * pre and post processing ops for graph * don't break other llops * shapetrackering * reshapes are free * lazy movement ops	2022-06-16 15:40:52 -07:00
George Hotz	2a14befb74	support padding	2022-06-15 14:46:44 -07:00
George Hotz	fef6c82491	wow dilation support was simple	2022-06-15 11:38:23 -07:00
George Hotz	0b182029dd	support dilated convolution in torch	2022-06-14 18:03:35 -07:00
George Hotz	a690ba4588	add test for padding	2022-06-14 17:41:22 -07:00
George Hotz	e057ca23bb	add flip	2022-06-14 17:28:43 -07:00
George Hotz	dcbca4fdf1	Expand Operator (#327 ) * replace broadcasting with expand * Tensor, not self * remove broadcasting from mlops * delete useless A operator * expand, not repeat * remove A op * expand on gpu * binary_op doesn't broadcast anymore * expand is still total junk, but the tests should pass	2022-06-12 12:31:48 -07:00
George Hotz	33f18c61a1	test_broadcasted_add	2022-06-12 10:19:58 -07:00
George Hotz	85d17a2acd	running resnet onnx	2022-06-11 13:17:15 -07:00
George Hotz	db5a632e8c	multicat + test onnx is generic onnx	2022-06-11 11:50:47 -07:00
George Hotz	08de1aa636	add flatten to tinygrad	2022-06-11 11:15:16 -07:00
George Hotz	d061ce8d5e	add ELU support	2022-06-11 10:47:23 -07:00
George Hotz	8864b37333	fix torch convdw	2022-06-10 15:04:39 -07:00
George Hotz	aac1a9b419	this breaks tests	2022-06-10 12:20:42 -07:00
George Hotz	a1dff4061b	minor cleanups	2022-06-06 08:14:52 -07:00
George Hotz	58ed46963e	fix broadcastdot	2021-11-29 18:54:57 -05:00

... 9 10 11 12 13

638 Commits