tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-01-20 20:38:03 -05:00

Author	SHA1	Message	Date
George Hotz	1290e01e2c	all ops supported on GPU now	2020-12-03 10:43:11 -08:00
George Hotz	621a93b777	ane in readme	2020-12-03 10:40:31 -08:00
George Hotz	1dcaecacc4	Support for Apple Neural Engine (#130 ) * ane query is success * cite and build instructions * low level access, need to disable AMFI * coreml_ane works * coreml fun * more work * compiled example * progress * compiler works * model flow * TODOs in the readme * put some real weights in * we are learning objc * much progress i think * signed model still doesn't work * working example * there are float16 * clean up: part 1 * h11ane header, more cleanup * cleanup DeviceController creation * remove the stupid sleep * notes * start a hwx parser * no tabs * compare stuff * hmm, why don't inputs work * cache doesn't seem to fix it * hmm, the issue was the compiler * fix the compiler, guess i didn't put in weights * logging for compiler * uselessness in plist * remove hwx before compile, weights are converted to float16 * better compare * better compare * last line in comparE * opcodes from compiler * notes	2020-12-03 10:32:26 -08:00
baplou	c83cebccda	Made the readme more consistent (#136 )	2020-11-28 08:20:02 -06:00
Marcel Bischoff	541330c42a	Update README.md (#133 ) should we put `ipython3` otherwise the path doesn't work or we have to add the env, not sure what is nicer	2020-11-25 07:53:54 -08:00
Mufeed VH	0bbf66627c	Define `ProfileOp` class once (#131 ) * define `ProfileOp` class once * clean `ProfileOp` class * removed `else: pass`	2020-11-24 19:39:13 -08:00
George Hotz	03994e0011	load torch files without torch	2020-11-21 13:43:53 -08:00
Marcel Bischoff	26899869a2	Update tensor.py (#128 ) Otherwise `.cpu()` is broken if default is GPU	2020-11-21 09:16:03 -08:00
adamritter	f190ca446d	Detach (#123 ) * Detach * Torch.detach reuses the buffer in the * Fix test * wakey wakey GitHub Actions Co-authored-by: holonomicjl <58403584+holonomicjl@users.noreply.github.com>	2020-11-19 19:03:42 -08:00
Colin Manko	8383ff40ad	fix pyopencl (#125 )	2020-11-19 19:03:04 -08:00
adamritter	5797e63d9b	Train efficientnet should respect NUM environment variable (#122 ) Co-authored-by: holonomicjl <58403584+holonomicjl@users.noreply.github.com>	2020-11-16 20:02:31 -08:00
dustcollector12	ee99d016e9	tensor implementation for rmsprop and adam (#121 ) * tensor implementation for rmsprop and adam * test_mnist.py extended to cover sgd, rmsprop and adam on cpu and gpu * number of steps reduced for adam from 1000 to 200	2020-11-16 15:07:49 -08:00
George Hotz	17bf90dbe4	unbroadcasting works on the GPU	2020-11-16 09:16:55 -08:00
George Hotz	17eab716b6	unbroadcast GPU template	2020-11-16 08:16:36 -08:00
George Hotz	2ffb8de1ea	move efficientnet to extra	2020-11-16 08:08:07 -08:00
George Hotz	13d34373d1	move gradcheck to extra, clean up unbroadcast	2020-11-16 08:03:31 -08:00
George Hotz	ed4c35e2e9	channels on the inside	2020-11-15 21:19:59 -08:00
adamritter	fb1df81c7d	Fix train_efficientnet (#120 ) Co-authored-by: holonomicjl <58403584+holonomicjl@users.noreply.github.com>	2020-11-15 20:50:31 -08:00
George Hotz	1207fe4c7d	cleanup LogSoftmax	2020-11-15 20:49:57 -08:00
George Hotz	d1441de3a6	minor cleanups	2020-11-15 20:39:19 -08:00
George Hotz	37a210f868	touchups and lines	2020-11-15 20:26:52 -08:00
adamritter	5ea3d76dfb	Topological sort, zero_grads (#119 ) * Topological sort, zero_grads * Bug fix, add test * Add zero_grads * Put deepwalk function in backward * Move zero_grad to optim * Fix gradcheck hack Co-authored-by: holonomicjl <58403584+holonomicjl@users.noreply.github.com>	2020-11-15 20:25:29 -08:00
George Hotz	a35425189d	binop fast path for no broadcast	2020-11-15 19:12:14 -08:00
Marcel Bischoff	c7b7f8ccc8	Backwards ops supporting broadcasting (#118 ) * streamlined numerical_jacobian * Got rid of the g loop in Conv2D.forward * ereased stupid line * nothing * no loops in Conv2D forward * Conv2D backprop improved * stupid things in examples * alternative to einsum * Conv2D backward einsum alternative * tidying up * tidied up * no ravel * got rid of print * Update efficientnet.py * Update efficientnet.py * Update efficientnet.py * only tensordot * 255.0 * whitespace * aspect ratio error in efficientnet * noprint * efficient net wrong strides * broadcasting for backward ops * Update ops.py * Update ops.py - was wrong * broadcast test for backward enabled * function adBC + not summing over already 1 axis * spacing Co-authored-by: Marcel Bischoff <marcel@Marcels-iMac.local>	2020-11-15 15:21:10 -08:00
adamritter	55d93017e4	Simplify more (#117 ) Co-authored-by: holonomicjl <58403584+holonomicjl@users.noreply.github.com>	2020-11-14 06:15:31 -08:00
dustcollector12	28474949b8	refactoring of forward in reshape (#115 ) * refactoring of forward in reshape * test case for reshape added	2020-11-13 13:20:43 -08:00
dustcollector12	6f033ea30a	enable local images for efficientnet.py (#116 )	2020-11-13 07:00:12 -08:00
pb1729	420af82888	General broadcasting of binary operations (#114 ) * allow for general broadcasting of binary operations. can handle any situation where corresponding dimensions between the tensors match, or at least one of them is of size 1. if a tensor has fewer dimensions than the other, then its size is padded with 1s until they match have the same number. also refactored buffer_zeros() by creating a function buff() that makes a buffer from a numpy array * remove extra tabs Co-authored-by: phillip <phillip_bement@reedbement.com>	2020-11-12 22:27:48 -08:00
damianzim	2b1286eef6	Don't wrap np.int32 in a function, use an alias (#113 )	2020-11-12 19:32:19 -08:00
adamritter	08aa60d9d0	broadcasting 1s at the start, 1 kernel/4 divs version (#110 ) * Pad2d backward pass on GPU * Faster Pad2D GPU backward pass (no zeroing needed) * Fix out of bounds error * Don't save prg * Let compiler optimize division by 1 * More generic broadcasting (1s at the start) * Bug fix * Add comment * Try to fix flaky test with other method * Add mixed broadcast support * 1kernel * Separate broadcast tests Co-authored-by: holonomicjl <58403584+holonomicjl@users.noreply.github.com>	2020-11-12 13:33:35 -08:00
NeuralLink	f773ef3996	⚡ tanh non first class op (#111 ) * ⚡ tanh non first class op * tanh test with 1e-6 tol Co-authored-by: Kartik Sharma <kartik.sharma@claimgenius.com>	2020-11-12 13:32:50 -08:00
Ryan Neph	608bdd4872	adds broadcasting test cases (#106 ) refs: #80, #90, #104, #105	2020-11-12 07:08:28 -08:00
adamritter	f1d21afe88	Somewhat more generic broadcasting (#105 ) * Somewhat more generic broadcasting * Add TODO * Set Torch to deterministic in test Co-authored-by: holonomicjl <58403584+holonomicjl@users.noreply.github.com>	2020-11-11 20:33:00 -08:00
Ryan Neph	8827a536e0	GPU MaxPool2D.backward(); TinyConvNet train passes (#103 ) * no trailing whitespace * GPU MaxPool2D.backward(); TinyConvNet train passes! * Fix GPU avgpool.forward() init_val Doesn’t change result but is simpler. * Fix MaxPool GPU init_val Tests only cover random non-negative inputs. This fixes issues if negative inputs are fed to GPU MaxPool2D. Test update to follow.	2020-11-11 07:58:43 -08:00
Marcel Bischoff	a3989f9e18	Supporting .png files in efficientnet (#102 ) * to make it work locally * definitely not working * Conv2D GPU passes some of the tests * Conv2D GPU passes more of the tests * passes some tests and mnist * removed unecessary code * Conv2D Backpass works * wrong test_ops.py * white space + test backward * ereased useless code * removed default argument * long lines * works also with 4 channel .png files * commenting out * track	2020-11-10 20:06:24 -08:00
George Hotz	d93cd945aa	reshape makes copies	2020-11-10 16:18:59 -08:00
George Hotz	d1284fa817	stride tests and i32	2020-11-10 16:10:14 -08:00
Marcel Bischoff	7bb803c5e0	Conv2D backward on GPU (#93 ) * to make it work locally * definitely not working * Conv2D GPU passes some of the tests * Conv2D GPU passes more of the tests * passes some tests and mnist * removed unecessary code * Conv2D Backpass works * wrong test_ops.py * white space + test backward * ereased useless code * removed default argument * long lines	2020-11-10 16:07:33 -08:00
George Hotz	5577b9d3a0	clean up imports	2020-11-10 15:53:05 -08:00
George Hotz	db755fa103	promote swish to a tensor ops	2020-11-10 15:48:11 -08:00
George Hotz	5f4b76a21b	touch ups	2020-11-10 15:44:47 -08:00
George Hotz	52ee913c98	move the mnist loader out of tinygrad proper	2020-11-10 15:37:39 -08:00
George Hotz	498b4d2f27	i32 and reduce line count a bit	2020-11-10 15:35:30 -08:00
George Hotz	df64658a2c	weee, opencl tests in CI	2020-11-10 10:04:45 -08:00
George Hotz	d47a128812	pocl	2020-11-10 10:02:13 -08:00
George Hotz	c05401a9ca	sudo maybe	2020-11-10 09:53:49 -08:00
George Hotz	09bc8eddfe	clinfo	2020-11-10 09:51:38 -08:00
George Hotz	58e703d099	fix tests	2020-11-10 09:49:19 -08:00
George Hotz	23405cec43	intel opencl	2020-11-10 09:41:40 -08:00
George Hotz	33090c4b0d	install more	2020-11-10 09:34:56 -08:00

... 201 202 203 204 205 ...

10417 Commits