tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-01-20 20:38:03 -05:00

Author	SHA1	Message	Date
gamwe6	dad061dafb	Added Python 3 style super() without arguments (#200 ) Co-authored-by: gamwe6 <gamwe6@users.noreply.github.com>	2020-12-16 20:50:16 -08:00
Liam	bcf1518309	All devices are equal! (#196 ) * Update all devices to be tested ANE, CPU and OCL all now support all tests. However tests are not currently passing on GPU and I cannot test on CPU. Failing GPU test are not an issue caused by this update. Tests have not been passing due to a missing "six" required installation. OpenCL Tests have not been run since commit: `1a1c63a08b` devices have 3 types and are handle by a new DeviceTypes enum. (The goal is to revert to Tensor.<type>, but this current setup allows for keyword argument defaults: `device=DeviceType.CPU`) All references to Tensor.GPU/CPU/ANE as been converted to the corresponding `DeviceTypes` enum. Refactor of the conversion code to allow for any device to any device conversion. * Add six dependency in requirements.txt * Resolve failure to run tests Move six into gpu required installs. Remove six from standard installation. * Remove repeated data conversion * Refactor method names Also reduce code with .to and .to_ * Dynamic device handlers * Refactor DeviceTypes -> Device * Add mem copy profiling back * test_backward_pass_diamond_model passing * Resolve Sum issue on GPU * Revert batchnorm2d tests * Update README with upadated API * ANE testing with * Last minute line gains	2020-12-15 23:44:08 -08:00
James Roberts	78210b5e40	less lines (#197 )	2020-12-14 13:53:00 -08:00
George Hotz	b86bbd2e72	readmes	2020-12-13 21:32:20 -08:00
Marcel Bischoff	da72a0eed4	Big MNIST model with PIL augmentation and load/save (#160 ) * 2serious * load/save * fixing GPU * added DEBUG * needs BatchNorm or doesn't learn anything * old file not needed * added conv biases * added extra/training.py and checkpoint * assert in test only * save * padding * num_classes * checkpoint * checkpoints for padding * training was broken * merge * rotation augmentation * more aug * needs testing * streamline augment, augment is fast thus bicubic * tidying up	2020-12-13 20:45:55 -08:00
George Hotz	f50dcc12ac	1k lines	2020-12-13 20:37:58 -08:00
George Hotz	4d8235d5f7	readme update	2020-12-13 20:24:33 -08:00
NeuralLink	1a1c63a08b	Gan is real...Look what tiny just generated! (#192 ) * mode collapse solved * info add * delete unnecessary imports * readme	2020-12-13 20:23:12 -08:00
Marcel Bischoff	6785614239	tinygrad.utils to extra.utils fix in mnist_gan (#190 )	2020-12-12 20:52:36 -08:00
NeuralLink	d901ef6b23	🎉 effort to generate mnist data using GAN with tinygrad. [WIP] (#166 ) * 🎉 effort to generate mnist data with tinygrad. * dropout added * working gan * minor bug fixes * more bug fixes * todo reg l2 * detach * logsoftmax twice	2020-12-12 17:58:04 -08:00
Mufeed VH	e6a5c6c93e	Added indentation linter (#187 ) * Added indentation linter * pylint package latest	2020-12-12 17:15:09 -08:00
George Hotz	f95e79dab7	update readme	2020-12-12 17:14:10 -08:00
George Hotz	a5aced8d47	30 MEGAReLUs. we need to lose 12 lines	2020-12-12 17:07:34 -08:00
WillemKauf	49da969d25	Fixed a typo. (#189 )	2020-12-12 16:25:33 -08:00
George Hotz	bc5df477de	readme and .ane()	2020-12-12 16:15:38 -08:00
George Hotz	da873cd556	Single ReLU in ANE (#188 ) * aneworks * cleanup	2020-12-12 16:11:34 -08:00
George Hotz	07ece2105e	actually move it	2020-12-12 15:26:58 -08:00
George Hotz	1d10559d1d	tinygrad.utils -> extra.utils	2020-12-12 15:26:07 -08:00
George Hotz	59358304a3	ane	2020-12-12 15:23:21 -08:00
George Hotz	36d4eee323	fix compiler segfault	2020-12-12 15:10:47 -08:00
George Hotz	abb7b74208	relu in python	2020-12-12 14:50:05 -08:00
George Hotz	d3886035dd	ane dylib	2020-12-12 13:41:09 -08:00
George Hotz	cf66d549c1	fix example ane	2020-12-12 13:32:49 -08:00
George Hotz	566045cefc	uint8 nope	2020-12-12 13:14:06 -08:00
pb1729	8c25431619	Faster but still general binop broadcasting (#159 ) * allow for general broadcasting of binary operations. can handle any situation where corresponding dimensions between the tensors match, or at least one of them is of size 1. if a tensor has fewer dimensions than the other, then its size is padded with 1s until they match have the same number. also refactored buffer_zeros() by creating a function buff() that makes a buffer from a numpy array * remove extra tabs * messy loop unrolling * fix loop unrolling bugs * revert loop unrolling changes, new plan here * binary_op(): avoid having a loop in the GPU C code, instead compute indices with nested expressions. simple broadcasts should have a similar level of performance to the simple-broadcast-specific code that was there before. broke out codegen and compilation into get_binop_prg(), which has a larger cache and depends only on the operation type and complist (this avoids doing a bunch of python string ops every time we want to compile something we've already compiled). the larger cache is needed since there will end up being quite a few possible types of broacasts (sum_i^N 3*i is a loose upper bound, N being the maximum number of dimensions). I assumed 5 kinds of binary operations when sizing the cache here, +, -, , /, and *. More may be needed in the future. add .cl to binop arguments * solved edge case where len(dimlist)==0. still problems when len(dimlist) > CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS * pyopencl can't handle more than 3 gids, so we just use 1 gid and compute the indices into the returned tensor in the kernel. this means more computation for the individual indices, but less for the index into the flattened tensor (last line of kernel), since it's just gid0 * trim some lines Co-authored-by: phillip <phillip_bement@reedbement.com>	2020-12-12 12:19:46 -08:00
Liam	bf9ba8718a	Profile GPU and CPU copying. (#182 ) Moving memory is slow, and therefor monitoring the time spent converting and limiting the number of copy operations can improve performance.	2020-12-12 12:15:47 -08:00
James Roberts	8e8cbc74b3	Minor clean up (#184 ) * Removes unused imports * Minor clean up	2020-12-11 14:25:29 -08:00
Skosh	f4faf401bc	require_init_gpu() function selects GPU as device and falls back to CPU if none are available (#180 ) * require_init_gpu() function selects GPU as device and falls back to CPU if none are available * Small fix for CPU specific code * Should work...	2020-12-11 09:21:59 -08:00
Daulet	c7e95ddb21	Add diamond model test (#181 ) * add backward pass test for diamond model * fix train_efficientnet example	2020-12-11 09:21:36 -08:00
Marcel Bischoff	38b29f49dd	abs (#172 )	2020-12-10 09:24:35 -08:00
Liam	e79cda6dad	Add pyopencl to dependency installs (#174 ) * Add pyopencl to dependency installs OpenCL was not actually being tested as pyopencl was not installed. * Reduce installation to 1 liner	2020-12-10 09:24:08 -08:00
NeuralLink	8ab8a71d5d	refactor (#178 )	2020-12-10 09:23:36 -08:00
Marcel Bischoff	d204f09316	some progress on batchnorms (draft) (#147 ) * no of categories for efficientnet * need layer_init_uniforn * merge fail * merge fail * batchnorms * needs work * needs work how determine training * pow * needs work * reshape was needed * sum with axis * sum with axis and tests * broken * works again * clean up * Update test_ops.py * using sum * don't always update running_stats * space * self * default return running_stats * passes test * need to use mean * merge * testing * fixing pow * test_ops had a line dropped * undo pow * rebase	2020-12-09 22:14:27 -08:00
Marcel Bischoff	5d46df638a	abs as non-first class operation using relu (#171 ) * abs (non-first class) * whitespace	2020-12-09 12:20:34 -08:00
George Hotz	4c55c7208f	no pow if mul will do	2020-12-09 08:19:29 -08:00
George Hotz	b85f17f247	more optim cleanup	2020-12-09 08:18:10 -08:00
George Hotz	9a64d13b94	add conv biases and max pool	2020-12-09 08:01:20 -08:00
George Hotz	99fa65f057	enable batchnorm in serious mnist	2020-12-09 03:29:40 -08:00
George Hotz	ffb96b2d0b	batchnorm by marcelbischoff	2020-12-09 03:23:04 -08:00
NeuralLink	00e376f36c	leaky relu as geohot suggested (#167 )	2020-12-09 02:58:35 -08:00
George Hotz	c225e62dd2	touchups	2020-12-09 02:52:28 -08:00
Liam	89d0ff6989	Consistent testing (#137 ) * Consistent GPU classes Convert the existing GPU classes into one standard format. Remove duplicated functions in `test_mnist` and create a TestMNISTGPU class. This reduces line count and ensures consistency. Use `@unittest.skipUnless(GPU, "Requires GPU")` instead of `if GPU:` to skip GPU testing. This will ensure that skipped tests are displayed accordingly in the pytest output. * Optim Testing now supports GPU * Tensor testing now supports GPU jacobian and gradcheck auto skipped until GPU float64 support added. * GPU support for custom constructor methods * Remove GPU flag from Model constructors It was requested that the `gpu` kwarg be removed from the model constructor. GPU conversion is now handled in the train function. This also required the conversion of Optimizer parameters as they are constructed prior to execution of the `train` function and are dependant on the model GPU state. * Fix typo: float32->float64 * Clean `get_parameters` utility Just a quick refactor w/ the new support for optimizers. * Remove GPU kwarg from TinyNet Remove `gpu` kwarg from tiny net to match test_mnist `train` function.	2020-12-09 02:25:27 -08:00
Liam	34b38dd4d0	Extra install requirements. (#164 ) * Testing install requirements * GPU install requirements	2020-12-09 02:22:47 -08:00
George Hotz	0e02f394ee	serious_mnist	2020-12-08 21:43:05 -08:00
Daulet	24d688c184	win more lines for core library (#158 ) ...and sacrifice test speed	2020-12-08 14:18:45 -08:00
NeuralLink	9f77fd6135	🔨 refactor optim (#156 ) * 🔨 refactor optim * 🔨 refactor optim * 🔨 more clean up	2020-12-08 14:16:31 -08:00
George Hotz	4e1a0de392	fix rsub	2020-12-08 10:05:21 -08:00
George Hotz	c4540f1b8c	Support scalars by kartik4949	2020-12-08 09:52:07 -08:00
George Hotz	97fd9c1237	zero_grad there to match readme	2020-12-07 23:12:18 -08:00
George Hotz	c63f950348	need zero grad now	2020-12-07 23:10:43 -08:00

... 199 200 201 202 203 ...

10417 Commits