tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-01-22 21:38:10 -05:00

Author	SHA1	Message	Date
George Hotz	c2eeb6950b	add support for sign. technically relu can be second class now	2021-01-03 08:29:57 -08:00
NeuralLink	0825cf7f79	⚡ Added softplus and mish non stable (#220 ) * ⚡ Added softplus and mish CPU * 🔨 refactor * 🔨 second class softplus and mish * 🔨 test fix * no need of device in testing	2021-01-03 08:08:41 -08:00
Liam	ebd72ff437	Test split (#231 ) * Split tests Split tests into "Test CPU" and "Test GPU". Add test flag "TEST_DEVICES" which is a comma separated list of devices: CPU,GPU,ANE * Run tests based on provided TEST_DEVICES flag By default will run all "CPU,GPU,ANE" * fix bad quote * Revert changes and use GPU=1 This is done through setting the default Tensor Device to Device.CPU of GPU=1 is set. Run GPU tests: GPU=1 pytest -s -v	2021-01-01 09:19:03 -05:00
George Hotz	4291002881	reorder GPU ops	2020-12-31 09:46:39 -05:00
Marcel Bischoff	e2f833f58f	max to behave on ties like torch (#229 ) * checkpoint * fixing pow * undo pow * backward max on GPU and CPU rewrite * indentation * changing seed for curiosity * max replaced equality * undo seed * rebase * fixed tests * merge error	2020-12-30 18:52:50 -05:00
George Hotz	fcfe3dae01	write slice for CPU	2020-12-30 10:32:53 -05:00
George Hotz	f9170505b3	if you like your transformers twice as slow, use the GPU	2020-12-29 17:14:23 -05:00
George Hotz	6a6a82e999	support multidot on GPU	2020-12-29 16:56:30 -05:00
George Hotz	27208d729b	add GPU max thanks to marcelbischoff	2020-12-29 16:44:14 -05:00
George Hotz	02655c07d5	break maxpool2d on GPU	2020-12-29 13:05:57 -05:00
George Hotz	061e37de39	touchups	2020-12-29 12:41:21 -05:00
George Hotz	a2e6562330	fix max op, less lines	2020-12-29 10:47:04 -05:00
Marcel Bischoff	dc8fa7999c	Transpose on GPU (#221 ) * 2serious * load/save * fixing GPU * added DEBUG * needs BatchNorm or doesn't learn anything * old file not needed * added conv biases * added extra/training.py and checkpoint * assert in test only * save * padding * num_classes * checkpoint * checkpoints for padding * training was broken * merge * rotation augmentation * more aug * needs testing * streamline augment, augment is fast thus bicubic * tidying up * transformer eval * axis=-1 * transpose * test for permutation using torch.movedims * another test * line	2020-12-29 10:40:11 -05:00
George Hotz	36579f66bf	max op	2020-12-28 23:54:52 -05:00
George Hotz	fafece9db7	avgpool2d is a second class op	2020-12-28 10:41:59 -05:00
George Hotz	593233b668	log and exp are first class ops	2020-12-28 10:00:30 -05:00
George Hotz	a361ef6861	fixup training loop	2020-12-27 18:35:56 -05:00
George Hotz	f15bec6dbc	make multidot work on CPU	2020-12-27 17:25:37 -05:00
George Hotz	131e04c90c	cpu only decorator	2020-12-27 17:18:55 -05:00
George Hotz	2f1b2c0a3b	add transpose, start on transformer	2020-12-27 16:59:12 -05:00
iainwo	56d44637f3	fixed pylint, formatted python files iwth cblack on localhost (#204 ) * fixed pylint, formatted python files iwth cblack on localhost * Revert "fixed pylint, formatted python files iwth cblack on localhost" This reverts commit `07e2b88466`. * dedented 4-spaces added linter Co-authored-by: Iain Wong <iainwong@outlook.com>	2020-12-17 14:37:31 -08:00
Liam	bcf1518309	All devices are equal! (#196 ) * Update all devices to be tested ANE, CPU and OCL all now support all tests. However tests are not currently passing on GPU and I cannot test on CPU. Failing GPU test are not an issue caused by this update. Tests have not been passing due to a missing "six" required installation. OpenCL Tests have not been run since commit: `1a1c63a08b` devices have 3 types and are handle by a new DeviceTypes enum. (The goal is to revert to Tensor.<type>, but this current setup allows for keyword argument defaults: `device=DeviceType.CPU`) All references to Tensor.GPU/CPU/ANE as been converted to the corresponding `DeviceTypes` enum. Refactor of the conversion code to allow for any device to any device conversion. * Add six dependency in requirements.txt * Resolve failure to run tests Move six into gpu required installs. Remove six from standard installation. * Remove repeated data conversion * Refactor method names Also reduce code with .to and .to_ * Dynamic device handlers * Refactor DeviceTypes -> Device * Add mem copy profiling back * test_backward_pass_diamond_model passing * Resolve Sum issue on GPU * Revert batchnorm2d tests * Update README with upadated API * ANE testing with * Last minute line gains	2020-12-15 23:44:08 -08:00
Marcel Bischoff	da72a0eed4	Big MNIST model with PIL augmentation and load/save (#160 ) * 2serious * load/save * fixing GPU * added DEBUG * needs BatchNorm or doesn't learn anything * old file not needed * added conv biases * added extra/training.py and checkpoint * assert in test only * save * padding * num_classes * checkpoint * checkpoints for padding * training was broken * merge * rotation augmentation * more aug * needs testing * streamline augment, augment is fast thus bicubic * tidying up	2020-12-13 20:45:55 -08:00
George Hotz	1d10559d1d	tinygrad.utils -> extra.utils	2020-12-12 15:26:07 -08:00
James Roberts	8e8cbc74b3	Minor clean up (#184 ) * Removes unused imports * Minor clean up	2020-12-11 14:25:29 -08:00
Daulet	c7e95ddb21	Add diamond model test (#181 ) * add backward pass test for diamond model * fix train_efficientnet example	2020-12-11 09:21:36 -08:00
Marcel Bischoff	5d46df638a	abs as non-first class operation using relu (#171 ) * abs (non-first class) * whitespace	2020-12-09 12:20:34 -08:00
George Hotz	ffb96b2d0b	batchnorm by marcelbischoff	2020-12-09 03:23:04 -08:00
NeuralLink	00e376f36c	leaky relu as geohot suggested (#167 )	2020-12-09 02:58:35 -08:00
George Hotz	c225e62dd2	touchups	2020-12-09 02:52:28 -08:00
Liam	89d0ff6989	Consistent testing (#137 ) * Consistent GPU classes Convert the existing GPU classes into one standard format. Remove duplicated functions in `test_mnist` and create a TestMNISTGPU class. This reduces line count and ensures consistency. Use `@unittest.skipUnless(GPU, "Requires GPU")` instead of `if GPU:` to skip GPU testing. This will ensure that skipped tests are displayed accordingly in the pytest output. * Optim Testing now supports GPU * Tensor testing now supports GPU jacobian and gradcheck auto skipped until GPU float64 support added. * GPU support for custom constructor methods * Remove GPU flag from Model constructors It was requested that the `gpu` kwarg be removed from the model constructor. GPU conversion is now handled in the train function. This also required the conversion of Optimizer parameters as they are constructed prior to execution of the `train` function and are dependant on the model GPU state. * Fix typo: float32->float64 * Clean `get_parameters` utility Just a quick refactor w/ the new support for optimizers. * Remove GPU kwarg from TinyNet Remove `gpu` kwarg from tiny net to match test_mnist `train` function.	2020-12-09 02:25:27 -08:00
Daulet	24d688c184	win more lines for core library (#158 ) ...and sacrifice test speed	2020-12-08 14:18:45 -08:00
George Hotz	4e1a0de392	fix rsub	2020-12-08 10:05:21 -08:00
George Hotz	c4540f1b8c	Support scalars by kartik4949	2020-12-08 09:52:07 -08:00
George Hotz	97fd9c1237	zero_grad there to match readme	2020-12-07 23:12:18 -08:00
George Hotz	b355cd2571	Mean axis (doesn't work) (#154 ) * mean axis * fixed	2020-12-07 22:58:34 -08:00
Marcel Bischoff	58ccebd7cd	Sum with axis (#153 ) * sum with axis and tests * broken * works again * clean up * Update test_ops.py	2020-12-07 21:49:18 -08:00
George Hotz	3b982f2f7a	get_parameters	2020-12-06 13:47:28 -08:00
George Hotz	102e6356e9	replace layer_init_uniform with .uniform	2020-12-06 13:44:31 -08:00
George Hotz	51daaa43d4	fix memory leaks, add gc test	2020-12-06 10:34:40 -08:00
George Hotz	17659f7dd7	gpu speedup, tests work on M1	2020-12-06 09:05:49 -08:00
adamritter	f190ca446d	Detach (#123 ) * Detach * Torch.detach reuses the buffer in the * Fix test * wakey wakey GitHub Actions Co-authored-by: holonomicjl <58403584+holonomicjl@users.noreply.github.com>	2020-11-19 19:03:42 -08:00
dustcollector12	ee99d016e9	tensor implementation for rmsprop and adam (#121 ) * tensor implementation for rmsprop and adam * test_mnist.py extended to cover sgd, rmsprop and adam on cpu and gpu * number of steps reduced for adam from 1000 to 200	2020-11-16 15:07:49 -08:00
George Hotz	17bf90dbe4	unbroadcasting works on the GPU	2020-11-16 09:16:55 -08:00
George Hotz	17eab716b6	unbroadcast GPU template	2020-11-16 08:16:36 -08:00
George Hotz	13d34373d1	move gradcheck to extra, clean up unbroadcast	2020-11-16 08:03:31 -08:00
adamritter	5ea3d76dfb	Topological sort, zero_grads (#119 ) * Topological sort, zero_grads * Bug fix, add test * Add zero_grads * Put deepwalk function in backward * Move zero_grad to optim * Fix gradcheck hack Co-authored-by: holonomicjl <58403584+holonomicjl@users.noreply.github.com>	2020-11-15 20:25:29 -08:00
Marcel Bischoff	c7b7f8ccc8	Backwards ops supporting broadcasting (#118 ) * streamlined numerical_jacobian * Got rid of the g loop in Conv2D.forward * ereased stupid line * nothing * no loops in Conv2D forward * Conv2D backprop improved * stupid things in examples * alternative to einsum * Conv2D backward einsum alternative * tidying up * tidied up * no ravel * got rid of print * Update efficientnet.py * Update efficientnet.py * Update efficientnet.py * only tensordot * 255.0 * whitespace * aspect ratio error in efficientnet * noprint * efficient net wrong strides * broadcasting for backward ops * Update ops.py * Update ops.py - was wrong * broadcast test for backward enabled * function adBC + not summing over already 1 axis * spacing Co-authored-by: Marcel Bischoff <marcel@Marcels-iMac.local>	2020-11-15 15:21:10 -08:00
dustcollector12	28474949b8	refactoring of forward in reshape (#115 ) * refactoring of forward in reshape * test case for reshape added	2020-11-13 13:20:43 -08:00
pb1729	420af82888	General broadcasting of binary operations (#114 ) * allow for general broadcasting of binary operations. can handle any situation where corresponding dimensions between the tensors match, or at least one of them is of size 1. if a tensor has fewer dimensions than the other, then its size is padded with 1s until they match have the same number. also refactored buffer_zeros() by creating a function buff() that makes a buffer from a numpy array * remove extra tabs Co-authored-by: phillip <phillip_bement@reedbement.com>	2020-11-12 22:27:48 -08:00

... 86 87 88 89 90 ...

4505 Commits