Commit Graph

10417 Commits

Author SHA1 Message Date
gamwe6
dad061dafb Added Python 3 style super() without arguments (#200)
Co-authored-by: gamwe6 <gamwe6@users.noreply.github.com>
2020-12-16 20:50:16 -08:00
Liam
bcf1518309 All devices are equal! (#196)
* Update all devices to be tested

ANE, CPU and OCL all now support all tests.

However tests are not currently passing on GPU and I cannot test on CPU.

Failing GPU test are not an issue caused by this update. Tests have not
been passing due to a missing "six" required installation.

OpenCL Tests have not been run since commit: 1a1c63a08b

devices have 3 types and are handle by a new DeviceTypes enum. (The goal
is to revert to Tensor.<type>, but this current setup allows for keyword
argument defaults: `device=DeviceType.CPU`)

All references to Tensor.GPU/CPU/ANE as been converted to the
corresponding `DeviceTypes` enum.

Refactor of the conversion code to allow for any device to any device
conversion.

* Add six dependency in requirements.txt

* Resolve failure to run tests

Move six into gpu required installs. Remove six from standard
installation.

* Remove repeated data conversion

* Refactor method names

Also reduce code with .to and .to_

* Dynamic device handlers

* Refactor DeviceTypes -> Device

* Add mem copy profiling back

* test_backward_pass_diamond_model passing

* Resolve Sum issue on GPU

* Revert batchnorm2d tests

* Update README with upadated API

* ANE testing with

* Last minute line gains
2020-12-15 23:44:08 -08:00
James Roberts
78210b5e40 less lines (#197) 2020-12-14 13:53:00 -08:00
George Hotz
b86bbd2e72 readmes 2020-12-13 21:32:20 -08:00
Marcel Bischoff
da72a0eed4 Big MNIST model with PIL augmentation and load/save (#160)
* 2serious

* load/save

* fixing GPU

* added DEBUG

* needs BatchNorm or doesn't learn anything

* old file not needed

* added conv biases

* added extra/training.py and checkpoint

* assert in test only

* save

* padding

* num_classes

* checkpoint

* checkpoints for padding

* training was broken

* merge

* rotation augmentation

* more aug

* needs testing

* streamline augment, augment is fast thus bicubic

* tidying up
2020-12-13 20:45:55 -08:00
George Hotz
f50dcc12ac 1k lines 2020-12-13 20:37:58 -08:00
George Hotz
4d8235d5f7 readme update 2020-12-13 20:24:33 -08:00
NeuralLink
1a1c63a08b Gan is real...Look what tiny just generated! (#192)
* mode collapse solved

* info add

* delete unnecessary imports

* readme
2020-12-13 20:23:12 -08:00
Marcel Bischoff
6785614239 tinygrad.utils to extra.utils fix in mnist_gan (#190) 2020-12-12 20:52:36 -08:00
NeuralLink
d901ef6b23 🎉 effort to generate mnist data using GAN with tinygrad. [WIP] (#166)
* 🎉 effort to generate mnist data with tinygrad.

* dropout added

* working gan

* minor bug fixes

* more bug fixes

* todo reg l2

* detach

* logsoftmax twice
2020-12-12 17:58:04 -08:00
Mufeed VH
e6a5c6c93e Added indentation linter (#187)
* Added indentation linter

* pylint package latest
2020-12-12 17:15:09 -08:00
George Hotz
f95e79dab7 update readme 2020-12-12 17:14:10 -08:00
George Hotz
a5aced8d47 30 MEGAReLUs. we need to lose 12 lines 2020-12-12 17:07:34 -08:00
WillemKauf
49da969d25 Fixed a typo. (#189) 2020-12-12 16:25:33 -08:00
George Hotz
bc5df477de readme and .ane() 2020-12-12 16:15:38 -08:00
George Hotz
da873cd556 Single ReLU in ANE (#188)
* aneworks

* cleanup
2020-12-12 16:11:34 -08:00
George Hotz
07ece2105e actually move it 2020-12-12 15:26:58 -08:00
George Hotz
1d10559d1d tinygrad.utils -> extra.utils 2020-12-12 15:26:07 -08:00
George Hotz
59358304a3 ane 2020-12-12 15:23:21 -08:00
George Hotz
36d4eee323 fix compiler segfault 2020-12-12 15:10:47 -08:00
George Hotz
abb7b74208 relu in python 2020-12-12 14:50:05 -08:00
George Hotz
d3886035dd ane dylib 2020-12-12 13:41:09 -08:00
George Hotz
cf66d549c1 fix example ane 2020-12-12 13:32:49 -08:00
George Hotz
566045cefc uint8 nope 2020-12-12 13:14:06 -08:00
pb1729
8c25431619 Faster but still general binop broadcasting (#159)
* allow for general broadcasting of binary operations. can handle any situation where corresponding dimensions between the tensors match, or at least one of them is of size 1. if a tensor has fewer dimensions than the other, then its size is padded with 1s until they match have the same number. also refactored buffer_zeros() by creating a function buff() that makes a buffer from a numpy array

* remove extra tabs

* messy loop unrolling

* fix loop unrolling bugs

* revert loop unrolling changes, new plan here

* binary_op(): avoid having a loop in the GPU C code, instead compute indices with nested expressions. simple broadcasts should have a similar level of performance to the simple-broadcast-specific code that was there before. broke out codegen and compilation into get_binop_prg(), which has a larger cache and depends only on the operation type and complist (this avoids doing a bunch of python string ops every time we want to compile something we've already compiled). the larger cache is needed since there will end up being quite a few possible types of broacasts (sum_i^N 3**i is a loose upper bound, N being the maximum number of dimensions). I assumed 5 kinds of binary operations when sizing the cache here, +, -, *, /, and **. More may be needed in the future.

* add .cl to binop arguments

* solved edge case where len(dimlist)==0. still problems when len(dimlist) > CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS

* pyopencl can't handle more than 3 gids, so we just use 1 gid and compute the indices into the returned tensor in the kernel. this means more computation for the individual indices, but less for the index into the flattened tensor (last line of kernel), since it's just gid0

* trim some lines

Co-authored-by: phillip <phillip_bement@reedbement.com>
2020-12-12 12:19:46 -08:00
Liam
bf9ba8718a Profile GPU and CPU copying. (#182)
Moving memory is slow, and therefor monitoring the time spent converting
and limiting the number of copy operations can improve performance.
2020-12-12 12:15:47 -08:00
James Roberts
8e8cbc74b3 Minor clean up (#184)
* Removes unused imports

* Minor clean up
2020-12-11 14:25:29 -08:00
Skosh
f4faf401bc require_init_gpu() function selects GPU as device and falls back to CPU if none are available (#180)
* require_init_gpu() function selects GPU as device and falls back to CPU if none are available

* Small fix for CPU specific code

* Should work...
2020-12-11 09:21:59 -08:00
Daulet
c7e95ddb21 Add diamond model test (#181)
* add backward pass test for diamond model

* fix train_efficientnet example
2020-12-11 09:21:36 -08:00
Marcel Bischoff
38b29f49dd abs (#172) 2020-12-10 09:24:35 -08:00
Liam
e79cda6dad Add pyopencl to dependency installs (#174)
* Add pyopencl to dependency installs

OpenCL was not actually being tested as pyopencl was not installed.

* Reduce installation to 1 liner
2020-12-10 09:24:08 -08:00
NeuralLink
8ab8a71d5d refactor (#178) 2020-12-10 09:23:36 -08:00
Marcel Bischoff
d204f09316 some progress on batchnorms (draft) (#147)
* no of categories for efficientnet

* need layer_init_uniforn

* merge fail

* merge fail

* batchnorms

* needs work

* needs work how determine training

* pow

* needs work

* reshape was needed

* sum with axis

* sum with axis and tests

* broken

* works again

* clean up

* Update test_ops.py

* using sum

* don't always update running_stats

* space

* self

* default return running_stats

* passes test

* need to use mean

* merge

* testing

* fixing pow

* test_ops had a line dropped

* undo pow

* rebase
2020-12-09 22:14:27 -08:00
Marcel Bischoff
5d46df638a abs as non-first class operation using relu (#171)
* abs (non-first class)

* whitespace
2020-12-09 12:20:34 -08:00
George Hotz
4c55c7208f no pow if mul will do 2020-12-09 08:19:29 -08:00
George Hotz
b85f17f247 more optim cleanup 2020-12-09 08:18:10 -08:00
George Hotz
9a64d13b94 add conv biases and max pool 2020-12-09 08:01:20 -08:00
George Hotz
99fa65f057 enable batchnorm in serious mnist 2020-12-09 03:29:40 -08:00
George Hotz
ffb96b2d0b batchnorm by marcelbischoff 2020-12-09 03:23:04 -08:00
NeuralLink
00e376f36c leaky relu as geohot suggested (#167) 2020-12-09 02:58:35 -08:00
George Hotz
c225e62dd2 touchups 2020-12-09 02:52:28 -08:00
Liam
89d0ff6989 Consistent testing (#137)
* Consistent GPU classes

Convert the existing GPU classes into one standard format.

Remove duplicated functions in `test_mnist` and create a TestMNISTGPU
class. This reduces line count and ensures consistency.

Use `@unittest.skipUnless(GPU, "Requires GPU")` instead of `if GPU:` to
skip GPU testing. This will ensure that skipped tests are displayed
accordingly in the pytest output.

* Optim Testing now supports GPU

* Tensor testing now supports GPU

jacobian and gradcheck auto skipped until GPU float64 support added.

* GPU support for custom constructor methods

* Remove GPU flag from Model constructors

It was requested that the `gpu` kwarg be removed from the model
constructor. GPU conversion is now handled in the train function.

This also required the conversion of Optimizer parameters as they are
constructed prior to execution of the `train` function and are dependant
on the model GPU state.

* Fix typo: float32->float64

* Clean `get_parameters` utility

Just a quick refactor w/ the new support for optimizers.

* Remove GPU kwarg from TinyNet

Remove `gpu` kwarg from tiny net to match test_mnist `train` function.
2020-12-09 02:25:27 -08:00
Liam
34b38dd4d0 Extra install requirements. (#164)
* Testing install requirements

* GPU install requirements
2020-12-09 02:22:47 -08:00
George Hotz
0e02f394ee serious_mnist 2020-12-08 21:43:05 -08:00
Daulet
24d688c184 win more lines for core library (#158)
...and sacrifice test speed
2020-12-08 14:18:45 -08:00
NeuralLink
9f77fd6135 🔨 refactor optim (#156)
* 🔨 refactor optim

* 🔨 refactor optim

* 🔨 more clean up
2020-12-08 14:16:31 -08:00
George Hotz
4e1a0de392 fix rsub 2020-12-08 10:05:21 -08:00
George Hotz
c4540f1b8c Support scalars by kartik4949 2020-12-08 09:52:07 -08:00
George Hotz
97fd9c1237 zero_grad there to match readme 2020-12-07 23:12:18 -08:00
George Hotz
c63f950348 need zero grad now 2020-12-07 23:10:43 -08:00