Commit Graph

11106 Commits

Author SHA1 Message Date
dustcollector12
ee99d016e9 tensor implementation for rmsprop and adam (#121)
* tensor implementation for rmsprop and adam

* test_mnist.py extended to cover sgd, rmsprop and adam on cpu and gpu

* number of steps reduced for adam from 1000 to 200
2020-11-16 15:07:49 -08:00
George Hotz
17bf90dbe4 unbroadcasting works on the GPU 2020-11-16 09:16:55 -08:00
George Hotz
17eab716b6 unbroadcast GPU template 2020-11-16 08:16:36 -08:00
George Hotz
2ffb8de1ea move efficientnet to extra 2020-11-16 08:08:07 -08:00
George Hotz
13d34373d1 move gradcheck to extra, clean up unbroadcast 2020-11-16 08:03:31 -08:00
George Hotz
ed4c35e2e9 channels on the inside 2020-11-15 21:19:59 -08:00
adamritter
fb1df81c7d Fix train_efficientnet (#120)
Co-authored-by: holonomicjl <58403584+holonomicjl@users.noreply.github.com>
2020-11-15 20:50:31 -08:00
George Hotz
1207fe4c7d cleanup LogSoftmax 2020-11-15 20:49:57 -08:00
George Hotz
d1441de3a6 minor cleanups 2020-11-15 20:39:19 -08:00
George Hotz
37a210f868 touchups and lines 2020-11-15 20:26:52 -08:00
adamritter
5ea3d76dfb Topological sort, zero_grads (#119)
* Topological sort, zero_grads

* Bug fix, add test

* Add zero_grads

* Put deepwalk function in backward

* Move zero_grad to optim

* Fix gradcheck hack

Co-authored-by: holonomicjl <58403584+holonomicjl@users.noreply.github.com>
2020-11-15 20:25:29 -08:00
George Hotz
a35425189d binop fast path for no broadcast 2020-11-15 19:12:14 -08:00
Marcel Bischoff
c7b7f8ccc8 Backwards ops supporting broadcasting (#118)
* streamlined numerical_jacobian

* Got rid of the g loop in Conv2D.forward

* ereased stupid line

* nothing

* no loops in Conv2D forward

* Conv2D backprop improved

* stupid things in examples

* alternative to einsum

* Conv2D backward einsum alternative

* tidying up

* tidied up

* no ravel

* got rid of print

* Update efficientnet.py

* Update efficientnet.py

* Update efficientnet.py

* only tensordot

* 255.0

* whitespace

* aspect ratio error in efficientnet

* noprint

* efficient net wrong strides

* broadcasting for backward ops

* Update ops.py

* Update ops.py

- was wrong

* broadcast test for backward enabled

* function adBC + not summing over already 1 axis

* spacing

Co-authored-by: Marcel Bischoff <marcel@Marcels-iMac.local>
2020-11-15 15:21:10 -08:00
adamritter
55d93017e4 Simplify more (#117)
Co-authored-by: holonomicjl <58403584+holonomicjl@users.noreply.github.com>
2020-11-14 06:15:31 -08:00
dustcollector12
28474949b8 refactoring of forward in reshape (#115)
* refactoring of forward in reshape

* test case for reshape added
2020-11-13 13:20:43 -08:00
dustcollector12
6f033ea30a enable local images for efficientnet.py (#116) 2020-11-13 07:00:12 -08:00
pb1729
420af82888 General broadcasting of binary operations (#114)
* allow for general broadcasting of binary operations. can handle any situation where corresponding dimensions between the tensors match, or at least one of them is of size 1. if a tensor has fewer dimensions than the other, then its size is padded with 1s until they match have the same number. also refactored buffer_zeros() by creating a function buff() that makes a buffer from a numpy array

* remove extra tabs

Co-authored-by: phillip <phillip_bement@reedbement.com>
2020-11-12 22:27:48 -08:00
damianzim
2b1286eef6 Don't wrap np.int32 in a function, use an alias (#113) 2020-11-12 19:32:19 -08:00
adamritter
08aa60d9d0 broadcasting 1s at the start, 1 kernel/4 divs version (#110)
* Pad2d backward pass on GPU

* Faster Pad2D GPU backward pass (no zeroing needed)

* Fix out of bounds error

* Don't save prg

* Let compiler optimize division by 1

* More generic broadcasting (1s at the start)

* Bug fix

* Add comment

* Try to fix flaky test with other method

* Add mixed broadcast support

* 1kernel

* Separate broadcast tests

Co-authored-by: holonomicjl <58403584+holonomicjl@users.noreply.github.com>
2020-11-12 13:33:35 -08:00
NeuralLink
f773ef3996 tanh non first class op (#111)
*  tanh non first class op

* tanh test with 1e-6 tol

Co-authored-by: Kartik Sharma <kartik.sharma@claimgenius.com>
2020-11-12 13:32:50 -08:00
Ryan Neph
608bdd4872 adds broadcasting test cases (#106)
refs: #80, #90, #104, #105
2020-11-12 07:08:28 -08:00
adamritter
f1d21afe88 Somewhat more generic broadcasting (#105)
* Somewhat more generic broadcasting

* Add TODO

* Set Torch to deterministic in test

Co-authored-by: holonomicjl <58403584+holonomicjl@users.noreply.github.com>
2020-11-11 20:33:00 -08:00
Ryan Neph
8827a536e0 GPU MaxPool2D.backward(); TinyConvNet train passes (#103)
* no trailing whitespace

* GPU MaxPool2D.backward(); TinyConvNet train passes!

* Fix GPU avgpool.forward() init_val

Doesn’t change result but is simpler.

* Fix MaxPool GPU init_val

Tests only cover random non-negative inputs. This fixes issues if negative inputs are fed to GPU MaxPool2D. Test update to follow.
2020-11-11 07:58:43 -08:00
Marcel Bischoff
a3989f9e18 Supporting .png files in efficientnet (#102)
* to make it work locally

* definitely not working

* Conv2D GPU passes some of the tests

* Conv2D GPU passes more of the tests

* passes some tests and mnist

* removed unecessary code

* Conv2D Backpass works

* wrong test_ops.py

* white space + test backward

* ereased useless code

* removed default argument

* long lines

* works also with 4 channel .png files

* commenting out

* track
2020-11-10 20:06:24 -08:00
George Hotz
d93cd945aa reshape makes copies 2020-11-10 16:18:59 -08:00
George Hotz
d1284fa817 stride tests and i32 2020-11-10 16:10:14 -08:00
Marcel Bischoff
7bb803c5e0 Conv2D backward on GPU (#93)
* to make it work locally

* definitely not working

* Conv2D GPU passes some of the tests

* Conv2D GPU passes more of the tests

* passes some tests and mnist

* removed unecessary code

* Conv2D Backpass works

* wrong test_ops.py

* white space + test backward

* ereased useless code

* removed default argument

* long lines
2020-11-10 16:07:33 -08:00
George Hotz
5577b9d3a0 clean up imports 2020-11-10 15:53:05 -08:00
George Hotz
db755fa103 promote swish to a tensor ops 2020-11-10 15:48:11 -08:00
George Hotz
5f4b76a21b touch ups 2020-11-10 15:44:47 -08:00
George Hotz
52ee913c98 move the mnist loader out of tinygrad proper 2020-11-10 15:37:39 -08:00
George Hotz
498b4d2f27 i32 and reduce line count a bit 2020-11-10 15:35:30 -08:00
George Hotz
df64658a2c weee, opencl tests in CI 2020-11-10 10:04:45 -08:00
George Hotz
d47a128812 pocl 2020-11-10 10:02:13 -08:00
George Hotz
c05401a9ca sudo maybe 2020-11-10 09:53:49 -08:00
George Hotz
09bc8eddfe clinfo 2020-11-10 09:51:38 -08:00
George Hotz
58e703d099 fix tests 2020-11-10 09:49:19 -08:00
George Hotz
23405cec43 intel opencl 2020-11-10 09:41:40 -08:00
George Hotz
33090c4b0d install more 2020-11-10 09:34:56 -08:00
George Hotz
a52590e76c cpu opencl maybe 2020-11-10 09:32:54 -08:00
George Hotz
f513302955 refactor profiler 2020-11-10 07:31:16 -08:00
adamritter
f27628b21c No separate pad2d kernel needed (#99)
Co-authored-by: holonomicjl <58403584+holonomicjl@users.noreply.github.com>
2020-11-10 06:47:53 -08:00
George Hotz
2d4a5d5950 readme 2020-11-10 01:27:04 -08:00
George Hotz
6e6bcbe5f2 shapes on backward 2020-11-10 01:23:31 -08:00
Ryan Neph
56f71ae8e5 Cleanup (#96)
* init GPU supsample retbuf to 0

* reduce GPU kernel source lines

ref: #94
2020-11-10 01:20:04 -08:00
George Hotz
55012d21bb debug in backward pass too 2020-11-10 01:19:52 -08:00
George Hotz
5d1985312c miniprofiler is real 2020-11-10 01:05:29 -08:00
George Hotz
c76a20b4be 4s and 7s work 2020-11-10 00:54:17 -08:00
George Hotz
ae0cd17c2d debug is env var, and simpler faster pad2d 2020-11-10 00:42:23 -08:00
George Hotz
f7d10d5639 DEBUG flag 2020-11-10 00:36:59 -08:00