Commit Graph

4667 Commits

Author SHA1 Message Date
George Hotz
89798d2f43 some flags 2021-06-19 11:46:31 -07:00
George Hotz
d3f169b267 move good models to models, add a training step test 2021-06-19 11:24:15 -07:00
Jacky Lee
3a91d5434f Add dropout test (#265)
* Add dropout test

* Remove condition where training is false

* Skip dropout test when on GPU

* Revert changes to tensor.py and fix test case

* Revert change on whitespace

* Convert Tensor to cpu for testing

* Fix whitespace in tensor.py
2021-06-19 08:49:13 -07:00
George Hotz
2affd226b3 speed up sum 2021-06-17 16:38:34 -07:00
George Hotz
c1d469d440 sum op 2021-06-17 16:19:35 -07:00
George Hotz
2075fdeb4f FPGA Based Accelerator for Tinygrad (#258)
* ops_risk

* risk sim

* guessing is for winners

* minor

* better

* matmal with risk

* conv doesn't work

* closer

* conv2d works

* ops_risk

* opt2 works

* opt1 may not be possible

* opt1 is a mulacc

* arty

* attosoc example building on mac

* minor

* riscv assembler

* gucci gang

* we got C code

* not a scam

* hello

* make risk mergeable into master

* unop support
2021-06-07 17:45:09 -07:00
Skosh
81bf933a91 Improved __getitem__ (#254)
* Some progress on yolov3

* Removed some debugging comments… Also, the forward pass eats all RAM for some reason

* forward pass almost runs

* forward pass runs almost

* forward pass runs, now we gotta load the weights

* loading weights works

* fetches config and weights

* everything kind of works, postprocessing of output still needs to be implemented, temp_process_results kind of works, but its kind of terrible, and not how things should be done

* some changes

* fixed some bugs in the forward pass and load_weights function, now outputs more correct values, however some values are still loaded incorrectly

* Something is wrong with the forward pass, Conv2d tests added

* forward pass almost outputs correct values, gotta fix one more thign

* yolo works

* some final changes

* reverting changes

* removed dataloader

* fixed some indentation

* comment out failing test, somehow it fails CI even though it passes on my computer…

* fixed wrong probabilities

* added webcam option to YOLO, now just need to add bounding boxes and speed it up

* some progress towards adding bounding boxes

* trying to speed up yolo layer on GPU, still faster on CPU but with 30GB ram usage

* Faster inference times, bounding boxes added correctly, webcam works, but is slow, and there is a memory leak when running on CPU... Also added tinygrads output on the classic dog image

* removed some debugging print statements

* updated result image

* something weird is going on, mean op on GPU tensor randomly faults, copying a tensor from GPU->CPU takes 10+ seconds…

* Improved __getitem__

* Updated

* Updated __getitem__

* Linebreaks

* Maybe this works?

* Added MNIST locally, tests run now
2021-05-05 22:15:22 -07:00
Skosh
78aa147b39 [WIP] YOLO working on tinygrad! (#245)
* Some progress on yolov3

* Removed some debugging comments… Also, the forward pass eats all RAM for some reason

* forward pass almost runs

* forward pass runs almost

* forward pass runs, now we gotta load the weights

* loading weights works

* fetches config and weights

* everything kind of works, postprocessing of output still needs to be implemented, temp_process_results kind of works, but its kind of terrible, and not how things should be done

* some changes

* fixed some bugs in the forward pass and load_weights function, now outputs more correct values, however some values are still loaded incorrectly

* Something is wrong with the forward pass, Conv2d tests added

* forward pass almost outputs correct values, gotta fix one more thign

* yolo works

* some final changes

* reverting changes

* removed dataloader

* fixed some indentation

* comment out failing test, somehow it fails CI even though it passes on my computer…

* fixed wrong probabilities

* added webcam option to YOLO, now just need to add bounding boxes and speed it up

* some progress towards adding bounding boxes

* trying to speed up yolo layer on GPU, still faster on CPU but with 30GB ram usage

* Faster inference times, bounding boxes added correctly, webcam works, but is slow, and there is a memory leak when running on CPU... Also added tinygrads output on the classic dog image

* removed some debugging print statements

* updated result image

* something weird is going on, mean op on GPU tensor randomly faults, copying a tensor from GPU->CPU takes 10+ seconds…
2021-04-25 18:06:52 -07:00
George Hotz
62e3a8558c fix tolerance maybe 2021-01-05 07:45:47 -08:00
George Hotz
8a38e0d207 only mish failed 2021-01-03 09:47:11 -08:00
George Hotz
1a4487965a remove negative from things w/o negative 2021-01-03 09:43:34 -08:00
George Hotz
0702e0c763 nah, no sign, it's not what you want. use relu 2021-01-03 09:30:33 -08:00
George Hotz
c2eeb6950b add support for sign. technically relu can be second class now 2021-01-03 08:29:57 -08:00
NeuralLink
0825cf7f79 Added softplus and mish non stable (#220)
*  Added softplus and mish CPU

* 🔨 refactor

* 🔨 second class softplus and mish

* 🔨 test fix

* no need of device in testing
2021-01-03 08:08:41 -08:00
Liam
ebd72ff437 Test split (#231)
* Split tests

Split tests into "Test CPU" and "Test GPU".

Add test flag "TEST_DEVICES" which is a comma separated list of devices:
CPU,GPU,ANE

* Run tests based on provided TEST_DEVICES flag

By default will run all "CPU,GPU,ANE"

* fix bad quote

* Revert changes and use GPU=1

This is done through setting the default Tensor Device to Device.CPU of
GPU=1 is set.

Run GPU tests: GPU=1 pytest -s -v
2021-01-01 09:19:03 -05:00
George Hotz
4291002881 reorder GPU ops 2020-12-31 09:46:39 -05:00
Marcel Bischoff
e2f833f58f max to behave on ties like torch (#229)
* checkpoint

* fixing pow

* undo pow

* backward max on GPU and CPU rewrite

* indentation

* changing seed for curiosity

* max replaced equality

* undo seed

* rebase

* fixed tests

* merge error
2020-12-30 18:52:50 -05:00
George Hotz
fcfe3dae01 write slice for CPU 2020-12-30 10:32:53 -05:00
George Hotz
f9170505b3 if you like your transformers twice as slow, use the GPU 2020-12-29 17:14:23 -05:00
George Hotz
6a6a82e999 support multidot on GPU 2020-12-29 16:56:30 -05:00
George Hotz
27208d729b add GPU max thanks to marcelbischoff 2020-12-29 16:44:14 -05:00
George Hotz
02655c07d5 break maxpool2d on GPU 2020-12-29 13:05:57 -05:00
George Hotz
061e37de39 touchups 2020-12-29 12:41:21 -05:00
George Hotz
a2e6562330 fix max op, less lines 2020-12-29 10:47:04 -05:00
Marcel Bischoff
dc8fa7999c Transpose on GPU (#221)
* 2serious

* load/save

* fixing GPU

* added DEBUG

* needs BatchNorm or doesn't learn anything

* old file not needed

* added conv biases

* added extra/training.py and checkpoint

* assert in test only

* save

* padding

* num_classes

* checkpoint

* checkpoints for padding

* training was broken

* merge

* rotation augmentation

* more aug

* needs testing

* streamline augment, augment is fast thus bicubic

* tidying up

* transformer eval

* axis=-1

* transpose

* test for permutation using torch.movedims

* another test

* line
2020-12-29 10:40:11 -05:00
George Hotz
36579f66bf max op 2020-12-28 23:54:52 -05:00
George Hotz
fafece9db7 avgpool2d is a second class op 2020-12-28 10:41:59 -05:00
George Hotz
593233b668 log and exp are first class ops 2020-12-28 10:00:30 -05:00
George Hotz
a361ef6861 fixup training loop 2020-12-27 18:35:56 -05:00
George Hotz
f15bec6dbc make multidot work on CPU 2020-12-27 17:25:37 -05:00
George Hotz
131e04c90c cpu only decorator 2020-12-27 17:18:55 -05:00
George Hotz
2f1b2c0a3b add transpose, start on transformer 2020-12-27 16:59:12 -05:00
iainwo
56d44637f3 fixed pylint, formatted python files iwth cblack on localhost (#204)
* fixed pylint, formatted python files iwth cblack on localhost

* Revert "fixed pylint, formatted python files iwth cblack on localhost"

This reverts commit 07e2b88466.

* dedented 4-spaces added linter

Co-authored-by: Iain Wong <iainwong@outlook.com>
2020-12-17 14:37:31 -08:00
Liam
bcf1518309 All devices are equal! (#196)
* Update all devices to be tested

ANE, CPU and OCL all now support all tests.

However tests are not currently passing on GPU and I cannot test on CPU.

Failing GPU test are not an issue caused by this update. Tests have not
been passing due to a missing "six" required installation.

OpenCL Tests have not been run since commit: 1a1c63a08b

devices have 3 types and are handle by a new DeviceTypes enum. (The goal
is to revert to Tensor.<type>, but this current setup allows for keyword
argument defaults: `device=DeviceType.CPU`)

All references to Tensor.GPU/CPU/ANE as been converted to the
corresponding `DeviceTypes` enum.

Refactor of the conversion code to allow for any device to any device
conversion.

* Add six dependency in requirements.txt

* Resolve failure to run tests

Move six into gpu required installs. Remove six from standard
installation.

* Remove repeated data conversion

* Refactor method names

Also reduce code with .to and .to_

* Dynamic device handlers

* Refactor DeviceTypes -> Device

* Add mem copy profiling back

* test_backward_pass_diamond_model passing

* Resolve Sum issue on GPU

* Revert batchnorm2d tests

* Update README with upadated API

* ANE testing with

* Last minute line gains
2020-12-15 23:44:08 -08:00
Marcel Bischoff
da72a0eed4 Big MNIST model with PIL augmentation and load/save (#160)
* 2serious

* load/save

* fixing GPU

* added DEBUG

* needs BatchNorm or doesn't learn anything

* old file not needed

* added conv biases

* added extra/training.py and checkpoint

* assert in test only

* save

* padding

* num_classes

* checkpoint

* checkpoints for padding

* training was broken

* merge

* rotation augmentation

* more aug

* needs testing

* streamline augment, augment is fast thus bicubic

* tidying up
2020-12-13 20:45:55 -08:00
George Hotz
1d10559d1d tinygrad.utils -> extra.utils 2020-12-12 15:26:07 -08:00
James Roberts
8e8cbc74b3 Minor clean up (#184)
* Removes unused imports

* Minor clean up
2020-12-11 14:25:29 -08:00
Daulet
c7e95ddb21 Add diamond model test (#181)
* add backward pass test for diamond model

* fix train_efficientnet example
2020-12-11 09:21:36 -08:00
Marcel Bischoff
5d46df638a abs as non-first class operation using relu (#171)
* abs (non-first class)

* whitespace
2020-12-09 12:20:34 -08:00
George Hotz
ffb96b2d0b batchnorm by marcelbischoff 2020-12-09 03:23:04 -08:00
NeuralLink
00e376f36c leaky relu as geohot suggested (#167) 2020-12-09 02:58:35 -08:00
George Hotz
c225e62dd2 touchups 2020-12-09 02:52:28 -08:00
Liam
89d0ff6989 Consistent testing (#137)
* Consistent GPU classes

Convert the existing GPU classes into one standard format.

Remove duplicated functions in `test_mnist` and create a TestMNISTGPU
class. This reduces line count and ensures consistency.

Use `@unittest.skipUnless(GPU, "Requires GPU")` instead of `if GPU:` to
skip GPU testing. This will ensure that skipped tests are displayed
accordingly in the pytest output.

* Optim Testing now supports GPU

* Tensor testing now supports GPU

jacobian and gradcheck auto skipped until GPU float64 support added.

* GPU support for custom constructor methods

* Remove GPU flag from Model constructors

It was requested that the `gpu` kwarg be removed from the model
constructor. GPU conversion is now handled in the train function.

This also required the conversion of Optimizer parameters as they are
constructed prior to execution of the `train` function and are dependant
on the model GPU state.

* Fix typo: float32->float64

* Clean `get_parameters` utility

Just a quick refactor w/ the new support for optimizers.

* Remove GPU kwarg from TinyNet

Remove `gpu` kwarg from tiny net to match test_mnist `train` function.
2020-12-09 02:25:27 -08:00
Daulet
24d688c184 win more lines for core library (#158)
...and sacrifice test speed
2020-12-08 14:18:45 -08:00
George Hotz
4e1a0de392 fix rsub 2020-12-08 10:05:21 -08:00
George Hotz
c4540f1b8c Support scalars by kartik4949 2020-12-08 09:52:07 -08:00
George Hotz
97fd9c1237 zero_grad there to match readme 2020-12-07 23:12:18 -08:00
George Hotz
b355cd2571 Mean axis (doesn't work) (#154)
* mean axis

* fixed
2020-12-07 22:58:34 -08:00
Marcel Bischoff
58ccebd7cd Sum with axis (#153)
* sum with axis and tests

* broken

* works again

* clean up

* Update test_ops.py
2020-12-07 21:49:18 -08:00
George Hotz
3b982f2f7a get_parameters 2020-12-06 13:47:28 -08:00