Commit Graph

4505 Commits

Author SHA1 Message Date
George Hotz
70561f3d90 way over the line limit 2022-06-08 09:36:31 -07:00
George Hotz
4f7ee235c5 not a real test now 2022-06-08 09:00:59 -07:00
George Hotz
ae33060dae early float4 stuff for binary 2022-06-08 08:59:54 -07:00
George Hotz
82f29b5dbf better GPU block 2022-06-08 08:01:04 -07:00
George Hotz
42ae78241e only run test on GPU 2022-06-08 07:54:40 -07:00
George Hotz
cdf4b5f142 opencl perf test 2022-06-08 07:49:08 -07:00
George Hotz
d8ee8a39ac sgd threestep graph is so pretty 2022-06-06 09:45:37 -07:00
George Hotz
c143c92828 adam threestep 2022-06-06 09:38:28 -07:00
George Hotz
d302049e53 don't use div 2022-06-06 09:25:31 -07:00
George Hotz
a1dff4061b minor cleanups 2022-06-06 08:14:52 -07:00
George Hotz
3dac8fa728 this fix the gc 2022-06-05 17:16:40 -07:00
George Hotz
0ee21ba115 add ViT test and car 2022-06-05 17:12:43 -07:00
George Hotz
1de75b67d5 fix bug in graph with use of id 2022-06-05 16:31:20 -07:00
George Hotz
f0fe37bd34 simpler graph demo 2022-06-05 12:40:12 -07:00
George Hotz
88de42fb6e document graph mode 2022-06-05 12:13:05 -07:00
George Hotz
845bb1fc34 bs 4 -> 2 in training test 2022-01-15 21:34:21 -08:00
George Hotz
c0d1254003 don't run unneeded grads 2022-01-15 21:32:13 -08:00
George Hotz
8ba3d1f803 fix bn test, affine is True 2022-01-15 19:52:15 -08:00
George Hotz
e28cdfb0cf clean up resnet 2021-11-30 16:14:54 -05:00
George Hotz
46bbbcf7f0 model touchups 2021-11-30 11:13:34 -05:00
George Hotz
bd21304e3c linear takes in weight and bias 2021-11-30 00:38:47 -05:00
George Hotz
de938c2d9d vit is now tested 2021-11-30 00:23:06 -05:00
George Hotz
58ed46963e fix broadcastdot 2021-11-29 18:54:57 -05:00
George Hotz
dca076dbf1 remove dumb nn ops 2021-11-29 18:05:31 -05:00
George Hotz
f909ab194f gelu with broken test 2021-11-29 15:00:50 -05:00
George Hotz
c752033283 fix GPU OOM in test 2021-11-29 13:05:59 -05:00
George Hotz
99b6051467 add ff_dim to transformer 2021-11-29 12:40:52 -05:00
George Hotz
29dee59368 cat: forward only not required 2021-11-29 00:14:56 -05:00
George Hotz
3cdc77f526 add cat support 2021-11-28 23:21:49 -05:00
George Hotz
ce3d198bb7 less lines and fix default device 2021-11-27 11:18:49 -05:00
George Hotz
7ae14179d3 refactor ops 2021-11-27 11:12:23 -05:00
George Hotz
c162e748f5 fix float64 warning on training 2021-10-30 20:07:31 -07:00
George Hotz
b0f14b4af8 move datasets into datasets 2021-10-30 19:55:50 -07:00
George Hotz
7472a7ebe2 not forcing 3.9 for a stupid type 2021-10-30 16:52:40 -07:00
George Hotz
fc6597a6d9 only resnet18, it's too slow otherwise 2021-10-30 16:48:39 -07:00
Evan Mays
285621aeda Cherry backprop for conv2d (#281)
* quick math: 0 + x = x.

* gradient w.r.t. x using cherry for conv

* gradient w.r.t. w for conv on cherry but doing vector dot products

* small optimization

* [cherry] optimize conv backpass for large channel count

* get rid of numpy einsum
2021-10-30 16:12:19 -07:00
Sebastian Kreft
8113eec4cf feat: add efficientnet test (#285)
Simple test using the Chicken example from https://upload.wikimedia.org/wikipedia/commons/4/41/Chicken.jpg and the image preprocessing from example/efficientnet.py

Note that EfficientNet loads the weights from the internet so running the tests may be slow the first time. We could speed up the tests by caching the /tmp folder.

Fixes #234
2021-10-30 15:53:51 -07:00
Guglielmo Camporese
2b7589db64 Added ResNet-{18, 34, 50, 101, 152} (#271)
* added resnets

* fix minor

* fix minor

* resnet in models

* added resnet test

* added resnet train test

* added linear, conv2d nn tests

* fix minor in extra/training

* resnet in models

* fix minor

* fix tolerance for linear in nn test

* fix eval, this causes cpu and gpu UT failing

* revert transformer test

* fix minor for CPU test

* improved model get_params for sequential layer

* fix minor for params counting

* commented broken ops tests

* improved train for resnet
2021-06-21 09:37:24 -07:00
George Hotz
89798d2f43 some flags 2021-06-19 11:46:31 -07:00
George Hotz
d3f169b267 move good models to models, add a training step test 2021-06-19 11:24:15 -07:00
Jacky Lee
3a91d5434f Add dropout test (#265)
* Add dropout test

* Remove condition where training is false

* Skip dropout test when on GPU

* Revert changes to tensor.py and fix test case

* Revert change on whitespace

* Convert Tensor to cpu for testing

* Fix whitespace in tensor.py
2021-06-19 08:49:13 -07:00
George Hotz
2affd226b3 speed up sum 2021-06-17 16:38:34 -07:00
George Hotz
c1d469d440 sum op 2021-06-17 16:19:35 -07:00
George Hotz
2075fdeb4f FPGA Based Accelerator for Tinygrad (#258)
* ops_risk

* risk sim

* guessing is for winners

* minor

* better

* matmal with risk

* conv doesn't work

* closer

* conv2d works

* ops_risk

* opt2 works

* opt1 may not be possible

* opt1 is a mulacc

* arty

* attosoc example building on mac

* minor

* riscv assembler

* gucci gang

* we got C code

* not a scam

* hello

* make risk mergeable into master

* unop support
2021-06-07 17:45:09 -07:00
Skosh
81bf933a91 Improved __getitem__ (#254)
* Some progress on yolov3

* Removed some debugging comments… Also, the forward pass eats all RAM for some reason

* forward pass almost runs

* forward pass runs almost

* forward pass runs, now we gotta load the weights

* loading weights works

* fetches config and weights

* everything kind of works, postprocessing of output still needs to be implemented, temp_process_results kind of works, but its kind of terrible, and not how things should be done

* some changes

* fixed some bugs in the forward pass and load_weights function, now outputs more correct values, however some values are still loaded incorrectly

* Something is wrong with the forward pass, Conv2d tests added

* forward pass almost outputs correct values, gotta fix one more thign

* yolo works

* some final changes

* reverting changes

* removed dataloader

* fixed some indentation

* comment out failing test, somehow it fails CI even though it passes on my computer…

* fixed wrong probabilities

* added webcam option to YOLO, now just need to add bounding boxes and speed it up

* some progress towards adding bounding boxes

* trying to speed up yolo layer on GPU, still faster on CPU but with 30GB ram usage

* Faster inference times, bounding boxes added correctly, webcam works, but is slow, and there is a memory leak when running on CPU... Also added tinygrads output on the classic dog image

* removed some debugging print statements

* updated result image

* something weird is going on, mean op on GPU tensor randomly faults, copying a tensor from GPU->CPU takes 10+ seconds…

* Improved __getitem__

* Updated

* Updated __getitem__

* Linebreaks

* Maybe this works?

* Added MNIST locally, tests run now
2021-05-05 22:15:22 -07:00
Skosh
78aa147b39 [WIP] YOLO working on tinygrad! (#245)
* Some progress on yolov3

* Removed some debugging comments… Also, the forward pass eats all RAM for some reason

* forward pass almost runs

* forward pass runs almost

* forward pass runs, now we gotta load the weights

* loading weights works

* fetches config and weights

* everything kind of works, postprocessing of output still needs to be implemented, temp_process_results kind of works, but its kind of terrible, and not how things should be done

* some changes

* fixed some bugs in the forward pass and load_weights function, now outputs more correct values, however some values are still loaded incorrectly

* Something is wrong with the forward pass, Conv2d tests added

* forward pass almost outputs correct values, gotta fix one more thign

* yolo works

* some final changes

* reverting changes

* removed dataloader

* fixed some indentation

* comment out failing test, somehow it fails CI even though it passes on my computer…

* fixed wrong probabilities

* added webcam option to YOLO, now just need to add bounding boxes and speed it up

* some progress towards adding bounding boxes

* trying to speed up yolo layer on GPU, still faster on CPU but with 30GB ram usage

* Faster inference times, bounding boxes added correctly, webcam works, but is slow, and there is a memory leak when running on CPU... Also added tinygrads output on the classic dog image

* removed some debugging print statements

* updated result image

* something weird is going on, mean op on GPU tensor randomly faults, copying a tensor from GPU->CPU takes 10+ seconds…
2021-04-25 18:06:52 -07:00
George Hotz
62e3a8558c fix tolerance maybe 2021-01-05 07:45:47 -08:00
George Hotz
8a38e0d207 only mish failed 2021-01-03 09:47:11 -08:00
George Hotz
1a4487965a remove negative from things w/o negative 2021-01-03 09:43:34 -08:00
George Hotz
0702e0c763 nah, no sign, it's not what you want. use relu 2021-01-03 09:30:33 -08:00