Commit Graph

11106 Commits

Author SHA1 Message Date
George Hotz
d3f169b267 move good models to models, add a training step test 2021-06-19 11:24:15 -07:00
George Hotz
b48d4bad2e clean up print spam 2021-06-19 10:31:04 -07:00
Jacky Lee
3a91d5434f Add dropout test (#265)
* Add dropout test

* Remove condition where training is false

* Skip dropout test when on GPU

* Revert changes to tensor.py and fix test case

* Revert change on whitespace

* Convert Tensor to cpu for testing

* Fix whitespace in tensor.py
2021-06-19 08:49:13 -07:00
George Hotz
ca0a38f2d5 more tpu notes 2021-06-18 13:28:06 -07:00
George Hotz
027535d0b5 microcoded matmul 2021-06-17 21:03:08 -07:00
George Hotz
d1dd9b46f6 if i was giving systolic arrays thumbs like siskyl and ebert (RIP) i would give them 0 2021-06-17 19:48:58 -07:00
George Hotz
026e2ae6a7 three registers and a zero command 2021-06-17 17:09:18 -07:00
George Hotz
2e71ae33f6 max op works 2021-06-17 17:01:21 -07:00
George Hotz
9e12c1bbba cherry binop 2021-06-17 16:50:40 -07:00
George Hotz
fcdabea880 training mnist with cherry ops 2021-06-17 16:45:35 -07:00
George Hotz
2affd226b3 speed up sum 2021-06-17 16:38:34 -07:00
George Hotz
e8eb7d1b7e max op 2021-06-17 16:20:56 -07:00
George Hotz
c1d469d440 sum op 2021-06-17 16:19:35 -07:00
George Hotz
d6517a8a7c ins 2021-06-16 19:31:13 -07:00
George Hotz
29a08ba352 pytorch earlier 2021-06-16 12:24:21 -07:00
George Hotz
4a07b71731 update business model 2021-06-16 12:01:50 -07:00
George Hotz
d29b16e5b4 more business notes 2021-06-16 11:47:57 -07:00
George Hotz
b1000d866e readme, plus reduce ops 2021-06-16 11:21:06 -07:00
George Hotz
ff3fdc58e5 risk -> cherry 2021-06-16 09:59:48 -07:00
George Hotz
2f91c012eb build note 2021-06-15 22:41:41 -07:00
George Hotz
0c02b66259 more 2021-06-15 15:02:32 -07:00
George Hotz
1e62e45d67 better todo 2021-06-15 10:30:16 -07:00
George Hotz
9ca4388695 debug 2021-06-15 10:24:21 -07:00
George Hotz
3d44aab52c more 2021-06-15 10:23:57 -07:00
George Hotz
4850d6eb43 update todo 2021-06-15 10:22:39 -07:00
George Hotz
4e1edb3692 have tinygrad log the loads 2021-06-14 18:35:14 -07:00
George Hotz
93f2e9769d little note 2021-06-14 15:49:41 -07:00
Jacky Lee
611d81dcb4 Add asserts for non-zero indices (#264) 2021-06-13 21:14:46 -07:00
George Hotz
508ced114c readme 2021-06-13 17:17:44 -07:00
Dinesh Kumar Gnanasekaran
2146860307 fixed OpenCL installation while running tests (#262)
Co-authored-by: dinesh <dinesh-GDK>
2021-06-12 11:14:21 -07:00
George Hotz
a89d12d735 wow, way faster 2021-06-10 17:11:39 -07:00
George Hotz
10b1306525 binops 2021-06-10 16:52:37 -07:00
George Hotz
4535d39baa comments and pow 2021-06-10 09:03:40 -07:00
George Hotz
2075fdeb4f FPGA Based Accelerator for Tinygrad (#258)
* ops_risk

* risk sim

* guessing is for winners

* minor

* better

* matmal with risk

* conv doesn't work

* closer

* conv2d works

* ops_risk

* opt2 works

* opt1 may not be possible

* opt1 is a mulacc

* arty

* attosoc example building on mac

* minor

* riscv assembler

* gucci gang

* we got C code

* not a scam

* hello

* make risk mergeable into master

* unop support
2021-06-07 17:45:09 -07:00
George Hotz
77ba198b57 Revert "Update README.md (#259)" (#260)
This reverts commit 5a69c5db6d.
2021-06-04 14:41:41 -07:00
Gabriel Rojas
5a69c5db6d Update README.md (#259) 2021-06-04 14:41:07 -07:00
Josh Smith
ad756f6112 minor optimizations & cleaning (#257)
* use isinstance, some optimizations & whitespace removal

* revert whitespace changes

* revert more whitespace

* some more cleanup

* revert fstring (not a fan of the {{}})

* fix typo

* fix typo
2021-06-02 09:57:15 -07:00
George Hotz
74e874cc0d comment 2021-05-26 18:06:55 -07:00
George Hotz
343c5f13c7 add output shape to DEBUG 2021-05-26 17:42:38 -07:00
George Hotz
b80cacb416 fix GPU efficientnet example 2021-05-26 17:29:35 -07:00
George Hotz
1ae0e88627 nvidia notes 2021-05-26 14:27:00 -07:00
20kdc
2653d33292 vgg7 (image upscaling) implementation - not the best, but it works (#255)
* vgg7 implementation - not the best, but it works

* VGG7 implementation: Spread nansbane to deter NaNs, maybe improved training experience

* VGG7 implementation: Fix training, for real this time

Results actually attempt to approximate the input

* VGG7 implementation: Sample probability management
2021-05-12 23:48:51 -07:00
Skosh
81bf933a91 Improved __getitem__ (#254)
* Some progress on yolov3

* Removed some debugging comments… Also, the forward pass eats all RAM for some reason

* forward pass almost runs

* forward pass runs almost

* forward pass runs, now we gotta load the weights

* loading weights works

* fetches config and weights

* everything kind of works, postprocessing of output still needs to be implemented, temp_process_results kind of works, but its kind of terrible, and not how things should be done

* some changes

* fixed some bugs in the forward pass and load_weights function, now outputs more correct values, however some values are still loaded incorrectly

* Something is wrong with the forward pass, Conv2d tests added

* forward pass almost outputs correct values, gotta fix one more thign

* yolo works

* some final changes

* reverting changes

* removed dataloader

* fixed some indentation

* comment out failing test, somehow it fails CI even though it passes on my computer…

* fixed wrong probabilities

* added webcam option to YOLO, now just need to add bounding boxes and speed it up

* some progress towards adding bounding boxes

* trying to speed up yolo layer on GPU, still faster on CPU but with 30GB ram usage

* Faster inference times, bounding boxes added correctly, webcam works, but is slow, and there is a memory leak when running on CPU... Also added tinygrads output on the classic dog image

* removed some debugging print statements

* updated result image

* something weird is going on, mean op on GPU tensor randomly faults, copying a tensor from GPU->CPU takes 10+ seconds…

* Improved __getitem__

* Updated

* Updated __getitem__

* Linebreaks

* Maybe this works?

* Added MNIST locally, tests run now
2021-05-05 22:15:22 -07:00
Skosh
78aa147b39 [WIP] YOLO working on tinygrad! (#245)
* Some progress on yolov3

* Removed some debugging comments… Also, the forward pass eats all RAM for some reason

* forward pass almost runs

* forward pass runs almost

* forward pass runs, now we gotta load the weights

* loading weights works

* fetches config and weights

* everything kind of works, postprocessing of output still needs to be implemented, temp_process_results kind of works, but its kind of terrible, and not how things should be done

* some changes

* fixed some bugs in the forward pass and load_weights function, now outputs more correct values, however some values are still loaded incorrectly

* Something is wrong with the forward pass, Conv2d tests added

* forward pass almost outputs correct values, gotta fix one more thign

* yolo works

* some final changes

* reverting changes

* removed dataloader

* fixed some indentation

* comment out failing test, somehow it fails CI even though it passes on my computer…

* fixed wrong probabilities

* added webcam option to YOLO, now just need to add bounding boxes and speed it up

* some progress towards adding bounding boxes

* trying to speed up yolo layer on GPU, still faster on CPU but with 30GB ram usage

* Faster inference times, bounding boxes added correctly, webcam works, but is slow, and there is a memory leak when running on CPU... Also added tinygrads output on the classic dog image

* removed some debugging print statements

* updated result image

* something weird is going on, mean op on GPU tensor randomly faults, copying a tensor from GPU->CPU takes 10+ seconds…
2021-04-25 18:06:52 -07:00
ziofil
155ec1f18e saving 50 LOC with automatic @staticmethod for forward and backward (#252)
* automatic @staticmethod for forward and backward

* triggering unit tests
2021-04-25 18:04:16 -07:00
freedom" Koan-Sin Tan
f0cc2b66f8 add an aneccompile example in Objective-C (#240)
* add an aneccompile example in Objective-C

add a compile.m corresponding to compile.mm

build with
```clang compile.m -F /System/Library/PrivateFrameworks/ -framework ANECompiler -framework Foundation```

CoreFoundation framework is a C library.
Foundation is an Objective-C framework.

CF data structures in CoreFoundation usually have corresponding NS data structures in Foundation, e.g.,
NSDictionary is "toll-free bridged" with its Core Foundation counterpart, CFDictionary.
See [1].

[1] https://developer.apple.com/library/archive/documentation/General/Conceptual/CocoaEncyclopedia/Toll-FreeBridgin/Toll-FreeBridgin.html

* figure out how to use param_3 of ANECCompile

add a simple param_3 blocks callback, which dumps the status
dictionary when status != 0
2021-01-31 08:31:16 -08:00
Göktuğ Karakaşlı
eabe0b9017 remove deepwalk args (#243) 2021-01-31 08:30:17 -08:00
George Hotz
ce77dda805 yolov5 v4 2021-01-05 07:56:17 -08:00
George Hotz
62e3a8558c fix tolerance maybe 2021-01-05 07:45:47 -08:00
Asim
1c148f2fe4 fixed example broken after gpu refactor (#238) 2021-01-05 07:41:54 -08:00