Commit Graph

19 Commits

Author SHA1 Message Date
George Hotz
f5467cfedc Devicebufferless (#708)
* runs one metal kernel

* conv2d works

* ops tests are passing

* const folding

* all ops work

* pre commit always passes

* torch works

* working still

* fix graph test

* tests passing

* image almost works

* image conv works

* most images

* fix custom

* fix assignment

* fix compile enet

* clean up comments

* fix realize return value

* include shapetracker in LB repr

* copy should make a copy

* reenable method cache

* fix lna

* dtypes in graph

* forward only for IMAGE=2

* simple realize

* getting close

* fixup new api, it's good except the kernel count

* back to 197 kernels

* tests should pass

* go to a real float

* no type_on_cpu

* fix the docs

* put shapetracker back in it's proper place
2023-03-18 14:40:23 -07:00
George Hotz
3a8af99adb i understand ClassVar now 2023-03-15 09:00:25 -07:00
Pasan Perera
df48753692 fixed the import error for latest changes in master (#705) 2023-03-15 08:59:42 -07:00
George Hotz
54f499b623 Move rawbuffer (#697)
* move GlobalCounters to helpers

* that's not part of the public api

* move InterpretedBuffer

* remove fromCPU from devicebuffer
2023-03-13 22:30:36 -07:00
George Hotz
cbc5a7222a symbolic is now a 6/10 due to the infinite loop. do better. 2023-03-13 00:07:59 -07:00
George Hotz
c594a0a835 fix flip bug, add new unit tests 2023-03-12 23:55:31 -07:00
George Hotz
ce1564b05e fix shapetracker test 2023-03-12 22:33:25 -07:00
George Hotz
153cce0f7e tutorial 2023-03-12 22:31:46 -07:00
George Hotz
8d16ebaea7 we have docs: 2023-03-12 19:05:44 -07:00
George Hotz
0ba6179de7 stable diffusion in readme 2022-09-05 18:51:56 -07:00
George Hotz
81c9438ea1 keepdim avoids reshapes 2022-06-05 15:56:42 -07:00
George Hotz
7a3fe34db1 GPU llops 2022-06-05 13:49:39 -07:00
George Hotz
2097d814f6 Sum doesn't need to save the tensor 2022-06-05 12:04:51 -07:00
George Hotz
fc6597a6d9 only resnet18, it's too slow otherwise 2021-10-30 16:48:39 -07:00
George Hotz
2075fdeb4f FPGA Based Accelerator for Tinygrad (#258)
* ops_risk

* risk sim

* guessing is for winners

* minor

* better

* matmal with risk

* conv doesn't work

* closer

* conv2d works

* ops_risk

* opt2 works

* opt1 may not be possible

* opt1 is a mulacc

* arty

* attosoc example building on mac

* minor

* riscv assembler

* gucci gang

* we got C code

* not a scam

* hello

* make risk mergeable into master

* unop support
2021-06-07 17:45:09 -07:00
George Hotz
1ae0e88627 nvidia notes 2021-05-26 14:27:00 -07:00
Skosh
78aa147b39 [WIP] YOLO working on tinygrad! (#245)
* Some progress on yolov3

* Removed some debugging comments… Also, the forward pass eats all RAM for some reason

* forward pass almost runs

* forward pass runs almost

* forward pass runs, now we gotta load the weights

* loading weights works

* fetches config and weights

* everything kind of works, postprocessing of output still needs to be implemented, temp_process_results kind of works, but its kind of terrible, and not how things should be done

* some changes

* fixed some bugs in the forward pass and load_weights function, now outputs more correct values, however some values are still loaded incorrectly

* Something is wrong with the forward pass, Conv2d tests added

* forward pass almost outputs correct values, gotta fix one more thign

* yolo works

* some final changes

* reverting changes

* removed dataloader

* fixed some indentation

* comment out failing test, somehow it fails CI even though it passes on my computer…

* fixed wrong probabilities

* added webcam option to YOLO, now just need to add bounding boxes and speed it up

* some progress towards adding bounding boxes

* trying to speed up yolo layer on GPU, still faster on CPU but with 30GB ram usage

* Faster inference times, bounding boxes added correctly, webcam works, but is slow, and there is a memory leak when running on CPU... Also added tinygrads output on the classic dog image

* removed some debugging print statements

* updated result image

* something weird is going on, mean op on GPU tensor randomly faults, copying a tensor from GPU->CPU takes 10+ seconds…
2021-04-25 18:06:52 -07:00
NeuralLink
1a1c63a08b Gan is real...Look what tiny just generated! (#192)
* mode collapse solved

* info add

* delete unnecessary imports

* readme
2020-12-13 20:23:12 -08:00
=
6b44a7f729 adds beautiful and meaningful logo 2020-10-26 18:12:49 +01:00