Commit Graph

1207 Commits

Author SHA1 Message Date
George Hotz
5ea44cefcc llama: add lexie personality 2023-03-11 10:23:33 -08:00
George Hotz
c908f911a7 llama defaults to metal on osx 2023-03-11 09:30:13 -08:00
George Hotz
5e1380df6a profiling llama + cache is_contiguous 2023-03-11 08:23:21 -08:00
George Hotz
f3ac52aee8 Mypyc (#680)
* building shapetracker

* default ENABLE_METHOD_CACHE

* symbolic compiles

* improve types

* tensor compiles

* oops, that's a bug

* best of both worlds

* find legit typing bugs

* pad2d can take list or tuple

* sub 200ms when compiled
2023-03-11 07:33:30 -08:00
George Hotz
b1206bcb18 third try at torch loading (#677)
* third try at torch loading

* numpy fixed

* fix enet compile

* load_single_weight supports empty weights

* oops, CPU wasn't the default

* so many bugs
2023-03-10 19:11:29 -08:00
George Hotz
8bf75a7fdd fix stable diffusion and CI 2023-03-10 17:48:12 -08:00
George Hotz
4780f9a6df llama runs (slowly) in master 2023-03-10 17:36:51 -08:00
jspieler
da7fb4b227 Fixed DDPG example (#667) 2023-03-09 11:49:52 -08:00
George Hotz
c22afc52db move the custom function example to a test 2023-03-08 10:05:04 -08:00
George Hotz
7d3b9d0e95 oops, things relied on that API. the global cache needs access to the ASTRunner class 2023-03-08 08:39:31 -08:00
George Hotz
4f957423c3 jitting custom ops + OPTLOCAL assignment bugfix 2023-03-08 08:30:37 -08:00
George Hotz
7285de41a1 tinygrad supports CUSTOM functions 2023-03-08 07:50:33 -08:00
Pankaj Doharey
9d97d97b26 Opens image in default viewer after saving. (#612) 2023-03-03 17:28:49 -08:00
George Hotz
2e26286294 speed like you wouldn't believe (#626)
* speed like you wouldn't believe

* fix tests
2023-03-02 07:49:19 -08:00
George Hotz
bfcec234a2 Refactor ASTs (#622)
* ugh worst branch name

* compiler refactor continues

* scc -> cloc

* buf -> _buf

* finish _buf, and program -> runtime

* gpu is still working, clang isn't

* clang in new style

* ops_metal

* something broke it

* improve metal

* clean up tons of cl crap

* hack fix sync

* cleaner gpu

* gpu metal clang

* cleanups

* minor refactor

* GPUCodegen

* fix up LLVM

* blind CUDA refactor

* codegen / runtime

* keep ops naming

* linter passes

* woah, llvm was allocing 4x what it needed to

* bugfixes

* fix openpilot compiler

* fix compile_efficientnet

* method cache should fix tests

* deal with duped functions
2023-03-01 18:57:29 -08:00
George Hotz
c4856aa193 fix yolo webcam 2023-02-26 17:24:05 -08:00
Jacky Lee
0f58c4c648 Cleanup yolo and remove stateless classes (#604)
* Add AvgPool2d as a layer

* Clean up a bit

* Remove stateless layers in yolo_nn

* More cleanup

* Save label for test

* Add test for YOLO

* Test without cv2

* Don't fail if cv2 not installed

* Better import

* Fix image read

* Use opencv :)

* Don't download the file

* Fix errors

* Use same version

* Set higher confidence

* Why is the confidence so low?

* Start over

* Remove stateless layers

* Remove extra lines

* Revert changes

* Save a few more lines
2023-02-26 16:55:21 -08:00
voidz
94bec40110 moved extras/jit.py -> tinygrad/jit.py (#599)
* moved extras/jit.py to tinygrad/jit.py

* fixed indent

* removed tinygrad.helpers.DEBUG from jit.py
2023-02-25 08:32:33 -08:00
Benedikt Mandelkow
7348e9a6c6 add restrict qualifier to inputs in c backend (#593)
* add restrict qualifier for clang backend convolution inputs/ outputs
see https://godbolt.org/z/Tb9jMxWfx for generated assembly

* enable more checks

* inline fmax to motivate the compiler to inline some more

* fix if else binding power
2023-02-25 08:32:21 -08:00
George Hotz
2e56a4793e rename log_softmax, support dim, fix onnx Softmax 2023-02-24 10:11:24 -08:00
George Hotz
94ccab941e compile_tensorflow: no cast required 2023-02-22 21:14:21 -08:00
George Hotz
135d0ddb78 compile_tensorflow: read weights from disk 2023-02-22 21:12:35 -08:00
George Hotz
0615dcffe7 compile_tensorflow: save the weights 2023-02-22 21:05:45 -08:00
George Hotz
c537fd0614 compile_tensorflow: add initialize and tests 2023-02-22 20:50:53 -08:00
George Hotz
dc914cde50 compile_tensorflow 2023-02-22 20:08:58 -08:00
George Hotz
76b4d0577d yolov8 works up to the MaxPool 2023-02-22 19:32:13 -08:00
Mischa Untaga
14bb2c40a2 Fix yolov3 example (#577) 2023-02-21 09:24:00 -08:00
George Hotz
d9fa47ecc9 use the TinyJit in the efficientnet runner, 200ms -> 20ms 2023-02-20 19:58:16 -08:00
George Hotz
714bf4b108 clang backend (#572)
* start clang backend

* mostly working

* no group for reduce w clang

* it compiles

* compiles

* a11y

* minor fixups

* formatting

* add a test

* rename test
2023-02-20 18:18:18 -08:00
Jacky Lee
cb679cd051 Fix weight initialization (#566)
* Fix weight initialization

* Use scaled_uniform in serious_mnist
2023-02-19 11:25:29 -08:00
Kirill
7944cfdadc Remove Tensor.data (#565) 2023-02-18 16:36:12 -08:00
Jacky Lee
7e8b0305f3 Fix mnist gan example (#563) 2023-02-18 13:45:37 -08:00
Jacky Lee
9fd41632c6 Import get_parameters from tinygrad.nn (#559)
* get_parameter is in optim

* Update all imports for get_parameters

* Clean up

* use optim.get_paramters
2023-02-17 15:22:26 -08:00
Jacky Lee
e172f0087a BatchNorm2D -> BatchNorm2d (#558)
* BatchNorm2D -> BatchNorm2d

* Fix typo
2023-02-16 12:31:49 -08:00
Jacky Lee
c35fcc6964 Replace phrase for prompt (#555) 2023-02-12 09:04:44 -08:00
George Hotz
191c76cfd7 hlb_cifar10 torch version 2023-02-11 18:04:40 -08:00
George Hotz
9057d98d36 no lr decay in cifar. test this in torch tomorrow 2023-02-11 17:42:54 -08:00
George Hotz
dd7accb9cc decay LR, little bugfix 2023-02-11 17:34:15 -08:00
George Hotz
ba3bf5bdf7 cifar stops learning 2023-02-11 17:21:42 -08:00
George Hotz
7d33f2d659 CL.CACHE is over, GlobalCounters.cache is it 2023-02-11 12:00:14 -08:00
George Hotz
9152bb5b4a momentum support in SGD 2023-02-11 10:22:37 -08:00
George Hotz
031edd01e6 switch openpilot compile to TinyJit 2023-02-11 09:51:44 -08:00
jspieler
8f912c3966 added deep deterministic policy gradient example (#531) 2023-02-11 10:10:46 -06:00
George Hotz
608fd730d3 put the JIT in extra 2023-02-11 00:35:18 -06:00
George Hotz
ed8ae7522a tinyjit 2023-02-11 00:22:36 -06:00
George Hotz
4c90a15689 make the fake data actually learnable 2023-02-10 23:35:21 -06:00
George Hotz
07629d7476 fakedata and move to new cache 2023-02-10 23:32:31 -06:00
George Hotz
63fa7daf30 wrong place for CL 2023-02-10 23:22:24 -06:00
George Hotz
fed95119dc CL.mem_used -> GlobalCounters.mem_used 2023-02-10 23:13:29 -06:00
Kirill
27154db99a Downloads weights in examples/stable_diffusion.py (#537)
* Downloads weights in examples/stable_diffusion.py

* use download_file_if_not_exists in fetch

* make consistent with previous NOCACHE behavior
2023-02-10 14:37:04 -06:00