Commit Graph

242 Commits

Author SHA1 Message Date
George Hotz
c4856aa193 fix yolo webcam 2023-02-26 17:24:05 -08:00
Jacky Lee
0f58c4c648 Cleanup yolo and remove stateless classes (#604)
* Add AvgPool2d as a layer

* Clean up a bit

* Remove stateless layers in yolo_nn

* More cleanup

* Save label for test

* Add test for YOLO

* Test without cv2

* Don't fail if cv2 not installed

* Better import

* Fix image read

* Use opencv :)

* Don't download the file

* Fix errors

* Use same version

* Set higher confidence

* Why is the confidence so low?

* Start over

* Remove stateless layers

* Remove extra lines

* Revert changes

* Save a few more lines
2023-02-26 16:55:21 -08:00
voidz
94bec40110 moved extras/jit.py -> tinygrad/jit.py (#599)
* moved extras/jit.py to tinygrad/jit.py

* fixed indent

* removed tinygrad.helpers.DEBUG from jit.py
2023-02-25 08:32:33 -08:00
Benedikt Mandelkow
7348e9a6c6 add restrict qualifier to inputs in c backend (#593)
* add restrict qualifier for clang backend convolution inputs/ outputs
see https://godbolt.org/z/Tb9jMxWfx for generated assembly

* enable more checks

* inline fmax to motivate the compiler to inline some more

* fix if else binding power
2023-02-25 08:32:21 -08:00
George Hotz
2e56a4793e rename log_softmax, support dim, fix onnx Softmax 2023-02-24 10:11:24 -08:00
George Hotz
94ccab941e compile_tensorflow: no cast required 2023-02-22 21:14:21 -08:00
George Hotz
135d0ddb78 compile_tensorflow: read weights from disk 2023-02-22 21:12:35 -08:00
George Hotz
0615dcffe7 compile_tensorflow: save the weights 2023-02-22 21:05:45 -08:00
George Hotz
c537fd0614 compile_tensorflow: add initialize and tests 2023-02-22 20:50:53 -08:00
George Hotz
dc914cde50 compile_tensorflow 2023-02-22 20:08:58 -08:00
George Hotz
76b4d0577d yolov8 works up to the MaxPool 2023-02-22 19:32:13 -08:00
Mischa Untaga
14bb2c40a2 Fix yolov3 example (#577) 2023-02-21 09:24:00 -08:00
George Hotz
d9fa47ecc9 use the TinyJit in the efficientnet runner, 200ms -> 20ms 2023-02-20 19:58:16 -08:00
George Hotz
714bf4b108 clang backend (#572)
* start clang backend

* mostly working

* no group for reduce w clang

* it compiles

* compiles

* a11y

* minor fixups

* formatting

* add a test

* rename test
2023-02-20 18:18:18 -08:00
Jacky Lee
cb679cd051 Fix weight initialization (#566)
* Fix weight initialization

* Use scaled_uniform in serious_mnist
2023-02-19 11:25:29 -08:00
Kirill
7944cfdadc Remove Tensor.data (#565) 2023-02-18 16:36:12 -08:00
Jacky Lee
7e8b0305f3 Fix mnist gan example (#563) 2023-02-18 13:45:37 -08:00
Jacky Lee
9fd41632c6 Import get_parameters from tinygrad.nn (#559)
* get_parameter is in optim

* Update all imports for get_parameters

* Clean up

* use optim.get_paramters
2023-02-17 15:22:26 -08:00
Jacky Lee
e172f0087a BatchNorm2D -> BatchNorm2d (#558)
* BatchNorm2D -> BatchNorm2d

* Fix typo
2023-02-16 12:31:49 -08:00
Jacky Lee
c35fcc6964 Replace phrase for prompt (#555) 2023-02-12 09:04:44 -08:00
George Hotz
191c76cfd7 hlb_cifar10 torch version 2023-02-11 18:04:40 -08:00
George Hotz
9057d98d36 no lr decay in cifar. test this in torch tomorrow 2023-02-11 17:42:54 -08:00
George Hotz
dd7accb9cc decay LR, little bugfix 2023-02-11 17:34:15 -08:00
George Hotz
ba3bf5bdf7 cifar stops learning 2023-02-11 17:21:42 -08:00
George Hotz
7d33f2d659 CL.CACHE is over, GlobalCounters.cache is it 2023-02-11 12:00:14 -08:00
George Hotz
9152bb5b4a momentum support in SGD 2023-02-11 10:22:37 -08:00
George Hotz
031edd01e6 switch openpilot compile to TinyJit 2023-02-11 09:51:44 -08:00
jspieler
8f912c3966 added deep deterministic policy gradient example (#531) 2023-02-11 10:10:46 -06:00
George Hotz
608fd730d3 put the JIT in extra 2023-02-11 00:35:18 -06:00
George Hotz
ed8ae7522a tinyjit 2023-02-11 00:22:36 -06:00
George Hotz
4c90a15689 make the fake data actually learnable 2023-02-10 23:35:21 -06:00
George Hotz
07629d7476 fakedata and move to new cache 2023-02-10 23:32:31 -06:00
George Hotz
63fa7daf30 wrong place for CL 2023-02-10 23:22:24 -06:00
George Hotz
fed95119dc CL.mem_used -> GlobalCounters.mem_used 2023-02-10 23:13:29 -06:00
Kirill
27154db99a Downloads weights in examples/stable_diffusion.py (#537)
* Downloads weights in examples/stable_diffusion.py

* use download_file_if_not_exists in fetch

* make consistent with previous NOCACHE behavior
2023-02-10 14:37:04 -06:00
Jacky Lee
f08187526f Fix examples (#540)
* Fix examples

* Remove training in parameters

* Simplify a bit

* Remove extra import

* Fix linter errors

* factor out Device

* NumPy-like semantics for Tensor.__getitem__ (#506)

* Rewrote Tensor.__getitem__ to fix negative indices and add support for np.newaxis/None

* Fixed pad2d

* mypy doesn't know about mlops methods

* normal python behavior for out-of-bounds slicing

* type: ignore

* inlined idxfix

* added comment for __getitem__

* Better comments, better tests, and fixed bug in np.newaxis

* update cpu and torch to hold buffers (#542)

* update cpu and torch to hold buffers

* save lines, and probably faster

* Mypy fun (#541)

* mypy fun

* things are just faster

* running fast

* mypy is fast

* compile.sh

* no gpu hack

* refactor ops_cpu and ops_torch to not subclass

* make weak buffer work

* tensor works

* fix test failing

* cpu/torch cleanups

* no or operator on dict in python 3.8

* that was junk

* fix warnings

* comment and touchup

* dyn add of math ops

* refactor ops_cpu and ops_torch to not share code

* nn/optim.py compiles now

* Reorder imports

* call mkdir only if directory doesn't exist

---------

Co-authored-by: George Hotz <geohot@gmail.com>
Co-authored-by: Mitchell Goff <mitchellgoffpc@gmail.com>
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
2023-02-10 12:09:37 -06:00
George Hotz
a5a55ac19e GlobalCounters cache + assign in optim 2023-02-08 17:10:55 -06:00
George Hotz
3d63934995 refactor to keep cl in the runtime (#545)
* refactor to keep cl in the runtime

* fix thneed, rename cl to _cl

* bugfix + _cuda

* fix tests

* thneed more correct
2023-02-08 16:46:09 -06:00
George Hotz
2844482a60 Mypy fun (#541)
* mypy fun

* things are just faster

* running fast

* mypy is fast

* compile.sh

* no gpu hack

* refactor ops_cpu and ops_torch to not subclass

* make weak buffer work

* tensor works

* fix test failing

* cpu/torch cleanups

* no or operator on dict in python 3.8

* that was junk

* fix warnings

* comment and touchup
2023-02-08 09:56:51 -06:00
George Hotz
f7291f6ca3 fixes big KOPT, breaks opencl (#505)
* fixes big KOPT, breaks opencl

* fix optimizer

* KernelCache

* oops, broke batchnorm

* hack to fix it

* fix llvm, less hacky gpu

* disable the cache

* cache just breaks things
2023-02-05 10:46:17 -08:00
James Roberts
db0a9b0a2d Refactor CL.time_sum into GlobalCounters (#519) 2023-02-01 20:13:56 -08:00
George Hotz
5e37f084db stable diffusion: clean up constant folding 2023-02-01 12:53:16 -08:00
Jacky Lee
486f023e81 Rename Normalize and move to nn (#513)
* Rename Normalize and move to nn

* Match PyTorch for dim>1
2023-02-01 11:55:03 -08:00
Jacky Lee
799b3f185a Refactor getenv into helpers (#508)
* Refactor getenv into helpers

* Remove unused os

* Fix default value

* Fix more defaults for CI

* Fix bracket

* Revert changes to openpilot/compile.py

* Use getenv from helpers when possible
2023-01-31 15:09:09 -08:00
George Hotz
21f2af08d5 getenv + graphing 2023-01-30 19:15:03 -08:00
George Hotz
60ccddb58b reenable SWAP 2023-01-30 17:32:02 -08:00
George Hotz
aea55eb196 found failing upcast 2023-01-30 16:12:56 -08:00
George Hotz
7ee0d99c70 CLCACHE 2023-01-30 14:02:06 -08:00
George Hotz
cccfea4b25 factor out KOPT code 2023-01-30 13:13:55 -08:00
George Hotz
de2c419fd4 make_pair and first attempt at hlb_cifar10 2023-01-30 11:07:23 -08:00