70 Commits

Author SHA1 Message Date
George Hotz
d8dda2af3a openpilot fixups 2023-03-06 14:14:44 -08:00
George Hotz
382f346523 clean up opt (#649)
* clean up opt

* don't let global kernels get too small

* 8192 -> 1024

* disable local shape for clang

* fix can_merge

* unroll the 5x5 depthwise convs in op

* load float4 check
2023-03-05 20:49:36 -08:00
George Hotz
c53efb3635 optimize for CL (#633)
* required opt

* simplify

* works

* shift_to_last

* required is fine

* print shape in colored

* better shape

* args was wrong

* debugs

* fix empty shape

* colored shape printer
2023-03-03 22:00:09 -08:00
George Hotz
1a84976d4d fix thneed gflops 2023-03-03 16:52:59 -08:00
George Hotz
b9ce20c374 openpilot test wasn't running, factor out image idx 2023-03-03 07:41:53 -08:00
George Hotz
2e26286294 speed like you wouldn't believe (#626)
* speed like you wouldn't believe

* fix tests
2023-03-02 07:49:19 -08:00
George Hotz
bfcec234a2 Refactor ASTs (#622)
* ugh worst branch name

* compiler refactor continues

* scc -> cloc

* buf -> _buf

* finish _buf, and program -> runtime

* gpu is still working, clang isn't

* clang in new style

* ops_metal

* something broke it

* improve metal

* clean up tons of cl crap

* hack fix sync

* cleaner gpu

* gpu metal clang

* cleanups

* minor refactor

* GPUCodegen

* fix up LLVM

* blind CUDA refactor

* codegen / runtime

* keep ops naming

* linter passes

* woah, llvm was allocing 4x what it needed to

* bugfixes

* fix openpilot compiler

* fix compile_efficientnet

* method cache should fix tests

* deal with duped functions
2023-03-01 18:57:29 -08:00
voidz
94bec40110 moved extras/jit.py -> tinygrad/jit.py (#599)
* moved extras/jit.py to tinygrad/jit.py

* fixed indent

* removed tinygrad.helpers.DEBUG from jit.py
2023-02-25 08:32:33 -08:00
George Hotz
d3029c91c5 no rng for op test 2023-02-24 00:23:20 -08:00
George Hotz
661812ffef don't ignore type 2023-02-23 19:38:52 -08:00
George Hotz
8b0082540b openpilot compile cleanups 2023-02-20 09:16:03 -08:00
George Hotz
de71c13934 test speed v torch uses jit 2023-02-12 07:43:17 -08:00
George Hotz
031edd01e6 switch openpilot compile to TinyJit 2023-02-11 09:51:44 -08:00
George Hotz
3d63934995 refactor to keep cl in the runtime (#545)
* refactor to keep cl in the runtime

* fix thneed, rename cl to _cl

* bugfix + _cuda

* fix tests

* thneed more correct
2023-02-08 16:46:09 -06:00
Jacky Lee
799b3f185a Refactor getenv into helpers (#508)
* Refactor getenv into helpers

* Remove unused os

* Fix default value

* Fix more defaults for CI

* Fix bracket

* Revert changes to openpilot/compile.py

* Use getenv from helpers when possible
2023-01-31 15:09:09 -08:00
George Hotz
92001a06e1 openpilot/go.sh 2023-01-28 13:57:43 -08:00
George Hotz
6d7658db12 delete opencl <celebration> 2023-01-24 14:18:35 -08:00
George Hotz
e313c8af20 update openpilot tests from OPENCL to GPU 2023-01-24 14:05:20 -08:00
George Hotz
281b0db773 three from image 2023-01-12 12:26:58 -08:00
George Hotz
4885fce56e shapetracker from newgpu (#456)
* shapetracker from newgpu

* touchup ops

* test

* testst

* thneed deletes unused inputs

* test

* bugfix
2023-01-09 12:40:01 -08:00
George Hotz
e6b65f8e01 fix graph in openpilot/compile.py 2022-10-28 08:55:34 -07:00
George Hotz
ef62db3186 cleanups, remove E701 2022-10-28 08:28:56 -07:00
George Hotz
b65b70812a Exec AST (#404)
* working exec ast

* exec_ast is staticmethod

* GenericExecAST

* fold that sometimes

* ExplicitExecAST

* exec_ast for GPU

* gpu working

* get_lazyop_shape

* now gpubuffer is ExplicitExecAST

* dedup

* add a type

* RESHAPE in opencl code

* fix linter

* that too for linter

* cleanups

* remove dead code

* GenericShape is less lines

* add ALLOWED_KERNEL_COUNT to tests

* fix mypy

* that's gotta be recursive

* fix opencl shape processing

* remove unneeded lambda
2022-10-28 08:27:03 -07:00
George Hotz
6a8fb53304 move ops.py into lazy.py (#402)
* move ops.py into lazy.py

* fix graph and linter

* ugh, didn't add
2022-10-25 13:58:03 -07:00
George Hotz
3b9b7eda48 remove run_thneed dead code 2022-10-20 17:24:18 -07:00
George Hotz
1bec4651b3 fix nonstatic weights 2022-10-20 17:04:14 -07:00
George Hotz
50c95c7d9a add assert to catch issue in attention 2022-10-20 15:13:00 -07:00
George Hotz
26c78ccf7d remove useless buffer 2022-10-20 14:07:28 -07:00
George Hotz
a18c1f3178 zero out the inputs 2022-10-20 13:46:52 -07:00
George Hotz
61ee428e4c rerun 2022-10-20 13:29:14 -07:00
George Hotz
5dae64b7b0 read input shapes and break down the layers 2022-10-20 13:11:24 -07:00
George Hotz
e00601faea fix thneed self test 2022-10-20 12:55:02 -07:00
George Hotz
ace8db29f8 ReduceSum 2022-10-20 12:48:14 -07:00
George Hotz
c400ee0beb refactoring thneed (#400)
* refactoring thneed

* continue

* minor update

* looks like it's working

* big refactor

* confirm thneed got the right output

* code is there but it's broken

* works now

* always OPTWG, input -> dat

* fix type issue
2022-10-20 12:35:59 -07:00
YassineYousfi
ae0f9b17df openpilot: new models and onnx ops (#401)
* ngrl stuff

* fngrl

* fix typo in compile script

* workflow dispatch

* new models in tests

* dont need to up this threshold

Co-authored-by: HaraldSchafer <harald.the.engineer@gmail.com>
2022-10-20 11:49:19 -07:00
George Hotz
d6f499fd69 improve opencl, why is it OOMing 2022-09-05 20:14:31 -07:00
George Hotz
2e9b7637b3 don't save input buffers 2022-08-31 15:37:38 -07:00
George Hotz
a3fc64a585 fix batchnorm folding in openpilot compile 2022-08-31 13:04:49 -07:00
Comma Device
a734df98fa TEST_ENET for openpilot compiler 2022-08-31 13:23:36 -04:00
George Hotz
d919ac32af fix wrong size input 2022-08-31 09:07:34 -07:00
George Hotz
040640a580 fix cl import error 2022-08-31 08:43:44 -07:00
George Hotz
33ac355bcd still broken 2022-08-29 19:08:07 -07:00
George Hotz
5efab7cf1d add reciprocal 2022-08-29 18:00:24 -07:00
George Hotz
880707f2d2 no torch test if no torch 2022-08-29 15:29:19 -07:00
George Hotz
5eba228844 print inputs 2022-08-29 08:56:04 -07:00
George Hotz
dd587d26e3 oops, compare with abs 2022-08-28 11:23:21 -07:00
George Hotz
dc7af8c3ac thneed run float32 2022-08-28 11:03:35 -07:00
Comma Device
f0d11f29c7 float32 in image desc 2022-08-28 08:47:43 -07:00
George Hotz
11626053b0 run_thneed with test 2022-08-22 09:45:46 -07:00
George Hotz
e7a4cd91ba fix cpu thneed running 2022-08-21 12:11:07 -07:00