Commit Graph

33 Commits

Author SHA1 Message Date
George Hotz
01f39b19dc move to shapetracker.py 2023-03-11 07:50:07 -08:00
George Hotz
bfcec234a2 Refactor ASTs (#622)
* ugh worst branch name

* compiler refactor continues

* scc -> cloc

* buf -> _buf

* finish _buf, and program -> runtime

* gpu is still working, clang isn't

* clang in new style

* ops_metal

* something broke it

* improve metal

* clean up tons of cl crap

* hack fix sync

* cleaner gpu

* gpu metal clang

* cleanups

* minor refactor

* GPUCodegen

* fix up LLVM

* blind CUDA refactor

* codegen / runtime

* keep ops naming

* linter passes

* woah, llvm was allocing 4x what it needed to

* bugfixes

* fix openpilot compiler

* fix compile_efficientnet

* method cache should fix tests

* deal with duped functions
2023-03-01 18:57:29 -08:00
George Hotz
c4c2c28738 a sustainable approach to float4 (#582)
* a sustainable approach to float4

* can_float4

* fix tests

* fix float4

* delete dead code

* types and minor cleanup
2023-02-22 09:45:08 -08:00
George Hotz
c5e2126d49 move DEBUG to helpers 2023-02-22 06:52:11 -08:00
George Hotz
82c257e8f5 more kernel search 2023-02-12 10:34:56 -08:00
George Hotz
ba3bf5bdf7 cifar stops learning 2023-02-11 17:21:42 -08:00
George Hotz
40f3949742 fancier KOPT 2023-02-11 16:40:25 -08:00
George Hotz
446442dbb3 fix tests symbolic 2023-02-11 15:16:47 -08:00
George Hotz
20a351a3c6 hand optim CONVW 2023-02-11 14:41:08 -08:00
George Hotz
5ed3622965 add dump to kernel_search 2023-02-10 12:13:30 -06:00
George Hotz
3d63934995 refactor to keep cl in the runtime (#545)
* refactor to keep cl in the runtime

* fix thneed, rename cl to _cl

* bugfix + _cuda

* fix tests

* thneed more correct
2023-02-08 16:46:09 -06:00
George Hotz
d93563f39f fix KOPT 2023-02-07 06:56:33 -06:00
George Hotz
f7291f6ca3 fixes big KOPT, breaks opencl (#505)
* fixes big KOPT, breaks opencl

* fix optimizer

* KernelCache

* oops, broke batchnorm

* hack to fix it

* fix llvm, less hacky gpu

* disable the cache

* cache just breaks things
2023-02-05 10:46:17 -08:00
Jacky Lee
799b3f185a Refactor getenv into helpers (#508)
* Refactor getenv into helpers

* Remove unused os

* Fix default value

* Fix more defaults for CI

* Fix bracket

* Revert changes to openpilot/compile.py

* Use getenv from helpers when possible
2023-01-31 15:09:09 -08:00
George Hotz
60ccddb58b reenable SWAP 2023-01-30 17:32:02 -08:00
George Hotz
aea55eb196 found failing upcast 2023-01-30 16:12:56 -08:00
George Hotz
b67f997864 tests pass w/o float4 2023-01-30 15:40:49 -08:00
George Hotz
cccfea4b25 factor out KOPT code 2023-01-30 13:13:55 -08:00
George Hotz
2db272c7f7 Kernel Optimizer (#489)
* kernel optimizer

* 10x faster, but wrong. not good deal

* move test -> extra

* print x speedup

* clcache

* fix clcache + DEBUG

* GFLOPS estimate

* i==3
2023-01-29 17:15:00 -08:00
George Hotz
bb0cdc2442 111.51x speedup for reduce 2023-01-29 03:06:00 -08:00
George Hotz
45c0aa6e2d search with SHIFT, REDUCE 2023-01-29 02:42:20 -08:00
George Hotz
87879cf4b6 improve search more 2023-01-29 02:08:57 -08:00
George Hotz
f6bbd43cb8 improve search 2023-01-29 01:33:47 -08:00
George Hotz
ebdec2b72f fix optimizer 2023-01-29 00:23:06 -08:00
George Hotz
a9cabce791 oops, broke mem estimates 2023-01-28 20:21:31 -08:00
George Hotz
6d5e1a8029 GEMM kernel search 2023-01-27 10:08:57 -08:00
Comma Device
f08e740957 factor out hand coded opt 2023-01-26 14:54:06 -06:00
George Hotz
5e8a36a18b real op kernel 2023-01-26 09:51:32 -08:00
George Hotz
e0600f537a op kernel in kernel search 2023-01-26 09:47:01 -08:00
George Hotz
aafc29484a cleanups 2023-01-25 12:37:10 -08:00
George Hotz
919e943867 decent search 2023-01-25 12:20:53 -08:00
George Hotz
7f3da91f8b kernel_search 2023-01-25 12:05:09 -08:00
George Hotz
e37424424f first little attempt at search 2023-01-25 11:49:29 -08:00