George Hotz
01f39b19dc
move to shapetracker.py
2023-03-11 07:50:07 -08:00
George Hotz
bfcec234a2
Refactor ASTs ( #622 )
...
* ugh worst branch name
* compiler refactor continues
* scc -> cloc
* buf -> _buf
* finish _buf, and program -> runtime
* gpu is still working, clang isn't
* clang in new style
* ops_metal
* something broke it
* improve metal
* clean up tons of cl crap
* hack fix sync
* cleaner gpu
* gpu metal clang
* cleanups
* minor refactor
* GPUCodegen
* fix up LLVM
* blind CUDA refactor
* codegen / runtime
* keep ops naming
* linter passes
* woah, llvm was allocing 4x what it needed to
* bugfixes
* fix openpilot compiler
* fix compile_efficientnet
* method cache should fix tests
* deal with duped functions
2023-03-01 18:57:29 -08:00
George Hotz
c4c2c28738
a sustainable approach to float4 ( #582 )
...
* a sustainable approach to float4
* can_float4
* fix tests
* fix float4
* delete dead code
* types and minor cleanup
2023-02-22 09:45:08 -08:00
George Hotz
c5e2126d49
move DEBUG to helpers
2023-02-22 06:52:11 -08:00
George Hotz
82c257e8f5
more kernel search
2023-02-12 10:34:56 -08:00
George Hotz
ba3bf5bdf7
cifar stops learning
2023-02-11 17:21:42 -08:00
George Hotz
40f3949742
fancier KOPT
2023-02-11 16:40:25 -08:00
George Hotz
446442dbb3
fix tests symbolic
2023-02-11 15:16:47 -08:00
George Hotz
20a351a3c6
hand optim CONVW
2023-02-11 14:41:08 -08:00
George Hotz
5ed3622965
add dump to kernel_search
2023-02-10 12:13:30 -06:00
George Hotz
3d63934995
refactor to keep cl in the runtime ( #545 )
...
* refactor to keep cl in the runtime
* fix thneed, rename cl to _cl
* bugfix + _cuda
* fix tests
* thneed more correct
2023-02-08 16:46:09 -06:00
George Hotz
d93563f39f
fix KOPT
2023-02-07 06:56:33 -06:00
George Hotz
f7291f6ca3
fixes big KOPT, breaks opencl ( #505 )
...
* fixes big KOPT, breaks opencl
* fix optimizer
* KernelCache
* oops, broke batchnorm
* hack to fix it
* fix llvm, less hacky gpu
* disable the cache
* cache just breaks things
2023-02-05 10:46:17 -08:00
Jacky Lee
799b3f185a
Refactor getenv into helpers ( #508 )
...
* Refactor getenv into helpers
* Remove unused os
* Fix default value
* Fix more defaults for CI
* Fix bracket
* Revert changes to openpilot/compile.py
* Use getenv from helpers when possible
2023-01-31 15:09:09 -08:00
George Hotz
60ccddb58b
reenable SWAP
2023-01-30 17:32:02 -08:00
George Hotz
aea55eb196
found failing upcast
2023-01-30 16:12:56 -08:00
George Hotz
b67f997864
tests pass w/o float4
2023-01-30 15:40:49 -08:00
George Hotz
cccfea4b25
factor out KOPT code
2023-01-30 13:13:55 -08:00
George Hotz
2db272c7f7
Kernel Optimizer ( #489 )
...
* kernel optimizer
* 10x faster, but wrong. not good deal
* move test -> extra
* print x speedup
* clcache
* fix clcache + DEBUG
* GFLOPS estimate
* i==3
2023-01-29 17:15:00 -08:00
George Hotz
bb0cdc2442
111.51x speedup for reduce
2023-01-29 03:06:00 -08:00
George Hotz
45c0aa6e2d
search with SHIFT, REDUCE
2023-01-29 02:42:20 -08:00
George Hotz
87879cf4b6
improve search more
2023-01-29 02:08:57 -08:00
George Hotz
f6bbd43cb8
improve search
2023-01-29 01:33:47 -08:00
George Hotz
ebdec2b72f
fix optimizer
2023-01-29 00:23:06 -08:00
George Hotz
a9cabce791
oops, broke mem estimates
2023-01-28 20:21:31 -08:00
George Hotz
6d5e1a8029
GEMM kernel search
2023-01-27 10:08:57 -08:00
Comma Device
f08e740957
factor out hand coded opt
2023-01-26 14:54:06 -06:00
George Hotz
5e8a36a18b
real op kernel
2023-01-26 09:51:32 -08:00
George Hotz
e0600f537a
op kernel in kernel search
2023-01-26 09:47:01 -08:00
George Hotz
aafc29484a
cleanups
2023-01-25 12:37:10 -08:00
George Hotz
919e943867
decent search
2023-01-25 12:20:53 -08:00
George Hotz
7f3da91f8b
kernel_search
2023-01-25 12:05:09 -08:00
George Hotz
e37424424f
first little attempt at search
2023-01-25 11:49:29 -08:00