Commit Graph

15 Commits

Author SHA1 Message Date
George Hotz
5495c7d64e linearizer! (#714)
* linearizer outputs something

* working ish

* cstyle codegen

* clang mostly works

* fix load valid

* fix numberless loop

* fancy gen

* working

* fix enet compiler

* cleanups

* float4 upcasting

* less lines

* supports_float4

* constant folding

* mulacc

* internet tests flaky in CI

* 90% image support

* fix image generic

* bugs exposed with shapetracker and single view

* new llvm

* use vload, remove OLD

* that's really poorly done

* ending up being more lines
2023-03-19 23:43:49 -07:00
George Hotz
f355b02987 remove comments and reorder 2023-03-18 14:48:39 -07:00
George Hotz
f5467cfedc Devicebufferless (#708)
* runs one metal kernel

* conv2d works

* ops tests are passing

* const folding

* all ops work

* pre commit always passes

* torch works

* working still

* fix graph test

* tests passing

* image almost works

* image conv works

* most images

* fix custom

* fix assignment

* fix compile enet

* clean up comments

* fix realize return value

* include shapetracker in LB repr

* copy should make a copy

* reenable method cache

* fix lna

* dtypes in graph

* forward only for IMAGE=2

* simple realize

* getting close

* fixup new api, it's good except the kernel count

* back to 197 kernels

* tests should pass

* go to a real float

* no type_on_cpu

* fix the docs

* put shapetracker back in it's proper place
2023-03-18 14:40:23 -07:00
George Hotz
54f499b623 Move rawbuffer (#697)
* move GlobalCounters to helpers

* that's not part of the public api

* move InterpretedBuffer

* remove fromCPU from devicebuffer
2023-03-13 22:30:36 -07:00
George Hotz
15e0b56e39 compile works (#688)
* compile works

* runtimes

* line count

* fix custom, to tg dtype

* meh, that's fine with lazy import
2023-03-12 11:01:25 -07:00
Cyril Roumégous
3f08613a2a apply flake8 E203 rule (#684) 2023-03-11 11:35:16 -08:00
George Hotz
1826ff6b89 dtypes nice and clean (#673)
* add dtype class

* dtypes

* buffers are lazy

* dtype is tracked by lazybuffer and GenericShape

* fix types in llvm

* llvm store

* dtype tests

* fix tests maybe

* fix flop counter

* fix CI

* CI fix and check format

* fix dtype and dtype check

* fix custom test

* fix test graph
2023-03-10 16:56:07 -08:00
George Hotz
1a039306d2 good changes from llama branch (#671)
* good changes from llama

* transpose behavior changed
2023-03-09 20:51:22 -08:00
George Hotz
4f957423c3 jitting custom ops + OPTLOCAL assignment bugfix 2023-03-08 08:30:37 -08:00
George Hotz
7930c6ab5c CLImage backing bug + test_vec_mul 2023-03-05 16:32:05 -08:00
George Hotz
3072e098c0 local workgroup optimizer 2023-03-05 12:08:12 -08:00
Comma Device
3da56ab41d adreno disassembler 2023-03-05 10:32:03 -06:00
George Hotz
459488bba2 fix linter (#630)
* fix linter

* no imports okay

* explicit bases

* disable in pylintrc
2023-03-02 20:06:20 -08:00
George Hotz
0335cb86b9 refactor comparison. there's a bug in the method cache 2023-03-02 10:10:16 -08:00
George Hotz
bfcec234a2 Refactor ASTs (#622)
* ugh worst branch name

* compiler refactor continues

* scc -> cloc

* buf -> _buf

* finish _buf, and program -> runtime

* gpu is still working, clang isn't

* clang in new style

* ops_metal

* something broke it

* improve metal

* clean up tons of cl crap

* hack fix sync

* cleaner gpu

* gpu metal clang

* cleanups

* minor refactor

* GPUCodegen

* fix up LLVM

* blind CUDA refactor

* codegen / runtime

* keep ops naming

* linter passes

* woah, llvm was allocing 4x what it needed to

* bugfixes

* fix openpilot compiler

* fix compile_efficientnet

* method cache should fix tests

* deal with duped functions
2023-03-01 18:57:29 -08:00