8 Commits

Author SHA1 Message Date
George Hotz
bfcec234a2 Refactor ASTs (#622)
* ugh worst branch name

* compiler refactor continues

* scc -> cloc

* buf -> _buf

* finish _buf, and program -> runtime

* gpu is still working, clang isn't

* clang in new style

* ops_metal

* something broke it

* improve metal

* clean up tons of cl crap

* hack fix sync

* cleaner gpu

* gpu metal clang

* cleanups

* minor refactor

* GPUCodegen

* fix up LLVM

* blind CUDA refactor

* codegen / runtime

* keep ops naming

* linter passes

* woah, llvm was allocing 4x what it needed to

* bugfixes

* fix openpilot compiler

* fix compile_efficientnet

* method cache should fix tests

* deal with duped functions
2023-03-01 18:57:29 -08:00
George Hotz
c5e2126d49 move DEBUG to helpers 2023-02-22 06:52:11 -08:00
George Hotz
693d4b89a4 fixup TRITON backend to use new APIs 2023-02-12 06:57:49 -08:00
George Hotz
b9eae94ae9 move Device back into lazy 2023-02-11 11:26:53 -08:00
Martin Loretz
4ad67b4bbc Refactor triton buffer to use CLBuffer of cuda runtime (#524)
* Refactor triton buffer to use CLBuffer of runtime

* Fix opencl GT0
2023-02-03 20:02:41 -08:00
Martin Loretz
45e847d284 Update triton to work in master (#517)
* Update triton to work in master

* Move mem_estimate out of runner
2023-02-01 12:58:14 -08:00
George Hotz
175c38d1b3 triton: it already was GT0 2023-02-01 12:00:33 -08:00
George Hotz
cd97b036cc A Triton backend for tinygrad (#470)
* triton can add

* print stuff from triton

* write out file

* ops triton working

* reduce ops

* sort of works

* Triton bugfixes & implementation of remaining ops (#490)

* padding

* support pow, max, relu, gt0

* allocate return buffer

* Fix reduce

* Add tests for power op

* Fix triton illegal memory accesses and memory leak (#512)

* Fix mypy issue

* Add triton to setup.py

* Replace torch with pycuda

* Use one cuda stream for data transfer and kernels

* Remove triton submodule

* Fix memory leak by using weakrefs for caching

* Fix memory access by adding valid as mask for load

* Fix invalid kernel launches by flattening the grid (#515)

---------

Co-authored-by: Martin Loretz <20306567+martinloretzzz@users.noreply.github.com>
2023-02-01 11:53:57 -08:00