tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-04-29 03:00:14 -04:00

Author	SHA1	Message	Date
George Hotz	bfcec234a2	Refactor ASTs (#622 ) * ugh worst branch name * compiler refactor continues * scc -> cloc * buf -> _buf * finish _buf, and program -> runtime * gpu is still working, clang isn't * clang in new style * ops_metal * something broke it * improve metal * clean up tons of cl crap * hack fix sync * cleaner gpu * gpu metal clang * cleanups * minor refactor * GPUCodegen * fix up LLVM * blind CUDA refactor * codegen / runtime * keep ops naming * linter passes * woah, llvm was allocing 4x what it needed to * bugfixes * fix openpilot compiler * fix compile_efficientnet * method cache should fix tests * deal with duped functions	2023-03-01 18:57:29 -08:00
George Hotz	c5e2126d49	move DEBUG to helpers	2023-02-22 06:52:11 -08:00
George Hotz	693d4b89a4	fixup TRITON backend to use new APIs	2023-02-12 06:57:49 -08:00
George Hotz	b9eae94ae9	move Device back into lazy	2023-02-11 11:26:53 -08:00
Martin Loretz	4ad67b4bbc	Refactor triton buffer to use CLBuffer of cuda runtime (#524 ) * Refactor triton buffer to use CLBuffer of runtime * Fix opencl GT0	2023-02-03 20:02:41 -08:00
Martin Loretz	45e847d284	Update triton to work in master (#517 ) * Update triton to work in master * Move mem_estimate out of runner	2023-02-01 12:58:14 -08:00
George Hotz	175c38d1b3	triton: it already was GT0	2023-02-01 12:00:33 -08:00
George Hotz	cd97b036cc	A Triton backend for tinygrad (#470 ) * triton can add * print stuff from triton * write out file * ops triton working * reduce ops * sort of works * Triton bugfixes & implementation of remaining ops (#490) * padding * support pow, max, relu, gt0 * allocate return buffer * Fix reduce * Add tests for power op * Fix triton illegal memory accesses and memory leak (#512) * Fix mypy issue * Add triton to setup.py * Replace torch with pycuda * Use one cuda stream for data transfer and kernels * Remove triton submodule * Fix memory leak by using weakrefs for caching * Fix memory access by adding valid as mask for load * Fix invalid kernel launches by flattening the grid (#515) --------- Co-authored-by: Martin Loretz <20306567+martinloretzzz@users.noreply.github.com>	2023-02-01 11:53:57 -08:00

8 Commits