Files
tinygrad/accel
George Hotz cd97b036cc A Triton backend for tinygrad (#470)
* triton can add

* print stuff from triton

* write out file

* ops triton working

* reduce ops

* sort of works

* Triton bugfixes & implementation of remaining ops (#490)

* padding

* support pow, max, relu, gt0

* allocate return buffer

* Fix reduce

* Add tests for power op

* Fix triton illegal memory accesses and memory leak (#512)

* Fix mypy issue

* Add triton to setup.py

* Replace torch with pycuda

* Use one cuda stream for data transfer and kernels

* Remove triton submodule

* Fix memory leak by using weakrefs for caching

* Fix memory access by adding valid as mask for load

* Fix invalid kernel launches by flattening the grid (#515)

---------

Co-authored-by: Martin Loretz <20306567+martinloretzzz@users.noreply.github.com>
2023-02-01 11:53:57 -08:00
..
2023-01-31 15:09:09 -08:00
2022-05-14 21:25:30 -07:00
2023-02-01 11:01:39 -08:00
2021-12-30 13:53:08 -05:00
2021-10-30 16:41:05 -07:00
2022-08-06 19:10:22 +00:00
2021-10-30 17:02:17 -07:00

This is where we scope out adding accelerators to tinygrad

ane -- Apple Neural Engine, in the M1 + newer iPhones
cherry -- Largely defunct custom hardware based on a RISC-V extension
tpu -- Google's TPU, available for rent in Google Cloud