Simple CUDA Runtime (#480)

* factor out opencl runtime

* don't use CL outside the runtime

* cuda runtime adds

* final_dimension

* tests pass with CUDA backend

* more cuda

* cuda simpler

* retain old functionality

* linter and typing

* move globalcounters out of runtimes

* oops, GlobalCounters in cuda

* MAX_OUTPUT_SHAPE=3 is fine for CUDA
This commit is contained in:
George Hotz
2023-01-27 16:26:24 -08:00
committed by GitHub
parent 6d5e1a8029
commit bd8a5c2ced
5 changed files with 137 additions and 100 deletions

View File

@@ -24,6 +24,7 @@ setup(name='tinygrad',
extras_require={
'gpu': ["pyopencl", "six"],
'llvm': ["llvmlite"],
'cuda': ["pycuda"],
'testing': [
"pytest",
"torch~=1.11.0",