mirror of
https://github.com/tinygrad/tinygrad.git
synced 2026-01-26 07:18:40 -05:00
* Move ops_triton to runtime and remove errors from deprecated code * Remove deprecated AST Kernel * Remove deprecated buffer * Add TritonProgram * Triton Buffer * Use RawCUDABuffer * triton_compile * Added new parameter * pass _buf to program * remove deprecated include * Added triton tests * Deprecated includes removed * remove double print * Disable float4 support * Disable float4 support * variable load fix * Track local size * Add pycuda to triton dependencies * Merge test.yml * install cuda packages for testing * merge double package install * remove emulated from triton tests * upscale local index to power of 2 and add masking * cuda envs * Add TernaryOps * ConstOp loading * proper function name * remove deprecated variables * get global program from name * const ops match local shape * Enable test_nn * remove deprecated import * fix linter error * Add wait logic * Add local size override * accumulate local shapes instead of using max shape * Merge triton tests into global tests * fix envs in testing * Old testing routine * split file into renderer and program * remove print and starting whitespace * pretty ptx print on debug 5 * linter errors * ignore triton saturation tests * ignore test example * remove pytorch cpu extra index * Add triton to existing testing routine * use triton tests * disable cuda backend in triton tests * use cudacpu in tests * print used device * Print device default * Remove print * ensure we are running triton backend * update variable signatures * update dtypes for load * infinity render fixed * limit global size * negative infinity now properly rendered * split chain with parentheses for and node * Add option to disable shared memory, disable for triton * missing import * Properly index and mask conditional load * use mask only if not loading a block pointer * nan support * fix symbolic tests to include chain split * proper masking for stores * Implemented bool dtype * Add mod * fix loads for variables with valid range * merge triton with cuda runtime * merge from master * run triton tests with cuda * Correct target when running from triton * conftest with triton compiler config * use triton nightly * verbose tests for triton * capture stdout * fix function depth when exiting multiple loops * add render valid function for readabilty * fix mask for local loops * add _arg_int32 datatype * fix dims for conditional loads * enable non float stores * correct variable dtypes * fix type for arg_int32 * remove junk * Added get max function for range based var.max * remove deprecated code * Fix triton ptxas path * Fix testing for CI * clamp local size by max local size instead of always running max * Disable matmul test in triton cpu * rerun tests * Disable broken test in triton cpu * whitespace removed * rerun tests again * Disable TestSymbolicOps for triton * update to new uops * linter fix * ignore test/extra * linting fix * Update tinygrad/renderer/triton.py Co-authored-by: Gijs Koning <gijs-koning@live.nl> * remove deprecated line * quotes type fix * linter * Remove unnecesary lines * UnaryOps.NEG * dont define constants * Linting fix * Disable tests that are broken in ocelot * remove trailing whitespace * reduce line count * linting fix * update to new uast * New looping style * Update to new uast * make AST runner work with triton * linting fix * set renderer var for testing * disable local for ocelot * reenable all tests for ocelot * Pass shared to cuda * Don't group if the backend doesn't support shared mem * use working gpuocelot branch * enable all tests * enable local for ocelot * cleanup * Update test.yml * update cache key * reenable test symbolic and extra * Update test.yml * Revert "Update test.yml" (rerun tests) This reverts commit98c0630ee5. * Revert "fix symbolic tests to include chain split" This reverts commit22a9a4c9cd. * Revert "split chain with parentheses for and node" This reverts commit7499a7004e. * use global size from linearizer * rename newvar to dtype to match other renderers * join program start lines * simplify code that adds axis to local dims * assign r[u] in ssa * We no longer need to replace target in src * we no longer need to cast indices to int by hand * Update triton.py(rerun tests) * Update triton.py(rerun tests) * Update triton.py(rerun tests) --------- Co-authored-by: Gijs Koning <gijs-koning@live.nl> Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
59 lines
1.8 KiB
Python
59 lines
1.8 KiB
Python
#!/usr/bin/env python3
|
|
|
|
from pathlib import Path
|
|
from setuptools import setup
|
|
|
|
directory = Path(__file__).resolve().parent
|
|
with open(directory / 'README.md', encoding='utf-8') as f:
|
|
long_description = f.read()
|
|
|
|
setup(name='tinygrad',
|
|
version='0.7.0',
|
|
description='You like pytorch? You like micrograd? You love tinygrad! <3',
|
|
author='George Hotz',
|
|
license='MIT',
|
|
long_description=long_description,
|
|
long_description_content_type='text/markdown',
|
|
packages = ['tinygrad', 'tinygrad.codegen', 'tinygrad.nn', 'tinygrad.renderer', 'tinygrad.runtime', 'tinygrad.shape'],
|
|
classifiers=[
|
|
"Programming Language :: Python :: 3",
|
|
"License :: OSI Approved :: MIT License"
|
|
],
|
|
install_requires=["numpy", "requests", "pillow", "tqdm", "networkx", "pyopencl", "PyYAML",
|
|
"pyobjc-framework-Metal; platform_system=='Darwin'",
|
|
"pyobjc-framework-Cocoa; platform_system=='Darwin'",
|
|
"pyobjc-framework-libdispatch; platform_system=='Darwin'"],
|
|
python_requires='>=3.8',
|
|
extras_require={
|
|
'llvm': ["llvmlite"],
|
|
'cuda': ["pycuda"],
|
|
'arm': ["unicorn"],
|
|
'triton': ["triton-nightly", "pycuda"],
|
|
'webgpu': ["wgpu"],
|
|
'linting': [
|
|
"flake8",
|
|
"pylint",
|
|
"mypy",
|
|
"typing-extensions",
|
|
"pre-commit",
|
|
"ruff",
|
|
],
|
|
'testing': [
|
|
"torch",
|
|
"pytest",
|
|
"pytest-xdist",
|
|
"onnx",
|
|
"onnx2torch",
|
|
"opencv-python",
|
|
"tabulate",
|
|
"safetensors",
|
|
"types-PyYAML",
|
|
"types-tqdm",
|
|
"cloudpickle",
|
|
"transformers",
|
|
"nevergrad",
|
|
"tiktoken",
|
|
],
|
|
},
|
|
include_package_data=True)
|