mirror of
https://github.com/tinygrad/tinygrad.git
synced 2026-01-23 13:58:00 -05:00
* chonker will make llvm fast * work * better speed tests, we will make them fast * with the cache add is the same speed * relu and neg are fast * fix sum speed * maximum maxnum? * hack for gemm opt * gemm very slow * zeros like * test_permute * shapetracker returns self * fix shapetracker factorization * err, int strides * permutes are faster now in tinygrad than pytorch * support -1 in expand * gemm unrolled * improve final test case * WIP GEMM * why isn't GEMM fast? * revert cache dim * ffp contract works on clang, not llvm? * ignore llvm ir * this makes fma work at least, but no faster * USE_4x4 * 63 GFLOPS * 87 GFLOPS * that wasn't matmul, 44 GFLOPS now * 82 GFLOPS permuted * this permute too * a little speed for the convs * 45 GFLOPS * speed tests pass again * clean up prints * fix FMA WHAT A WASTE OF TIME * colors * moar fair * GPU * useless on chonker * cleanups * improve factorized shapetracker * better threshold * label conv * work * ops test pass again * hot load the index * run the last view, no need to create * ZeroView needs a repr for the key to work * fix segfault on out of bounds * one more test * start amx, and llvm.initialize_native_asmparser * amx works * nice AMX class * nicer AMX class * refactor get_idxs * amx working * is slower... * useless flip * cache * SZ_X * AMX_SZ_X/Y work alone * Contiguous mlop * test gemm packed * PREPARE in packed * use_amx factor * prefetch isn't faster * loop * same 3ms * 2.24 ms * allow double on store in TG * amx reduce is the same speed as non amx reduce * include memory bandwidth * clean up shapetracker * flip returns stride * prepare for upstream * Update ops_llvm.py (#426) * permutes are yellow and green now * faster conv * llvm cleanups * Show optimised IR under debug 4 (#428) * ASTKernel class * Make tinygrad work with older python version (#427) * Make tinygrad work with older python version * Use partialmethod instead of partial * smiple chonker is chonking * remove junk from test speed vs torch * fix linker and types * AMX is only here now * add LLVM tests, it's a valid backend now * oops, run llvm test * contiguous_op * fix loadops compare * dedup reduceops Co-authored-by: calledit <1573053+calledit@users.noreply.github.com>
38 lines
1.1 KiB
Python
38 lines
1.1 KiB
Python
#!/usr/bin/env python3
|
|
|
|
import os
|
|
from setuptools import setup
|
|
|
|
directory = os.path.abspath(os.path.dirname(__file__))
|
|
with open(os.path.join(directory, 'README.md'), encoding='utf-8') as f:
|
|
long_description = f.read()
|
|
|
|
setup(name='tinygrad',
|
|
version='0.4.0',
|
|
description='You like pytorch? You like micrograd? You love tinygrad! heart',
|
|
author='George Hotz',
|
|
license='MIT',
|
|
long_description=long_description,
|
|
long_description_content_type='text/markdown',
|
|
packages = ['tinygrad'],
|
|
classifiers=[
|
|
"Programming Language :: Python :: 3",
|
|
"License :: OSI Approved :: MIT License"
|
|
],
|
|
install_requires=['numpy', 'requests', 'pillow', 'networkx'],
|
|
python_requires='>=3.8',
|
|
extras_require={
|
|
'gpu': ["pyopencl", "six"],
|
|
'llvm': ["llvmlite"],
|
|
'testing': [
|
|
"pytest",
|
|
"torch~=1.11.0",
|
|
"tqdm",
|
|
"protobuf~=3.19.0",
|
|
"onnx",
|
|
"onnx2torch",
|
|
"mypy",
|
|
],
|
|
},
|
|
include_package_data=True)
|