tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-04-29 03:00:14 -04:00

Go to file

sehaj 775287ed91 Add yolov8 implementation (#806 )

* added SPPF module from yolov8

* added conv_block, bottleneck modules

* cleaned modules

* c2f example

* spf changes

* C2f

* fixed and tested bottleneck

* improved detect class

* tested spf and conv

* checked c2f

* DFL structure

* fixed dfl

* added dist2bbox function

* added dist2bbox function

* added and tested make_anchors function for the head

* keeping functions above

* creating the detection head

* fixing head

* untested blocks a. scale_boxes b. clip_boxes c. xywh2xyxy d. box_iou

* head works

* structure fixx

* added darknet (backbone)

* yolov8 neck, and intialize bias function while detection

* fixed spacing

* yolov8 class, init bias, and fixed c2f

* forward pass almost working

* fixed net structure

* init bias not needed, forward pass working

* load weights boilerplate

* load weights done?

* all variants loading!

* post process: clip_boxes, scale_boxes, xywh2xyxy, and box_iou(untested)

* fix scale_boxes

* box_iou fixed and tested

* created the pre nms function

* fix nms

* fixed load weights, apparently the latest commit broke something, excluding num_batches_tracked

* added letterbox and pre_tranform for pre_process function

* fixed letterbox, pre_transform and added preprocess function

* custom NMS done, integrated prepare_boxes and nms, improved box_iou

* added postprocess function till parsing

* added draw_bounding_boxes_and_save function

* testing full flow

* using fetch for class names

* fixed make_anchors + all tinygrad now

* added command line arguments, weight downloading

* single image for now only

* made draw boxes more efficient

* made NMS functions efficient

* made compute_transform better

* v8 working now, inference is done

* prints objects detected in console now

* fixed image loading (pre processing)

* batch post processing

* created initial tests

* fixes bounding box thickness AND added get_detected_classes_with_frequency function

* cleaning for testing

* two tests

* added url option for image, removed need for specifiying arguments

* tests complete, but lots on things are printed on screen by ultralytics

* remove parse arguments

* fixed weight location

* fixed colours of classes, and black font when high brightness

* minor changes

* TODOs for later

* removed use of torch, using .npz weights

* fixed tests

* one path for fetch

* preprocess now in tinygrad, plus test fix for that

* updated tests

* fix tests

* no class labels needed

* Add files via upload

* Update showcase.md

* Update showcase.md

* added safe tensors as weights, and tests fix for that

* safe tensors test

* using safe_load

* using tinygrad functions now to load weights

* update tests

---------

Co-authored-by: r3sist-uniq <amanmatreja@gmail.com>
Co-authored-by: r3sist <72573738+r3sist-uniq@users.noreply.github.com>

2023-06-16 18:55:19 -07:00

.github/workflows

fixes to Onnx ops LayerNormalization/Prelu and added OptionalHasElement/OptionalGetElement (#956 )

2023-06-08 16:09:19 -07:00

accel

move to shapetracker.py

2023-03-11 07:50:07 -08:00

cache

add ff_dim to transformer

2021-11-29 12:40:52 -05:00

datasets

imagenet download and prepare (#928 )

2023-06-08 12:55:33 -07:00

disassemblers/adreno

fix path linter issue

2023-04-18 19:17:41 -07:00

docs

Add yolov8 implementation (#806 )

2023-06-16 18:55:19 -07:00

examples

Add yolov8 implementation (#806 )

2023-06-16 18:55:19 -07:00

extra

faster RDNA assembly backend (#990 )

2023-06-16 12:06:38 -07:00

models

Refactor LoadOps (#910 )

2023-06-03 09:40:43 -07:00

openpilot

jit: TODO, use abstractions

2023-05-05 22:51:30 -07:00

test

Add yolov8 implementation (#806 )

2023-06-16 18:55:19 -07:00

tinygrad

faster RDNA assembly backend (#990 )

2023-06-16 12:06:38 -07:00

weights

cleanup clip tokenizer

2022-09-12 09:20:12 -07:00

.editorconfig

Basic editorconfig support (#422 )

2022-11-08 10:34:25 -08:00

.gitignore

Whisper (#919 )

2023-06-03 18:55:14 -07:00

.pre-commit-config.yaml

fix mypy

2023-05-13 21:25:36 -07:00

.pylintrc

Devicebufferless (#708 )

2023-03-18 14:40:23 -07:00

.tokeignore

Add a quick start guide (#900 )

2023-06-04 08:51:20 -07:00

compile.sh

stop wasting time with the compiler. tinygrad needs to just jit

2023-03-12 12:08:46 -07:00

LICENSE

Updated LICENSE year (#760 )

2023-05-01 15:35:23 -07:00

push_pypi.sh

push pypi

2020-10-27 08:13:15 -07:00

README.md

readme updates

2023-06-05 12:20:14 -07:00

rmso.sh

compile works (#688 )

2023-03-12 11:01:25 -07:00

run_multibackend.sh

dtypes nice and clean (#673 )

2023-03-10 16:56:07 -08:00

setup.py

RDNA assembly backend ($1000 bounty) (#787 )

2023-06-16 09:33:18 -07:00

sz.py

move line counter to python

2023-05-29 09:21:40 -07:00

README.md

tinygrad: For something between PyTorch and karpathy/micrograd. Maintained by tiny corp.

Homepage | Documentation | Examples | Showcase | Discord

This may not be the best deep learning framework, but it is a deep learning framework.

Due to its extreme simplicity, it aims to be the easiest framework to add new accelerators to, with support for both inference and training. If XLA is CISC, tinygrad is RISC.

tinygrad is still alpha software, but we raised some money to make it good. Someday, we will tape out chips.

Features

LLaMA and Stable Diffusion

tinygrad can run LLaMA and Stable Diffusion!

Laziness

Try a matmul. See how, despite the style, it is fused into one kernel with the power of laziness.

DEBUG=3 OPTLOCAL=1 python3 -c "from tinygrad.tensor import Tensor;
N = 1024; a, b = Tensor.rand(N, N), Tensor.rand(N, N);
c = (a.reshape(N, 1, N) * b.permute(1,0).reshape(1, N, N)).sum(axis=2);
print((c.numpy() - (a.numpy() @ b.numpy())).mean())"

And we can change DEBUG to 4 to see the generated code.

Neural networks

As it turns out, 90% of what you need for neural networks are a decent autograd/tensor library. Throw in an optimizer, a data loader, and some compute, and you have all you need.

Neural network example (from test/models/test_mnist.py)

from tinygrad.tensor import Tensor
import tinygrad.nn.optim as optim

class TinyBobNet:
  def __init__(self):
    self.l1 = Tensor.uniform(784, 128)
    self.l2 = Tensor.uniform(128, 10)

  def forward(self, x):
    return x.dot(self.l1).relu().dot(self.l2).log_softmax()

model = TinyBobNet()
optim = optim.SGD([model.l1, model.l2], lr=0.001)

# ... complete data loader here

out = model.forward(x)
loss = out.mul(y).mean()
optim.zero_grad()
loss.backward()
optim.step()

Accelerators

tinygrad already supports numerous accelerators, including:

CPU
GPU (OpenCL)
C Code (Clang)
LLVM
METAL
CUDA
Triton
PyTorch

And it is easy to add more! Your accelerator of choice only needs to support a total of 26 (optionally 27) low level ops. More information can be found in the documentation for adding new accelerators.

Installation

The current recommended way to install tinygrad is from source.

From source

git clone https://github.com/geohot/tinygrad.git
cd tinygrad
python3 -m pip install -e . # or `py3 -m pip install -e .` if you are on windows

Don't forget the . at the end!

Documentation

Documentation along with a quick start guide can be found in the docs/ directory.

Quick example comparing to PyTorch

from tinygrad.tensor import Tensor

x = Tensor.eye(3, requires_grad=True)
y = Tensor([[2.0,0,-2.0]], requires_grad=True)
z = y.matmul(x).sum()
z.backward()

print(x.grad.numpy())  # dz/dx
print(y.grad.numpy())  # dz/dy

The same thing but in PyTorch:

import torch

x = torch.eye(3, requires_grad=True)
y = torch.tensor([[2.0,0,-2.0]], requires_grad=True)
z = y.matmul(x).sum()
z.backward()

print(x.grad.numpy())  # dz/dx
print(y.grad.numpy())  # dz/dy

Contributing

There has been a lot of interest in tinygrad lately. Here are some basic guidelines for contributing:

Bug fixes are the best and always welcome! Like this one.
If you don't understand the code you are changing, don't change it!
All code golf PRs will be closed, but conceptual cleanups are great.
Features are welcome. Though if you are adding a feature, you need to include tests.
Improving test coverage is great, with reliable non-brittle tests.

Additional guidelines can be found in CONTRIBUTING.md.

Running tests

For more examples on how to run the full test suite please refer to the CI workflow.

Some examples:

python3 -m pip install -e '.[testing]'
python3 -m pytest
python3 -m pytest -v -k TestTrain
python3 ./test/models/test_train.py TestTrain.test_efficientnet

Languages

Python 73.3%

C 15.6%

Cuda 6.2%

C++ 2.3%

Metal 1.8%

Other 0.7%