While fusion (#654)

* try this

* readme

* opt comments
This commit is contained in:
George Hotz
2023-03-06 09:13:23 -08:00
committed by GitHub
parent 066a65dad5
commit 4b9bc1615b
2 changed files with 8 additions and 4 deletions

View File

@@ -71,12 +71,14 @@ print(y.grad) # dz/dy
Try a matmul. See how, despite the style, it is fused into one kernel with the power of laziness.
```python
OPTLOCAL=1 GPU=1 DEBUG=3 python3 -c "from tinygrad.tensor import Tensor;
DEBUG=3 OPTLOCAL=1 GPU=1 python3 -c "from tinygrad.tensor import Tensor;
N = 1024; a, b = Tensor.randn(N, N), Tensor.randn(N, N);
c = (a.reshape(N, 1, N) * b.permute(1,0).reshape(1, N, N)).sum(axis=2);
print((c.numpy() - (a.numpy() @ b.numpy())).mean())"
```
Change to `DEBUG=4` to see the generated code.
## Neural networks?
It turns out, a decent autograd tensor library is 90% of what you need for neural networks. Add an optimizer (SGD, RMSprop, and Adam implemented) from tinygrad.nn.optim, write some boilerplate minibatching code, and you have all you need.