Commit Graph

43 Commits

Author SHA1 Message Date
Yvon Manzi
6652003839 Add cumprod to Tensor (#9629)
* probably how cumprod should look like

* update _cumalu to work with MUL

* shorter

* cumprod testing

* clean

* more cleanup

* add cumprod to torch backend.

* make it look like cumsum

* mypy fix

---------

Co-authored-by: chenyu <chenyu@fastmail.com>
2025-03-30 21:49:18 -04:00
Priyank Patel
4f5e03bd60 better fix inplace detach (#9557) 2025-03-24 22:50:28 +08:00
geohotstan
309afa20b7 add Tensor.max_unpool2d (#9518)
* why does max_unpool2d feel slower than out.gradient ...

* slightly cleaner

* what happened to ruff

* need to think about this some more

* slightly faster now?

* clean up, 1 more failing edge case

* ok good

* working TINY_BACKEND

* nit doc wording

* retry CI
2025-03-22 12:11:33 -04:00
geohotstan
8c0d0a122c Add return_indices to max_pool (#9506)
* wow argmax is so good

* 1 less line

* clean up and better variable names

* is this torch thing right...?

* add more tests

* slap a TODO on it

* clean ups

* prettier looking code and fix ceil mode test

* add return types and some docs

* ok that was a bad example since indices == value, just no example
2025-03-19 15:25:37 -04:00
b1tg
a95b489a55 nanoGPT train works with tiny torch backend (#9283)
* train_shakespeare_char.py works

* move aten.where.self_out to tiny_backend_out

* fix memory leak

* corealize in the backward_hook

* Update backend.py

---------

Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
2025-03-19 11:51:02 +08:00
Anish Umale
5e58f4b65b Tiny backend test_ops fix part 3 (#9483)
* extract straightforward things from https://github.com/tinygrad/tinygrad/pull/9302

* pass dtype and device for ones_like
2025-03-17 18:01:51 -04:00
TJ
9fcef4d009 add masked_select to tensor.py (#9468)
* add masked_select to tensor.py

* fix tests

---------

Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
2025-03-17 16:05:36 -04:00
geohotstan
53d6f1e1bb Add bitonic cat sort (#9422)
* poc

* repeated values fail, sigh

* is this being timed out?

* fix up down names

* bitonic v2, does this run?

* bitonic v3, faster

* bitonic v3.1, faster

* bitonic v3.1.1, same speed unlucky

* support dim and indices

* bitonic v3.2, simpler code, TODO repeated indices

* bruv gimme green for once cmon

* cat (stack) implementation, slow but maybe one day when cat is fast meow

* revert to v3.2

* bitonic v4, who let the cats out edition

* clean up variable names

* figured out repeated indices :D

* ruff check --fix

* use sort for topk

* add Tensor.sort everywhere

* fix docs and add some types

* slightly better variable names

* am I doing torch inplace correctly?

* delegate sort to values_stable

* add a contig, faster first sort

* maybe don't test_inplace

---------

Co-authored-by: chenyu <chenyu@fastmail.com>
2025-03-17 12:01:23 -04:00
Priyank Patel
4714c4f9ad torch backend multigpu - add devices and tests (#9414)
* add multi-device support and tests

* simplify
2025-03-12 11:33:11 +08:00
Priyank Patel
beed00eabe fix torch backend memory leak (#9395)
* fix leak, realize everything on torch optim step

* only realize a subset

---------

Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
2025-03-11 10:48:20 +08:00
chenyu
01e8b60911 acc_dtype -> dtype (#9402)
matched numpy and torch
2025-03-10 16:05:30 -04:00
Priyank Patel
796c3bbb23 torch: support in-place operations on views (#9371)
* add torch inplace tests

* first set of tests passing

* wrap all inplace funcs, add more tests

* fixes and wrap more functions

* fix all uint8 tests to avoid slow tests

* fix the one test

* another test, another fix

* and one more, works for ddp now

* something on contiguous, cleanup

---------

Co-authored-by: qazal <77887910+Qazalin@users.noreply.github.com>
2025-03-10 23:29:00 +08:00
geohotstan
1d64c12f2b add Topk to tensor (#9343)
* terrible but somewhat working impl

* linux behaves differently than macos?

* slightly better impl

* small clean up; haven't figured this out yet

* better

* torch has different behavior on linux and macos for duplicated values

* add sum docs

* fix test

* add torch return_type test

* add an exception test

* wrap_fxn instead, and move op lower in order

* better repeated values test

* rerun ci
2025-03-09 20:01:42 -04:00
Anish Umale
b3ac60ce53 Fix test_ops for tiny backend part 2 (#9358)
* extact functions from https://github.com/tinygrad/tinygrad/pull/9302

* revert gather and add aten.elu_backward

* address review

---------

Co-authored-by: chenyu <chenyu@fastmail.com>
2025-03-05 13:38:40 -05:00
Priyank Patel
f048256341 fix TORCH_DEBUG=1 sigsegv (#9352) 2025-03-05 12:24:53 +03:00
chenyu
019417743c ruff torch backend (#9341) 2025-03-03 15:15:23 -05:00
Anish Umale
bafa40fe12 Tiny backend test_ops fix part1 (#9338)
* extract name methods from https://github.com/tinygrad/tinygrad/pull/9302

* t.grad.numpy() -> t.grad.cpu().numpy()

* revert TORCH_DEBUG change

* revert dtype change in aten.sum
2025-03-03 12:36:51 -05:00
Friedrich Carl Eichenroth
b4028e48ae Torch Backend Refinement (#9327)
* fix some torch tests

* fixup

* small change

* fixup

* fix test

* use default function

* add todo

* bunch of small changes

* fix tests

* more tests

* fix

* fix

* test fix

* simplify
2025-03-03 10:24:02 -05:00
chenyu
ba4b8c2c23 Tensor.copysign (#9329) 2025-03-02 21:33:49 -05:00
Friedrich Carl Eichenroth
06ef9cc9f4 aten leaky_relu, div.out_mode, clamp_max, clamp_min, copysign (#9323)
* fix some torch tests

* fixup

* small change

* fixup

* fix test

* use default function

* add todo
2025-03-02 19:12:16 -05:00
Priyank Patel
f4148ac46a torch fix casting and add ops for sd vae(s) (#9297)
* torch fix copy casting and add upsample op

* update cast and add test

* fix lint

* add pad for sdxl vae to work
2025-03-01 08:49:10 -05:00
nimlgen
052722a7bc torch hook: address comments (#9295)
* torch hook: address comments

* failed test
2025-02-28 11:51:52 +03:00
Priyank Patel
8ae215dd3d torch backend fix manual seed warning (#9292) 2025-02-28 13:45:32 +08:00
George Hotz
ac40316692 hotfix: group cpu functions in torch backend 2025-02-28 10:39:00 +08:00
George Hotz
b32595dbbc torch examples (#9290)
* torch, fix examples/mnist

* fix vae torch example

* where out
2025-02-28 10:16:06 +08:00
chenyu
184030168d fix aten.reflection_pad2d (#9289)
tested the torch doc example
2025-02-27 15:53:46 -05:00
chenyu
0de6585df0 fix aten.normal_ arg (#9288)
should be mean and std.
2025-02-27 15:36:25 -05:00
chenyu
8ee2b460ee Tensor.var_mean (#9287) 2025-02-27 15:15:31 -05:00
George Hotz
387ea41e99 increase speed of torch mnist: use gradient api (#9282) 2025-02-27 11:57:41 +08:00
Priyank Patel
a0764f0dc0 (bounty) Make mnist training run with torch backend (#9233)
* yml changes

* torch backend remove meta decomps and add test

* torch backend bump timeout for tests

---------

Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
2025-02-27 11:32:25 +08:00
George Hotz
9088125a6a a lil more torch (#9280) 2025-02-27 11:12:20 +08:00
George Hotz
b6a14911c8 start torch.compile support (#9279) 2025-02-27 10:29:51 +08:00
Francis Lata
86b737a120 leakyrelu to leaky_relu (#9270) 2025-02-26 13:22:08 -05:00
chenyu
aaf0a8069f xor -> bitwise_xor (#9264) 2025-02-26 10:21:14 -05:00
George Hotz
2158dc4849 full fix for as_strided in torch backend (#9257)
* fixes from chargpt for torch backend

* shrink support

* add stride support

* comment cleanup

* a few more

* work

* import the stream hack

* llvm multi auto
2025-02-26 22:34:05 +08:00
George Hotz
7780393460 rig up torch's testing framework [pr] (#9254)
* rig up torch's testing framework [pr]

* support more movement ops

* dec on expand

* fix tests

* work

* fix tests

* a few more

* decomps + opt hook

* installed pytest
2025-02-26 18:46:22 +08:00
George Hotz
b603af373e run some tests from torch [pr] (#9252)
* run some tests from torch [pr]

* yml

* wrap_out

* clean up for the new people

* a lil more
2025-02-26 15:42:22 +08:00
George Hotz
fc32ff80d6 torch and numpy dtype interop [pr] (#9224)
* torch and numpy dtype interop [pr]

* less lines

* order
2025-02-24 18:26:49 +08:00
George Hotz
fd731e740a hotfix: add note on backend2.py 2025-02-24 11:23:03 +08:00
albanD
f2dd9c1562 simplify c++ code (#9221) 2025-02-24 11:04:41 +08:00
George Hotz
97bc723538 torch backend works for ResNet-18 (#9200)
* torch backend progress, a few more functions

* resnet works

* pillow

* tv
2025-02-22 22:16:23 +08:00
George Hotz
4e6665bda5 different way to write torch backend (#9197)
* different way to write torch backend

* both backends

* more work

* simpler code

* more work

* test both

* imply unwrap/wrap

* FORWARD_ONLY=1 TINY_BACKEND=1 python3 test/test_ops.py TestOps.test_add works

* ready to start making test_ops work in torch backend

* backward pass, TINY_BACKEND=1 python3 test/test_ops.py TestOps.test_add works

* FORWARD_ONLY=1 TINY_BACKEND=1 python3 test/test_ops.py TestOps.test_simple_conv2d works

* matmul backward is broken with as_strided
2025-02-22 14:42:26 +08:00
George Hotz
e87be0131e torch backend start (#9191)
* start torch backend

* progress

* ugh, you need cpp crap

* 1+1 works

* 1+1 works

* becoming a real backend

* ready to merge?
2025-02-21 16:57:28 +08:00