Commit Graph

23 Commits

Author SHA1 Message Date
George Hotz
9366a23eb0 test backward in test_tiny (#11697)
* test backward in test_tiny

* empty
2025-08-16 20:29:39 -07:00
kevvz
e2873a3a41 [bounty] Muon optim (#11414)
* newton schulz

* add muon + move newton schulz to tensor

* compact newton schulz

* better tests

* cleanup

* add comments for muon

* cleanup

* add export with tests

* match muon optim with test optim

* cleanup

* unsed import

* correct comment

* whitespace

* move export

* muon test fix

* match reference impl + tests

* remove export by moving muon device

* add credit

* cleanup

* remove print

* spacing

* spacing

* comma

* cleanup

* removal

* fix tests + optim momentum

* consistent is not/ not

* more consistency

* fix test

* cleanup

* fix the nones

* remove comment

* cast

* comment

* comment

* muon teeny test

* muon flag beautiful mnist

* set steps

* steps as hyperparam

* match default test steps

* name

* large cleanup

* dont care about steps

* nesterov false default

* match each other impl

* steps

* switch nest

* swap defaults

* update docstring

* add no nesterov test

* ban fuse_optim

* prints

* classical momentum

* alternative condition

* recon

* pre + post wd

* false default

* detach

* signature changes

* context

* swap order

* big cleanup

* 0 step instead

* parity

* remove fuse

* remove fused

* better paper

* assert message

* correct shape check + eps

* multidim

* add eps

* cleanup

* correct assert message

* lint

* better tests

* naming

* ns_steps,ns_params

* update docstring

* docstring

* match sgd and muon together

* sandwich

* add back fused

* parity

---------

Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
2025-08-13 14:27:55 -04:00
chenyu
7338ffead0 small beautiful_mnist update (#11596)
gather is fast now. there's a conv/bw kernel that only gets fast with BEAM, but whole thing runs < 5 seconds now regardless
2025-08-09 19:51:14 -04:00
George Hotz
b3b43a82c4 remove Tensor.no_grad, it's meaningless now [pr] (#10556) 2025-05-28 22:20:02 -07:00
Adam Van Ymeren
a28ca0680f update dead link (#10242) 2025-05-09 19:59:52 -04:00
George Hotz
37fa38d272 Revert "switch beautiful_mnist to use new optimizer [pr] (#8231)" (#8233)
This reverts commit e9ee39df22.
2024-12-13 19:07:09 -08:00
George Hotz
e9ee39df22 switch beautiful_mnist to use new optimizer [pr] (#8231)
* switch beautiful_mnist to use new optimizer [pr]

* fix abstractions3 + docs

* fix OptimizerGroup with schedule_step api
2024-12-13 18:27:16 -08:00
Kinvert
960c495755 added beautiful fashion mnist and example (#6961)
* added beautiful fashion mnist and example

* fixing whitespace

* refactor Fashion MNIST to fewer lines

* fix newline to reduce diff

* Update beautiful_mnist.py

* Update beautiful_mnist.py

---------

Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
2024-10-10 12:01:07 +08:00
George Hotz
17a043edad tensor inference (#6156)
* tensor inference

* test is even better name
2024-08-18 00:19:28 -07:00
George Hotz
14b613e281 add STEPS to beautiful_mnist 2024-08-10 15:23:44 -07:00
George Hotz
a9f5a764dc make BatchNorm work for 2D and 3D (#5477)
* make BatchNorm work for 2D and 3D

* beautiful mnist shouldn't use BatchNorm2d
2024-07-14 11:39:58 -07:00
George Hotz
5232e405ce hotfix: add BS to beautiful_mnist 2024-07-11 10:55:05 -07:00
chenyu
055e616302 cleanup mnist data load in beautiful_mnist (#5106) 2024-06-22 18:31:51 -04:00
chenyu
e356807696 tinytqdm.set_description and tinytrange (#5101) 2024-06-22 14:45:06 -04:00
George Hotz
b683d0f496 hotfix: 100% accuracy is wrong 2024-05-01 08:07:18 -07:00
George Hotz
cd88afc98b datasets isn't a feature + filter docstrings (#4228)
* datasets isn't a feature

* filter docstrings in sz
2024-04-19 16:16:10 +04:00
George Hotz
216eb235e5 hotfix: cast mnist to float 2024-04-09 19:30:03 -07:00
George Hotz
fea774f669 spend 5 lines to bring mnist into the repo (#4122) 2024-04-09 19:24:57 -07:00
chenyu
d651835ef5 verify beautiful_mnist.py eval acc and put into benchmark ci (#3926)
* verify beautiful_mnist and put in ci

* 97.5 for eval verification
2024-03-25 16:47:49 -04:00
chenyu
fb3f8f7597 move sample inside jit for beautiful_mnist (#3115)
also removed .realize() for jit functions since jit does it automatically now. a little more beautiful
2024-01-14 01:36:30 -05:00
chenyu
2f67f1e580 remove obsolete TODO in beautiful_mnist (#2946)
the compiler error was due to `error: call to 'max' is ambiguous` when we have max(int, float) in kernel.
it was first fixed in 4380ccb1 the non fp32 math PR, and further solidified with dtype refactor
2023-12-28 17:09:23 -05:00
George Hotz
c8c5212dce a lil more beautiful_mnist 2023-11-17 19:53:06 -08:00
George Hotz
c7b38b324b A beautiful MNIST training example (#2272)
* beautiful mnist

* beautiful mnist example

* from tinygrad import Tensor

* more beautiful

* the jit is super core tinygrad

* globalcounters reset on jit run

* symlinks and exclude

* beautiful_cartpole

* evaluate is it's own function

* no symlinks

* more beautiful

* jit reset for double speed

* type hinting for JIT

* beautiful_mnist gets 98%

* beautiful_mnist < 4s with BEAM=2

* better cartpole

* use actor critic

* zero_grad got lost

* delete double relu

* stable cartpole with PPO

* beautiful_cartpole is more beautiful

* REPLAY_BUFFER

* beautiful stuff typechecks

* None support in shape

* hp tuning
2023-11-17 19:42:43 -08:00