Commit Graph

155 Commits

Author SHA1 Message Date
chenyu
9504db1a57 remove the realize in _rebuild_tensor_v2 (#5347)
no longer needed
2024-07-09 12:28:52 -04:00
chenyu
b2c3a28a5e nn.RMSNorm (#5272)
the norm itself has no significant value to add to Tensor method, but we would want Tensor.normalize
2024-07-02 21:39:01 -04:00
chenyu
8080298739 s/tinytqdm/tqdm (#5103)
except in unit test where tqdm is imported
2024-06-22 14:18:26 -04:00
George Hotz
ca4ccddcd6 docsfix: nn.Tensor -> Tensor 2024-06-12 09:18:32 +02:00
George Hotz
b9f26eedc9 hotfix: import datasets in nn init 2024-06-10 11:33:50 +02:00
SnakeOnex
b1db2d0094 tqdm replacement (#4846)
* tqdm replacement almost

* formatting

* formatting

* imports

* line len

* fix

* removed set description :(

* removed set description :(

* fix

* fix

* green check?

* rewrote as class, fixed several bugs

* types spacing

* removed imports

* fix

* iterable

* typing

* mypy disagreement

* imports

* more e2e tests vs tqdm

* removed seed setting

* robustness against time.sleep() flakiness

* flaky fix

* automatic bar closing when count==total

* cleanup

* clang error with tqdm

* tqdm back

* use os lib, print to stderr (fixes the clang bug, where the bar was leaking into the generated c program

* back to shutil

* unit_scale + unit_scale test

* custom unit to tests

* pretty

* clean

* removed flaky test

* less test iters

* empty line

* remove disable
2024-06-09 23:46:03 +02:00
chenyu
0b58203cbe docs: fix nn rendering (#4736)
* docs: fix nn rendering

need to import nn at the start of the page. maybe there's a better way to move it on top?

* print that
2024-05-26 18:55:20 -04:00
wozeparrot
5c6af97436 nn docs (#4735) 2024-05-26 13:08:22 -07:00
George Hotz
5ba611787d move image into tensor.py. delete features (#4603)
* move image into tensor.py

* change setup.py

* openpilot tests need pythonpath now
2024-05-15 10:50:25 -07:00
chenyu
7afca52796 replace pow in LAMB by tracking b1**t and b2**t per step (#4582)
* replace pow in LAMB by tracking b1**t and b2**t per step

* remove t, add [self.b1_t, self.b2_t] to return

* adam has one less kernel
2024-05-14 13:08:22 -04:00
wozeparrot
d7670f8141 quantized llama multilazybuffer fix (#4557) 2024-05-12 14:19:21 -07:00
ziereis
bcee4743ce fix error message (#4556)
* fix error messgae

* typo

* add suggestion to fix error

---------

Co-authored-by: Thomas Ziereis <thomas.ziereis@web.de>
2024-05-12 12:35:51 -07:00
George Hotz
17faae091b optimizer shouldn't be run without training (#4460)
* optimizer shouldn't be run without training

* set training in relevant tests

* fix multitensor

* that too
2024-05-06 15:34:12 -07:00
David Hou
c0a048c044 batchnorm d(var)/d(mean) = 0 (#4430)
* d(var)/d(mean) = 0

* drop the number in test_schedule!
2024-05-05 00:25:45 -04:00
George Hotz
d325be2540 update docs (#4356)
* update docs

* nn.md

* mnist cleanups

* rhip test is very slow
2024-04-30 16:51:42 +09:00
chenyu
5ae252ae83 use at least float32 for optim.lr (#4297)
* use at least float32 for optim.lr

when doing mixed precision training (float32 weight, default_float=half), still use float32 to store lr.
it would have been upcasted later in actual weight update, but would have lost precision.
this improved resnet convergence significantly

* undo type annotation
2024-04-25 14:42:28 -04:00
George Hotz
967638f0d5 update docs, remove corealize (#4264)
* update docs, remove corealize

* handle 0 line count

* tensor schedule
2024-04-23 12:05:29 +04:00
George Hotz
cd88afc98b datasets isn't a feature + filter docstrings (#4228)
* datasets isn't a feature

* filter docstrings in sz
2024-04-19 16:16:10 +04:00
George Hotz
fa57c3e7ce continue llm.c (#4190)
* continue llm.c

* export more

* progress on llm.c

* simpler optim, names work
2024-04-18 10:57:54 +04:00
David Hou
593c90d7d6 Resnet fp16 training with fp32 master weight copy (#4144)
* add casts to layers

* FLOAT flag

* detach

* no_grad for eval

* whitespace

* explicit fp32 initialization

* oops

* whitespace

* put back config['DEFAULT_FLOAT']

* bad

* live dangerously (don't hide bugs)

* don't bundle changes

---------

Co-authored-by: chenyu <chenyu@fastmail.com>
2024-04-14 11:25:08 -04:00
chenyu
f61ed869f5 Use exec_alu for lazy const folding (#4039) 2024-04-02 20:52:05 -04:00
Patrick Tsai
0147174ad6 Embedding in one kernel (#4036)
* Embedding is in one kernel

* embedding is one kernel

* rm extra line

* newline

* bert test counts state vars?

* add a test?

* move items around

---------

Co-authored-by: Patrick Tsai <patosai@users.noreply.github.com>
2024-04-02 11:38:21 -04:00
chenyu
c71627fee6 move GlobalCounter to helpers (#4002)
break circular import between ops and buffer
2024-03-30 00:30:30 -04:00
uuuvn
8a40d7d423 Shape changing bitcast and assert bitcast in disk (#3973)
* Shape changing bitcast

* only support it on disk

* basic test

* more tests

* RuntimeError instead of assert

* create unique temp files

* move tests that use disk to test_disk_tensor

* linter

* remove assert on error messages

* that's RuntimeError now

---------

Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
2024-03-28 21:49:10 -07:00
wozeparrot
9a9cac58f9 add lars to nn (#3750)
* feat: add lars

* feat: don't remove this comment

* clean: smaller diff

* clean: shorter line

* feat: remove mlperf lars, switch resnet

* fix: fully remove mlperf lars

* clean: comment

* feat: contiguous

* feat: no weight decay on skip params

* feat: optimizergroup

* feat: classic momentum

* fix: pylint

* clean: move comment

* fix: correct algo

* feat: lrschedulergroup

* feat: skip list tests

* feat: :| forgot that params are a thing

* feat: remove skip_list params from main params

* feat: set moment

---------

Co-authored-by: chenyu <chenyu@fastmail.com>
2024-03-24 11:43:12 -04:00
George Hotz
54dc48aa47 fix assign (#3878)
* fix assign

* remove terrible optimizer hack

* oops, not realized assigns
2024-03-22 11:48:48 -07:00
George Hotz
e6d55932ca hotfix: this makes beautiful mnist work again, not okay 2024-03-18 18:22:44 -07:00
chenyu
639bd5dbfc move bf16 cast hack to Tensor.llvm_bf16_cast (#3788) 2024-03-17 18:51:22 -04:00
David Hou
07324b56d5 [experimenting] use contiguous instead of realize in optim (#3770)
* run CI

* comment

* remove t.grad to try

* Revert "remove t.grad to try"

This reverts commit 05ec2d3b89.
2024-03-15 23:06:50 -07:00
George Hotz
3527c5a9d2 add Tensor.replace (#3738)
* add Tensor.replace

* fix dtypes in that test

* should be replace

* and mixtral
2024-03-14 13:34:14 -07:00
Szymon Ożóg
6c36264790 Improve type hints for optimizer (#3583)
* Improve type hints for optimizer

* lint fix
2024-03-02 07:35:44 -08:00
George Hotz
50e1445e60 Revert "allow overriding weight init for Linear (#3569)" (#3576)
This reverts commit 2d0973a852.
2024-03-02 03:17:13 -08:00
David Hou
2d0973a852 allow overriding weight init for Linear (#3569) 2024-03-02 03:16:04 -08:00
George Hotz
b1c0d8c99d remove cpu and torch backends (#3399)
* remove cpu and torch backends

* don't copy to cpu

* use clang instead of cpu

* multitensor gathers on the first device

* clang is cpu + use default

* fixup

* bugfix
2024-02-15 16:55:39 +01:00
chenyu
97275101e9 fix safetensor load uint32 and uint64 (#3315)
the correct keys are U32 and U64.
2024-02-04 10:46:27 -05:00
Yoshinori Sano
edb74897b2 support safe load bf16 (#3310)
* support safe load bf16

* fix lint error E501

* add test for loading safetensors

* key should be BOOL

* fix lint
2024-02-04 10:08:39 -05:00
George Hotz
a72b1b6d65 sharding for llama (#3151)
* shard llama

* sharding works

* simpler

* simpler

* consume option

* disable that test

* save a line

---------

Co-authored-by: George Hotz <george@tinygrad.org>
2024-01-16 19:28:00 -08:00
chenyu
c658aa4fbf minor cleanup of test_disk_tensor (#3112) 2024-01-13 20:54:58 -05:00
Yixiang Gao
a686663657 make Embedding device aware for multigpu (#3051)
* make Embedding device aware for multigpu

* split line instead of igore because that's cheating

* add test incomplete

* add test complete

* remove comment

* fix white space

* remove nn.Embedding
2024-01-08 20:09:26 -08:00
chenyu
4f4e8634b8 use in_features directly in nn.Linear.__init__ bound check (#3050)
* use in_features directly in nn.Linear.__init__ bound check

get rid of the unnecessary check of isinstance int

* that is always int

* long lines
2024-01-08 19:32:35 -05:00
chenyu
3eb3664074 fix nn.Embedding with empty length input (#3048) 2024-01-08 18:08:36 -05:00
chenyu
74a30431b4 replace d[a] if a in d else b with d.get(a, b) (#2997) 2024-01-03 18:10:25 -05:00
George Hotz
a280cfe169 move dtypes to dtype.py (#2964)
* move dtypes to dtype.py

* fix urllib
2024-01-01 14:58:48 -08:00
George Hotz
c81ce9643d move globalcounters to ops (#2960)
* move globalcounters to ops

* missed a few

* sick of that failing
2024-01-01 14:21:02 -08:00
George Hotz
e0ecab3797 touchups from multibuffer branch (#2958) 2024-01-01 11:33:41 -08:00
George Hotz
e1861ab65e remove realize from optimizer (#2880)
* remove realize from optimizer

* one still needed

* opt realize
2023-12-20 16:42:41 -08:00
Oleg Rybalko
42a038c83f More readable torch_load ext check (#2853)
* more readable extension check

* enable tarfile test

* detach tensor if requires grad in torch
2023-12-19 14:53:15 -05:00
George Hotz
7e5b3e53fe changes to prep for new lazy (#2748)
* changes to prep for new lazy

* put those back
2023-12-13 10:28:22 -08:00
George Hotz
6d6eb9302d ruff checks the max line length is 150 (#2734)
* ruff checks the max line length is 150

* fix tensor.py

* a lot more

* done
2023-12-12 17:34:47 -08:00
Guy Leroy
ee9e1d3662 Extend available types for safe_save (#2720)
* Extend available types to save with

* Linter fix
2023-12-11 14:50:35 -08:00