chenyu
9504db1a57
remove the realize in _rebuild_tensor_v2 ( #5347 )
...
no longer needed
2024-07-09 12:28:52 -04:00
chenyu
b2c3a28a5e
nn.RMSNorm ( #5272 )
...
the norm itself has no significant value to add to Tensor method, but we would want Tensor.normalize
2024-07-02 21:39:01 -04:00
chenyu
8080298739
s/tinytqdm/tqdm ( #5103 )
...
except in unit test where tqdm is imported
2024-06-22 14:18:26 -04:00
George Hotz
ca4ccddcd6
docsfix: nn.Tensor -> Tensor
2024-06-12 09:18:32 +02:00
George Hotz
b9f26eedc9
hotfix: import datasets in nn init
2024-06-10 11:33:50 +02:00
SnakeOnex
b1db2d0094
tqdm replacement ( #4846 )
...
* tqdm replacement almost
* formatting
* formatting
* imports
* line len
* fix
* removed set description :(
* removed set description :(
* fix
* fix
* green check?
* rewrote as class, fixed several bugs
* types spacing
* removed imports
* fix
* iterable
* typing
* mypy disagreement
* imports
* more e2e tests vs tqdm
* removed seed setting
* robustness against time.sleep() flakiness
* flaky fix
* automatic bar closing when count==total
* cleanup
* clang error with tqdm
* tqdm back
* use os lib, print to stderr (fixes the clang bug, where the bar was leaking into the generated c program
* back to shutil
* unit_scale + unit_scale test
* custom unit to tests
* pretty
* clean
* removed flaky test
* less test iters
* empty line
* remove disable
2024-06-09 23:46:03 +02:00
chenyu
0b58203cbe
docs: fix nn rendering ( #4736 )
...
* docs: fix nn rendering
need to import nn at the start of the page. maybe there's a better way to move it on top?
* print that
2024-05-26 18:55:20 -04:00
wozeparrot
5c6af97436
nn docs ( #4735 )
2024-05-26 13:08:22 -07:00
George Hotz
5ba611787d
move image into tensor.py. delete features ( #4603 )
...
* move image into tensor.py
* change setup.py
* openpilot tests need pythonpath now
2024-05-15 10:50:25 -07:00
chenyu
7afca52796
replace pow in LAMB by tracking b1**t and b2**t per step ( #4582 )
...
* replace pow in LAMB by tracking b1**t and b2**t per step
* remove t, add [self.b1_t, self.b2_t] to return
* adam has one less kernel
2024-05-14 13:08:22 -04:00
wozeparrot
d7670f8141
quantized llama multilazybuffer fix ( #4557 )
2024-05-12 14:19:21 -07:00
ziereis
bcee4743ce
fix error message ( #4556 )
...
* fix error messgae
* typo
* add suggestion to fix error
---------
Co-authored-by: Thomas Ziereis <thomas.ziereis@web.de >
2024-05-12 12:35:51 -07:00
George Hotz
17faae091b
optimizer shouldn't be run without training ( #4460 )
...
* optimizer shouldn't be run without training
* set training in relevant tests
* fix multitensor
* that too
2024-05-06 15:34:12 -07:00
David Hou
c0a048c044
batchnorm d(var)/d(mean) = 0 ( #4430 )
...
* d(var)/d(mean) = 0
* drop the number in test_schedule!
2024-05-05 00:25:45 -04:00
George Hotz
d325be2540
update docs ( #4356 )
...
* update docs
* nn.md
* mnist cleanups
* rhip test is very slow
2024-04-30 16:51:42 +09:00
chenyu
5ae252ae83
use at least float32 for optim.lr ( #4297 )
...
* use at least float32 for optim.lr
when doing mixed precision training (float32 weight, default_float=half), still use float32 to store lr.
it would have been upcasted later in actual weight update, but would have lost precision.
this improved resnet convergence significantly
* undo type annotation
2024-04-25 14:42:28 -04:00
George Hotz
967638f0d5
update docs, remove corealize ( #4264 )
...
* update docs, remove corealize
* handle 0 line count
* tensor schedule
2024-04-23 12:05:29 +04:00
George Hotz
cd88afc98b
datasets isn't a feature + filter docstrings ( #4228 )
...
* datasets isn't a feature
* filter docstrings in sz
2024-04-19 16:16:10 +04:00
George Hotz
fa57c3e7ce
continue llm.c ( #4190 )
...
* continue llm.c
* export more
* progress on llm.c
* simpler optim, names work
2024-04-18 10:57:54 +04:00
David Hou
593c90d7d6
Resnet fp16 training with fp32 master weight copy ( #4144 )
...
* add casts to layers
* FLOAT flag
* detach
* no_grad for eval
* whitespace
* explicit fp32 initialization
* oops
* whitespace
* put back config['DEFAULT_FLOAT']
* bad
* live dangerously (don't hide bugs)
* don't bundle changes
---------
Co-authored-by: chenyu <chenyu@fastmail.com >
2024-04-14 11:25:08 -04:00
chenyu
f61ed869f5
Use exec_alu for lazy const folding ( #4039 )
2024-04-02 20:52:05 -04:00
Patrick Tsai
0147174ad6
Embedding in one kernel ( #4036 )
...
* Embedding is in one kernel
* embedding is one kernel
* rm extra line
* newline
* bert test counts state vars?
* add a test?
* move items around
---------
Co-authored-by: Patrick Tsai <patosai@users.noreply.github.com >
2024-04-02 11:38:21 -04:00
chenyu
c71627fee6
move GlobalCounter to helpers ( #4002 )
...
break circular import between ops and buffer
2024-03-30 00:30:30 -04:00
uuuvn
8a40d7d423
Shape changing bitcast and assert bitcast in disk ( #3973 )
...
* Shape changing bitcast
* only support it on disk
* basic test
* more tests
* RuntimeError instead of assert
* create unique temp files
* move tests that use disk to test_disk_tensor
* linter
* remove assert on error messages
* that's RuntimeError now
---------
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com >
2024-03-28 21:49:10 -07:00
wozeparrot
9a9cac58f9
add lars to nn ( #3750 )
...
* feat: add lars
* feat: don't remove this comment
* clean: smaller diff
* clean: shorter line
* feat: remove mlperf lars, switch resnet
* fix: fully remove mlperf lars
* clean: comment
* feat: contiguous
* feat: no weight decay on skip params
* feat: optimizergroup
* feat: classic momentum
* fix: pylint
* clean: move comment
* fix: correct algo
* feat: lrschedulergroup
* feat: skip list tests
* feat: :| forgot that params are a thing
* feat: remove skip_list params from main params
* feat: set moment
---------
Co-authored-by: chenyu <chenyu@fastmail.com >
2024-03-24 11:43:12 -04:00
George Hotz
54dc48aa47
fix assign ( #3878 )
...
* fix assign
* remove terrible optimizer hack
* oops, not realized assigns
2024-03-22 11:48:48 -07:00
George Hotz
e6d55932ca
hotfix: this makes beautiful mnist work again, not okay
2024-03-18 18:22:44 -07:00
chenyu
639bd5dbfc
move bf16 cast hack to Tensor.llvm_bf16_cast ( #3788 )
2024-03-17 18:51:22 -04:00
David Hou
07324b56d5
[experimenting] use contiguous instead of realize in optim ( #3770 )
...
* run CI
* comment
* remove t.grad to try
* Revert "remove t.grad to try"
This reverts commit 05ec2d3b89 .
2024-03-15 23:06:50 -07:00
George Hotz
3527c5a9d2
add Tensor.replace ( #3738 )
...
* add Tensor.replace
* fix dtypes in that test
* should be replace
* and mixtral
2024-03-14 13:34:14 -07:00
Szymon Ożóg
6c36264790
Improve type hints for optimizer ( #3583 )
...
* Improve type hints for optimizer
* lint fix
2024-03-02 07:35:44 -08:00
George Hotz
50e1445e60
Revert "allow overriding weight init for Linear ( #3569 )" ( #3576 )
...
This reverts commit 2d0973a852 .
2024-03-02 03:17:13 -08:00
David Hou
2d0973a852
allow overriding weight init for Linear ( #3569 )
2024-03-02 03:16:04 -08:00
George Hotz
b1c0d8c99d
remove cpu and torch backends ( #3399 )
...
* remove cpu and torch backends
* don't copy to cpu
* use clang instead of cpu
* multitensor gathers on the first device
* clang is cpu + use default
* fixup
* bugfix
2024-02-15 16:55:39 +01:00
chenyu
97275101e9
fix safetensor load uint32 and uint64 ( #3315 )
...
the correct keys are U32 and U64.
2024-02-04 10:46:27 -05:00
Yoshinori Sano
edb74897b2
support safe load bf16 ( #3310 )
...
* support safe load bf16
* fix lint error E501
* add test for loading safetensors
* key should be BOOL
* fix lint
2024-02-04 10:08:39 -05:00
George Hotz
a72b1b6d65
sharding for llama ( #3151 )
...
* shard llama
* sharding works
* simpler
* simpler
* consume option
* disable that test
* save a line
---------
Co-authored-by: George Hotz <george@tinygrad.org >
2024-01-16 19:28:00 -08:00
chenyu
c658aa4fbf
minor cleanup of test_disk_tensor ( #3112 )
2024-01-13 20:54:58 -05:00
Yixiang Gao
a686663657
make Embedding device aware for multigpu ( #3051 )
...
* make Embedding device aware for multigpu
* split line instead of igore because that's cheating
* add test incomplete
* add test complete
* remove comment
* fix white space
* remove nn.Embedding
2024-01-08 20:09:26 -08:00
chenyu
4f4e8634b8
use in_features directly in nn.Linear.__init__ bound check ( #3050 )
...
* use in_features directly in nn.Linear.__init__ bound check
get rid of the unnecessary check of isinstance int
* that is always int
* long lines
2024-01-08 19:32:35 -05:00
chenyu
3eb3664074
fix nn.Embedding with empty length input ( #3048 )
2024-01-08 18:08:36 -05:00
chenyu
74a30431b4
replace d[a] if a in d else b with d.get(a, b) ( #2997 )
2024-01-03 18:10:25 -05:00
George Hotz
a280cfe169
move dtypes to dtype.py ( #2964 )
...
* move dtypes to dtype.py
* fix urllib
2024-01-01 14:58:48 -08:00
George Hotz
c81ce9643d
move globalcounters to ops ( #2960 )
...
* move globalcounters to ops
* missed a few
* sick of that failing
2024-01-01 14:21:02 -08:00
George Hotz
e0ecab3797
touchups from multibuffer branch ( #2958 )
2024-01-01 11:33:41 -08:00
George Hotz
e1861ab65e
remove realize from optimizer ( #2880 )
...
* remove realize from optimizer
* one still needed
* opt realize
2023-12-20 16:42:41 -08:00
Oleg Rybalko
42a038c83f
More readable torch_load ext check ( #2853 )
...
* more readable extension check
* enable tarfile test
* detach tensor if requires grad in torch
2023-12-19 14:53:15 -05:00
George Hotz
7e5b3e53fe
changes to prep for new lazy ( #2748 )
...
* changes to prep for new lazy
* put those back
2023-12-13 10:28:22 -08:00
George Hotz
6d6eb9302d
ruff checks the max line length is 150 ( #2734 )
...
* ruff checks the max line length is 150
* fix tensor.py
* a lot more
* done
2023-12-12 17:34:47 -08:00
Guy Leroy
ee9e1d3662
Extend available types for safe_save ( #2720 )
...
* Extend available types to save with
* Linter fix
2023-12-11 14:50:35 -08:00