chenyu
|
f2414c666f
|
fix train_gpt2.py (#4771)
added `with Tensor.train():`
|
2024-05-29 12:01:34 -04:00 |
|
chenyu
|
7624ad3ddd
|
add --timing and --profile to llama3 example (#4767)
|
2024-05-28 16:24:44 -04:00 |
|
chenyu
|
e614b7c696
|
docs: showcase remove mnist_gan and add conversation.py (#4757)
fixed both examples, and i think it's better to show conversation
|
2024-05-28 11:09:26 -04:00 |
|
chenyu
|
fd249422f5
|
minor cleanup example stable_diffusion (#4753)
|
2024-05-28 00:05:37 -04:00 |
|
Elias Wahl
|
c4b0acf095
|
Global norm + small changes (#4749)
* norm
* no empty
* default loss scaler in float
|
2024-05-27 18:35:27 -04:00 |
|
chenyu
|
31358cbea5
|
change Tensor.stack to method (#4719)
|
2024-05-24 17:04:19 -04:00 |
|
chenyu
|
38bc38cdff
|
fix llama example quantize (#4699)
* fix llama example quantize
import quantize layers from new example llama3
add to mac benchmark
* fix that
* save the files
|
2024-05-23 15:35:26 -04:00 |
|
chenyu
|
792a494eb8
|
fix various examples (#4691)
* fix examples that used ax1 and ax2 for transpose
* fix that
* update those
|
2024-05-22 20:43:21 -04:00 |
|
Elias Wahl
|
acc0039cfc
|
Resume fix + scheduler for non weight decay params (#4679)
* move ckpt dir
* fix resume. Add scheduler group
|
2024-05-21 19:38:13 -04:00 |
|
chenyu
|
5e3fbbb33e
|
llama3 example add manual seed and log seed (#4667)
|
2024-05-20 19:09:57 -04:00 |
|
chenyu
|
704cb1d8a0
|
fix conversation.py quantize (#4663)
it used to be true for int8, not it's a string for int8 or nf4
|
2024-05-20 17:36:37 -04:00 |
|
chenyu
|
ae861325ce
|
update llama sample for mac 32 input buffer limit (#4662)
set default sampling params to function call to 0, and top k in llama3 to 25.
|
2024-05-20 17:23:39 -04:00 |
|
Elias Wahl
|
993091adfa
|
loss scaler + nan fixes (#4661)
|
2024-05-20 17:08:35 -04:00 |
|
wozeparrot
|
b144d4b460
|
new llama3 example (#4576)
|
2024-05-19 22:42:23 -07:00 |
|
George Hotz
|
5ba611787d
|
move image into tensor.py. delete features (#4603)
* move image into tensor.py
* change setup.py
* openpilot tests need pythonpath now
|
2024-05-15 10:50:25 -07:00 |
|
George Hotz
|
53d082a2aa
|
move memory into schedule (#4597)
|
2024-05-15 07:54:20 -07:00 |
|
George Hotz
|
ff64bcab69
|
move graph/search to engine (#4596)
|
2024-05-14 23:12:59 -07:00 |
|
George Hotz
|
fd02ab1e8b
|
move disassemblers and openpilot (#4592)
* move disassemblers and openpilot
* delete junk
* put that in pre-commit
* fixup readme
|
2024-05-14 19:30:02 -07:00 |
|
chenyu
|
2b0ee74bb6
|
lshift and rshift (#4591)
|
2024-05-14 19:16:31 -04:00 |
|
qazal
|
9aa5e02229
|
update llmc export (#4584)
* update example
* move train to optim
* rename
* b2
|
2024-05-14 21:18:38 +03:00 |
|
wozeparrot
|
d7670f8141
|
quantized llama multilazybuffer fix (#4557)
|
2024-05-12 14:19:21 -07:00 |
|
chenyu
|
01a0c1a948
|
slightly faster nf4 llama (#4542)
|
2024-05-12 14:24:42 -04:00 |
|
wozeparrot
|
e07c7668b3
|
nf4 llama (#4540)
|
2024-05-11 22:22:34 -07:00 |
|
chenyu
|
bed70b130c
|
mlperf bert getenv-able EVAL_STEP_FREQ (#4534)
|
2024-05-11 14:36:56 -04:00 |
|
chenyu
|
04a4980a51
|
touchup bert script (#4531)
small adjustments, remove duplicated training setting and stop the script once target is hit
|
2024-05-11 13:02:02 -04:00 |
|
George Hotz
|
347a3acb37
|
add renderer class (#4524)
* add renderer class
* tests pass
* fix pylint
* fix tensor cores
|
2024-05-10 21:40:02 -07:00 |
|
chenyu
|
b00b6b16f0
|
fix TRAIN_BEAM and Tensor.training for mlperf bert (#4525)
also hard coded bert model config instead of looking up a file
|
2024-05-11 00:18:36 -04:00 |
|
George Hotz
|
4eef1ee9bf
|
move renderer into options (#4514)
* move renderer into options
* fix tests
* renders are functions
|
2024-05-10 10:01:51 -07:00 |
|
George Hotz
|
7c630a9a53
|
hotfix: fix llama spacing + fix hcq
|
2024-05-10 15:10:13 +00:00 |
|
chenyu
|
b399d98e41
|
fix resnet eval (#4507)
|
2024-05-10 00:49:00 -04:00 |
|
wozeparrot
|
a602dc67d3
|
feat: more mlperf fixes (#4505)
|
2024-05-09 20:50:20 -07:00 |
|
chenyu
|
0e8aa0e288
|
use fake data in beam searching resnet (#4504)
|
2024-05-09 23:43:50 -04:00 |
|
wozeparrot
|
29daea4e60
|
fix: core count and os (#4503)
|
2024-05-09 19:55:07 -07:00 |
|
George Hotz
|
89e119bc58
|
move Allocator to buffer.py (#4502)
* move Allocator to buffer.py
* move those to realize
* memory file
* cleanup
|
2024-05-09 19:45:56 -07:00 |
|
chenyu
|
ef93e41a15
|
resnet mlperf systems add tinygrad commit and python / runtime versions (#4494)
|
2024-05-09 16:04:15 -04:00 |
|
chenyu
|
b5afdfbc5b
|
first draft resnet mlperf readme (#4493)
* start readme
* something
|
2024-05-09 15:51:44 -04:00 |
|
chenyu
|
047c7f3e5b
|
polish resnet mlperf logging (#4490)
don't include save final check point time in run time, and some cosmetic order changes
|
2024-05-09 13:04:24 -04:00 |
|
chenyu
|
d78e159aa3
|
resnet logging move RUN_START to start of the script (#4488)
|
2024-05-09 12:32:32 -04:00 |
|
chenyu
|
1bcb58479d
|
resnet setup power cap red box gpu to 350W (#4484)
1%-2% faster
|
2024-05-08 23:32:41 -04:00 |
|
chenyu
|
0ed755bcf5
|
resnet use EVAL_BS=192 (#4482)
* resnet use EVAL_BS=192
also lower green run BEAM_MIN_PROGRESS from 10 to 5
* BEAM_MIN_PROGRESS 5 is too close to setup limit
|
2024-05-08 22:29:27 -04:00 |
|
chenyu
|
1f6bf9d2f7
|
real diskcache_clear in model_train resnet (#4445)
clear cache if INITMLPERF is set, or running run_and_time. dev_beam and dev_run do not clear cache
|
2024-05-08 19:06:09 -04:00 |
|
chenyu
|
1b4645bea6
|
hotfix resnet move init_start to start of the script (#4481)
|
2024-05-08 19:03:52 -04:00 |
|
wozeparrot
|
a347ae94d6
|
feat: remove wandb (#4480)
|
2024-05-08 15:31:16 -07:00 |
|
chenyu
|
db7e15c46f
|
hotfix resnet only log epoch start with RUNMLPERF (#4477)
|
2024-05-08 15:14:41 -04:00 |
|
chenyu
|
062c6dd65d
|
mlperf logging, truncate dir in logs and log seed (#4475)
|
2024-05-08 12:54:02 -04:00 |
|
chenyu
|
b62a65b617
|
redo faster sparse_categorical_crossentropy (#4461)
update LR and DECAY in resnet default that help convergence too
|
2024-05-08 11:21:43 -04:00 |
|
George Hotz
|
17faae091b
|
optimizer shouldn't be run without training (#4460)
* optimizer shouldn't be run without training
* set training in relevant tests
* fix multitensor
* that too
|
2024-05-06 15:34:12 -07:00 |
|
George Hotz
|
f4e49a7c1a
|
resnet 50 opt: correct loop + LARS (#4449)
* correct loop + LARS
* ops
|
2024-05-06 08:01:26 -07:00 |
|
George Hotz
|
fc995d4446
|
add backward to handcode_resnet50_opt
|
2024-05-06 06:42:26 -07:00 |
|
wozeparrot
|
603d3a351b
|
feat: allow keeping multiple cookies (#4440)
|
2024-05-05 19:26:48 -07:00 |
|