chenyu
704cb1d8a0
fix conversation.py quantize ( #4663 )
...
it used to be true for int8, not it's a string for int8 or nf4
2024-05-20 17:36:37 -04:00
chenyu
ae861325ce
update llama sample for mac 32 input buffer limit ( #4662 )
...
set default sampling params to function call to 0, and top k in llama3 to 25.
2024-05-20 17:23:39 -04:00
Elias Wahl
993091adfa
loss scaler + nan fixes ( #4661 )
2024-05-20 17:08:35 -04:00
wozeparrot
b144d4b460
new llama3 example ( #4576 )
2024-05-19 22:42:23 -07:00
George Hotz
5ba611787d
move image into tensor.py. delete features ( #4603 )
...
* move image into tensor.py
* change setup.py
* openpilot tests need pythonpath now
2024-05-15 10:50:25 -07:00
George Hotz
53d082a2aa
move memory into schedule ( #4597 )
2024-05-15 07:54:20 -07:00
George Hotz
ff64bcab69
move graph/search to engine ( #4596 )
2024-05-14 23:12:59 -07:00
George Hotz
fd02ab1e8b
move disassemblers and openpilot ( #4592 )
...
* move disassemblers and openpilot
* delete junk
* put that in pre-commit
* fixup readme
2024-05-14 19:30:02 -07:00
chenyu
2b0ee74bb6
lshift and rshift ( #4591 )
2024-05-14 19:16:31 -04:00
qazal
9aa5e02229
update llmc export ( #4584 )
...
* update example
* move train to optim
* rename
* b2
2024-05-14 21:18:38 +03:00
wozeparrot
d7670f8141
quantized llama multilazybuffer fix ( #4557 )
2024-05-12 14:19:21 -07:00
chenyu
01a0c1a948
slightly faster nf4 llama ( #4542 )
2024-05-12 14:24:42 -04:00
wozeparrot
e07c7668b3
nf4 llama ( #4540 )
2024-05-11 22:22:34 -07:00
chenyu
bed70b130c
mlperf bert getenv-able EVAL_STEP_FREQ ( #4534 )
2024-05-11 14:36:56 -04:00
chenyu
04a4980a51
touchup bert script ( #4531 )
...
small adjustments, remove duplicated training setting and stop the script once target is hit
2024-05-11 13:02:02 -04:00
George Hotz
347a3acb37
add renderer class ( #4524 )
...
* add renderer class
* tests pass
* fix pylint
* fix tensor cores
2024-05-10 21:40:02 -07:00
chenyu
b00b6b16f0
fix TRAIN_BEAM and Tensor.training for mlperf bert ( #4525 )
...
also hard coded bert model config instead of looking up a file
2024-05-11 00:18:36 -04:00
George Hotz
4eef1ee9bf
move renderer into options ( #4514 )
...
* move renderer into options
* fix tests
* renders are functions
2024-05-10 10:01:51 -07:00
George Hotz
7c630a9a53
hotfix: fix llama spacing + fix hcq
2024-05-10 15:10:13 +00:00
chenyu
b399d98e41
fix resnet eval ( #4507 )
2024-05-10 00:49:00 -04:00
wozeparrot
a602dc67d3
feat: more mlperf fixes ( #4505 )
2024-05-09 20:50:20 -07:00
chenyu
0e8aa0e288
use fake data in beam searching resnet ( #4504 )
2024-05-09 23:43:50 -04:00
wozeparrot
29daea4e60
fix: core count and os ( #4503 )
2024-05-09 19:55:07 -07:00
George Hotz
89e119bc58
move Allocator to buffer.py ( #4502 )
...
* move Allocator to buffer.py
* move those to realize
* memory file
* cleanup
2024-05-09 19:45:56 -07:00
chenyu
ef93e41a15
resnet mlperf systems add tinygrad commit and python / runtime versions ( #4494 )
2024-05-09 16:04:15 -04:00
chenyu
b5afdfbc5b
first draft resnet mlperf readme ( #4493 )
...
* start readme
* something
2024-05-09 15:51:44 -04:00
chenyu
047c7f3e5b
polish resnet mlperf logging ( #4490 )
...
don't include save final check point time in run time, and some cosmetic order changes
2024-05-09 13:04:24 -04:00
chenyu
d78e159aa3
resnet logging move RUN_START to start of the script ( #4488 )
2024-05-09 12:32:32 -04:00
chenyu
1bcb58479d
resnet setup power cap red box gpu to 350W ( #4484 )
...
1%-2% faster
2024-05-08 23:32:41 -04:00
chenyu
0ed755bcf5
resnet use EVAL_BS=192 ( #4482 )
...
* resnet use EVAL_BS=192
also lower green run BEAM_MIN_PROGRESS from 10 to 5
* BEAM_MIN_PROGRESS 5 is too close to setup limit
2024-05-08 22:29:27 -04:00
chenyu
1f6bf9d2f7
real diskcache_clear in model_train resnet ( #4445 )
...
clear cache if INITMLPERF is set, or running run_and_time. dev_beam and dev_run do not clear cache
2024-05-08 19:06:09 -04:00
chenyu
1b4645bea6
hotfix resnet move init_start to start of the script ( #4481 )
2024-05-08 19:03:52 -04:00
wozeparrot
a347ae94d6
feat: remove wandb ( #4480 )
2024-05-08 15:31:16 -07:00
chenyu
db7e15c46f
hotfix resnet only log epoch start with RUNMLPERF ( #4477 )
2024-05-08 15:14:41 -04:00
chenyu
062c6dd65d
mlperf logging, truncate dir in logs and log seed ( #4475 )
2024-05-08 12:54:02 -04:00
chenyu
b62a65b617
redo faster sparse_categorical_crossentropy ( #4461 )
...
update LR and DECAY in resnet default that help convergence too
2024-05-08 11:21:43 -04:00
George Hotz
17faae091b
optimizer shouldn't be run without training ( #4460 )
...
* optimizer shouldn't be run without training
* set training in relevant tests
* fix multitensor
* that too
2024-05-06 15:34:12 -07:00
George Hotz
f4e49a7c1a
resnet 50 opt: correct loop + LARS ( #4449 )
...
* correct loop + LARS
* ops
2024-05-06 08:01:26 -07:00
George Hotz
fc995d4446
add backward to handcode_resnet50_opt
2024-05-06 06:42:26 -07:00
wozeparrot
603d3a351b
feat: allow keeping multiple cookies ( #4440 )
2024-05-05 19:26:48 -07:00
Francis Lam
709410071c
mlperf/resnet: updated BEAM params to increase performance ( #4443 )
2024-05-05 21:49:46 -04:00
chenyu
3b30756cbb
update mlperf submission system ( #4435 )
...
more required fields.
2024-05-05 13:19:07 -04:00
David Hou
c0a048c044
batchnorm d(var)/d(mean) = 0 ( #4430 )
...
* d(var)/d(mean) = 0
* drop the number in test_schedule!
2024-05-05 00:25:45 -04:00
qazal
fa17dcaf07
Fix llm.c/export.py ( #4423 )
...
* fix headers
* add CI
* add stdio
* merge clang tests
* revert llm.c
* revert ci
* Revert "revert llm.c"
This reverts commit 5fd17e3c8b .
2024-05-04 19:37:10 +03:00
George Hotz
cb7289f9c9
remove clang program header ( #4422 )
...
* remove clang program header
* proper max
* bools are numbers
* fix compile enet
2024-05-04 08:38:01 -07:00
chenyu
473ecb978a
remove SPLIT_REDUCEOP=1 from resnet scripts ( #4404 )
...
SPLIT_REDUCEOP=1 is default
2024-05-03 12:36:23 -04:00
David Hou
b767d59684
resnet trainer: keep old cookie around until next step has been queued ( #4401 )
...
* keep old cookie around until next step has been queued (-10ms 6gpu)
* also for eval
* drop cookie before data_get?
* Revert "drop cookie before data_get?"
This reverts commit b01e6aa2b2 .
* Revert "Revert "drop cookie before data_get?""
This reverts commit 23464e73d4 .
2024-05-03 12:15:21 -04:00
chenyu
2c3b7f8e70
pad resnet training data with training data mean ( #4369 )
...
update model_train resnet to pad training
2024-05-02 20:26:15 -04:00
Francis Lam
3cf8291f2f
mlperf/resnet: update beam params to increase time and quality ( #4396 )
...
* mlperf/resnet: update beam params to increase time and quality
* revert upcast 8 in search space and add rocm setup function
* refactor to independent setup.sh script
2024-05-02 20:14:46 -04:00
chenyu
ab01a9433d
resnet eval 4n+3 if epoch < 33 ( #4391 )
...
the rule is as thoroughly as 4n+k and we can stop the clock as soon as eval hits target. this can save 24 evals or 12 minutes
2024-05-02 16:52:07 -04:00