Files
tinygrad/extra/datasets
Elias Wahl 27613dd881 MLPerf BERT: Main training loop (#4288)
* BERT language modeling head + trunc normal initializers

* add train loop + helpers

* shuffle in dataloaders + slight changes in main loop

* beam change

* Minor changes

* random.shuffle

* HParam update

* Use deque for dataloader

* wandb bert project name

* half fixes

* BENCHMARK + remove epoch

* cast + print()

---------

Co-authored-by: chenyu <chenyu@fastmail.com>
2024-04-29 14:35:27 -04:00
..
2023-11-28 17:36:55 -08:00
2024-04-28 22:34:18 -04:00
2023-07-07 18:41:58 -07:00
2024-03-09 14:55:23 -08:00
2023-11-28 17:36:55 -08:00