Files
tinygrad/extra
Elias Wahl 27613dd881 MLPerf BERT: Main training loop (#4288)
* BERT language modeling head + trunc normal initializers

* add train loop + helpers

* shuffle in dataloaders + slight changes in main loop

* beam change

* Minor changes

* random.shuffle

* HParam update

* Use deque for dataloader

* wandb bert project name

* half fixes

* BENCHMARK + remove epoch

* cast + print()

---------

Co-authored-by: chenyu <chenyu@fastmail.com>
2024-04-29 14:35:27 -04:00
..
2023-11-13 20:18:40 -08:00
2024-01-01 14:58:48 -08:00
2024-04-11 08:24:57 -07:00
2024-01-26 18:27:49 -08:00
2024-04-22 19:50:20 +04:00
2024-01-01 14:58:48 -08:00
2024-03-26 21:02:46 -07:00
2024-01-05 10:33:13 -08:00
2023-07-05 15:36:22 -07:00
2023-12-07 17:07:05 -08:00
2024-01-19 23:34:30 -05:00
2023-11-30 17:07:16 -08:00