Files
tinygrad/test
Elias Wahl 27613dd881 MLPerf BERT: Main training loop (#4288)
* BERT language modeling head + trunc normal initializers

* add train loop + helpers

* shuffle in dataloaders + slight changes in main loop

* beam change

* Minor changes

* random.shuffle

* HParam update

* Use deque for dataloader

* wandb bert project name

* half fixes

* BENCHMARK + remove epoch

* cast + print()

---------

Co-authored-by: chenyu <chenyu@fastmail.com>
2024-04-29 14:35:27 -04:00
..
2024-04-23 16:28:14 -04:00
2024-03-14 13:34:14 -07:00
2024-04-28 11:57:39 -04:00
2023-12-01 11:34:47 -08:00
2020-12-15 23:44:08 -08:00
2023-06-25 10:38:58 -07:00
2024-04-04 17:38:19 -04:00
2024-03-18 16:47:07 -04:00
2024-03-26 21:02:46 -07:00
2023-12-07 17:07:05 -08:00
2024-04-02 11:38:21 -04:00
2024-04-28 12:23:05 -04:00
2024-04-09 13:47:25 -07:00
2024-03-06 13:34:21 -08:00
2024-04-25 15:39:39 -04:00
2024-03-26 21:02:46 -07:00