mirror of
https://github.com/tinygrad/tinygrad.git
synced 2026-04-07 03:00:26 -04:00
4h32m with this https://wandb.ai/chenyuxyz/MLPerf-BERT/runs/q99frv1l/overview. loss scaler 2**13->2**10. matched the closest submission, no nan for ~10 runs. increased lr and total step a bit. `PARALLEL=0` after setup, same as resnet.
41 KiB
41 KiB