b1tg
0fbc551622
train bert with fp8 (#13874)
* fp8 train
* clean
* lint
* test fix from #13439
* skip first/last layer
* rm __init__, restore unroll <=32 check
* tests
* clean test, remove unused
* multi-gpu test, clean quantize_to_fp8
* remove bert contiguous
* run script
* test: better check
* run script search
* add seed in bert data shuffle
* move script to mi350x folder
---------
Co-authored-by: chenyu <chenyu@fastmail.com>
2026-01-09 09:21:59 -05:00
..
2023-12-02 15:03:46 -08:00
2025-06-30 08:21:27 -07:00
2026-01-09 09:21:59 -05:00
2025-11-12 23:12:45 -05:00
2025-12-19 16:27:37 -04:00
2025-06-08 08:42:22 -07:00
2025-12-19 16:27:37 -04:00
2025-02-26 13:22:08 -05:00
2025-06-08 08:42:22 -07:00
2023-03-11 16:28:10 -08:00
2025-05-28 22:20:02 -07:00
2025-10-08 04:54:07 -04:00
2024-10-10 11:34:29 -04:00
2025-12-04 12:17:29 -05:00
2025-10-07 10:42:22 +08:00
2025-02-20 18:03:09 -05:00
2025-10-07 10:42:22 +08:00
2025-10-16 09:55:20 -04:00
2025-12-03 16:11:42 -08:00
2025-10-08 04:54:07 -04:00
2025-11-17 14:09:37 -05:00
2025-09-10 13:56:40 -04:00
2025-09-14 15:27:34 -04:00
2025-05-28 22:20:02 -07:00
2025-06-09 09:25:53 -07:00
2025-02-26 13:22:08 -05:00
2025-08-10 20:33:22 -04:00
2025-09-10 13:56:40 -04:00
2025-12-01 22:50:53 -08:00
2025-06-05 17:17:42 -07:00
2025-09-13 21:47:51 -04:00
2024-09-25 17:45:13 +08:00
2025-11-24 18:59:16 -08:00
2025-06-30 08:21:27 -07:00
2025-10-16 14:11:33 -04:00
2026-01-06 22:32:41 -05:00
2025-06-30 08:21:27 -07:00
2025-10-08 09:13:26 -04:00
2025-04-13 19:34:20 -04:00
2025-06-30 08:21:27 -07:00
2025-11-07 12:55:01 -05:00
2025-08-11 21:14:47 -04:00
2025-10-07 10:42:22 +08:00
2025-06-30 08:21:27 -07:00