Commit Graph

2 Commits

Author SHA1 Message Date
George Hotz
183d38b128 remove CUSTOM_KERNEL / directly construct it (#14604)
* remove CUSTOM_KERNEL / directly construct it

* clean that up

* simpler multi

* custom kernel spec

* remove Kernel

* fix multi

* use sharded shape

* explicit regression test
2026-02-08 18:43:33 +08:00
b1tg
0fbc551622 train bert with fp8 (#13874)
* fp8 train

* clean

* lint

* test fix from #13439

* skip first/last layer

* rm __init__, restore unroll <=32 check

* tests

* clean test, remove unused

* multi-gpu test, clean quantize_to_fp8

* remove bert contiguous

* run script

* test: better check

* run script search

* add seed in bert data shuffle

* move script to mi350x folder

---------

Co-authored-by: chenyu <chenyu@fastmail.com>
2026-01-09 09:21:59 -05:00