mirror of
https://github.com/tinygrad/tinygrad.git
synced 2026-04-29 03:00:14 -04:00
* add quantize fp8 in llama3 * don't truncate fp8 alu result * cast to float32 before matmul * --model weights/LLaMA-3/8B-SF-DPO/ --------- Co-authored-by: chenyu <chenyu@fastmail.com>
26 KiB
26 KiB