mirror of
https://github.com/nod-ai/AMD-SHARK-Studio.git
synced 2026-02-19 11:56:43 -05:00
* improved sharded performance and fixed issue with lmhead on rocm * mmap shards + disable sharing of device arrays across devices * fix device_idx for non-layer vmfbs * fix time calc for sharded --------- Co-authored-by: Elias Joseph <elias@nod-labs.com> Co-authored-by: PhaneeshB <b.phaneesh@gmail.com>