mirror of
https://github.com/nod-ai/AMD-SHARK-Studio.git
synced 2026-04-03 03:00:17 -04:00
* improved sharded performance and fixed issue with lmhead on rocm * mmap shards + disable sharing of device arrays across devices * fix device_idx for non-layer vmfbs * fix time calc for sharded --------- Co-authored-by: Elias Joseph <elias@nod-labs.com> Co-authored-by: PhaneeshB <b.phaneesh@gmail.com>