Files
ROCm/scripts
Lixun Zhang 1de859df32 [GEMM] [Tuning] Add waves_per_eu to gemm tuning (#362)
* Add waves_per_eu in the tuning space

* Do not allocate tensor on device during kernel compilation step

* Add breakdown elapsed time

* Parallelize the post-processing step

* Parallelize the profile step with --ngpus

* Better timing info printout
2023-10-16 13:50:03 -05:00
..