* [MFMA] FP8 and BF8 support
This PR adds support of fp8 and bf8 in AccelerateMatmul pass and
Introduces generation of float8 mfma instructions in ttg to llvm conversion.
* add tests
* fix tests
* review fix: fix variable naming and dot operand promotion.
* review comments fixes
---------
Co-authored-by: Shucai Xiao <shucai.xiao@amd.com>