[BACKEND] Support MMA V3 with register operand (#2375)

MMA V3 support taking operand A from register. This helps for chained matmul operations like in attention. Add an optimization to use this mode when it helps and add the lowering for it.
2026-04-05 03:01:17 -04:00 · 2023-09-25 10:43:54 -07:00
parent 8ae2ae4f40
commit 6bc1d9e1be
13 changed files with 186 additions and 78 deletions
--- a/include/triton/Analysis/Utility.h
+++ b/include/triton/Analysis/Utility.h
@@ -133,6 +133,10 @@ bool isMmaToDotShortcut(RankedTensorType &srcTy, RankedTensorType &dstTy);

 bool isMmaToMmaShortcut(RankedTensorType &srcTy, RankedTensorType &dstTy);

+// Return true if the src and dst layout match.
+bool matchMmaV3AndDotOperandLayout(RankedTensorType srcTy,
+                                   RankedTensorType dstTy);
+
 // TODO: Move utility functions that belong to ConvertLayoutOp to class
 // ConvertLayoutOpHelper in the future
 bool shouldUseDistSmem(Attribute srcLayout, Attribute dstLayout);