[BACKEND] Improve decision of MMA dimension on H100 (#2373)

When there is a chain of mma ops we want to pick the same shape to avoid
conversions. This improves the detection going through for loops.
This fixes a crash in tutorial bw attention.

We might want to change this logic and convert the format to allow more
efficient MMA at some point.
This commit is contained in:
Thomas Raoux
2023-09-22 15:21:56 -07:00
committed by GitHub
parent 1724604bd9
commit 840e7e7b53
4 changed files with 106 additions and 14 deletions

View File

@@ -141,6 +141,16 @@ Value linearize(OpBuilder &b, Location loc, ArrayRef<Value> multiDim,
Value linearize(OpBuilder &b, Location loc, ArrayRef<Value> multiDim,
ArrayRef<unsigned> shape);
// Implement backward and forward slice that will go through scf blocks when
// yield or scf results are in the slice.
// Note that like exisiting forward and backard slice this may add operations to
// the slice that are not actually dependent on the root because when a region
// is added to the slice in the forward slice all the operations of the region
// are added. We could implement a more accurate slice method by tracking value
// usage across scf regions.
void getBackwardSliceSCFAware(Operation *, SetVector<Operation *> *slices);
void getForwardSliceSCFAware(Value root, SetVector<Operation *> *slices);
} // namespace mlir
#endif // TRITON_DIALECT_TRITONGPU_TRANSFORMS_UTILITY_H_