mirror of
https://github.com/ROCm/ROCm.git
synced 2026-04-27 03:01:52 -04:00
[BACKEND] Improve decision of MMA dimension on H100 (#2373)
When there is a chain of mma ops we want to pick the same shape to avoid conversions. This improves the detection going through for loops. This fixes a crash in tutorial bw attention. We might want to change this logic and convert the format to allow more efficient MMA at some point.
This commit is contained in:
@@ -141,6 +141,16 @@ Value linearize(OpBuilder &b, Location loc, ArrayRef<Value> multiDim,
|
||||
Value linearize(OpBuilder &b, Location loc, ArrayRef<Value> multiDim,
|
||||
ArrayRef<unsigned> shape);
|
||||
|
||||
// Implement backward and forward slice that will go through scf blocks when
|
||||
// yield or scf results are in the slice.
|
||||
// Note that like exisiting forward and backard slice this may add operations to
|
||||
// the slice that are not actually dependent on the root because when a region
|
||||
// is added to the slice in the forward slice all the operations of the region
|
||||
// are added. We could implement a more accurate slice method by tracking value
|
||||
// usage across scf regions.
|
||||
void getBackwardSliceSCFAware(Operation *, SetVector<Operation *> *slices);
|
||||
void getForwardSliceSCFAware(Value root, SetVector<Operation *> *slices);
|
||||
|
||||
} // namespace mlir
|
||||
|
||||
#endif // TRITON_DIALECT_TRITONGPU_TRANSFORMS_UTILITY_H_
|
||||
|
||||
Reference in New Issue
Block a user