diff --git a/docs/compatibility/ml-compatibility/pytorch-compatibility.rst b/docs/compatibility/ml-compatibility/pytorch-compatibility.rst index 5fc0c4cb1..3e900ac3e 100644 --- a/docs/compatibility/ml-compatibility/pytorch-compatibility.rst +++ b/docs/compatibility/ml-compatibility/pytorch-compatibility.rst @@ -399,18 +399,19 @@ with ROCm. **Note:** Only official release exists. -Key features and enhancements for PyTorch 2.8 with ROCm 7.1 +Key features and enhancements for PyTorch 2.9 with ROCm 7.1.1 ================================================================================ +- Added OCP Micro-scaling Format (mx-fp8/mx-fp4) support for advanced precision training. -- MIOpen deep learning optimizations: Further optimized NHWC BatchNorm feature. +- `torch.backends.miopen.immediate` flag to toggle MIOpen Immediate Mode independently of + deterministic and benchmark settings, providing finer control over convolution execution. -- Added float8 support for the DeepSpeed extension, allowing for decreased - memory footprint and increased throughput in training and inference workloads. +- rocSOLVER now used for Cholesky inversion operations, providing improved numerical stability + and performance for linear algebra workloads. -- ``torch.nn.functional.scaled_dot_product_attention`` now calling optimized - flash attention kernel automatically. +- MI355X GPU testing enabled in CI. -Key features and enhancements for PyTorch 2.7/2.8 with ROCm 7.0 +Key features and enhancements for PyTorch 2.7/2.8 with ROCm 7.1.1 ================================================================================ - Enhanced TunableOp framework: Introduces ``tensorfloat32`` support for