Fix PyTorch Compatibility link and remove incomplete rows (#4195)

* fix pytorch-compatibility filename fix links * remove incomplete rows in pytorch-compatibility * fix broken refs
2026-01-08 14:23:55 -05:00 · 2024-12-24 11:13:54 -05:00
parent 027b2ea376
commit f76145c2ad
6 changed files with 9 additions and 43 deletions
--- a/docs/compatibility/compatibility-matrix-historical-6.0.csv
+++ b/docs/compatibility/compatibility-matrix-historical-6.0.csv
@@ -22,7 +22,7 @@ ROCm Version,6.3.1,6.3.0,6.2.4,6.2.2,6.2.1,6.2.0, 6.1.2, 6.1.1, 6.1.0, 6.0.2, 6.
      ,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908
      ,,,,,,,,,,,
      FRAMEWORK SUPPORT,.. _framework-support-compatibility-matrix-past-60:,,,,,,,,,,
-      :doc:`PyTorch <../compatibility/pytorch-compatiblity>`,"2.4, 2.3, 2.2, 2.1, 2.0, 1.13","2.4, 2.3, 2.2, 2.1, 2.0, 1.13","2.3, 2.2, 2.1, 2.0, 1.13","2.3, 2.2, 2.1, 2.0, 1.13","2.3, 2.2, 2.1, 2.0, 1.13","2.3, 2.2, 2.1, 2.0, 1.13","2.1, 2.0, 1.13","2.1, 2.0, 1.13","2.1, 2.0, 1.13","2.1, 2.0, 1.13","2.1, 2.0, 1.13"
+      :doc:`PyTorch <../compatibility/pytorch-compatibility>`,"2.4, 2.3, 2.2, 2.1, 2.0, 1.13","2.4, 2.3, 2.2, 2.1, 2.0, 1.13","2.3, 2.2, 2.1, 2.0, 1.13","2.3, 2.2, 2.1, 2.0, 1.13","2.3, 2.2, 2.1, 2.0, 1.13","2.3, 2.2, 2.1, 2.0, 1.13","2.1, 2.0, 1.13","2.1, 2.0, 1.13","2.1, 2.0, 1.13","2.1, 2.0, 1.13","2.1, 2.0, 1.13"
      :doc:`TensorFlow <rocm-install-on-linux:install/3rd-party/tensorflow-install>`,"2.17.0, 2.16.2, 2.15.1","2.17.0, 2.16.2, 2.15.1","2.16.1, 2.15.1, 2.14.1","2.16.1, 2.15.1, 2.14.1","2.16.1, 2.15.1, 2.14.1","2.16.1, 2.15.1, 2.14.1","2.15.0, 2.14.0, 2.13.1","2.15.0, 2.14.0, 2.13.1","2.15.0, 2.14.0, 2.13.1","2.14.0, 2.13.1, 2.12.1","2.14.0, 2.13.1, 2.12.1"
      :doc:`JAX <rocm-install-on-linux:install/3rd-party/jax-install>`,0.4.35,0.4.35,0.4.26,0.4.26,0.4.26,0.4.26,0.4.26,0.4.26,0.4.26,0.4.26,0.4.26
      `ONNX Runtime <https://onnxruntime.ai/docs/build/eps.html#amd-migraphx>`_,1.17.3,1.17.3,1.17.3,1.17.3,1.17.3,1.17.3,1.17.3,1.17.3,1.17.3,1.14.1,1.14.1
--- a/docs/compatibility/compatibility-matrix.rst
+++ b/docs/compatibility/compatibility-matrix.rst
@@ -47,7 +47,7 @@ compatibility and system requirements.
      ,gfx908,gfx908,gfx908
      ,,,
      FRAMEWORK SUPPORT,.. _framework-support-compatibility-matrix:,,
-      :doc:`PyTorch <../compatibility/pytorch-compatiblity>`,"2.4, 2.3, 2.2, 1.13","2.4, 2.3, 2.2, 2.1, 2.0, 1.13","2.3, 2.2, 2.1, 2.0, 1.13"
+      :doc:`PyTorch <../compatibility/pytorch-compatibility>`,"2.4, 2.3, 2.2, 1.13","2.4, 2.3, 2.2, 2.1, 2.0, 1.13","2.3, 2.2, 2.1, 2.0, 1.13"
      :doc:`TensorFlow <rocm-install-on-linux:install/3rd-party/tensorflow-install>`,"2.17.0, 2.16.2, 2.15.1","2.17.0, 2.16.2, 2.15.1","2.16.1, 2.15.1, 2.14.1"
      :doc:`JAX <rocm-install-on-linux:install/3rd-party/jax-install>`,0.4.35,0.4.35,0.4.26
      `ONNX Runtime <https://onnxruntime.ai/docs/build/eps.html#amd-migraphx>`_,1.17.3,1.17.3,1.17.3
--- a/docs/compatibility/pytorch-compatibility.rst
+++ b/docs/compatibility/pytorch-compatibility.rst
@@ -576,14 +576,6 @@ PyTorch interacts with the CUDA or ROCm environment.
      - Globally enables or disables the PyTorch C++ implementation within SDPA.
      - 2.1
      - ❌
-    * - ``allow_fp16_bf16_reduction_math_sdp``
-      - Globally enables FP16 and BF16 precision for reduction operations within
-        SDPA.
-      - 2.1
-      - 
-..
-   FIXME:
-      - Partial?

 .. Need to validate and extend.

@@ -671,15 +663,6 @@ of computational resources and scalability for large-scale tasks.
        those on separate machines.
      - 1.8
      - 5.4
-    * - RPC Device Map Passing
-      - RPC Device Map Passing in PyTorch refers to a feature of the Remote
-        Procedure Call (RPC) framework that enables developers to control and
-        specify how tensors are transferred between devices during remote
-        operations. It allows fine-grained management of device placement when
-        sending tensors across nodes in distributed training or execution
-        scenarios.
-      - 1.9
-      - 
    * - Gloo
      - Gloo is designed for multi-machine and multi-GPU setups, enabling
        efficient communication and synchronization between processes. Gloo is
@@ -687,24 +670,6 @@ of computational resources and scalability for large-scale tasks.
        (DDP) and RPC frameworks, alongside other backends like NCCL and MPI.
      - 1.0
      - 2.0
-    * - MPI
-      - MPI (Message Passing Interface) in PyTorch refers to the use of the MPI
-        backend for distributed communication in the ``torch.distributed`` module.
-        It enables inter-process communication, primarily in distributed
-        training settings, using the widely adopted MPI standard.
-      - 1.9
-      -
-    * - TorchElastic
-      - TorchElastic is a PyTorch library that enables fault-tolerant and
-        elastic training in distributed environments. It is designed to handle
-        dynamically changing resources, such as adding or removing nodes during
-        training, which is especially useful in cloud-based or preemptible
-        environments.
-      - 1.9
-      -
-
-.. 
-   FIXME: RPC Device Map Passing "Since ROCm version"

 torch.compiler
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
--- a/docs/how-to/deep-learning-rocm.rst
+++ b/docs/how-to/deep-learning-rocm.rst
@@ -11,11 +11,14 @@ ROCm provides a comprehensive ecosystem for deep learning development, including
 deep learning frameworks and libraries such as PyTorch, TensorFlow, and JAX. ROCm works closely with these
 frameworks to ensure that framework-specific optimizations take advantage of AMD accelerator and GPU architectures.

-The following guides provide information on compatibility and supported features for ROCm-enabled deep learning frameworks.
+The following guides provide information on compatibility and supported
+features for these ROCm-enabled deep learning frameworks.

 * :doc:`PyTorch compatibility <../compatibility/pytorch-compatibility>`
+.. * :doc:`TensorFlow compatibility <../compatibility/tensorflow-compatibility>`
+.. * :doc:`JAX compatibility <../compatibility/jax-compatibility>`

-The following chart steps through typical installation workflows for installing deep learning frameworks for ROCm.
+This chart steps through typical installation workflows for installing deep learning frameworks for ROCm.

 .. image:: ../data/how-to/framework_install_2024_07_04.png
   :alt: Flowchart for installing ROCm-aware machine learning frameworks
@@ -37,3 +40,4 @@ through the following guides.
 * :doc:`rocm-for-ai/index`

 * :doc:`llm-fine-tuning-optimization/index`
+
--- a/docs/how-to/performance-validation/mi300x/vllm-benchmark.rst
+++ b/docs/how-to/performance-validation/mi300x/vllm-benchmark.rst
@@ -399,9 +399,6 @@ Further reading
 - To learn how to optimize inference on LLMs, see
  :doc:`Fine-tuning LLMs and inference optimization </how-to/llm-fine-tuning-optimization/index>`.

- For a list of other ready-made Docker images for ROCm, see the
-  :doc:`Docker image support matrix <rocm-install-on-linux:reference/docker-image-support-matrix>`.
-
 - To compare with the previous version of the ROCm vLLM Docker image for performance validation, refer to
  `LLM inference performance validation on AMD Instinct MI300X (ROCm 6.2.0) <https://rocm.docs.amd.com/en/docs-6.2.0/how-to/performance-validation/mi300x/vllm-benchmark.html>`_.

--- a/docs/how-to/tuning-guides/mi300x/workload.rst
+++ b/docs/how-to/tuning-guides/mi300x/workload.rst
@@ -92,7 +92,7 @@ involves configuring tensor parallelism, leveraging advanced features, and
 ensuring efficient execution. Here’s how to optimize vLLM performance:

 * Tensor parallelism: Configure the
-  :ref:`tensor-parallel-size parameter <mi300x-vllm-optimize-tp-gemm>` to distribute
+  :ref:`tensor-parallel-size parameter <mi300x-vllm-multiple-gpus>` to distribute
  tensor computations across multiple GPUs. Adjust parameters such as
  ``batch-size``, ``input-len``, and ``output-len`` based on your workload.