mirror of
https://github.com/ROCm/ROCm.git
synced 2026-01-09 14:48:06 -05:00
Docs: frameworks compatibility standardization (#5488)
This commit is contained in:
@@ -34,6 +34,7 @@ AlexNet
|
|||||||
Andrej
|
Andrej
|
||||||
Arb
|
Arb
|
||||||
Autocast
|
Autocast
|
||||||
|
autograd
|
||||||
BARs
|
BARs
|
||||||
BatchNorm
|
BatchNorm
|
||||||
BLAS
|
BLAS
|
||||||
@@ -86,9 +87,11 @@ Conda
|
|||||||
ConnectX
|
ConnectX
|
||||||
CountOnes
|
CountOnes
|
||||||
CuPy
|
CuPy
|
||||||
|
customizable
|
||||||
da
|
da
|
||||||
Dashboarding
|
Dashboarding
|
||||||
Dataloading
|
Dataloading
|
||||||
|
dataflows
|
||||||
DBRX
|
DBRX
|
||||||
DDR
|
DDR
|
||||||
DF
|
DF
|
||||||
@@ -182,7 +185,7 @@ GPT
|
|||||||
GPU
|
GPU
|
||||||
GPU's
|
GPU's
|
||||||
GPUs
|
GPUs
|
||||||
Graphbolt
|
GraphBolt
|
||||||
GraphSage
|
GraphSage
|
||||||
GRBM
|
GRBM
|
||||||
GRE
|
GRE
|
||||||
@@ -212,6 +215,7 @@ Haswell
|
|||||||
Higgs
|
Higgs
|
||||||
href
|
href
|
||||||
Hyperparameters
|
Hyperparameters
|
||||||
|
HybridEngine
|
||||||
Huggingface
|
Huggingface
|
||||||
IB
|
IB
|
||||||
ICD
|
ICD
|
||||||
|
|||||||
@@ -2,7 +2,7 @@
|
|||||||
|
|
||||||
.. meta::
|
.. meta::
|
||||||
:description: Deep Graph Library (DGL) compatibility
|
:description: Deep Graph Library (DGL) compatibility
|
||||||
:keywords: GPU, DGL compatibility
|
:keywords: GPU, CPU, deep graph library, DGL, deep learning, framework compatibility
|
||||||
|
|
||||||
.. version-set:: rocm_version latest
|
.. version-set:: rocm_version latest
|
||||||
|
|
||||||
@@ -10,24 +10,42 @@
|
|||||||
DGL compatibility
|
DGL compatibility
|
||||||
********************************************************************************
|
********************************************************************************
|
||||||
|
|
||||||
Deep Graph Library `(DGL) <https://www.dgl.ai/>`_ is an easy-to-use, high-performance and scalable
|
Deep Graph Library (`DGL <https://www.dgl.ai/>`__) is an easy-to-use, high-performance, and scalable
|
||||||
Python package for deep learning on graphs. DGL is framework agnostic, meaning
|
Python package for deep learning on graphs. DGL is framework agnostic, meaning
|
||||||
if a deep graph model is a component in an end-to-end application, the rest of
|
that if a deep graph model is a component in an end-to-end application, the rest of
|
||||||
the logic is implemented using PyTorch.
|
the logic is implemented using PyTorch.
|
||||||
|
|
||||||
* ROCm support for DGL is hosted in the `https://github.com/ROCm/dgl <https://github.com/ROCm/dgl>`_ repository.
|
DGL provides a high-performance graph object that can reside on either CPUs or GPUs.
|
||||||
* Due to independent compatibility considerations, this location differs from the `https://github.com/dmlc/dgl <https://github.com/dmlc/dgl>`_ upstream repository.
|
It bundles structural data features for better control and provides a variety of functions
|
||||||
* Use the prebuilt :ref:`Docker images <dgl-docker-compat>` with DGL, PyTorch, and ROCm preinstalled.
|
for computing with graph objects, including efficient and customizable message passing
|
||||||
* See the :doc:`ROCm DGL installation guide <rocm-install-on-linux:install/3rd-party/dgl-install>`
|
primitives for Graph Neural Networks.
|
||||||
to install and get started.
|
|
||||||
|
|
||||||
|
Support overview
|
||||||
Supported devices
|
|
||||||
================================================================================
|
================================================================================
|
||||||
|
|
||||||
- **Officially Supported**: TF32 with AMD Instinct MI300X (through hipblaslt)
|
- The ROCm-supported version of DGL is maintained in the official `https://github.com/ROCm/dgl
|
||||||
- **Partially Supported**: TF32 with AMD Instinct MI250X
|
<https://github.com/ROCm/dgl>`__ repository, which differs from the
|
||||||
|
`https://github.com/dmlc/dgl <https://github.com/dmlc/dgl>`__ upstream repository.
|
||||||
|
|
||||||
|
- To get started and install DGL on ROCm, use the prebuilt :ref:`Docker images <dgl-docker-compat>`,
|
||||||
|
which include ROCm, DGL, and all required dependencies.
|
||||||
|
|
||||||
|
- See the :doc:`ROCm DGL installation guide <rocm-install-on-linux:install/3rd-party/dgl-install>`
|
||||||
|
for installation and setup instructions.
|
||||||
|
|
||||||
|
- You can also consult the upstream `Installation guide <https://www.dgl.ai/pages/start.html>`__
|
||||||
|
for additional context.
|
||||||
|
|
||||||
|
Version support
|
||||||
|
--------------------------------------------------------------------------------
|
||||||
|
|
||||||
|
DGL is supported on `ROCm 6.4.0 <https://repo.radeon.com/rocm/apt/6.4/>`__.
|
||||||
|
|
||||||
|
Supported devices
|
||||||
|
--------------------------------------------------------------------------------
|
||||||
|
|
||||||
|
- **Officially Supported**: AMD Instinct™ MI300X (through `hipBLASlt <https://rocm.docs.amd.com/projects/hipBLASLt/en/latest/index.html>`__)
|
||||||
|
- **Partially Supported**: AMD Instinct™ MI250X
|
||||||
|
|
||||||
.. _dgl-recommendations:
|
.. _dgl-recommendations:
|
||||||
|
|
||||||
@@ -35,7 +53,7 @@ Use cases and recommendations
|
|||||||
================================================================================
|
================================================================================
|
||||||
|
|
||||||
DGL can be used for Graph Learning, and building popular graph models like
|
DGL can be used for Graph Learning, and building popular graph models like
|
||||||
GAT, GCN and GraphSage. Using these we can support a variety of use-cases such as:
|
GAT, GCN, and GraphSage. Using these models, a variety of use cases are supported:
|
||||||
|
|
||||||
- Recommender systems
|
- Recommender systems
|
||||||
- Network Optimization and Analysis
|
- Network Optimization and Analysis
|
||||||
@@ -62,16 +80,17 @@ Docker image compatibility
|
|||||||
|
|
||||||
<i class="fab fa-docker"></i>
|
<i class="fab fa-docker"></i>
|
||||||
|
|
||||||
AMD validates and publishes `DGL images <https://hub.docker.com/r/rocm/dgl>`_
|
AMD validates and publishes `DGL images <https://hub.docker.com/r/rocm/dgl/tags>`__
|
||||||
with ROCm and Pytorch backends on Docker Hub. The following Docker image tags and associated
|
with ROCm backends on Docker Hub. The following Docker image tags and associated
|
||||||
inventories were tested on `ROCm 6.4.0 <https://repo.radeon.com/rocm/apt/6.4/>`_.
|
inventories represent the latest available DGL version from the official Docker Hub.
|
||||||
Click the |docker-icon| to view the image on Docker Hub.
|
Click the |docker-icon| to view the image on Docker Hub.
|
||||||
|
|
||||||
.. list-table:: DGL Docker image components
|
.. list-table:: DGL Docker image components
|
||||||
:header-rows: 1
|
:header-rows: 1
|
||||||
:class: docker-image-compatibility
|
:class: docker-image-compatibility
|
||||||
|
|
||||||
* - Docker
|
* - Docker image
|
||||||
|
- ROCm
|
||||||
- DGL
|
- DGL
|
||||||
- PyTorch
|
- PyTorch
|
||||||
- Ubuntu
|
- Ubuntu
|
||||||
@@ -81,102 +100,106 @@ Click the |docker-icon| to view the image on Docker Hub.
|
|||||||
|
|
||||||
<a href="https://hub.docker.com/layers/rocm/dgl/dgl-2.4_rocm6.4_ubuntu24.04_py3.12_pytorch_release_2.6.0/images/sha256-8ce2c3bcfaa137ab94a75f9e2ea711894748980f57417739138402a542dd5564"><i class="fab fa-docker fa-lg"></i></a>
|
<a href="https://hub.docker.com/layers/rocm/dgl/dgl-2.4_rocm6.4_ubuntu24.04_py3.12_pytorch_release_2.6.0/images/sha256-8ce2c3bcfaa137ab94a75f9e2ea711894748980f57417739138402a542dd5564"><i class="fab fa-docker fa-lg"></i></a>
|
||||||
|
|
||||||
- `2.4.0 <https://github.com/dmlc/dgl/releases/tag/v2.4.0>`_
|
- `6.4.0 <https://repo.radeon.com/rocm/apt/6.4/>`__.
|
||||||
- `2.6.0 <https://github.com/ROCm/pytorch/tree/release/2.6>`_
|
- `2.4.0 <https://github.com/dmlc/dgl/releases/tag/v2.4.0>`__
|
||||||
|
- `2.6.0 <https://github.com/ROCm/pytorch/tree/release/2.6>`__
|
||||||
- 24.04
|
- 24.04
|
||||||
- `3.12.9 <https://www.python.org/downloads/release/python-3129/>`_
|
- `3.12.9 <https://www.python.org/downloads/release/python-3129/>`__
|
||||||
|
|
||||||
* - .. raw:: html
|
* - .. raw:: html
|
||||||
|
|
||||||
<a href="https://hub.docker.com/layers/rocm/dgl/dgl-2.4_rocm6.4_ubuntu24.04_py3.12_pytorch_release_2.4.1/images/sha256-cf1683283b8eeda867b690229c8091c5bbf1edb9f52e8fb3da437c49a612ebe4"><i class="fab fa-docker fa-lg"></i></a>
|
<a href="https://hub.docker.com/layers/rocm/dgl/dgl-2.4_rocm6.4_ubuntu24.04_py3.12_pytorch_release_2.4.1/images/sha256-cf1683283b8eeda867b690229c8091c5bbf1edb9f52e8fb3da437c49a612ebe4"><i class="fab fa-docker fa-lg"></i></a>
|
||||||
|
|
||||||
- `2.4.0 <https://github.com/dmlc/dgl/releases/tag/v2.4.0>`_
|
- `6.4.0 <https://repo.radeon.com/rocm/apt/6.4/>`__.
|
||||||
- `2.4.1 <https://github.com/ROCm/pytorch/tree/release/2.4>`_
|
- `2.4.0 <https://github.com/dmlc/dgl/releases/tag/v2.4.0>`__
|
||||||
|
- `2.4.1 <https://github.com/ROCm/pytorch/tree/release/2.4>`__
|
||||||
- 24.04
|
- 24.04
|
||||||
- `3.12.9 <https://www.python.org/downloads/release/python-3129/>`_
|
- `3.12.9 <https://www.python.org/downloads/release/python-3129/>`__
|
||||||
|
|
||||||
|
|
||||||
* - .. raw:: html
|
* - .. raw:: html
|
||||||
|
|
||||||
<a href="https://hub.docker.com/layers/rocm/dgl/dgl-2.4_rocm6.4_ubuntu22.04_py3.10_pytorch_release_2.4.1/images/sha256-4834f178c3614e2d09e89e32041db8984c456d45dfd20286e377ca8635686554"><i class="fab fa-docker fa-lg"></i></a>
|
<a href="https://hub.docker.com/layers/rocm/dgl/dgl-2.4_rocm6.4_ubuntu22.04_py3.10_pytorch_release_2.4.1/images/sha256-4834f178c3614e2d09e89e32041db8984c456d45dfd20286e377ca8635686554"><i class="fab fa-docker fa-lg"></i></a>
|
||||||
|
|
||||||
- `2.4.0 <https://github.com/dmlc/dgl/releases/tag/v2.4.0>`_
|
- `6.4.0 <https://repo.radeon.com/rocm/apt/6.4/>`__.
|
||||||
- `2.4.1 <https://github.com/ROCm/pytorch/tree/release/2.4>`_
|
- `2.4.0 <https://github.com/dmlc/dgl/releases/tag/v2.4.0>`__
|
||||||
|
- `2.4.1 <https://github.com/ROCm/pytorch/tree/release/2.4>`__
|
||||||
- 22.04
|
- 22.04
|
||||||
- `3.10.16 <https://www.python.org/downloads/release/python-31016/>`_
|
- `3.10.16 <https://www.python.org/downloads/release/python-31016/>`__
|
||||||
|
|
||||||
|
|
||||||
* - .. raw:: html
|
* - .. raw:: html
|
||||||
|
|
||||||
<a href="https://hub.docker.com/layers/rocm/dgl/dgl-2.4_rocm6.4_ubuntu22.04_py3.10_pytorch_release_2.3.0/images/sha256-88740a2c8ab4084b42b10c3c6ba984cab33dd3a044f479c6d7618e2b2cb05e69"><i class="fab fa-docker fa-lg"></i></a>
|
<a href="https://hub.docker.com/layers/rocm/dgl/dgl-2.4_rocm6.4_ubuntu22.04_py3.10_pytorch_release_2.3.0/images/sha256-88740a2c8ab4084b42b10c3c6ba984cab33dd3a044f479c6d7618e2b2cb05e69"><i class="fab fa-docker fa-lg"></i></a>
|
||||||
|
|
||||||
- `2.4.0 <https://github.com/dmlc/dgl/releases/tag/v2.4.0>`_
|
- `6.4.0 <https://repo.radeon.com/rocm/apt/6.4/>`__.
|
||||||
- `2.3.0 <https://github.com/ROCm/pytorch/tree/release/2.3>`_
|
- `2.4.0 <https://github.com/dmlc/dgl/releases/tag/v2.4.0>`__
|
||||||
|
- `2.3.0 <https://github.com/ROCm/pytorch/tree/release/2.3>`__
|
||||||
- 22.04
|
- 22.04
|
||||||
- `3.10.16 <https://www.python.org/downloads/release/python-31016/>`_
|
- `3.10.16 <https://www.python.org/downloads/release/python-31016/>`__
|
||||||
|
|
||||||
|
|
||||||
Key ROCm libraries for DGL
|
Key ROCm libraries for DGL
|
||||||
================================================================================
|
================================================================================
|
||||||
|
|
||||||
DGL on ROCm depends on specific libraries that affect its features and performance.
|
DGL on ROCm depends on specific libraries that affect its features and performance.
|
||||||
Using the DGL Docker container or building it with the provided docker file or a ROCm base image is recommended.
|
Using the DGL Docker container or building it with the provided Docker file or a ROCm base image is recommended.
|
||||||
If you prefer to build it yourself, ensure the following dependencies are installed:
|
If you prefer to build it yourself, ensure the following dependencies are installed:
|
||||||
|
|
||||||
.. list-table::
|
.. list-table::
|
||||||
:header-rows: 1
|
:header-rows: 1
|
||||||
|
|
||||||
* - ROCm library
|
* - ROCm library
|
||||||
- Version
|
- ROCm 6.4.0 Version
|
||||||
- Purpose
|
- Purpose
|
||||||
* - `Composable Kernel <https://github.com/ROCm/composable_kernel>`_
|
* - `Composable Kernel <https://github.com/ROCm/composable_kernel>`_
|
||||||
- :version-ref:`"Composable Kernel" rocm_version`
|
- 1.1.0
|
||||||
- Enables faster execution of core operations like matrix multiplication
|
- Enables faster execution of core operations like matrix multiplication
|
||||||
(GEMM), convolutions and transformations.
|
(GEMM), convolutions and transformations.
|
||||||
* - `hipBLAS <https://github.com/ROCm/hipBLAS>`_
|
* - `hipBLAS <https://github.com/ROCm/hipBLAS>`_
|
||||||
- :version-ref:`hipBLAS rocm_version`
|
- 2.4.0
|
||||||
- Provides GPU-accelerated Basic Linear Algebra Subprograms (BLAS) for
|
- Provides GPU-accelerated Basic Linear Algebra Subprograms (BLAS) for
|
||||||
matrix and vector operations.
|
matrix and vector operations.
|
||||||
* - `hipBLASLt <https://github.com/ROCm/hipBLASLt>`_
|
* - `hipBLASLt <https://github.com/ROCm/hipBLASLt>`_
|
||||||
- :version-ref:`hipBLASLt rocm_version`
|
- 0.12.0
|
||||||
- hipBLASLt is an extension of the hipBLAS library, providing additional
|
- hipBLASLt is an extension of the hipBLAS library, providing additional
|
||||||
features like epilogues fused into the matrix multiplication kernel or
|
features like epilogues fused into the matrix multiplication kernel or
|
||||||
use of integer tensor cores.
|
use of integer tensor cores.
|
||||||
* - `hipCUB <https://github.com/ROCm/hipCUB>`_
|
* - `hipCUB <https://github.com/ROCm/hipCUB>`_
|
||||||
- :version-ref:`hipCUB rocm_version`
|
- 3.4.0
|
||||||
- Provides a C++ template library for parallel algorithms for reduction,
|
- Provides a C++ template library for parallel algorithms for reduction,
|
||||||
scan, sort and select.
|
scan, sort and select.
|
||||||
* - `hipFFT <https://github.com/ROCm/hipFFT>`_
|
* - `hipFFT <https://github.com/ROCm/hipFFT>`_
|
||||||
- :version-ref:`hipFFT rocm_version`
|
- 1.0.18
|
||||||
- Provides GPU-accelerated Fast Fourier Transform (FFT) operations.
|
- Provides GPU-accelerated Fast Fourier Transform (FFT) operations.
|
||||||
* - `hipRAND <https://github.com/ROCm/hipRAND>`_
|
* - `hipRAND <https://github.com/ROCm/hipRAND>`_
|
||||||
- :version-ref:`hipRAND rocm_version`
|
- 2.12.0
|
||||||
- Provides fast random number generation for GPUs.
|
- Provides fast random number generation for GPUs.
|
||||||
* - `hipSOLVER <https://github.com/ROCm/hipSOLVER>`_
|
* - `hipSOLVER <https://github.com/ROCm/hipSOLVER>`_
|
||||||
- :version-ref:`hipSOLVER rocm_version`
|
- 2.4.0
|
||||||
- Provides GPU-accelerated solvers for linear systems, eigenvalues, and
|
- Provides GPU-accelerated solvers for linear systems, eigenvalues, and
|
||||||
singular value decompositions (SVD).
|
singular value decompositions (SVD).
|
||||||
* - `hipSPARSE <https://github.com/ROCm/hipSPARSE>`_
|
* - `hipSPARSE <https://github.com/ROCm/hipSPARSE>`_
|
||||||
- :version-ref:`hipSPARSE rocm_version`
|
- 3.2.0
|
||||||
- Accelerates operations on sparse matrices, such as sparse matrix-vector
|
- Accelerates operations on sparse matrices, such as sparse matrix-vector
|
||||||
or matrix-matrix products.
|
or matrix-matrix products.
|
||||||
* - `hipSPARSELt <https://github.com/ROCm/hipSPARSELt>`_
|
* - `hipSPARSELt <https://github.com/ROCm/hipSPARSELt>`_
|
||||||
- :version-ref:`hipSPARSELt rocm_version`
|
- 0.2.3
|
||||||
- Accelerates operations on sparse matrices, such as sparse matrix-vector
|
- Accelerates operations on sparse matrices, such as sparse matrix-vector
|
||||||
or matrix-matrix products.
|
or matrix-matrix products.
|
||||||
* - `hipTensor <https://github.com/ROCm/hipTensor>`_
|
* - `hipTensor <https://github.com/ROCm/hipTensor>`_
|
||||||
- :version-ref:`hipTensor rocm_version`
|
- 1.5.0
|
||||||
- Optimizes for high-performance tensor operations, such as contractions.
|
- Optimizes for high-performance tensor operations, such as contractions.
|
||||||
* - `MIOpen <https://github.com/ROCm/MIOpen>`_
|
* - `MIOpen <https://github.com/ROCm/MIOpen>`_
|
||||||
- :version-ref:`MIOpen rocm_version`
|
- 3.4.0
|
||||||
- Optimizes deep learning primitives such as convolutions, pooling,
|
- Optimizes deep learning primitives such as convolutions, pooling,
|
||||||
normalization, and activation functions.
|
normalization, and activation functions.
|
||||||
* - `MIGraphX <https://github.com/ROCm/AMDMIGraphX>`_
|
* - `MIGraphX <https://github.com/ROCm/AMDMIGraphX>`_
|
||||||
- :version-ref:`MIGraphX rocm_version`
|
- 2.12.0
|
||||||
- Adds graph-level optimizations, ONNX models and mixed precision support
|
- Adds graph-level optimizations, ONNX models and mixed precision support
|
||||||
and enable Ahead-of-Time (AOT) Compilation.
|
and enable Ahead-of-Time (AOT) Compilation.
|
||||||
* - `MIVisionX <https://github.com/ROCm/MIVisionX>`_
|
* - `MIVisionX <https://github.com/ROCm/MIVisionX>`_
|
||||||
- :version-ref:`MIVisionX rocm_version`
|
- 3.2.0
|
||||||
- Optimizes acceleration for computer vision and AI workloads like
|
- Optimizes acceleration for computer vision and AI workloads like
|
||||||
preprocessing, augmentation, and inferencing.
|
preprocessing, augmentation, and inferencing.
|
||||||
* - `rocAL <https://github.com/ROCm/rocAL>`_
|
* - `rocAL <https://github.com/ROCm/rocAL>`_
|
||||||
@@ -184,25 +207,25 @@ If you prefer to build it yourself, ensure the following dependencies are instal
|
|||||||
- Accelerates the data pipeline by offloading intensive preprocessing and
|
- Accelerates the data pipeline by offloading intensive preprocessing and
|
||||||
augmentation tasks. rocAL is part of MIVisionX.
|
augmentation tasks. rocAL is part of MIVisionX.
|
||||||
* - `RCCL <https://github.com/ROCm/rccl>`_
|
* - `RCCL <https://github.com/ROCm/rccl>`_
|
||||||
- :version-ref:`RCCL rocm_version`
|
- 2.2.0
|
||||||
- Optimizes for multi-GPU communication for operations like AllReduce and
|
- Optimizes for multi-GPU communication for operations like AllReduce and
|
||||||
Broadcast.
|
Broadcast.
|
||||||
* - `rocDecode <https://github.com/ROCm/rocDecode>`_
|
* - `rocDecode <https://github.com/ROCm/rocDecode>`_
|
||||||
- :version-ref:`rocDecode rocm_version`
|
- 0.10.0
|
||||||
- Provides hardware-accelerated data decoding capabilities, particularly
|
- Provides hardware-accelerated data decoding capabilities, particularly
|
||||||
for image, video, and other dataset formats.
|
for image, video, and other dataset formats.
|
||||||
* - `rocJPEG <https://github.com/ROCm/rocJPEG>`_
|
* - `rocJPEG <https://github.com/ROCm/rocJPEG>`_
|
||||||
- :version-ref:`rocJPEG rocm_version`
|
- 0.8.0
|
||||||
- Provides hardware-accelerated JPEG image decoding and encoding.
|
- Provides hardware-accelerated JPEG image decoding and encoding.
|
||||||
* - `RPP <https://github.com/ROCm/RPP>`_
|
* - `RPP <https://github.com/ROCm/RPP>`_
|
||||||
- :version-ref:`RPP rocm_version`
|
- 1.9.10
|
||||||
- Speeds up data augmentation, transformation, and other preprocessing steps.
|
- Speeds up data augmentation, transformation, and other preprocessing steps.
|
||||||
* - `rocThrust <https://github.com/ROCm/rocThrust>`_
|
* - `rocThrust <https://github.com/ROCm/rocThrust>`_
|
||||||
- :version-ref:`rocThrust rocm_version`
|
- 3.3.0
|
||||||
- Provides a C++ template library for parallel algorithms like sorting,
|
- Provides a C++ template library for parallel algorithms like sorting,
|
||||||
reduction, and scanning.
|
reduction, and scanning.
|
||||||
* - `rocWMMA <https://github.com/ROCm/rocWMMA>`_
|
* - `rocWMMA <https://github.com/ROCm/rocWMMA>`_
|
||||||
- :version-ref:`rocWMMA rocm_version`
|
- 1.7.0
|
||||||
- Accelerates warp-level matrix-multiply and matrix-accumulate to speed up matrix
|
- Accelerates warp-level matrix-multiply and matrix-accumulate to speed up matrix
|
||||||
multiplication (GEMM) and accumulation operations with mixed precision
|
multiplication (GEMM) and accumulation operations with mixed precision
|
||||||
support.
|
support.
|
||||||
@@ -211,14 +234,14 @@ If you prefer to build it yourself, ensure the following dependencies are instal
|
|||||||
Supported features
|
Supported features
|
||||||
================================================================================
|
================================================================================
|
||||||
|
|
||||||
Many functions and methods available in DGL Upstream are also supported in DGL ROCm.
|
Many functions and methods available upstream are also supported in DGL on ROCm.
|
||||||
Instead of listing them all, support is grouped into the following categories to provide a general overview.
|
Instead of listing them all, support is grouped into the following categories to provide a general overview.
|
||||||
|
|
||||||
* DGL Base
|
* DGL Base
|
||||||
* DGL Backend
|
* DGL Backend
|
||||||
* DGL Data
|
* DGL Data
|
||||||
* DGL Dataloading
|
* DGL Dataloading
|
||||||
* DGL DGLGraph
|
* DGL Graph
|
||||||
* DGL Function
|
* DGL Function
|
||||||
* DGL Ops
|
* DGL Ops
|
||||||
* DGL Sampling
|
* DGL Sampling
|
||||||
@@ -235,9 +258,9 @@ Instead of listing them all, support is grouped into the following categories to
|
|||||||
Unsupported features
|
Unsupported features
|
||||||
================================================================================
|
================================================================================
|
||||||
|
|
||||||
* Graphbolt
|
* GraphBolt
|
||||||
* Partial TF32 Support (MI250x only)
|
* Partial TF32 Support (MI250X only)
|
||||||
* Kineto/ ROCTracer integration
|
* Kineto/ROCTracer integration
|
||||||
|
|
||||||
|
|
||||||
Unsupported functions
|
Unsupported functions
|
||||||
|
|||||||
@@ -1,8 +1,8 @@
|
|||||||
:orphan:
|
:orphan:
|
||||||
|
|
||||||
.. meta::
|
.. meta::
|
||||||
:description: FlashInfer deep learning framework compatibility
|
:description: FlashInfer compatibility
|
||||||
:keywords: GPU, LLM, FlashInfer, compatibility
|
:keywords: GPU, LLM, FlashInfer, deep learning, framework compatibility
|
||||||
|
|
||||||
.. version-set:: rocm_version latest
|
.. version-set:: rocm_version latest
|
||||||
|
|
||||||
@@ -11,7 +11,7 @@ FlashInfer compatibility
|
|||||||
********************************************************************************
|
********************************************************************************
|
||||||
|
|
||||||
`FlashInfer <https://docs.flashinfer.ai/index.html>`__ is a library and kernel generator
|
`FlashInfer <https://docs.flashinfer.ai/index.html>`__ is a library and kernel generator
|
||||||
for Large Language Models (LLMs) that provides high-performance implementation of graphics
|
for Large Language Models (LLMs) that provides a high-performance implementation of graphics
|
||||||
processing units (GPUs) kernels. FlashInfer focuses on LLM serving and inference, as well
|
processing units (GPUs) kernels. FlashInfer focuses on LLM serving and inference, as well
|
||||||
as advanced performance across diverse scenarios.
|
as advanced performance across diverse scenarios.
|
||||||
|
|
||||||
@@ -25,28 +25,30 @@ offers high-performance LLM-specific operators, with easy integration through Py
|
|||||||
For the latest feature compatibility matrix, refer to the ``README`` of the
|
For the latest feature compatibility matrix, refer to the ``README`` of the
|
||||||
`https://github.com/ROCm/flashinfer <https://github.com/ROCm/flashinfer>`__ repository.
|
`https://github.com/ROCm/flashinfer <https://github.com/ROCm/flashinfer>`__ repository.
|
||||||
|
|
||||||
Support for the ROCm port of FlashInfer is available as follows:
|
Support overview
|
||||||
|
================================================================================
|
||||||
|
|
||||||
- ROCm support for FlashInfer is hosted in the `https://github.com/ROCm/flashinfer
|
- The ROCm-supported version of FlashInfer is maintained in the official `https://github.com/ROCm/flashinfer
|
||||||
<https://github.com/ROCm/flashinfer>`__ repository. This location differs from the
|
<https://github.com/ROCm/flashinfer>`__ repository, which differs from the
|
||||||
`https://github.com/flashinfer-ai/flashinfer <https://github.com/flashinfer-ai/flashinfer>`_
|
`https://github.com/flashinfer-ai/flashinfer <https://github.com/flashinfer-ai/flashinfer>`__
|
||||||
upstream repository.
|
upstream repository.
|
||||||
|
|
||||||
- To install FlashInfer, use the prebuilt :ref:`Docker image <flashinfer-docker-compat>`,
|
- To get started and install FlashInfer on ROCm, use the prebuilt :ref:`Docker images <flashinfer-docker-compat>`,
|
||||||
which includes ROCm, FlashInfer, and all required dependencies.
|
which include ROCm, FlashInfer, and all required dependencies.
|
||||||
|
|
||||||
- See the :doc:`ROCm FlashInfer installation guide <rocm-install-on-linux:install/3rd-party/flashinfer-install>`
|
- See the :doc:`ROCm FlashInfer installation guide <rocm-install-on-linux:install/3rd-party/flashinfer-install>`
|
||||||
to install and get started.
|
for installation and setup instructions.
|
||||||
|
|
||||||
- See the `Installation guide <https://docs.flashinfer.ai/installation.html>`__
|
- You can also consult the upstream `Installation guide <https://docs.flashinfer.ai/installation.html>`__
|
||||||
in the upstream FlashInfer documentation.
|
for additional context.
|
||||||
|
|
||||||
.. note::
|
Version support
|
||||||
|
--------------------------------------------------------------------------------
|
||||||
|
|
||||||
Flashinfer is supported on ROCm 6.4.1.
|
FlashInfer is supported on `ROCm 6.4.1 <https://repo.radeon.com/rocm/apt/6.4.1/>`__.
|
||||||
|
|
||||||
Supported devices
|
Supported devices
|
||||||
================================================================================
|
--------------------------------------------------------------------------------
|
||||||
|
|
||||||
**Officially Supported**: AMD Instinct™ MI300X
|
**Officially Supported**: AMD Instinct™ MI300X
|
||||||
|
|
||||||
@@ -78,10 +80,9 @@ Docker image compatibility
|
|||||||
|
|
||||||
<i class="fab fa-docker"></i>
|
<i class="fab fa-docker"></i>
|
||||||
|
|
||||||
AMD validates and publishes `ROCm FlashInfer images <https://hub.docker.com/r/rocm/flashinfer/tags>`__
|
AMD validates and publishes `FlashInfer images <https://hub.docker.com/r/rocm/flashinfer/tags>`__
|
||||||
with ROCm and Pytorch backends on Docker Hub. The following Docker image tags and associated
|
with ROCm backends on Docker Hub. The following Docker image tag and associated
|
||||||
inventories represent the FlashInfer version from the official Docker Hub.
|
inventories represent the latest available FlashInfer version from the official Docker Hub.
|
||||||
The Docker images have been validated for `ROCm 6.4.1 <https://repo.radeon.com/rocm/apt/6.4.1/>`__.
|
|
||||||
Click |docker-icon| to view the image on Docker Hub.
|
Click |docker-icon| to view the image on Docker Hub.
|
||||||
|
|
||||||
.. list-table::
|
.. list-table::
|
||||||
|
|||||||
@@ -2,7 +2,7 @@
|
|||||||
|
|
||||||
.. meta::
|
.. meta::
|
||||||
:description: JAX compatibility
|
:description: JAX compatibility
|
||||||
:keywords: GPU, JAX compatibility
|
:keywords: GPU, JAX, deep learning, framework compatibility
|
||||||
|
|
||||||
.. version-set:: rocm_version latest
|
.. version-set:: rocm_version latest
|
||||||
|
|
||||||
@@ -10,42 +10,38 @@
|
|||||||
JAX compatibility
|
JAX compatibility
|
||||||
*******************************************************************************
|
*******************************************************************************
|
||||||
|
|
||||||
JAX provides a NumPy-like API, which combines automatic differentiation and the
|
`JAX <https://docs.jax.dev/en/latest/notebooks/thinking_in_jax.html>`__ is a library
|
||||||
|
for array-oriented numerical computation (similar to NumPy), with automatic differentiation
|
||||||
|
and just-in-time (JIT) compilation to enable high-performance machine learning research.
|
||||||
|
|
||||||
|
JAX provides an API that combines automatic differentiation and the
|
||||||
Accelerated Linear Algebra (XLA) compiler to achieve high-performance machine
|
Accelerated Linear Algebra (XLA) compiler to achieve high-performance machine
|
||||||
learning at scale.
|
learning at scale. JAX uses composable transformations of Python and NumPy through
|
||||||
|
JIT compilation, automatic vectorization, and parallelization.
|
||||||
|
|
||||||
JAX uses composable transformations of Python and NumPy through just-in-time
|
Support overview
|
||||||
(JIT) compilation, automatic vectorization, and parallelization. To learn about
|
================================================================================
|
||||||
JAX, including profiling and optimizations, see the official `JAX documentation
|
|
||||||
<https://jax.readthedocs.io/en/latest/notebooks/quickstart.html>`_.
|
|
||||||
|
|
||||||
ROCm support for JAX is upstreamed, and users can build the official source code
|
- The ROCm-supported version of JAX is maintained in the official `https://github.com/ROCm/rocm-jax
|
||||||
with ROCm support:
|
<https://github.com/ROCm/rocm-jax>`__ repository, which differs from the
|
||||||
|
`https://github.com/jax-ml/jax <https://github.com/jax-ml/jax>`__ upstream repository.
|
||||||
|
|
||||||
- ROCm JAX release:
|
- To get started and install JAX on ROCm, use the prebuilt :ref:`Docker images <jax-docker-compat>`,
|
||||||
|
which include ROCm, JAX, and all required dependencies.
|
||||||
- Offers AMD-validated and community :ref:`Docker images <jax-docker-compat>`
|
|
||||||
with ROCm and JAX preinstalled.
|
|
||||||
|
|
||||||
- ROCm JAX repository: `ROCm/rocm-jax <https://github.com/ROCm/rocm-jax>`_
|
|
||||||
|
|
||||||
- See the :doc:`ROCm JAX installation guide <rocm-install-on-linux:install/3rd-party/jax-install>`
|
- See the :doc:`ROCm JAX installation guide <rocm-install-on-linux:install/3rd-party/jax-install>`
|
||||||
to get started.
|
for installation and setup instructions.
|
||||||
|
|
||||||
- Official JAX release:
|
- You can also consult the upstream `Installation guide <https://jax.readthedocs.io/en/latest/installation.html#amd-gpu-linux>`__
|
||||||
|
for additional context.
|
||||||
|
|
||||||
- Official JAX repository: `jax-ml/jax <https://github.com/jax-ml/jax>`_
|
Version support
|
||||||
|
--------------------------------------------------------------------------------
|
||||||
|
|
||||||
- See the `AMD GPU (Linux) installation section
|
AMD releases official `ROCm JAX Docker images <https://hub.docker.com/r/rocm/jax/tags>`_
|
||||||
<https://jax.readthedocs.io/en/latest/installation.html#amd-gpu-linux>`_ in
|
quarterly alongside new ROCm releases. These images undergo full AMD testing.
|
||||||
the JAX documentation.
|
`Community ROCm JAX Docker images <https://hub.docker.com/r/rocm/jax-community/tags>`_
|
||||||
|
follow upstream JAX releases and use the latest available ROCm version.
|
||||||
.. note::
|
|
||||||
|
|
||||||
AMD releases official `ROCm JAX Docker images <https://hub.docker.com/r/rocm/jax>`_
|
|
||||||
quarterly alongside new ROCm releases. These images undergo full AMD testing.
|
|
||||||
`Community ROCm JAX Docker images <https://hub.docker.com/r/rocm/jax-community>`_
|
|
||||||
follow upstream JAX releases and use the latest available ROCm version.
|
|
||||||
|
|
||||||
Use cases and recommendations
|
Use cases and recommendations
|
||||||
================================================================================
|
================================================================================
|
||||||
@@ -71,7 +67,7 @@ Use cases and recommendations
|
|||||||
* The `Distributed fine-tuning with JAX on AMD GPUs <https://rocm.blogs.amd.com/artificial-intelligence/distributed-sft-jax/README.html>`_
|
* The `Distributed fine-tuning with JAX on AMD GPUs <https://rocm.blogs.amd.com/artificial-intelligence/distributed-sft-jax/README.html>`_
|
||||||
outlines the process of fine-tuning a Bidirectional Encoder Representations
|
outlines the process of fine-tuning a Bidirectional Encoder Representations
|
||||||
from Transformers (BERT)-based large language model (LLM) using JAX for a text
|
from Transformers (BERT)-based large language model (LLM) using JAX for a text
|
||||||
classification task. The blog post discuss techniques for parallelizing the
|
classification task. The blog post discusses techniques for parallelizing the
|
||||||
fine-tuning across multiple AMD GPUs and assess the model's performance on a
|
fine-tuning across multiple AMD GPUs and assess the model's performance on a
|
||||||
holdout dataset. During the fine-tuning, a BERT-base-cased transformer model
|
holdout dataset. During the fine-tuning, a BERT-base-cased transformer model
|
||||||
and the General Language Understanding Evaluation (GLUE) benchmark dataset was
|
and the General Language Understanding Evaluation (GLUE) benchmark dataset was
|
||||||
@@ -90,9 +86,9 @@ For more use cases and recommendations, see `ROCm JAX blog posts <https://rocm.b
|
|||||||
Docker image compatibility
|
Docker image compatibility
|
||||||
================================================================================
|
================================================================================
|
||||||
|
|
||||||
AMD provides preconfigured Docker images with JAX and the ROCm backend.
|
AMD validates and publishes `JAX images <https://hub.docker.com/r/rocm/jax/tags>`__
|
||||||
These images are published on `Docker Hub <https://hub.docker.com/r/rocm/jax>`__ and are the
|
with ROCm backends on Docker Hub.
|
||||||
recommended way to get started with deep learning with JAX on ROCm.
|
|
||||||
For ``jax-community`` images, see `rocm/jax-community
|
For ``jax-community`` images, see `rocm/jax-community
|
||||||
<https://hub.docker.com/r/rocm/jax-community/tags>`__ on Docker Hub.
|
<https://hub.docker.com/r/rocm/jax-community/tags>`__ on Docker Hub.
|
||||||
|
|
||||||
@@ -234,7 +230,7 @@ The ROCm supported data types in JAX are collected in the following table.
|
|||||||
|
|
||||||
.. note::
|
.. note::
|
||||||
|
|
||||||
JAX data type support is effected by the :ref:`key_rocm_libraries` and it's
|
JAX data type support is affected by the :ref:`key_rocm_libraries` and it's
|
||||||
collected on :doc:`ROCm data types and precision support <rocm:reference/precision-support>`
|
collected on :doc:`ROCm data types and precision support <rocm:reference/precision-support>`
|
||||||
page.
|
page.
|
||||||
|
|
||||||
|
|||||||
@@ -1,8 +1,8 @@
|
|||||||
:orphan:
|
:orphan:
|
||||||
|
|
||||||
.. meta::
|
.. meta::
|
||||||
:description: llama.cpp deep learning framework compatibility
|
:description: llama.cpp compatibility
|
||||||
:keywords: GPU, GGML, llama.cpp compatibility
|
:keywords: GPU, GGML, llama.cpp, deep learning, framework compatibility
|
||||||
|
|
||||||
.. version-set:: rocm_version latest
|
.. version-set:: rocm_version latest
|
||||||
|
|
||||||
@@ -20,33 +20,32 @@ to accelerate inference and reduce memory usage. Originally built as a CPU-first
|
|||||||
llama.cpp is easy to integrate with other programming environments and is widely
|
llama.cpp is easy to integrate with other programming environments and is widely
|
||||||
adopted across diverse platforms, including consumer devices.
|
adopted across diverse platforms, including consumer devices.
|
||||||
|
|
||||||
ROCm support for llama.cpp is upstreamed, and you can build the official source code
|
Support overview
|
||||||
with ROCm support:
|
|
||||||
|
|
||||||
- ROCm support for llama.cpp is hosted in the official `https://github.com/ROCm/llama.cpp
|
|
||||||
<https://github.com/ROCm/llama.cpp>`_ repository.
|
|
||||||
|
|
||||||
- Due to independent compatibility considerations, this location differs from the
|
|
||||||
`https://github.com/ggml-org/llama.cpp <https://github.com/ggml-org/llama.cpp>`_ upstream repository.
|
|
||||||
|
|
||||||
- To install llama.cpp, use the prebuilt :ref:`Docker image <llama-cpp-docker-compat>`,
|
|
||||||
which includes ROCm, llama.cpp, and all required dependencies.
|
|
||||||
|
|
||||||
- See the :doc:`ROCm llama.cpp installation guide <rocm-install-on-linux:install/3rd-party/llama-cpp-install>`
|
|
||||||
to install and get started.
|
|
||||||
|
|
||||||
- See the `Installation guide <https://github.com/ggml-org/llama.cpp/blob/master/docs/build.md#hip>`__
|
|
||||||
in the upstream llama.cpp documentation.
|
|
||||||
|
|
||||||
.. note::
|
|
||||||
|
|
||||||
llama.cpp is supported on ROCm 7.0.0 and ROCm 6.4.x.
|
|
||||||
|
|
||||||
Supported devices
|
|
||||||
================================================================================
|
================================================================================
|
||||||
|
|
||||||
**Officially Supported**: AMD Instinct™ MI300X, MI325X, MI210
|
- The ROCm-supported version of llama.cpp is maintained in the official `https://github.com/ROCm/llama.cpp
|
||||||
|
<https://github.com/ROCm/llama.cpp>`__ repository, which differs from the
|
||||||
|
`https://github.com/ggml-org/llama.cpp <https://github.com/ggml-org/llama.cpp>`__ upstream repository.
|
||||||
|
|
||||||
|
- To get started and install llama.cpp on ROCm, use the prebuilt :ref:`Docker images <llama-cpp-docker-compat>`,
|
||||||
|
which include ROCm, llama.cpp, and all required dependencies.
|
||||||
|
|
||||||
|
- See the :doc:`ROCm llama.cpp installation guide <rocm-install-on-linux:install/3rd-party/llama-cpp-install>`
|
||||||
|
for installation and setup instructions.
|
||||||
|
|
||||||
|
- You can also consult the upstream `Installation guide <https://github.com/ggml-org/llama.cpp/blob/master/docs/build.md>`__
|
||||||
|
for additional context.
|
||||||
|
|
||||||
|
Version support
|
||||||
|
--------------------------------------------------------------------------------
|
||||||
|
|
||||||
|
llama.cpp is supported on `ROCm 7.0.0 <https://repo.radeon.com/rocm/apt/7.0/>`__ and
|
||||||
|
`ROCm 6.4.x <https://repo.radeon.com/rocm/apt/6.4/>`__.
|
||||||
|
|
||||||
|
Supported devices
|
||||||
|
--------------------------------------------------------------------------------
|
||||||
|
|
||||||
|
**Officially Supported**: AMD Instinct™ MI300X, MI325X, MI210
|
||||||
|
|
||||||
Use cases and recommendations
|
Use cases and recommendations
|
||||||
================================================================================
|
================================================================================
|
||||||
@@ -84,9 +83,9 @@ Docker image compatibility
|
|||||||
|
|
||||||
<i class="fab fa-docker"></i>
|
<i class="fab fa-docker"></i>
|
||||||
|
|
||||||
AMD validates and publishes `ROCm llama.cpp Docker images <https://hub.docker.com/r/rocm/llama.cpp/tags>`__
|
AMD validates and publishes `llama.cpp images <https://hub.docker.com/r/rocm/llama.cpp/tags>`__
|
||||||
with ROCm backends on Docker Hub. The following Docker image tags and associated
|
with ROCm backends on Docker Hub. The following Docker image tags and associated
|
||||||
inventories represent the available llama.cpp versions from the official Docker Hub.
|
inventories represent the latest available llama.cpp versions from the official Docker Hub.
|
||||||
Click |docker-icon| to view the image on Docker Hub.
|
Click |docker-icon| to view the image on Docker Hub.
|
||||||
|
|
||||||
.. important::
|
.. important::
|
||||||
|
|||||||
@@ -2,7 +2,7 @@
|
|||||||
|
|
||||||
.. meta::
|
.. meta::
|
||||||
:description: Megablocks compatibility
|
:description: Megablocks compatibility
|
||||||
:keywords: GPU, megablocks, compatibility
|
:keywords: GPU, megablocks, deep learning, framework compatibility
|
||||||
|
|
||||||
.. version-set:: rocm_version latest
|
.. version-set:: rocm_version latest
|
||||||
|
|
||||||
@@ -10,28 +10,42 @@
|
|||||||
Megablocks compatibility
|
Megablocks compatibility
|
||||||
********************************************************************************
|
********************************************************************************
|
||||||
|
|
||||||
Megablocks is a light-weight library for mixture-of-experts (MoE) training.
|
`Megablocks <https://github.com/databricks/megablocks>`__ is a lightweight library
|
||||||
|
for mixture-of-experts `(MoE) <https://huggingface.co/blog/moe>`__ training.
|
||||||
The core of the system is efficient "dropless-MoE" and standard MoE layers.
|
The core of the system is efficient "dropless-MoE" and standard MoE layers.
|
||||||
Megablocks is integrated with `https://github.com/stanford-futuredata/Megatron-LM <https://github.com/stanford-futuredata/Megatron-LM>`_,
|
Megablocks is integrated with `https://github.com/stanford-futuredata/Megatron-LM
|
||||||
|
<https://github.com/stanford-futuredata/Megatron-LM>`__,
|
||||||
where data and pipeline parallel training of MoEs is supported.
|
where data and pipeline parallel training of MoEs is supported.
|
||||||
|
|
||||||
* ROCm support for Megablocks is hosted in the official `https://github.com/ROCm/megablocks <https://github.com/ROCm/megablocks>`_ repository.
|
Support overview
|
||||||
* Due to independent compatibility considerations, this location differs from the `https://github.com/stanford-futuredata/Megatron-LM <https://github.com/stanford-futuredata/Megatron-LM>`_ upstream repository.
|
================================================================================
|
||||||
* Use the prebuilt :ref:`Docker image <megablocks-docker-compat>` with ROCm, PyTorch, and Megablocks preinstalled.
|
|
||||||
* See the :doc:`ROCm Megablocks installation guide <rocm-install-on-linux:install/3rd-party/megablocks-install>` to install and get started.
|
|
||||||
|
|
||||||
.. note::
|
- The ROCm-supported version of Megablocks is maintained in the official `https://github.com/ROCm/megablocks
|
||||||
|
<https://github.com/ROCm/megablocks>`__ repository, which differs from the
|
||||||
|
`https://github.com/stanford-futuredata/Megatron-LM <https://github.com/stanford-futuredata/Megatron-LM>`__ upstream repository.
|
||||||
|
|
||||||
Megablocks is supported on ROCm 6.3.0.
|
- To get started and install Megablocks on ROCm, use the prebuilt :ref:`Docker image <megablocks-docker-compat>`,
|
||||||
|
which includes ROCm, Megablocks, and all required dependencies.
|
||||||
|
|
||||||
|
- See the :doc:`ROCm Megablocks installation guide <rocm-install-on-linux:install/3rd-party/megablocks-install>`
|
||||||
|
for installation and setup instructions.
|
||||||
|
|
||||||
|
- You can also consult the upstream `Installation guide <https://github.com/databricks/megablocks>`__
|
||||||
|
for additional context.
|
||||||
|
|
||||||
|
Version support
|
||||||
|
--------------------------------------------------------------------------------
|
||||||
|
|
||||||
|
Megablocks is supported on `ROCm 6.3.0 <https://repo.radeon.com/rocm/apt/6.3/>`__.
|
||||||
|
|
||||||
Supported devices
|
Supported devices
|
||||||
================================================================================
|
--------------------------------------------------------------------------------
|
||||||
|
|
||||||
- **Officially Supported**: AMD Instinct MI300X
|
- **Officially Supported**: AMD Instinct™ MI300X
|
||||||
- **Partially Supported** (functionality or performance limitations): AMD Instinct MI250X, MI210
|
- **Partially Supported** (functionality or performance limitations): AMD Instinct™ MI250X, MI210
|
||||||
|
|
||||||
Supported models and features
|
Supported models and features
|
||||||
================================================================================
|
--------------------------------------------------------------------------------
|
||||||
|
|
||||||
This section summarizes the Megablocks features supported by ROCm.
|
This section summarizes the Megablocks features supported by ROCm.
|
||||||
|
|
||||||
@@ -41,20 +55,28 @@ This section summarizes the Megablocks features supported by ROCm.
|
|||||||
* Mixture-of-Experts
|
* Mixture-of-Experts
|
||||||
* dropless-Mixture-of-Experts
|
* dropless-Mixture-of-Experts
|
||||||
|
|
||||||
|
|
||||||
.. _megablocks-recommendations:
|
.. _megablocks-recommendations:
|
||||||
|
|
||||||
Use cases and recommendations
|
Use cases and recommendations
|
||||||
================================================================================
|
================================================================================
|
||||||
|
|
||||||
The `ROCm Megablocks blog posts <https://rocm.blogs.amd.com/artificial-intelligence/megablocks/README.html>`_
|
* The `Efficient MoE training on AMD ROCm: How-to use Megablocks on AMD GPUs
|
||||||
guide how to leverage the ROCm platform for pre-training using the Megablocks framework.
|
<https://rocm.blogs.amd.com/artificial-intelligence/megablocks/README.html>`__
|
||||||
|
blog post guides how to leverage the ROCm platform for pre-training using the
|
||||||
|
Megablocks framework. It introduces a streamlined approach for training Mixture-of-Experts
|
||||||
|
(MoE) models using the Megablocks library on AMD hardware. Focusing on GPT-2, it
|
||||||
|
demonstrates how block-sparse computations can enhance scalability and efficiency in MoE
|
||||||
|
training. The guide provides step-by-step instructions for setting up the environment,
|
||||||
|
including cloning the repository, building the Docker image, and running the training container.
|
||||||
|
Additionally, it offers insights into utilizing the ``oscar-1GB.json`` dataset for pre-training
|
||||||
|
language models. By leveraging Megablocks and the ROCm platform, you can optimize your MoE
|
||||||
|
training workflows for large-scale transformer models.
|
||||||
|
|
||||||
It features how to pre-process datasets and how to begin pre-training on AMD GPUs through:
|
It features how to pre-process datasets and how to begin pre-training on AMD GPUs through:
|
||||||
|
|
||||||
* Single-GPU pre-training
|
* Single-GPU pre-training
|
||||||
* Multi-GPU pre-training
|
* Multi-GPU pre-training
|
||||||
|
|
||||||
|
|
||||||
.. _megablocks-docker-compat:
|
.. _megablocks-docker-compat:
|
||||||
|
|
||||||
Docker image compatibility
|
Docker image compatibility
|
||||||
@@ -64,10 +86,9 @@ Docker image compatibility
|
|||||||
|
|
||||||
<i class="fab fa-docker"></i>
|
<i class="fab fa-docker"></i>
|
||||||
|
|
||||||
AMD validates and publishes `ROCm Megablocks images <https://hub.docker.com/r/rocm/megablocks/tags>`_
|
AMD validates and publishes `Megablocks images <https://hub.docker.com/r/rocm/megablocks/tags>`__
|
||||||
with ROCm and Pytorch backends on Docker Hub. The following Docker image tags and associated
|
with ROCm backends on Docker Hub. The following Docker image tag and associated
|
||||||
inventories represent the latest Megatron-LM version from the official Docker Hub.
|
inventories represent the latest available Megablocks version from the official Docker Hub.
|
||||||
The Docker images have been validated for `ROCm 6.3.0 <https://repo.radeon.com/rocm/apt/6.3/>`_.
|
|
||||||
Click |docker-icon| to view the image on Docker Hub.
|
Click |docker-icon| to view the image on Docker Hub.
|
||||||
|
|
||||||
.. list-table::
|
.. list-table::
|
||||||
|
|||||||
@@ -2,7 +2,7 @@
|
|||||||
|
|
||||||
.. meta::
|
.. meta::
|
||||||
:description: PyTorch compatibility
|
:description: PyTorch compatibility
|
||||||
:keywords: GPU, PyTorch compatibility
|
:keywords: GPU, PyTorch, deep learning, framework compatibility
|
||||||
|
|
||||||
.. version-set:: rocm_version latest
|
.. version-set:: rocm_version latest
|
||||||
|
|
||||||
@@ -15,40 +15,42 @@ deep learning. PyTorch on ROCm provides mixed-precision and large-scale training
|
|||||||
using `MIOpen <https://github.com/ROCm/MIOpen>`__ and
|
using `MIOpen <https://github.com/ROCm/MIOpen>`__ and
|
||||||
`RCCL <https://github.com/ROCm/rccl>`__ libraries.
|
`RCCL <https://github.com/ROCm/rccl>`__ libraries.
|
||||||
|
|
||||||
ROCm support for PyTorch is upstreamed into the official PyTorch repository. Due
|
PyTorch provides two high-level features:
|
||||||
to independent compatibility considerations, this results in two distinct
|
|
||||||
release cycles for PyTorch on ROCm:
|
|
||||||
|
|
||||||
- ROCm PyTorch release:
|
- Tensor computation (like NumPy) with strong GPU acceleration
|
||||||
|
|
||||||
- Provides the latest version of ROCm but might not necessarily support the
|
- Deep neural networks built on a tape-based autograd system (rapid computation
|
||||||
latest stable PyTorch version.
|
of multiple partial derivatives or gradients)
|
||||||
|
|
||||||
- Offers :ref:`Docker images <pytorch-docker-compat>` with ROCm and PyTorch
|
Support overview
|
||||||
preinstalled.
|
================================================================================
|
||||||
|
|
||||||
- ROCm PyTorch repository: `<https://github.com/ROCm/pytorch>`__
|
ROCm support for PyTorch is upstreamed into the official PyTorch repository.
|
||||||
|
ROCm development is aligned with the stable release of PyTorch, while upstream
|
||||||
|
PyTorch testing uses the stable release of ROCm to maintain consistency:
|
||||||
|
|
||||||
|
- The ROCm-supported version of PyTorch is maintained in the official `https://github.com/ROCm/pytorch
|
||||||
|
<https://github.com/ROCm/pytorch>`__ repository, which differs from the
|
||||||
|
`https://github.com/pytorch/pytorch <https://github.com/pytorch/pytorch>`__ upstream repository.
|
||||||
|
|
||||||
|
- To get started and install PyTorch on ROCm, use the prebuilt :ref:`Docker images <pytorch-docker-compat>`,
|
||||||
|
which include ROCm, PyTorch, and all required dependencies.
|
||||||
|
|
||||||
- See the :doc:`ROCm PyTorch installation guide <rocm-install-on-linux:install/3rd-party/pytorch-install>`
|
- See the :doc:`ROCm PyTorch installation guide <rocm-install-on-linux:install/3rd-party/pytorch-install>`
|
||||||
to get started.
|
for installation and setup instructions.
|
||||||
|
|
||||||
- Official PyTorch release:
|
- You can also consult the upstream `Installation guide <https://pytorch.org/get-started/locally/>`__ or
|
||||||
|
`Previous versions <https://pytorch.org/get-started/previous-versions/>`__ for additional context.
|
||||||
- Provides the latest stable version of PyTorch but might not necessarily
|
|
||||||
support the latest ROCm version.
|
|
||||||
|
|
||||||
- Official PyTorch repository: `<https://github.com/pytorch/pytorch>`__
|
|
||||||
|
|
||||||
- See the `Nightly and latest stable version installation guide <https://pytorch.org/get-started/locally/>`__
|
|
||||||
or `Previous versions <https://pytorch.org/get-started/previous-versions/>`__
|
|
||||||
to get started.
|
|
||||||
|
|
||||||
PyTorch includes tooling that generates HIP source code from the CUDA backend.
|
PyTorch includes tooling that generates HIP source code from the CUDA backend.
|
||||||
This approach allows PyTorch to support ROCm without requiring manual code
|
This approach allows PyTorch to support ROCm without requiring manual code
|
||||||
modifications. For more information, see :doc:`HIPIFY <hipify:index>`.
|
modifications. For more information, see :doc:`HIPIFY <hipify:index>`.
|
||||||
|
|
||||||
ROCm development is aligned with the stable release of PyTorch, while upstream
|
Version support
|
||||||
PyTorch testing uses the stable release of ROCm to maintain consistency.
|
--------------------------------------------------------------------------------
|
||||||
|
|
||||||
|
AMD releases official `ROCm PyTorch Docker images <https://hub.docker.com/r/rocm/pytorch/tags>`_
|
||||||
|
quarterly alongside new ROCm releases. These images undergo full AMD testing.
|
||||||
|
|
||||||
.. _pytorch-recommendations:
|
.. _pytorch-recommendations:
|
||||||
|
|
||||||
@@ -78,7 +80,7 @@ Use cases and recommendations
|
|||||||
GPU.
|
GPU.
|
||||||
|
|
||||||
* The :doc:`Inception with PyTorch documentation </conceptual/ai-pytorch-inception>`
|
* The :doc:`Inception with PyTorch documentation </conceptual/ai-pytorch-inception>`
|
||||||
describes how PyTorch integrates with ROCm for AI workloads It outlines the
|
describes how PyTorch integrates with ROCm for AI workloads. It outlines the
|
||||||
use of PyTorch on the ROCm platform and focuses on efficiently leveraging AMD
|
use of PyTorch on the ROCm platform and focuses on efficiently leveraging AMD
|
||||||
GPU hardware for training and inference tasks in AI applications.
|
GPU hardware for training and inference tasks in AI applications.
|
||||||
|
|
||||||
@@ -89,9 +91,8 @@ For more use cases and recommendations, see `ROCm PyTorch blog posts <https://ro
|
|||||||
Docker image compatibility
|
Docker image compatibility
|
||||||
================================================================================
|
================================================================================
|
||||||
|
|
||||||
AMD provides preconfigured Docker images with PyTorch and the ROCm backend.
|
AMD validates and publishes `PyTorch images <https://hub.docker.com/r/rocm/pytorch/tags>`__
|
||||||
These images are published on `Docker Hub <https://hub.docker.com/r/rocm/pytorch>`__ and are the
|
with ROCm backends on Docker Hub.
|
||||||
recommended way to get started with deep learning with PyTorch on ROCm.
|
|
||||||
|
|
||||||
To find the right image tag, see the :ref:`PyTorch on ROCm installation
|
To find the right image tag, see the :ref:`PyTorch on ROCm installation
|
||||||
documentation <rocm-install-on-linux:pytorch-docker-support>` for a list of
|
documentation <rocm-install-on-linux:pytorch-docker-support>` for a list of
|
||||||
|
|||||||
@@ -1,8 +1,8 @@
|
|||||||
:orphan:
|
:orphan:
|
||||||
|
|
||||||
.. meta::
|
.. meta::
|
||||||
:description: Ray deep learning framework compatibility
|
:description: Ray compatibility
|
||||||
:keywords: GPU, Ray compatibility
|
:keywords: GPU, Ray, deep learning, framework compatibility
|
||||||
|
|
||||||
.. version-set:: rocm_version latest
|
.. version-set:: rocm_version latest
|
||||||
|
|
||||||
@@ -19,36 +19,35 @@ simplifying machine learning computations.
|
|||||||
Ray is a general-purpose framework that runs many types of workloads efficiently.
|
Ray is a general-purpose framework that runs many types of workloads efficiently.
|
||||||
Any Python application can be scaled with Ray, without extra infrastructure.
|
Any Python application can be scaled with Ray, without extra infrastructure.
|
||||||
|
|
||||||
ROCm support for Ray is upstreamed, and you can build the official source code
|
Support overview
|
||||||
with ROCm support:
|
|
||||||
|
|
||||||
- ROCm support for Ray is hosted in the official `https://github.com/ROCm/ray
|
|
||||||
<https://github.com/ROCm/ray>`_ repository.
|
|
||||||
|
|
||||||
- Due to independent compatibility considerations, this location differs from the
|
|
||||||
`https://github.com/ray-project/ray <https://github.com/ray-project/ray>`_ upstream repository.
|
|
||||||
|
|
||||||
- To install Ray, use the prebuilt :ref:`Docker image <ray-docker-compat>`
|
|
||||||
which includes ROCm, Ray, and all required dependencies.
|
|
||||||
|
|
||||||
- See the :doc:`ROCm Ray installation guide <rocm-install-on-linux:install/3rd-party/ray-install>`
|
|
||||||
for instructions to get started.
|
|
||||||
|
|
||||||
- See the `Installation section <https://docs.ray.io/en/latest/ray-overview/installation.html>`_
|
|
||||||
in the upstream Ray documentation.
|
|
||||||
|
|
||||||
- The Docker image provided is based on the upstream Ray `Daily Release (Nightly) wheels <https://docs.ray.io/en/latest/ray-overview/installation.html#daily-releases-nightlies>`__
|
|
||||||
corresponding to commit `005c372 <https://github.com/ray-project/ray/commit/005c372262e050d5745f475e22e64305fa07f8b8>`__.
|
|
||||||
|
|
||||||
.. note::
|
|
||||||
|
|
||||||
Ray is supported on ROCm 6.4.1.
|
|
||||||
|
|
||||||
Supported devices
|
|
||||||
================================================================================
|
================================================================================
|
||||||
|
|
||||||
**Officially Supported**: AMD Instinct™ MI300X, MI210
|
- The ROCm-supported version of Ray is maintained in the official `https://github.com/ROCm/ray
|
||||||
|
<https://github.com/ROCm/ray>`__ repository, which differs from the
|
||||||
|
`https://github.com/ray-project/ray <https://github.com/ray-project/ray>`__ upstream repository.
|
||||||
|
|
||||||
|
- To get started and install Ray on ROCm, use the prebuilt :ref:`Docker image <ray-docker-compat>`,
|
||||||
|
which includes ROCm, Ray, and all required dependencies.
|
||||||
|
|
||||||
|
- The Docker image provided is based on the upstream Ray `Daily Release (Nightly) wheels
|
||||||
|
<https://docs.ray.io/en/latest/ray-overview/installation.html#daily-releases-nightlies>`__
|
||||||
|
corresponding to commit `005c372 <https://github.com/ray-project/ray/commit/005c372262e050d5745f475e22e64305fa07f8b8>`__.
|
||||||
|
|
||||||
|
- See the :doc:`ROCm Ray installation guide <rocm-install-on-linux:install/3rd-party/ray-install>`
|
||||||
|
for installation and setup instructions.
|
||||||
|
|
||||||
|
- You can also consult the upstream `Installation guide <https://docs.ray.io/en/latest/ray-overview/installation.html>`__
|
||||||
|
for additional context.
|
||||||
|
|
||||||
|
Version support
|
||||||
|
--------------------------------------------------------------------------------
|
||||||
|
|
||||||
|
Ray is supported on `ROCm 6.4.1 <https://repo.radeon.com/rocm/apt/6.4.1/>`__.
|
||||||
|
|
||||||
|
Supported devices
|
||||||
|
--------------------------------------------------------------------------------
|
||||||
|
|
||||||
|
**Officially Supported**: AMD Instinct™ MI300X, MI210
|
||||||
|
|
||||||
Use cases and recommendations
|
Use cases and recommendations
|
||||||
================================================================================
|
================================================================================
|
||||||
@@ -88,15 +87,15 @@ Docker image compatibility
|
|||||||
|
|
||||||
AMD validates and publishes ready-made `ROCm Ray Docker images <https://hub.docker.com/r/rocm/ray/tags>`__
|
AMD validates and publishes ready-made `ROCm Ray Docker images <https://hub.docker.com/r/rocm/ray/tags>`__
|
||||||
with ROCm backends on Docker Hub. The following Docker image tags and
|
with ROCm backends on Docker Hub. The following Docker image tags and
|
||||||
associated inventories represent the latest Ray version from the official Docker Hub and are validated for
|
associated inventories represent the latest Ray version from the official Docker Hub.
|
||||||
`ROCm 6.4.1 <https://repo.radeon.com/rocm/apt/6.4.1/>`_. Click the |docker-icon|
|
Click the |docker-icon| icon to view the image on Docker Hub.
|
||||||
icon to view the image on Docker Hub.
|
|
||||||
|
|
||||||
.. list-table::
|
.. list-table::
|
||||||
:header-rows: 1
|
:header-rows: 1
|
||||||
:class: docker-image-compatibility
|
:class: docker-image-compatibility
|
||||||
|
|
||||||
* - Docker image
|
* - Docker image
|
||||||
|
- ROCm
|
||||||
- Ray
|
- Ray
|
||||||
- Pytorch
|
- Pytorch
|
||||||
- Ubuntu
|
- Ubuntu
|
||||||
@@ -105,6 +104,7 @@ icon to view the image on Docker Hub.
|
|||||||
* - .. raw:: html
|
* - .. raw:: html
|
||||||
|
|
||||||
<a href="https://hub.docker.com/layers/rocm/ray/ray-2.48.0.post0_rocm6.4.1_ubuntu24.04_py3.12_pytorch2.6.0/images/sha256-0d166fe6bdced38338c78eedfb96eff92655fb797da3478a62dd636365133cc0"><i class="fab fa-docker fa-lg"></i> rocm/ray</a>
|
<a href="https://hub.docker.com/layers/rocm/ray/ray-2.48.0.post0_rocm6.4.1_ubuntu24.04_py3.12_pytorch2.6.0/images/sha256-0d166fe6bdced38338c78eedfb96eff92655fb797da3478a62dd636365133cc0"><i class="fab fa-docker fa-lg"></i> rocm/ray</a>
|
||||||
|
- `6.4.1 <https://repo.radeon.com/rocm/apt/6.4.1/>`__.
|
||||||
- `2.48.0.post0 <https://github.com/ROCm/ray/tree/release/2.48.0.post0>`_
|
- `2.48.0.post0 <https://github.com/ROCm/ray/tree/release/2.48.0.post0>`_
|
||||||
- 2.6.0+git684f6f2
|
- 2.6.0+git684f6f2
|
||||||
- 24.04
|
- 24.04
|
||||||
|
|||||||
@@ -2,7 +2,7 @@
|
|||||||
|
|
||||||
.. meta::
|
.. meta::
|
||||||
:description: Stanford Megatron-LM compatibility
|
:description: Stanford Megatron-LM compatibility
|
||||||
:keywords: Stanford, Megatron-LM, compatibility
|
:keywords: Stanford, Megatron-LM, deep learning, framework compatibility
|
||||||
|
|
||||||
.. version-set:: rocm_version latest
|
.. version-set:: rocm_version latest
|
||||||
|
|
||||||
@@ -10,34 +10,50 @@
|
|||||||
Stanford Megatron-LM compatibility
|
Stanford Megatron-LM compatibility
|
||||||
********************************************************************************
|
********************************************************************************
|
||||||
|
|
||||||
Stanford Megatron-LM is a large-scale language model training framework developed by NVIDIA `https://github.com/NVIDIA/Megatron-LM <https://github.com/NVIDIA/Megatron-LM>`_. It is
|
Stanford Megatron-LM is a large-scale language model training framework developed
|
||||||
designed to train massive transformer-based language models efficiently by model and data parallelism.
|
by NVIDIA at `https://github.com/NVIDIA/Megatron-LM <https://github.com/NVIDIA/Megatron-LM>`_.
|
||||||
|
It is designed to train massive transformer-based language models efficiently by model
|
||||||
|
and data parallelism.
|
||||||
|
|
||||||
* ROCm support for Stanford Megatron-LM is hosted in the official `https://github.com/ROCm/Stanford-Megatron-LM <https://github.com/ROCm/Stanford-Megatron-LM>`_ repository.
|
It provides efficient tensor, pipeline, and sequence-based model parallelism for
|
||||||
* Due to independent compatibility considerations, this location differs from the `https://github.com/stanford-futuredata/Megatron-LM <https://github.com/stanford-futuredata/Megatron-LM>`_ upstream repository.
|
pre-training transformer-based language models such as GPT (Decoder Only), BERT
|
||||||
* Use the prebuilt :ref:`Docker image <megatron-lm-docker-compat>` with ROCm, PyTorch, and Megatron-LM preinstalled.
|
(Encoder Only), and T5 (Encoder-Decoder).
|
||||||
* See the :doc:`ROCm Stanford Megatron-LM installation guide <rocm-install-on-linux:install/3rd-party/stanford-megatron-lm-install>` to install and get started.
|
|
||||||
|
|
||||||
.. note::
|
Support overview
|
||||||
|
|
||||||
Stanford Megatron-LM is supported on ROCm 6.3.0.
|
|
||||||
|
|
||||||
|
|
||||||
Supported Devices
|
|
||||||
================================================================================
|
================================================================================
|
||||||
|
|
||||||
- **Officially Supported**: AMD Instinct MI300X
|
- The ROCm-supported version of Stanford Megatron-LM is maintained in the official `https://github.com/ROCm/Stanford-Megatron-LM
|
||||||
- **Partially Supported** (functionality or performance limitations): AMD Instinct MI250X, MI210
|
<https://github.com/ROCm/Stanford-Megatron-LM>`__ repository, which differs from the
|
||||||
|
`https://github.com/stanford-futuredata/Megatron-LM <https://github.com/stanford-futuredata/Megatron-LM>`__ upstream repository.
|
||||||
|
|
||||||
|
- To get started and install Stanford Megatron-LM on ROCm, use the prebuilt :ref:`Docker image <megatron-lm-docker-compat>`,
|
||||||
|
which includes ROCm, Stanford Megatron-LM, and all required dependencies.
|
||||||
|
|
||||||
|
- See the :doc:`ROCm Stanford Megatron-LM installation guide <rocm-install-on-linux:install/3rd-party/stanford-megatron-lm-install>`
|
||||||
|
for installation and setup instructions.
|
||||||
|
|
||||||
|
- You can also consult the upstream `Installation guide <https://github.com/NVIDIA/Megatron-LM>`__
|
||||||
|
for additional context.
|
||||||
|
|
||||||
|
Version support
|
||||||
|
--------------------------------------------------------------------------------
|
||||||
|
|
||||||
|
Stanford Megatron-LM is supported on `ROCm 6.3.0 <https://repo.radeon.com/rocm/apt/6.3/>`__.
|
||||||
|
|
||||||
|
Supported devices
|
||||||
|
--------------------------------------------------------------------------------
|
||||||
|
|
||||||
|
- **Officially Supported**: AMD Instinct™ MI300X
|
||||||
|
- **Partially Supported** (functionality or performance limitations): AMD Instinct™ MI250X, MI210
|
||||||
|
|
||||||
Supported models and features
|
Supported models and features
|
||||||
================================================================================
|
--------------------------------------------------------------------------------
|
||||||
|
|
||||||
This section details models & features that are supported by the ROCm version on Stanford Megatron-LM.
|
This section details models & features that are supported by the ROCm version on Stanford Megatron-LM.
|
||||||
|
|
||||||
Models:
|
Models:
|
||||||
|
|
||||||
* Bert
|
* BERT
|
||||||
* GPT
|
* GPT
|
||||||
* T5
|
* T5
|
||||||
* ICT
|
* ICT
|
||||||
@@ -54,13 +70,24 @@ Features:
|
|||||||
Use cases and recommendations
|
Use cases and recommendations
|
||||||
================================================================================
|
================================================================================
|
||||||
|
|
||||||
See the `Efficient MoE training on AMD ROCm: How-to use Megablocks on AMD GPUs blog <https://rocm.blogs.amd.com/artificial-intelligence/megablocks/README.html>`_ post
|
The following blog post mentions Megablocks, but you can run Stanford Megatron-LM with the same steps to pre-process datasets on AMD GPUs:
|
||||||
to leverage the ROCm platform for pre-training by using the Stanford Megatron-LM framework of pre-processing datasets on AMD GPUs.
|
|
||||||
Coverage includes:
|
|
||||||
|
|
||||||
* Single-GPU pre-training
|
* The `Efficient MoE training on AMD ROCm: How-to use Megablocks on AMD GPUs
|
||||||
* Multi-GPU pre-training
|
<https://rocm.blogs.amd.com/artificial-intelligence/megablocks/README.html>`__
|
||||||
|
blog post guides how to leverage the ROCm platform for pre-training using the
|
||||||
|
Megablocks framework. It introduces a streamlined approach for training Mixture-of-Experts
|
||||||
|
(MoE) models using the Megablocks library on AMD hardware. Focusing on GPT-2, it
|
||||||
|
demonstrates how block-sparse computations can enhance scalability and efficiency in MoE
|
||||||
|
training. The guide provides step-by-step instructions for setting up the environment,
|
||||||
|
including cloning the repository, building the Docker image, and running the training container.
|
||||||
|
Additionally, it offers insights into utilizing the ``oscar-1GB.json`` dataset for pre-training
|
||||||
|
language models. By leveraging Megablocks and the ROCm platform, you can optimize your MoE
|
||||||
|
training workflows for large-scale transformer models.
|
||||||
|
|
||||||
|
It features how to pre-process datasets and how to begin pre-training on AMD GPUs through:
|
||||||
|
|
||||||
|
* Single-GPU pre-training
|
||||||
|
* Multi-GPU pre-training
|
||||||
|
|
||||||
.. _megatron-lm-docker-compat:
|
.. _megatron-lm-docker-compat:
|
||||||
|
|
||||||
@@ -71,10 +98,9 @@ Docker image compatibility
|
|||||||
|
|
||||||
<i class="fab fa-docker"></i>
|
<i class="fab fa-docker"></i>
|
||||||
|
|
||||||
AMD validates and publishes `Stanford Megatron-LM images <https://hub.docker.com/r/rocm/megatron-lm>`_
|
AMD validates and publishes `Stanford Megatron-LM images <https://hub.docker.com/r/rocm/stanford-megatron-lm/tags>`_
|
||||||
with ROCm and Pytorch backends on Docker Hub. The following Docker image tags and associated
|
with ROCm and Pytorch backends on Docker Hub. The following Docker image tags and associated
|
||||||
inventories represent the latest Megatron-LM version from the official Docker Hub.
|
inventories represent the latest Stanford Megatron-LM version from the official Docker Hub.
|
||||||
The Docker images have been validated for `ROCm 6.3.0 <https://repo.radeon.com/rocm/apt/6.3/>`_.
|
|
||||||
Click |docker-icon| to view the image on Docker Hub.
|
Click |docker-icon| to view the image on Docker Hub.
|
||||||
|
|
||||||
.. list-table::
|
.. list-table::
|
||||||
@@ -82,6 +108,7 @@ Click |docker-icon| to view the image on Docker Hub.
|
|||||||
:class: docker-image-compatibility
|
:class: docker-image-compatibility
|
||||||
|
|
||||||
* - Docker image
|
* - Docker image
|
||||||
|
- ROCm
|
||||||
- Stanford Megatron-LM
|
- Stanford Megatron-LM
|
||||||
- PyTorch
|
- PyTorch
|
||||||
- Ubuntu
|
- Ubuntu
|
||||||
@@ -91,6 +118,7 @@ Click |docker-icon| to view the image on Docker Hub.
|
|||||||
|
|
||||||
<a href="https://hub.docker.com/layers/rocm/stanford-megatron-lm/stanford-megatron-lm85f95ae_rocm6.3.0_ubuntu24.04_py3.12_pytorch2.4.0/images/sha256-070556f078be10888a1421a2cb4f48c29f28b02bfeddae02588d1f7fc02a96a6"><i class="fab fa-docker fa-lg"></i></a>
|
<a href="https://hub.docker.com/layers/rocm/stanford-megatron-lm/stanford-megatron-lm85f95ae_rocm6.3.0_ubuntu24.04_py3.12_pytorch2.4.0/images/sha256-070556f078be10888a1421a2cb4f48c29f28b02bfeddae02588d1f7fc02a96a6"><i class="fab fa-docker fa-lg"></i></a>
|
||||||
|
|
||||||
|
- `6.3.0 <https://repo.radeon.com/rocm/apt/6.3/>`_
|
||||||
- `85f95ae <https://github.com/stanford-futuredata/Megatron-LM/commit/85f95aef3b648075fe6f291c86714fdcbd9cd1f5>`_
|
- `85f95ae <https://github.com/stanford-futuredata/Megatron-LM/commit/85f95aef3b648075fe6f291c86714fdcbd9cd1f5>`_
|
||||||
- `2.4.0 <https://github.com/ROCm/pytorch/tree/release/2.4>`_
|
- `2.4.0 <https://github.com/ROCm/pytorch/tree/release/2.4>`_
|
||||||
- 24.04
|
- 24.04
|
||||||
|
|||||||
@@ -2,7 +2,7 @@
|
|||||||
|
|
||||||
.. meta::
|
.. meta::
|
||||||
:description: Taichi compatibility
|
:description: Taichi compatibility
|
||||||
:keywords: GPU, Taichi compatibility
|
:keywords: GPU, Taichi, deep learning, framework compatibility
|
||||||
|
|
||||||
.. version-set:: rocm_version latest
|
.. version-set:: rocm_version latest
|
||||||
|
|
||||||
@@ -19,28 +19,52 @@ Taichi is widely used across various domains, including real-time physical simul
|
|||||||
numerical computing, augmented reality, artificial intelligence, computer vision, robotics,
|
numerical computing, augmented reality, artificial intelligence, computer vision, robotics,
|
||||||
visual effects in film and gaming, and general-purpose computing.
|
visual effects in film and gaming, and general-purpose computing.
|
||||||
|
|
||||||
* ROCm support for Taichi is hosted in the official `https://github.com/ROCm/taichi <https://github.com/ROCm/taichi>`_ repository.
|
Support overview
|
||||||
* Due to independent compatibility considerations, this location differs from the `https://github.com/taichi-dev <https://github.com/taichi-dev>`_ upstream repository.
|
================================================================================
|
||||||
* Use the prebuilt :ref:`Docker image <taichi-docker-compat>` with ROCm, PyTorch, and Taichi preinstalled.
|
|
||||||
* See the :doc:`ROCm Taichi installation guide <rocm-install-on-linux:install/3rd-party/taichi-install>` to install and get started.
|
|
||||||
|
|
||||||
.. note::
|
- The ROCm-supported version of Taichi is maintained in the official `https://github.com/ROCm/taichi
|
||||||
|
<https://github.com/ROCm/taichi>`__ repository, which differs from the
|
||||||
|
`https://github.com/taichi-dev/taichi <https://github.com/taichi-dev/taichi>`__ upstream repository.
|
||||||
|
|
||||||
Taichi is supported on ROCm 6.3.2.
|
- To get started and install Taichi on ROCm, use the prebuilt :ref:`Docker image <taichi-docker-compat>`,
|
||||||
|
which includes ROCm, Taichi, and all required dependencies.
|
||||||
|
|
||||||
Supported devices and features
|
- See the :doc:`ROCm Taichi installation guide <rocm-install-on-linux:install/3rd-party/taichi-install>`
|
||||||
===============================================================================
|
for installation and setup instructions.
|
||||||
There is support through the ROCm software stack for all Taichi GPU features on AMD Instinct MI250X and MI210X Series GPUs with the exception of Taichi’s GPU rendering system, CGUI.
|
|
||||||
AMD Instinct MI300X Series GPUs will be supported by November.
|
- You can also consult the upstream `Installation guide <https://github.com/taichi-dev/taichi>`__
|
||||||
|
for additional context.
|
||||||
|
|
||||||
|
Version support
|
||||||
|
--------------------------------------------------------------------------------
|
||||||
|
|
||||||
|
Taichi is supported on `ROCm 6.3.2 <https://repo.radeon.com/rocm/apt/6.3.2/>`__.
|
||||||
|
|
||||||
|
Supported devices
|
||||||
|
--------------------------------------------------------------------------------
|
||||||
|
|
||||||
|
- **Officially Supported**: AMD Instinct™ MI250X, MI210X (with the exception of Taichi’s GPU rendering system, CGUI)
|
||||||
|
- **Upcoming Support**: AMD Instinct™ MI300X
|
||||||
|
|
||||||
.. _taichi-recommendations:
|
.. _taichi-recommendations:
|
||||||
|
|
||||||
Use cases and recommendations
|
Use cases and recommendations
|
||||||
================================================================================
|
================================================================================
|
||||||
To fully leverage Taichi's performance capabilities in compute-intensive tasks, it is best to adhere to specific coding patterns and utilize Taichi decorators.
|
|
||||||
A collection of example use cases is available in the `https://github.com/ROCm/taichi_examples <https://github.com/ROCm/taichi_examples>`_ repository,
|
* The `Accelerating Parallel Programming in Python with Taichi Lang on AMD GPUs
|
||||||
providing practical insights and foundational knowledge for working with the Taichi programming language.
|
<https://rocm.blogs.amd.com/artificial-intelligence/taichi/README.html>`__
|
||||||
You can also refer to the `AMD ROCm blog <https://rocm.blogs.amd.com/>`_ to search for Taichi examples and best practices to optimize your workflows on AMD GPUs.
|
blog highlights Taichi as an open-source programming language designed for high-performance
|
||||||
|
numerical computation, particularly in domains like real-time physical simulation,
|
||||||
|
artificial intelligence, computer vision, robotics, and visual effects. Taichi
|
||||||
|
is embedded in Python and uses just-in-time (JIT) compilation frameworks like
|
||||||
|
LLVM to optimize execution on GPUs and CPUs. The blog emphasizes the versatility
|
||||||
|
of Taichi in enabling complex simulations and numerical algorithms, making
|
||||||
|
it ideal for developers working on compute-intensive tasks. Developers are
|
||||||
|
encouraged to follow recommended coding patterns and utilize Taichi decorators
|
||||||
|
for performance optimization, with examples available in the `https://github.com/ROCm/taichi_examples
|
||||||
|
<https://github.com/ROCm/taichi_examples>`_ repository. Prebuilt Docker images
|
||||||
|
integrating ROCm, PyTorch, and Taichi are provided for simplified installation
|
||||||
|
and deployment, making it easier to leverage Taichi for advanced computational workloads.
|
||||||
|
|
||||||
.. _taichi-docker-compat:
|
.. _taichi-docker-compat:
|
||||||
|
|
||||||
@@ -52,9 +76,8 @@ Docker image compatibility
|
|||||||
<i class="fab fa-docker"></i>
|
<i class="fab fa-docker"></i>
|
||||||
|
|
||||||
AMD validates and publishes ready-made `ROCm Taichi Docker images <https://hub.docker.com/r/rocm/taichi/tags>`_
|
AMD validates and publishes ready-made `ROCm Taichi Docker images <https://hub.docker.com/r/rocm/taichi/tags>`_
|
||||||
with ROCm backends on Docker Hub. The following Docker image tags and associated inventories
|
with ROCm backends on Docker Hub. The following Docker image tag and associated inventories
|
||||||
represent the latest Taichi version from the official Docker Hub.
|
represent the latest Taichi version from the official Docker Hub.
|
||||||
The Docker images have been validated for `ROCm 6.3.2 <https://rocm.docs.amd.com/en/docs-6.3.2/about/release-notes.html>`_.
|
|
||||||
Click |docker-icon| to view the image on Docker Hub.
|
Click |docker-icon| to view the image on Docker Hub.
|
||||||
|
|
||||||
.. list-table::
|
.. list-table::
|
||||||
|
|||||||
@@ -2,7 +2,7 @@
|
|||||||
|
|
||||||
.. meta::
|
.. meta::
|
||||||
:description: TensorFlow compatibility
|
:description: TensorFlow compatibility
|
||||||
:keywords: GPU, TensorFlow compatibility
|
:keywords: GPU, TensorFlow, deep learning, framework compatibility
|
||||||
|
|
||||||
.. version-set:: rocm_version latest
|
.. version-set:: rocm_version latest
|
||||||
|
|
||||||
@@ -12,37 +12,33 @@ TensorFlow compatibility
|
|||||||
|
|
||||||
`TensorFlow <https://www.tensorflow.org/>`__ is an open-source library for
|
`TensorFlow <https://www.tensorflow.org/>`__ is an open-source library for
|
||||||
solving machine learning, deep learning, and AI problems. It can solve many
|
solving machine learning, deep learning, and AI problems. It can solve many
|
||||||
problems across different sectors and industries but primarily focuses on
|
problems across different sectors and industries, but primarily focuses on
|
||||||
neural network training and inference. It is one of the most popular and
|
neural network training and inference. It is one of the most popular deep
|
||||||
in-demand frameworks and is very active in open-source contribution and
|
learning frameworks and is very active in open-source development.
|
||||||
development.
|
|
||||||
|
Support overview
|
||||||
|
================================================================================
|
||||||
|
|
||||||
|
- The ROCm-supported version of TensorFlow is maintained in the official `https://github.com/ROCm/tensorflow-upstream
|
||||||
|
<https://github.com/ROCm/tensorflow-upstream>`__ repository, which differs from the
|
||||||
|
`https://github.com/tensorflow/tensorflow <https://github.com/tensorflow/tensorflow>`__ upstream repository.
|
||||||
|
|
||||||
|
- To get started and install TensorFlow on ROCm, use the prebuilt :ref:`Docker images <tensorflow-docker-compat>`,
|
||||||
|
which include ROCm, TensorFlow, and all required dependencies.
|
||||||
|
|
||||||
|
- See the :doc:`ROCm TensorFlow installation guide <rocm-install-on-linux:install/3rd-party/tensorflow-install>`
|
||||||
|
for installation and setup instructions.
|
||||||
|
|
||||||
|
- You can also consult the `TensorFlow API versions <https://www.tensorflow.org/versions>`__ list
|
||||||
|
for additional context.
|
||||||
|
|
||||||
|
Version support
|
||||||
|
--------------------------------------------------------------------------------
|
||||||
|
|
||||||
The `official TensorFlow repository <http://github.com/tensorflow/tensorflow>`__
|
The `official TensorFlow repository <http://github.com/tensorflow/tensorflow>`__
|
||||||
includes full ROCm support. AMD maintains a TensorFlow `ROCm repository
|
includes full ROCm support. AMD maintains a TensorFlow `ROCm repository
|
||||||
<http://github.com/rocm/tensorflow-upstream>`__ in order to quickly add bug
|
<http://github.com/rocm/tensorflow-upstream>`__ in order to quickly add bug
|
||||||
fixes, updates, and support for the latest ROCM versions.
|
fixes, updates, and support for the latest ROCm versions.
|
||||||
|
|
||||||
- ROCm TensorFlow release:
|
|
||||||
|
|
||||||
- Offers :ref:`Docker images <tensorflow-docker-compat>` with
|
|
||||||
ROCm and TensorFlow pre-installed.
|
|
||||||
|
|
||||||
- ROCm TensorFlow repository: `<https://github.com/ROCm/tensorflow-upstream>`__
|
|
||||||
|
|
||||||
- See the :doc:`ROCm TensorFlow installation guide <rocm-install-on-linux:install/3rd-party/tensorflow-install>`
|
|
||||||
to get started.
|
|
||||||
|
|
||||||
- Official TensorFlow release:
|
|
||||||
|
|
||||||
- Official TensorFlow repository: `<https://github.com/tensorflow/tensorflow>`__
|
|
||||||
|
|
||||||
- See the `TensorFlow API versions <https://www.tensorflow.org/versions>`__ list.
|
|
||||||
|
|
||||||
.. note::
|
|
||||||
|
|
||||||
The official TensorFlow documentation does not cover ROCm support. Use the
|
|
||||||
ROCm documentation for installation instructions for Tensorflow on ROCm.
|
|
||||||
See :doc:`rocm-install-on-linux:install/3rd-party/tensorflow-install`.
|
|
||||||
|
|
||||||
.. _tensorflow-docker-compat:
|
.. _tensorflow-docker-compat:
|
||||||
|
|
||||||
|
|||||||
@@ -2,7 +2,7 @@
|
|||||||
|
|
||||||
.. meta::
|
.. meta::
|
||||||
:description: verl compatibility
|
:description: verl compatibility
|
||||||
:keywords: GPU, verl compatibility
|
:keywords: GPU, verl, deep learning, framework compatibility
|
||||||
|
|
||||||
.. version-set:: rocm_version latest
|
.. version-set:: rocm_version latest
|
||||||
|
|
||||||
@@ -10,24 +10,58 @@
|
|||||||
verl compatibility
|
verl compatibility
|
||||||
*******************************************************************************
|
*******************************************************************************
|
||||||
|
|
||||||
Volcano Engine Reinforcement Learning for LLMs (verl) is a reinforcement learning framework designed for large language models (LLMs).
|
Volcano Engine Reinforcement Learning for LLMs (`verl <https://verl.readthedocs.io/en/latest/>`__)
|
||||||
verl offers a scalable, open-source fine-tuning solution optimized for AMD Instinct GPUs with full ROCm support.
|
is a reinforcement learning framework designed for large language models (LLMs).
|
||||||
|
verl offers a scalable, open-source fine-tuning solution by using a hybrid programming model
|
||||||
|
that makes it easy to define and run complex post-training dataflows efficiently.
|
||||||
|
|
||||||
* See the `verl documentation <https://verl.readthedocs.io/en/latest/>`_ for more information about verl.
|
Its modular APIs separate computation from data, allowing smooth integration with other frameworks.
|
||||||
* The official verl GitHub repository is `https://github.com/volcengine/verl <https://github.com/volcengine/verl>`_.
|
It also supports flexible model placement across GPUs for efficient scaling on different cluster sizes.
|
||||||
* Use the AMD-validated :ref:`Docker images <verl-docker-compat>` with ROCm and verl preinstalled.
|
verl achieves high training and generation throughput by building on existing LLM frameworks.
|
||||||
* See the :doc:`ROCm verl installation guide <rocm-install-on-linux:install/3rd-party/verl-install>` to install and get started.
|
Its 3D-HybridEngine reduces memory use and communication overhead when switching between training
|
||||||
|
and inference, improving overall performance.
|
||||||
|
|
||||||
.. note::
|
Support overview
|
||||||
|
================================================================================
|
||||||
|
|
||||||
verl is supported on ROCm 6.2.0.
|
- The ROCm-supported version of verl is maintained in the official `https://github.com/ROCm/verl
|
||||||
|
<https://github.com/ROCm/verl>`__ repository, which differs from the
|
||||||
|
`https://github.com/volcengine/verl <https://github.com/volcengine/verl>`__ upstream repository.
|
||||||
|
|
||||||
|
- To get started and install verl on ROCm, use the prebuilt :ref:`Docker image <verl-docker-compat>`,
|
||||||
|
which includes ROCm, verl, and all required dependencies.
|
||||||
|
|
||||||
|
- See the :doc:`ROCm verl installation guide <rocm-install-on-linux:install/3rd-party/verl-install>`
|
||||||
|
for installation and setup instructions.
|
||||||
|
|
||||||
|
- You can also consult the upstream `verl documentation <https://verl.readthedocs.io/en/latest/>`__
|
||||||
|
for additional context.
|
||||||
|
|
||||||
|
Version support
|
||||||
|
--------------------------------------------------------------------------------
|
||||||
|
|
||||||
|
verl is supported on `ROCm 6.2.0 <https://repo.radeon.com/rocm/apt/6.2/>`__.
|
||||||
|
|
||||||
|
Supported devices
|
||||||
|
--------------------------------------------------------------------------------
|
||||||
|
|
||||||
|
**Officially Supported**: AMD Instinct™ MI300X
|
||||||
|
|
||||||
.. _verl-recommendations:
|
.. _verl-recommendations:
|
||||||
|
|
||||||
Use cases and recommendations
|
Use cases and recommendations
|
||||||
================================================================================
|
================================================================================
|
||||||
|
|
||||||
The benefits of verl in large-scale reinforcement learning from human feedback (RLHF) are discussed in the `Reinforcement Learning from Human Feedback on AMD GPUs with verl and ROCm Integration <https://rocm.blogs.amd.com/artificial-intelligence/verl-large-scale/README.html>`_ blog.
|
* The benefits of verl in large-scale reinforcement learning from human feedback
|
||||||
|
(RLHF) are discussed in the `Reinforcement Learning from Human Feedback on AMD
|
||||||
|
GPUs with verl and ROCm Integration <https://rocm.blogs.amd.com/artificial-intelligence/verl-large-scale/README.html>`__
|
||||||
|
blog. The blog post outlines how the Volcano Engine Reinforcement Learning
|
||||||
|
(verl) framework integrates with the AMD ROCm platform to optimize training on
|
||||||
|
Instinct™ MI300X GPUs. The guide details the process of building a Docker image,
|
||||||
|
setting up single-node and multi-node training environments, and highlights
|
||||||
|
performance benchmarks demonstrating improved throughput and convergence accuracy.
|
||||||
|
This resource serves as a comprehensive starting point for deploying verl on AMD GPUs,
|
||||||
|
facilitating efficient RLHF training workflows.
|
||||||
|
|
||||||
.. _verl-supported_features:
|
.. _verl-supported_features:
|
||||||
|
|
||||||
@@ -61,8 +95,10 @@ Docker image compatibility
|
|||||||
|
|
||||||
<i class="fab fa-docker"></i>
|
<i class="fab fa-docker"></i>
|
||||||
|
|
||||||
AMD validates and publishes ready-made `ROCm verl Docker images <https://hub.docker.com/r/rocm/verl/tags>`_
|
AMD validates and publishes ready-made `verl Docker images <https://hub.docker.com/r/rocm/verl/tags>`_
|
||||||
with ROCm backends on Docker Hub. The following Docker image tags and associated inventories represent the available verl versions from the official Docker Hub.
|
with ROCm backends on Docker Hub. The following Docker image tag and associated inventories
|
||||||
|
represent the latest verl version from the official Docker Hub.
|
||||||
|
Click |docker-icon| to view the image on Docker Hub.
|
||||||
|
|
||||||
.. list-table::
|
.. list-table::
|
||||||
:header-rows: 1
|
:header-rows: 1
|
||||||
|
|||||||
Reference in New Issue
Block a user