diff --git a/.wordlist.txt b/.wordlist.txt index 0cb68a4aa..6835092e8 100644 --- a/.wordlist.txt +++ b/.wordlist.txt @@ -805,6 +805,7 @@ reStructuredText redirections refactorization reformats +reinforcememt repo repos representativeness @@ -812,6 +813,7 @@ req resampling rescaling reusability +RLHF roadmap roc rocAL @@ -927,6 +929,7 @@ vectorize vectorized vectorizer vectorizes +verl virtualize virtualized vjxb diff --git a/docs/compatibility/compatibility-matrix.rst b/docs/compatibility/compatibility-matrix.rst index b6e12c349..1b0d1c453 100644 --- a/docs/compatibility/compatibility-matrix.rst +++ b/docs/compatibility/compatibility-matrix.rst @@ -242,6 +242,8 @@ Expand for full historical view of: .. [#mi300_610-past-60] **For ROCm 6.1.0** - MI300A (gfx942) is supported on Ubuntu 22.04.4, RHEL 9.4, RHEL 9.3, RHEL 8.9, and SLES 15 SP5. MI300X (gfx942) is only supported on Ubuntu 22.04.4. .. [#mi300_602-past-60] **For ROCm 6.0.2** - MI300A (gfx942) is supported on Ubuntu 22.04.3, RHEL 8.9, and SLES 15 SP5. MI300X (gfx942) is only supported on Ubuntu 22.04.3. .. [#mi300_600-past-60] **For ROCm 6.0.0** - MI300A (gfx942) is supported on Ubuntu 22.04.3, RHEL 8.9, and SLES 15 SP5. MI300X (gfx942) is only supported on Ubuntu 22.04.3. + .. [#verl_compat] verl is only supported on ROCm 6.2.0. + .. [#kfd_support-past-60] Starting from ROCm 6.4.0, forward and backward compatibility between the AMD Kernel-mode GPU Driver (KMD) and its user space software is provided up to a year apart (assuming hardware support is available in both). For earlier ROCm releases, the compatibility is provided for +/- 2 releases. These are the compatibility combinations that are currently supported. .. [#kfd_support-past-60] As of ROCm 6.4.0, forward and backward compatibility between the AMD Kernel-mode GPU Driver (KMD) and its user space software is provided up to a year apart. For earlier ROCm releases, the compatibility is provided for +/- 2 releases. The tested user space versions on this page were accurate as of the time of initial ROCm release. For the most up-to-date information, see the latest version of this information at `User and kernel-space support matrix `_. .. [#ROCT-rocr-past-60] Starting from ROCm 6.3.0, the ROCT Thunk Interface is included as part of the ROCr runtime package. diff --git a/docs/compatibility/ml-compatibility/verl-compatibility.rst b/docs/compatibility/ml-compatibility/verl-compatibility.rst new file mode 100644 index 000000000..deb83274a --- /dev/null +++ b/docs/compatibility/ml-compatibility/verl-compatibility.rst @@ -0,0 +1,85 @@ +:orphan: + +.. meta:: + :description: verl compatibility + :keywords: GPU, verl compatibility + +.. version-set:: rocm_version latest + +******************************************************************************* +verl compatibility +******************************************************************************* + +Volcano Engine Reinforcement Learning for LLMs (verl) is a reinforcement learning framework designed for large language models (LLMs). +verl offers a scalable, open-source fine-tuning solution optimized for AMD Instinct GPUs with full ROCm support. + +* See the `verl documentation `_ for more information about verl. +* The official verl GitHub repository is `https://github.com/volcengine/verl `_. +* Use the AMD-validated :ref:`Docker images ` with ROCm and verl preinstalled. +* See the :doc:`ROCm verl installation guide ` to get started. + +.. note:: + + verl is supported on ROCm 6.2.0. + + +.. _verl-recommendations: + +Use cases and recommendations +================================================================================ + +The benefits of verl in large-scale reinforcement leaning from human feedback (RLHF) are discussed in the `Reinforcement Learning from Human Feedback on AMD GPUs with verl and ROCm Integration `_ blog. + +.. _verl-docker-compat: + +Docker image compatibility +================================================================================ + +.. |docker-icon| raw:: html + + + +AMD validates and publishes ready-made `ROCm verl Docker images `_ +with ROCm backends on Docker Hub. The following Docker image tags and associated inventories represent the latest verl version from the official Docker Hub. The Docker images have been validated for `ROCm 6.2.0 `_. + +.. list-table:: + :header-rows: 1 + + * - Docker image + - verl + - Linux + - Pytorch + - Python + - vllm + + * - .. raw:: html + + rocm/verl + - `0.3.0post0 `_ + - Ubuntu 20.04 + - `2.5.0 `_ + - `3.9.19 `_ + - `0.6.4 `_ + + +Supported features +=============================================================================== + +The following table shows verl and ROCm support for GPU-accelerated modules. + +.. list-table:: + :header-rows: 1 + + * - Module + - Description + - verl version + - ROCm version + * - ``FSDP`` + - Training engine + - 0.3.0.post0 + - 6.2 + * - ``vllm`` + - Inference engine + - 0.3.0.post0 + - 6.2 + diff --git a/docs/how-to/deep-learning-rocm.rst b/docs/how-to/deep-learning-rocm.rst index 5886647f7..3ba645df3 100644 --- a/docs/how-to/deep-learning-rocm.rst +++ b/docs/how-to/deep-learning-rocm.rst @@ -17,6 +17,7 @@ features for these ROCm-enabled deep learning frameworks. * :doc:`PyTorch compatibility <../compatibility/ml-compatibility/pytorch-compatibility>` * :doc:`TensorFlow compatibility <../compatibility/ml-compatibility/tensorflow-compatibility>` * :doc:`JAX compatibility <../compatibility/ml-compatibility/jax-compatibility>` +* :doc:`verl compatibility <../compatibility/ml-compatibility/verl-compatibility>` * :doc:`Stanford Megatron-LM compatibility <../compatibility/ml-compatibility/stanford-megatron-lm-compatibility>` * :doc:`DGL compatibility <../compatibility/ml-compatibility/dgl-compatibility>` @@ -31,6 +32,7 @@ See the installation instructions to get started. * :doc:`PyTorch for ROCm ` * :doc:`TensorFlow for ROCm ` * :doc:`JAX for ROCm ` +* :doc:`verl for ROCm ` * :doc:`Stanford Megatron-LM for ROCm ` * :doc:`DGL for ROCm `