diff --git a/README.md b/README.md index bb2e038ea..da7c39f18 100644 --- a/README.md +++ b/README.md @@ -50,7 +50,7 @@ The following example shows how to use the repo tool to download the ROCm source ```bash mkdir -p ~/ROCm/ cd ~/ROCm/ -export ROCM_VERSION=6.4.0 +export ROCM_VERSION=6.4.1 ~/bin/repo init -u http://github.com/ROCm/ROCm.git -b roc-6.4.x -m tools/rocm-build/rocm-${ROCM_VERSION}.xml ~/bin/repo sync ``` @@ -77,7 +77,7 @@ The Build time will reduce significantly if we limit the GPU Architecture/s agai mkdir -p ~/WORKSPACE/ # Or any folder name other than WORKSPACE cd ~/WORKSPACE/ -export ROCM_VERSION=6.4.0 +export ROCM_VERSION=6.4.1 ~/bin/repo init -u http://github.com/ROCm/ROCm.git -b roc-6.4.x -m tools/rocm-build/rocm-${ROCM_VERSION}.xml ~/bin/repo sync diff --git a/RELEASE.md b/RELEASE.md index fd79a0e59..a9125123f 100644 --- a/RELEASE.md +++ b/RELEASE.md @@ -24,8 +24,6 @@ The release notes provide a summary of notable changes since the previous ROCm r - [ROCm known issues](#rocm-known-issues) -- [ROCm resolved issues](#rocm-resolved-issues) - - [ROCm upcoming changes](#rocm-upcoming-changes) ```{note} @@ -69,7 +67,7 @@ ROCm documentation continues to be updated to provide clearer and more comprehen * The [Training a model with LLM Foundry](https://rocm.docs.amd.com/en/latest/how-to/rocm-for-ai/training/benchmark-docker/mpt-llm-foundry.html) performance testing guide has been added. This guide describes how to use the preconfigured [ROCm/pytorch-training](https://hub.docker.com/layers/rocm/pytorch-training/v25.5/images/sha256-d47850a9b25b4a7151f796a8d24d55ea17bba545573f0d50d54d3852f96ecde5) training environment and [https://github.com/ROCm/MAD](https://github.com/ROCm/MAD) to test the training performance of the LLM Foundry framework on AMD Instinct MI325X and MI300X accelerators using the [MPT-30B](https://huggingface.co/mosaicml/mpt-30b) model. * The [Training a model with PyTorch](https://rocm.docs.amd.com/en/latest/how-to/rocm-for-ai/training/benchmark-docker/pytorch-training.html) performance testing guide has been updated to feature the latest [ROCm/pytorch-training](https://hub.docker.com/layers/rocm/pytorch-training/v25.5/images/sha256-d47850a9b25b4a7151f796a8d24d55ea17bba545573f0d50d54d3852f96ecde5) Docker image (a preconfigured training environment with ROCm and PyTorch). Support for [Llama 3.3 70B](https://huggingface.co/meta-llama/Llama-3.3-70B-Instruct) has been added. * The [Training a model with JAX MaxText](https://rocm.docs.amd.com/en/latest/how-to/rocm-for-ai/training/benchmark-docker/jax-maxtext.html) performance testing guide has been updated to feature the latest [ROCm/jax-training](https://hub.docker.com/layers/rocm/jax-training/maxtext-v25.5/images/sha256-4e0516358a227cae8f552fb866ec07e2edcf244756f02e7b40212abfbab5217b) Docker image (a preconfigured training environment with ROCm, JAX, and [MaxText](https://github.com/AI-Hypercomputer/maxtext)). Support for [Llama 3.3 70B](https://huggingface.co/meta-llama/Llama-3.3-70B-Instruct) has been added. -* The [vLLM inference performance testing](https://rocm.docs.amd.com/en/latest/how-to/rocm-for-ai/inference/vllm-benchmark.html?model=pyt_vllm_qwq-32b) guide has been updated to feature the latest [ROCm/vLLM](https://hub.docker.com/layers/rocm/vllm/instinct_main/images/sha256-ad9062dea3483d59dedb17c67f7c49f30eebd6eb37c3fac0a171fb19696cc845) Docker image (a preconfigured environment for inference with ROCm and [vLLM](https://docs.vllm.ai/en/latest/)). Support for the [QwQ-32B](https://huggingface.co/Qwen/QwQ-32B) model has been added. +* The [vLLM inference performance testing](https://rocm.docs.amd.com/en/latest/how-to/rocm-for-ai/inference/vllm-benchmark.html?model=pyt_vllm_qwq-32b) guide has been updated to feature the latest [ROCm/vLLM](https://hub.docker.com/layers/rocm/vllm/latest/images/sha256-5c8b4436dd0464119d9df2b44c745fadf81512f18ffb2f4b5dc235c71ebe26b4) Docker image (a preconfigured environment for inference with ROCm and [vLLM](https://docs.vllm.ai/en/latest/)). Support for the [QwQ-32B](https://huggingface.co/Qwen/QwQ-32B) model has been added. * The [PyTorch inference performance testing](https://rocm.docs.amd.com/en/latest/how-to/rocm-for-ai/inference/pytorch-inference-benchmark.html?model=pyt_clip_inference) guide has been added, featuring the [ROCm/PyTorch](https://hub.docker.com/layers/rocm/pytorch/latest/images/sha256-ab1d350b818b90123cfda31363019d11c0d41a8f12a19e3cb2cb40cf0261137d) Docker image (a preconfigured inference environment with ROCm and PyTorch) with initial support for the [CLIP](https://huggingface.co/laion/CLIP-ViT-B-32-laion2B-s34B-b79K) and [Chai-1](https://huggingface.co/chaidiscovery/chai-1) models. ## Operating system and hardware support changes @@ -78,7 +76,8 @@ ROCm 6.4.1 introduces support for the RDNA4 architecture-based [Radeon AI PRO R9700](https://www.amd.com/en/products/graphics/workstations/radeon-ai-pro/ai-9000-series/amd-radeon-ai-pro-r9700.html), [Radeon RX 9070 XT](https://www.amd.com/en/products/graphics/desktops/radeon/9000-series/amd-radeon-rx-9070xt.html), and [Radeon RX 9060 XT](https://www.amd.com/en/products/graphics/desktops/radeon/9000-series/amd-radeon-rx-9060xt.html) GPUs -for compute workloads. For details, see the full list of [Supported GPUs +for compute workloads. Currently, these GPUs are only supported on Ubuntu 24.04.2, Ubuntu 22.04.5, RHEL 9.5, and RHEL 9.4. +For details, see the full list of [Supported GPUs (Linux)](https://rocm.docs.amd.com/projects/install-on-linux/en/latest/reference/system-requirements.html#supported-gpus). Operating system support remains unchanged in this release. @@ -390,7 +389,7 @@ Click {fab}`github` to go to the component's source code on GitHub. Compilers HIPCC 1.1.1 - @@ -562,12 +561,29 @@ issues related to individual components, review the [Detailed component changes] Radeon AI PRO R9700 GPUs might hang when running [Stable Diffusion 2.1](https://huggingface.co/stabilityai/stable-diffusion-2-1) with batch sizes greater than four. As a workaround, limit batch sizes to four or fewer. This issue -will be addressed in a future ROCm release. +will be addressed in a future ROCm release. See [issue #4770](https://github.com/ROCm/ROCm/issues/4770) on GitHub. ### RCCL MSCCL initialization failure When splitting a communicator using `ncclCommSplit` in some GPU configurations, MSCCL initialization can cause a segmentation fault. The recommended workaround is to disable MSCCL with `export RCCL_MSCCL_ENABLE=0`. -This issue will be fixed in a future ROCm release. +This issue will be fixed in a future ROCm release. See [issue #4769](https://github.com/ROCm/ROCm/issues/4769) on GitHub. + +### AMD SMI CLI: CPER entries not dumped continuously when using follow flag + +* When using the `--follow` flag with `amd-smi ras --cper`, CPER entries are not streamed continuously as intended. This will be fixed in an upcoming ROCm release. +See [issue #4768](https://github.com/ROCm/ROCm/issues/4768) on GitHub. + +### ROCm SMI uninstallation issue on RHEL and SLES + +`rocm-smi-lib` does not get uninstalled and remains orphaned on RHEL and SLES systems when: + +* [Uninstalling ROCm using the AMDGPU installer](https://rocm.docs.amd.com/projects/install-on-linux/en/latest/install/amdgpu-install.html#uninstalling-rocm) with `amdgpu-install --uninstall` + +* [Uninstalling via package manager](https://rocm.docs.amd.com/projects/install-on-linux/en/latest/install/install-methods/package-manager/package-manager-rhel.html#uninstall-rocm-packages) + with `dnf remove rocm-core` on RHEL or `zypper remove rocm-core` on SLES. + +As a workaround, manually remove the `rocm-smi-lib` package using `sudo dnf remove rocm-smi-lib` or `sudo zypper remove rocm-smi-lib`. +See [issue #4767](https://github.com/ROCm/ROCm/issues/4767) on GitHub. ## ROCm upcoming changes @@ -638,4 +654,4 @@ There are a number of upcoming changes planned for HIP runtime API in an upcomin that are not backward compatible with prior releases. Most of these changes increase alignment between HIP and CUDA APIs or behavior. Some of the upcoming changes are to clean up header files, remove namespace collision, and have a clear separation between -`hipRTC` and HIP runtime. For more information refer to [HIP Upcoming changes](https://rocm.docs.amd.com/en/latest/about/release-notes.html#id15). +`hipRTC` and HIP runtime. For more information refer to [HIP Upcoming changes](https://rocm.docs.amd.com/en/docs-6.4.0/about/release-notes.html#id15). diff --git a/default.xml b/default.xml index efebae568..09816f9a9 100644 --- a/default.xml +++ b/default.xml @@ -1,7 +1,7 @@ - diff --git a/docs/compatibility/compatibility-matrix-historical-6.0.csv b/docs/compatibility/compatibility-matrix-historical-6.0.csv index 56a3afd06..2e9fcfbe5 100644 --- a/docs/compatibility/compatibility-matrix-historical-6.0.csv +++ b/docs/compatibility/compatibility-matrix-historical-6.0.csv @@ -17,8 +17,8 @@ ROCm Version,6.4.1,6.4.0,6.3.3,6.3.2,6.3.1,6.3.0,6.2.4,6.2.2,6.2.1,6.2.0, 6.1.5, ,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3 ,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2 ,.. _gpu-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,, - :doc:`GPU / LLVM target `,gfx1201,,,,,,,,,,,,,,, - ,gfx1200,,,,,,,,,,,,,,, + :doc:`GPU / LLVM target `,gfx1201 [#RDNA4-OS-past-60]_,,,,,,,,,,,,,,, + ,gfx1200 [#RDNA4-OS-past-60]_,,,,,,,,,,,,,,, ,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100 ,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030 ,gfx942,gfx942,gfx942,gfx942,gfx942,gfx942,gfx942 [#mi300_624-past-60]_,gfx942 [#mi300_622-past-60]_,gfx942 [#mi300_621-past-60]_,gfx942 [#mi300_620-past-60]_, gfx942 [#mi300_612-past-60]_, gfx942 [#mi300_612-past-60]_, gfx942 [#mi300_611-past-60]_, gfx942 [#mi300_610-past-60]_, gfx942 [#mi300_602-past-60]_, gfx942 [#mi300_600-past-60]_ diff --git a/docs/compatibility/compatibility-matrix.rst b/docs/compatibility/compatibility-matrix.rst index e9c6bbc7c..f852a18fa 100644 --- a/docs/compatibility/compatibility-matrix.rst +++ b/docs/compatibility/compatibility-matrix.rst @@ -42,8 +42,8 @@ compatibility and system requirements. ,RDNA3,RDNA3,RDNA3 ,RDNA2,RDNA2,RDNA2 ,.. _gpu-support-compatibility-matrix:,, - :doc:`GPU / LLVM target `,gfx1201,, - ,gfx1200,, + :doc:`GPU / LLVM target `,gfx1201 [#RDNA4-OS]_,, + ,gfx1200 [#RDNA4-OS]_,, ,gfx1100,gfx1100,gfx1100 ,gfx1030,gfx1030,gfx1030 ,gfx942,gfx942,gfx942 @@ -156,6 +156,7 @@ compatibility and system requirements. .. [#mi300_620] **For ROCm 6.2.0** - MI300X (gfx942) is supported on listed operating systems *except* Ubuntu 22.04.5 [6.8 HWE] and Ubuntu 22.04.4 [6.5 HWE]. .. [#kfd_support] Starting from ROCm 6.4.0, forward and backward compatibility between the AMD Kernel-mode GPU Driver (KMD) and its user space software is provided up to a year apart (assuming hardware support is available in both). For earlier ROCm releases, the compatibility is provided for +/- 2 releases. These are the compatibility combinations that are currently supported. .. [#ROCT-rocr] Starting from ROCm 6.3.0, the ROCT Thunk Interface is included as part of the ROCr runtime package. +.. [#RDNA4-OS] Radeon AI PRO R9700, Radeon RX 9070 XT (gfx1201), and Radeon RX 9060 XT (gfx1200) are supported only on Ubuntu 24.04.2, Ubuntu 22.04.5, RHEL 9.5, and RHEL 9.4. .. _OS-kernel-versions: @@ -234,3 +235,4 @@ Expand for full historical view of: .. [#mi300_600-past-60] **For ROCm 6.0.0** - MI300A (gfx942) is supported on Ubuntu 22.04.3, RHEL 8.9, and SLES 15 SP5. MI300X (gfx942) is only supported on Ubuntu 22.04.3. .. [#kfd_support-past-60] Starting from ROCm 6.4.0, forward and backward compatibility between the AMD Kernel-mode GPU Driver (KMD) and its user space software is provided up to a year apart (assuming hardware support is available in both). For earlier ROCm releases, the compatibility is provided for +/- 2 releases. These are the compatibility combinations that are currently supported. .. [#ROCT-rocr-past-60] Starting from ROCm 6.3.0, the ROCT Thunk Interface is included as part of the ROCr runtime package. + .. [#RDNA4-OS-past-60] Radeon AI PRO R9700, Radeon RX 9070 XT (gfx1201), and Radeon RX 9060 XT (gfx1200) are supported only on Ubuntu 24.04.2, Ubuntu 22.04.5, RHEL 9.5, and RHEL 9.4. diff --git a/tools/autotag/components.xml b/tools/autotag/components.xml index 5aa3c3ad1..4a4d15557 100644 --- a/tools/autotag/components.xml +++ b/tools/autotag/components.xml @@ -1,7 +1,7 @@ - diff --git a/tools/rocm-build/rocm-6.4.1.xml b/tools/rocm-build/rocm-6.4.1.xml new file mode 100644 index 000000000..08433d4a6 --- /dev/null +++ b/tools/rocm-build/rocm-6.4.1.xml @@ -0,0 +1,79 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +