mirror of
https://github.com/ROCm/ROCm.git
synced 2026-01-08 22:28:06 -05:00
Add ROCm 6.2.2 release notes (#178)
* rm extra file * sync release.md w/ public version * update versions to 6.2.2 * update version, release, and ga date * Revert "update versions to 6.2.2" This reverts commit c08e51d1acc8e773deefef96946c31fd368a09dd. * stack 6.2.2 and 6.2.1 * fix word * bump 6.2.1 headings * add explanatory note remove 'please' rm caps * add fixed issue highlight * update mode-2 fix * add clarification * add ubuntu 24.04.1 note * update autotag templates for 6.2.x * rm Ubuntu 24.04.1 from 6.2.2 (have it in 6.2.1 only) * add horizontal rule for visual separation between 6.2.2 and 6.2.1 release notes * remove extra templ * spellcheck * Docs: Add Ubuntu 24.04.1 (#3801) * add ubuntu 24.04.1 * add 24.04.1 to bottom os section * fix heading and template * Update compatibility-matrix.rst for OpenMP version * Update compatibility-matrix-historical-6.0.csv for OpenMP version * rm ubuntu 24.04.1 from 6.2.0 * Update docs/compatibility/compatibility-matrix.rst Co-authored-by: Young Hui - AMD <145490163+yhuiYH@users.noreply.github.com> * rm duplicate ubuntu in historical --------- Co-authored-by: Young Hui - AMD <145490163+yhuiYH@users.noreply.github.com> --------- Co-authored-by: Young Hui - AMD <145490163+yhuiYH@users.noreply.github.com>
This commit is contained in:
@@ -53,6 +53,7 @@ CSC
|
||||
CSE
|
||||
CSV
|
||||
CSn
|
||||
CTest
|
||||
CTests
|
||||
CU
|
||||
CUDA
|
||||
@@ -387,6 +388,7 @@ UAC
|
||||
UC
|
||||
UCC
|
||||
UCX
|
||||
UE
|
||||
UIF
|
||||
UMC
|
||||
USM
|
||||
@@ -653,6 +655,7 @@ quasirandom
|
||||
queueing
|
||||
rccl
|
||||
rdc
|
||||
rdma
|
||||
reStructuredText
|
||||
redirections
|
||||
refactorization
|
||||
|
||||
161
RELEASE.md
161
RELEASE.md
@@ -1,6 +1,43 @@
|
||||
# ROCm 6.2.1 release notes
|
||||
# ROCm 6.2.2 release notes
|
||||
|
||||
The release notes provide a summary of notable changes since the previous ROCm release.
|
||||
These release notes provide a summary of notable changes since the previous ROCm release.
|
||||
|
||||
```{note}
|
||||
As ROCm 6.2.2 was released shortly after 6.2.1, the changes between these versions
|
||||
are minimal. For a comprehensive overview of recent updates, the ROCm 6.2.1 release
|
||||
notes are appended to the end of this document.
|
||||
|
||||
For detailed information about the changes in ROCm 6.2.1, refer to the appended
|
||||
section: [ROCm 6.2.1 release notes](rocm-6-2-1-release-notes).
|
||||
```
|
||||
|
||||
The [Compatibility matrix](https://rocm.docs.amd.com/en/docs-6.2.2/compatibility/compatibility-matrix.html)
|
||||
provides the full list of supported hardware, operating systems, ecosystems, third-party components, and ROCm components
|
||||
for each ROCm release.
|
||||
|
||||
Release notes for previous ROCm releases are available in earlier versions of the documentation.
|
||||
See the [ROCm documentation release history](https://rocm.docs.amd.com/en/latest/release/versions.html).
|
||||
|
||||
## Release highlights
|
||||
|
||||
The following is a significant fix introduced in ROCm 6.2.2.
|
||||
|
||||
### Fixed Instinct MI300X error recovery failure
|
||||
|
||||
Improved the reliability of AMD Instinct MI300X accelerators in scenarios involving
|
||||
uncorrectable errors. Previously, error recovery did not occur as expected,
|
||||
potentially leaving the system in an undefined state. This fix ensures that error
|
||||
recovery functions as expected, maintaining system stability.
|
||||
|
||||
See the [original issue](#instinct-mi300x-gpu-recovery-failure-on-uncorrectable-errors)
|
||||
noted in the ROCm 6.2.1 release notes.
|
||||
|
||||
---
|
||||
|
||||
## ROCm 6.2.1 release notes
|
||||
|
||||
The ROCm 6.2.1 release notes document newly added ecosystem support, ROCm Offline Installer Creator updates,
|
||||
and improvements to several ROCm libraries and tools.
|
||||
|
||||
- [Release highlights](release-highlights)
|
||||
|
||||
@@ -14,55 +51,52 @@ The release notes provide a summary of notable changes since the previous ROCm r
|
||||
|
||||
- [ROCm upcoming changes](rocm-upcoming-changes)
|
||||
|
||||
The [Compatibility matrix](https://rocm.docs.amd.com/en/docs-6.2.1/compatibility/compatibility-matrix.html)
|
||||
provides the full list of supported hardware, operating systems, ecosystems, third-party components, and ROCm components for each ROCm release.
|
||||
|
||||
Release notes for previous ROCm releases are available in earlier versions of the documentation.
|
||||
See the [ROCm documentation release history](https://rocm.docs.amd.com/en/latest/release/versions.html).
|
||||
|
||||
## Release highlights
|
||||
### Release highlights
|
||||
|
||||
The following are notable new features and improvements in ROCm 6.2.1. For changes to individual components, see [Detailed component changes](#detailed-component-changes).
|
||||
|
||||
### rocAL version change
|
||||
#### rocAL major version change
|
||||
|
||||
The version of rocAL has been updated to 2.0.0. Applications built using rocAL 1.0.0 must be recompiled to work with rocAL 2.0.0. See [the rocAL detailed changes](#rocal-2-0-0) for more information.
|
||||
The new version of rocAL introduces many new features, but does not modify any of the existing public API functions. However, the version number was incremented from 1.3 to 2.0.
|
||||
Applications linked to version 1.3 must be recompiled to link against version 2.0.
|
||||
|
||||
### New support for FBGEMM (Facebook General Matrix Multiplication)
|
||||
See [the rocAL detailed changes](#rocal-2-0-0) for more information.
|
||||
|
||||
#### New support for FBGEMM (Facebook General Matrix Multiplication)
|
||||
|
||||
As of ROCm 6.2.1, ROCm supports Facebook General Matrix Multiplication (FBGEMM) and the related FBGEMM_GPU library.
|
||||
|
||||
FBGEMM is a low-precision, high-performance CPU kernel library for convolution and matrix multiplication. It is used for server-side inference and as a back end for PyTorch quantized operators. FBGEMM_GPU includes a collection of PyTorch GPU operator libraries for training and inference. For more information, see the ROCm [Model acceleration libraries guide](https://rocm.docs.amd.com/en/6.2.1/how-to/llm-fine-tuning-optimization/model-acceleration-libraries.html)
|
||||
FBGEMM is a low-precision, high-performance CPU kernel library for convolution and matrix multiplication. It is used for server-side inference and as a back end for PyTorch quantized operators. FBGEMM_GPU includes a collection of PyTorch GPU operator libraries for training and inference. For more information, see the ROCm [Model acceleration libraries guide](https://rocm.docs.amd.com/en/docs-6.2.1/how-to/llm-fine-tuning-optimization/model-acceleration-libraries.html)
|
||||
and [PyTorch's FBGEMM GitHub repository](https://github.com/pytorch/FBGEMM).
|
||||
|
||||
### ROCm Offline Installer Creator changes
|
||||
#### ROCm Offline Installer Creator changes
|
||||
|
||||
The [ROCm Offline Installer Creator 6.2.1](https://rocm.docs.amd.com/projects/install-on-linux/en/6.2.1/install/rocm-offline-installer.html) introduces several new features and improvements including:
|
||||
The [ROCm Offline Installer Creator 6.2.1](https://rocm.docs.amd.com/projects/install-on-linux/en/docs-6.2.1/install/rocm-offline-installer.html) introduces several new features and improvements including:
|
||||
|
||||
* Logging support for create and install logs
|
||||
* More stringent checks for Linux versions and distributions
|
||||
* Updated prerequisite repositories
|
||||
* Fixed CTest issues
|
||||
|
||||
### ROCm documentation changes
|
||||
#### ROCm documentation changes
|
||||
|
||||
There have been no changes to supported hardware or operating systems from ROCm 6.2.0 to ROCm 6.2.1.
|
||||
|
||||
* The Programming Model Reference and Understanding the Programming Model topics in HIP have been consolidated into one topic,
|
||||
[HIP programming model (conceptual)](https://rocm.docs.amd.com/projects/HIP/en/6.2.1/understand/programming_model.html).
|
||||
* The [HIP virtual memory management](https://rocm.docs.amd.com/projects/HIP/en/6.2.1/how-to/virtual_memory.html) and [HIP virtual memory management API](https://rocm.docs.amd.com/projects/HIP/en/6.2.1/reference/virtual_memory_reference.html) topics have been added.
|
||||
[HIP programming model (conceptual)](https://rocm.docs.amd.com/projects/HIP/en/docs-6.2.1/understand/programming_model.html).
|
||||
* The [HIP virtual memory management](https://rocm.docs.amd.com/projects/HIP/en/docs-6.2.1/how-to/virtual_memory.html) and [HIP virtual memory management API](https://rocm.docs.amd.com/projects/HIP/en/docs-6.2.1/reference/virtual_memory_reference.html) topics have been added.
|
||||
|
||||
```{note}
|
||||
The ROCm documentation, like all ROCm projects, is open source and available on GitHub. To contribute to ROCm documentation, see the [ROCm documentation contribution guidelines](https://rocm.docs.amd.com/en/latest/contribute/contributing.html).
|
||||
```
|
||||
|
||||
## Operating system and hardware support changes
|
||||
### Operating system and hardware support changes
|
||||
|
||||
There are no changes to supported hardware or operating systems from ROCm 6.2.0 to ROCm 6.2.1.
|
||||
ROCm 6.2.1 adds support for Ubuntu 24.04.1 (kernel: 6.8 [GA]).
|
||||
|
||||
See the [Compatibility matrix](https://rocm.docs.amd.com/en/docs-6.2.1/compatibility/compatibility-matrix.html) for the full list of supported operating systems and hardware architectures.
|
||||
|
||||
## ROCm components
|
||||
### ROCm components
|
||||
|
||||
The following table lists the versions of ROCm components for ROCm 6.2.1, including any version changes from 6.2.0 to 6.2.1.
|
||||
|
||||
@@ -140,7 +174,7 @@ Click the component's updated version to go to a detailed list of its changes. C
|
||||
<th rowspan="1"></th>
|
||||
<th rowspan="1">Communication</th>
|
||||
<td><a href="https://rocm.docs.amd.com/projects/rccl/en/docs-6.2.1">RCCL</a></td>
|
||||
<td>2.20.5</td>
|
||||
<td>2.20.5 ⇒ <a href="#rccl-2-20-5">2.20.5</a></td>
|
||||
<td><a href="https://github.com/ROCm/rccl/releases/tag/rocm-6.2.1"><i
|
||||
class="fab fa-github fa-lg"></i></a></td>
|
||||
</tr>
|
||||
@@ -417,37 +451,37 @@ Click the component's updated version to go to a detailed list of its changes. C
|
||||
</table>
|
||||
</div>
|
||||
|
||||
## Detailed component changes
|
||||
### Detailed component changes
|
||||
|
||||
The following sections describe key changes to ROCm components.
|
||||
|
||||
### **AMD SMI** (24.6.3)
|
||||
#### **AMD SMI** (24.6.3)
|
||||
|
||||
#### Changes
|
||||
##### Changes
|
||||
|
||||
* Added `amd-smi static --ras` on Guest VMs. Guest VMs can view enabled/disabled RAS features on Host cards.
|
||||
|
||||
#### Removals
|
||||
##### Removals
|
||||
|
||||
* Removed `amd-smi metric --ecc` & `amd-smi metric --ecc-blocks` on Guest VMs. Guest VMs do not support getting current ECC counts from the Host cards.
|
||||
|
||||
#### Resolved issues
|
||||
##### Resolved issues
|
||||
|
||||
* Fixed TypeError in `amd-smi process -G`.
|
||||
* Updated CLI error strings to handle empty and invalid GPU/CPU inputs.
|
||||
* Fixed Guest VM showing passthrough options.
|
||||
* Fixed firmware formatting where leading 0s were missing.
|
||||
|
||||
### **HIP** (6.2.1)
|
||||
#### **HIP** (6.2.1)
|
||||
|
||||
#### Resolved issues
|
||||
##### Resolved issues
|
||||
|
||||
* Soft hang when using `AMD_SERIALIZE_KERNEL`
|
||||
* Memory leak in `hipIpcCloseMemHandle`
|
||||
|
||||
### **HIPIFY** (18.0.0)
|
||||
#### **HIPIFY** (18.0.0)
|
||||
|
||||
#### Changes
|
||||
##### Changes
|
||||
|
||||
* Added CUDA 12.5.1 support
|
||||
* Added cuDNN 9.2.1 support
|
||||
@@ -455,17 +489,32 @@ The following sections describe key changes to ROCm components.
|
||||
* Added `hipBLAS` 64-bit APIs support
|
||||
* Added Support for math constants `math_constants.h`
|
||||
|
||||
### **Omnitrace** (1.11.2)
|
||||
#### **Omnitrace** (1.11.2)
|
||||
|
||||
#### Known Issues
|
||||
##### Known issues
|
||||
|
||||
* Perfetto can no longer open Omnitrace proto files. Loading Perfetto trace output `.proto` files in the latest version of `ui.perfetto.dev` can result in a dialog with the message, "Oops, something went wrong! Please file a bug." The information in the dialog will refer to an "Unknown field type." The workaround is to open the files with the previous version of the Perfetto UI found at [https://ui.perfetto.dev/v46.0-35b3d9845/#!/](https://ui.perfetto.dev/v46.0-35b3d9845/#!/).
|
||||
Perfetto can no longer open Omnitrace proto files. Loading Perfetto trace output `.proto` files in the latest version of `ui.perfetto.dev` can result in a dialog with the message, "Oops, something went wrong! Please file a bug." The information in the dialog will refer to an "Unknown field type." The workaround is to open the files with the previous version of the Perfetto UI found at [https://ui.perfetto.dev/v46.0-35b3d9845/#!/](https://ui.perfetto.dev/v46.0-35b3d9845/#!/).
|
||||
|
||||
### **rocAL** (2.0.0)
|
||||
See [issue #3767](https://github.com/ROCm/ROCm/issues/3767) on GitHub.
|
||||
|
||||
#### Changes
|
||||
#### **RCCL** (2.20.5)
|
||||
|
||||
##### Known issues
|
||||
|
||||
On systems running Linux kernel 6.8.0, such as Ubuntu 24.04, Direct Memory Access (DMA) transfers between the GPU and NIC are disabled and impacts multi-node RCCL performance.
|
||||
This issue was reproduced with RCCL 2.20.5 (ROCm 6.2.0 and 6.2.1) on systems with Broadcom Thor-2 NICs and affects other systems with RoCE networks using Linux 6.8.0 or newer.
|
||||
Older RCCL versions are also impacted.
|
||||
|
||||
This issue will be addressed in a future ROCm release.
|
||||
|
||||
See [issue #3772](https://github.com/ROCm/ROCm/issues/3772) on GitHub.
|
||||
|
||||
#### **rocAL** (2.0.0)
|
||||
|
||||
##### Changes
|
||||
|
||||
* Version updated from 1.0.0 to 2.0.0. Applications built using rocAL 1.0.0 must be recompiled to work with rocAL 2.0.0.
|
||||
* The new version of rocAL introduces many new features, but does not modify any of the existing public API functions.However, the version number was incremented from 1.3 to 2.0.
|
||||
Applications linked to version 1.3 must be recompiled to link against version 2.0.
|
||||
* Added development and test packages.
|
||||
* Added C++ rocAL audio unit test and Python script to run and compare the outputs.
|
||||
* Added Python support for audio decoders.
|
||||
@@ -483,37 +532,37 @@ The following sections describe key changes to ROCm components.
|
||||
* Image to tensor updates
|
||||
* ROCm install - use case graphics removed
|
||||
|
||||
#### Known issues
|
||||
##### Known issues
|
||||
|
||||
* Dependencies are not installed with the rocAL package installer. Dependencies must be installed with the prerequisite setup script provided. See the [rocAL README on GitHub](https://github.com/ROCm/rocAL/blob/docs/6.2.1/README.md#prerequisites-setup-script) for details.
|
||||
|
||||
### **rocBLAS** (4.2.1)
|
||||
#### **rocBLAS** (4.2.1)
|
||||
|
||||
#### Removals
|
||||
##### Removals
|
||||
|
||||
* Removed Device_Memory_Allocation.pdf link in documentation.
|
||||
|
||||
#### Resolved issues
|
||||
##### Resolved issues
|
||||
|
||||
* Fixed error/warning message during `rocblas_set_stream()` call.
|
||||
|
||||
### **rocFFT** (1.0.29)
|
||||
#### **rocFFT** (1.0.29)
|
||||
|
||||
#### Optimizations
|
||||
##### Optimizations
|
||||
|
||||
* Implemented 1D kernels for factorizable sizes greater than 1024.
|
||||
* Implemented 1D kernels for factorizable sizes less than 1024.
|
||||
|
||||
### **ROCm SMI** (7.3.0)
|
||||
#### **ROCm SMI** (7.3.0)
|
||||
|
||||
#### Optimizations
|
||||
##### Optimizations
|
||||
|
||||
* Improved handling of UnicodeEncodeErrors with non UTF-8 locales. Non UTF-8 locales were causing crashes on UTF-8 special characters.
|
||||
|
||||
#### Resolved issues
|
||||
##### Resolved issues
|
||||
|
||||
* Fixed an issue where the Compute Partition tests segfaulted when AMDGPU was loaded with optional parameters.
|
||||
|
||||
#### Known issues
|
||||
##### Known issues
|
||||
|
||||
* When setting CPX as a partition mode, there is a DRM node limit of 64. This is a known limitation when multiple drivers are using the DRM nodes. The `ls /sys/class/drm` command can be used to see the number of DRM nodes, and the following steps can be used to remove unnecessary drivers:
|
||||
|
||||
@@ -521,18 +570,18 @@ The following sections describe key changes to ROCm components.
|
||||
2. Remove any unnecessary drivers using `rmmod`. For example, to remove an AST driver, run `sudo rmmod ast`.
|
||||
3. Reload AMDGPU using `modprobe`: `sudo modprobe amdgpu`.
|
||||
|
||||
### **rocPRIM** (3.2.1)
|
||||
#### **rocPRIM** (3.2.1)
|
||||
|
||||
#### Optimizations
|
||||
##### Optimizations
|
||||
|
||||
* Improved performance of `block_reduce_warp_reduce` when warp size equals block size.
|
||||
|
||||
## ROCm known issues
|
||||
### ROCm known issues
|
||||
|
||||
ROCm known issues are tracked on [GitHub](https://github.com/ROCm/ROCm/labels/Verified%20Issue). Known issues related to
|
||||
individual components are listed in the [Detailed component changes](detailed-component-changes) section.
|
||||
|
||||
### Instinct MI300X GPU recovery failure on uncorrectable errors
|
||||
#### Instinct MI300X GPU recovery failure on uncorrectable errors
|
||||
|
||||
For the AMD Instinct MI300X accelerator, GPU recovery resets triggered by uncorrectable errors (UE) might not complete
|
||||
successfully, which can result in the system being left in an undefined state. A system reboot is needed to recover from
|
||||
@@ -540,14 +589,16 @@ this state. Additionally, error logging might fail in these situations, hinderin
|
||||
|
||||
This issue is under investigation and will be resolved in a future ROCm release.
|
||||
|
||||
## ROCm upcoming changes
|
||||
See [issue #3766](https://github.com/ROCm/ROCm/issues/3766) on GitHub.
|
||||
|
||||
### ROCm upcoming changes
|
||||
|
||||
The following changes to the ROCm software stack are anticipated for future releases.
|
||||
|
||||
### rocm-llvm-alt
|
||||
#### rocm-llvm-alt
|
||||
|
||||
The `rocm-llvm-alt` package will be removed in an upcoming release. Users relying on the functionality provided by the closed-source compiler should transition to the open-source compiler. Once the `rocm-llvm-alt` package is removed, any compilation requesting functionality provided by the closed-source compiler will result in a Clang warning: "*[AMD] proprietary optimization compiler has been removed*".
|
||||
|
||||
### rccl-rdma-sharp-plugins
|
||||
#### rccl-rdma-sharp-plugins
|
||||
|
||||
The RCCL plugin package, `rccl-rdma-sharp-plugins`, will be removed in an upcoming ROCm release.
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
ROCm Version,6.2.1,6.2.0, 6.1.2, 6.1.1, 6.1.0, 6.0.2, 6.0.0
|
||||
:ref:`Operating Systems & kernels <OS-kernel-versions>`,Ubuntu 24.04,Ubuntu 24.04,,,,,
|
||||
:ref:`Operating Systems & kernels <OS-kernel-versions>`,"Ubuntu 24.04.1, 24.04",Ubuntu 24.04,,,,,
|
||||
,"Ubuntu 22.04.5, 22.04.4","Ubuntu 22.04.5, 22.04.4","Ubuntu 22.04.4, 22.04.3","Ubuntu 22.04.4, 22.04.3","Ubuntu 22.04.4, 22.04.3","Ubuntu 22.04.4, 22.04.3, 22.04.2","Ubuntu 22.04.4, 22.04.3, 22.04.2"
|
||||
,,,"Ubuntu 20.04.6, 20.04.5","Ubuntu 20.04.6, 20.04.5","Ubuntu 20.04.6, 20.04.5","Ubuntu 20.04.6, 20.04.5","Ubuntu 20.04.6, 20.04.5"
|
||||
,"RHEL 9.4, 9.3","RHEL 9.4, 9.3","RHEL 9.4 [#red-hat94-past-60]_, 9.3, 9.2","RHEL 9.4 [#red-hat94-past-60]_, 9.3, 9.2","RHEL 9.4 [#red-hat94-past-60]_, 9.3, 9.2","RHEL 9.3, 9.2","RHEL 9.3, 9.2"
|
||||
@@ -106,9 +106,9 @@ ROCm Version,6.2.1,6.2.0, 6.1.2, 6.1.1, 6.1.0, 6.0.2, 6.0.0
|
||||
COMPILERS,.. _compilers-support-compatibility-matrix-past-60:,,,,,,
|
||||
`clang-ocl <https://github.com/ROCm/clang-ocl>`_,N/A,N/A,0.5.0,0.5.0,0.5.0,0.5.0,0.5.0
|
||||
:doc:`hipCC <hipcc:index>`,1.1.1,1.1.1,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0
|
||||
`Flang <https://github.com/ROCm/flang>`_,18.0.0.24332,18.0.0.24232,17.0.0.24193,17.0.0.24154,17.0.0.24103,17.0.0.24012,17.0.0.23483
|
||||
:doc:`llvm-project <llvm-project:index>`,18.0.0.24332,18.0.0.24232,17.0.0.24193,17.0.0.24154,17.0.0.24103,17.0.0.24012,17.0.0.23483
|
||||
`OpenMP <https://github.com/ROCm/llvm-project/tree/amd-staging/openmp>`_,18.0.0.24332,18.0.0.24232,17.0.0.24193,17.0.0.24154,17.0.0.24103,17.0.0.24012,17.0.0.23483
|
||||
`Flang <https://github.com/ROCm/flang>`_,18.0.0.24355,18.0.0.24232,17.0.0.24193,17.0.0.24154,17.0.0.24103,17.0.0.24012,17.0.0.23483
|
||||
:doc:`llvm-project <llvm-project:index>`,18.0.0.24355,18.0.0.24232,17.0.0.24193,17.0.0.24154,17.0.0.24103,17.0.0.24012,17.0.0.23483
|
||||
`OpenMP <https://github.com/ROCm/llvm-project/tree/amd-staging/openmp>`_,18.0.0.24355,18.0.0.24232,17.0.0.24193,17.0.0.24154,17.0.0.24103,17.0.0.24012,17.0.0.23483
|
||||
,,,,,,,
|
||||
RUNTIMES,.. _runtime-support-compatibility-matrix-past-60:,,,,,,
|
||||
:doc:`AMD CLR <hip:understand/amd_clr>`,6.2.41134,6.2.41133,6.1.40093,6.1.40092,6.1.40091,6.1.32831,6.1.32830
|
||||
|
||||
|
@@ -20,7 +20,7 @@ You can also refer to the :ref:`past versions of ROCm compatibility matrix<past-
|
||||
:header: "ROCm Version", "6.2.1", "6.2.0", "6.1.0"
|
||||
:stub-columns: 1
|
||||
|
||||
:ref:`Operating Systems & kernels <OS-kernel-versions>`,Ubuntu 24.04,Ubuntu 24.04,
|
||||
:ref:`Operating Systems & kernels <OS-kernel-versions>`,"Ubuntu 24.04.1, 24.04",Ubuntu 24.04,
|
||||
,"Ubuntu 22.04.5, 22.04.4","Ubuntu 22.04.5, 22.04.4","Ubuntu 22.04.4, 22.04.3"
|
||||
,,,"Ubuntu 20.04.6, 20.04.5"
|
||||
,"RHEL 9.4, 9.3","RHEL 9.4, 9.3","RHEL 9.4 [#red-hat94]_, 9.3, 9.2"
|
||||
@@ -127,9 +127,9 @@ You can also refer to the :ref:`past versions of ROCm compatibility matrix<past-
|
||||
COMPILERS,.. _compilers-support-compatibility-matrix:,,
|
||||
`clang-ocl <https://github.com/ROCm/clang-ocl>`_,N/A,N/A,0.5.0
|
||||
:doc:`hipCC <hipcc:index>`,1.1.1,1.1.1,1.0.0
|
||||
`Flang <https://github.com/ROCm/flang>`_,18.0.0.24332,18.0.0.24232,17.0.0.24103
|
||||
:doc:`llvm-project <llvm-project:index>`,18.0.0.24332,18.0.0.24232,17.0.0.24103
|
||||
`OpenMP <https://github.com/ROCm/llvm-project/tree/amd-staging/openmp>`_,18.0.0.24332,18.0.0.24232,17.0.0.24103
|
||||
`Flang <https://github.com/ROCm/flang>`_,18.0.0.24355,18.0.0.24232,17.0.0.24103
|
||||
:doc:`llvm-project <llvm-project:index>`,18.0.0.24355,18.0.0.24232,17.0.0.24103
|
||||
`OpenMP <https://github.com/ROCm/llvm-project/tree/amd-staging/openmp>`_,18.0.0.24355,18.0.0.24232,17.0.0.24103
|
||||
,,,
|
||||
RUNTIMES,.. _runtime-support-compatibility-matrix:,,
|
||||
:doc:`AMD CLR <hip:understand/amd_clr>`,6.2.41134,6.2.41133,6.1.40091
|
||||
@@ -160,7 +160,8 @@ Use this look up table to confirm which operating system and kernel versions are
|
||||
:widths: 40, 20, 40
|
||||
:stub-columns: 1
|
||||
|
||||
`Ubuntu <https://ubuntu.com/about/release-cycle#ubuntu-kernel-release-cycle>`_, 24.04, "6.8 GA"
|
||||
`Ubuntu <https://ubuntu.com/about/release-cycle#ubuntu-kernel-release-cycle>`_, 24.04.1, "6.8 GA"
|
||||
, 24.04, "6.8 GA"
|
||||
`Ubuntu <https://ubuntu.com/about/release-cycle#ubuntu-kernel-release-cycle>`_, 22.04.05, "5.15 GA, 6.8 HWE"
|
||||
, 22.04.04, "5.15 GA, 6.5 HWE"
|
||||
, 22.04.03, "5.15 GA, 6.2 HWE"
|
||||
|
||||
@@ -30,16 +30,15 @@ if os.environ.get("READTHEDOCS", "") == "True":
|
||||
project = "ROCm Documentation"
|
||||
author = "Advanced Micro Devices, Inc."
|
||||
copyright = "Copyright (c) 2024 Advanced Micro Devices, Inc. All rights reserved."
|
||||
version = "6.2.1"
|
||||
release = "6.2.1"
|
||||
version = "6.2.2"
|
||||
release = "6.2.2"
|
||||
setting_all_article_info = True
|
||||
all_article_info_os = ["linux", "windows"]
|
||||
all_article_info_author = ""
|
||||
|
||||
# pages with specific settings
|
||||
article_pages = [
|
||||
{"file": "about/release-notes", "os": ["linux", "windows"], "date": "2024-09-20"},
|
||||
{"file": "about/changelog", "os": ["linux", "windows"], "date": "2024-09-20"},
|
||||
{"file": "about/release-notes", "os": ["linux", "windows"], "date": "2024-09-24"},
|
||||
{"file": "how-to/deep-learning-rocm", "os": ["linux"]},
|
||||
{"file": "how-to/rocm-for-ai/index", "os": ["linux"]},
|
||||
{"file": "how-to/rocm-for-ai/install", "os": ["linux"]},
|
||||
|
||||
@@ -1,3 +1,28 @@
|
||||
The release notes provide a comprehensive summary of changes since the previous ROCm release.
|
||||
|
||||
- [Release highlights](release-highlights)
|
||||
|
||||
- [Operating system and hardware support changes](operating-system-and-hardware-support-changes)
|
||||
|
||||
- [ROCm components versioning](rocm-components)
|
||||
|
||||
- [Detailed component changes](detailed-component-changes)
|
||||
|
||||
- [ROCm known issues](rocm-known-issues)
|
||||
|
||||
- [ROCm upcoming changes](rocm-upcoming-changes)
|
||||
|
||||
The [Compatibility matrix](https://rocm.docs.amd.com/en/docs-6.2.0/compatibility/compatibility-matrix.html)
|
||||
provides an overview of operating system, hardware, ecosystem, and ROCm component support across ROCm releases.
|
||||
|
||||
Release notes for previous ROCm releases are available in earlier versions of the documentation.
|
||||
See the [ROCm documentation release history](https://rocm.docs.amd.com/en/latest/release/versions.html).
|
||||
|
||||
## Release highlights
|
||||
|
||||
This section introduces notable new features and improvements in ROCm 6.2. See the
|
||||
[Detailed component changes](#detailed-component-changes) for individual component changes.
|
||||
|
||||
### New components
|
||||
|
||||
ROCm 6.2.0 introduces the following new components to the ROCm software stack.
|
||||
|
||||
@@ -1 +1,58 @@
|
||||
### Highlights will go here
|
||||
The release notes provide a summary of notable changes since the previous ROCm release.
|
||||
|
||||
- [Release highlights](release-highlights)
|
||||
|
||||
- [Operating system and hardware support changes](operating-system-and-hardware-support-changes)
|
||||
|
||||
- [ROCm components versioning](rocm-components)
|
||||
|
||||
- [Detailed component changes](detailed-component-changes)
|
||||
|
||||
- [ROCm known issues](rocm-known-issues)
|
||||
|
||||
- [ROCm upcoming changes](rocm-upcoming-changes)
|
||||
|
||||
The [Compatibility matrix](https://rocm.docs.amd.com/en/docs-6.2.1/compatibility/compatibility-matrix.html)
|
||||
provides the full list of supported hardware, operating systems, ecosystems, third-party components, and ROCm components for each ROCm release.
|
||||
|
||||
Release notes for previous ROCm releases are available in earlier versions of the documentation.
|
||||
See the [ROCm documentation release history](https://rocm.docs.amd.com/en/latest/release/versions.html).
|
||||
|
||||
## Release highlights
|
||||
|
||||
The following are notable new features and improvements in ROCm 6.2.1. For changes to individual components, see [Detailed component changes](#detailed-component-changes).
|
||||
|
||||
### rocAL major version change
|
||||
|
||||
The new version of rocAL introduces many new features, but does not modify any of the existing public API functions. However, the version number was incremented from 1.3 to 2.0.
|
||||
Applications linked to version 1.3 must be recompiled to link against version 2.0.
|
||||
|
||||
See [the rocAL detailed changes](#rocal-2-0-0) for more information.
|
||||
|
||||
### New support for FBGEMM (Facebook General Matrix Multiplication)
|
||||
|
||||
As of ROCm 6.2.1, ROCm supports Facebook General Matrix Multiplication (FBGEMM) and the related FBGEMM_GPU library.
|
||||
|
||||
FBGEMM is a low-precision, high-performance CPU kernel library for convolution and matrix multiplication. It is used for server-side inference and as a back end for PyTorch quantized operators. FBGEMM_GPU includes a collection of PyTorch GPU operator libraries for training and inference. For more information, see the ROCm [Model acceleration libraries guide](https://rocm.docs.amd.com/en/docs-6.2.1/how-to/llm-fine-tuning-optimization/model-acceleration-libraries.html)
|
||||
and [PyTorch's FBGEMM GitHub repository](https://github.com/pytorch/FBGEMM).
|
||||
|
||||
### ROCm Offline Installer Creator changes
|
||||
|
||||
The [ROCm Offline Installer Creator 6.2.1](https://rocm.docs.amd.com/projects/install-on-linux/en/docs-6.2.1/install/rocm-offline-installer.html) introduces several new features and improvements including:
|
||||
|
||||
* Logging support for create and install logs
|
||||
* More stringent checks for Linux versions and distributions
|
||||
* Updated prerequisite repositories
|
||||
* Fixed CTest issues
|
||||
|
||||
### ROCm documentation changes
|
||||
|
||||
There have been no changes to supported hardware or operating systems from ROCm 6.2.0 to ROCm 6.2.1.
|
||||
|
||||
* The Programming Model Reference and Understanding the Programming Model topics in HIP have been consolidated into one topic,
|
||||
[HIP programming model (conceptual)](https://rocm.docs.amd.com/projects/HIP/en/docs-6.2.1/understand/programming_model.html).
|
||||
* The [HIP virtual memory management](https://rocm.docs.amd.com/projects/HIP/en/docs-6.2.1/how-to/virtual_memory.html) and [HIP virtual memory management API](https://rocm.docs.amd.com/projects/HIP/en/docs-6.2.1/reference/virtual_memory_reference.html) topics have been added.
|
||||
|
||||
```{note}
|
||||
The ROCm documentation, like all ROCm projects, is open source and available on GitHub. To contribute to ROCm documentation, see the [ROCm documentation contribution guidelines](https://rocm.docs.amd.com/en/latest/contribute/contributing.html).
|
||||
```
|
||||
|
||||
31
tools/autotag/templates/highlights/6.2.2.md
Normal file
31
tools/autotag/templates/highlights/6.2.2.md
Normal file
@@ -0,0 +1,31 @@
|
||||
These release notes provide a summary of notable changes since the previous ROCm release.
|
||||
|
||||
```{note}
|
||||
As ROCm 6.2.2 was released shortly after 6.2.1, the changes between these versions
|
||||
are minimal. For a comprehensive overview of recent updates, the ROCm 6.2.1 release
|
||||
notes are appended to the end of this document.
|
||||
|
||||
For detailed information about the changes in ROCm 6.2.1, refer to the appended
|
||||
section: [ROCm 6.2.1 release notes](rocm-6-2-1-release-notes).
|
||||
```
|
||||
|
||||
The [Compatibility matrix](https://rocm.docs.amd.com/en/docs-6.2.2/compatibility/compatibility-matrix.html)
|
||||
provides the full list of supported hardware, operating systems, ecosystems, third-party components, and ROCm components
|
||||
for each ROCm release.
|
||||
|
||||
Release notes for previous ROCm releases are available in earlier versions of the documentation.
|
||||
See the [ROCm documentation release history](https://rocm.docs.amd.com/en/latest/release/versions.html).
|
||||
|
||||
## Release highlights
|
||||
|
||||
The following is a significant fix introduced in ROCm 6.2.2.
|
||||
|
||||
### Fixed Instinct MI300X error recovery failure
|
||||
|
||||
Improved the reliability of AMD Instinct MI300X accelerators in scenarios involving
|
||||
uncorrectable errors. Previously, error recovery did not occur as expected,
|
||||
potentially leaving the system in an undefined state. This fix ensures that error
|
||||
recovery functions as expected, maintaining system stability.
|
||||
|
||||
See the [original issue](#instinct-mi300x-gpu-recovery-failure-on-uncorrectable-errors)
|
||||
noted in the ROCm 6.2.1 release notes.
|
||||
86
tools/autotag/templates/known_issues/6.2.0.md
Normal file
86
tools/autotag/templates/known_issues/6.2.0.md
Normal file
@@ -0,0 +1,86 @@
|
||||
## ROCm known issues
|
||||
|
||||
ROCm known issues are noted on {fab}`github` [GitHub](https://github.com/ROCm/ROCm/labels/Verified%20Issue). For known
|
||||
issues related to individual components, review the [Detailed component changes](detailed-component-changes).
|
||||
|
||||
### Default processor affinity behavior for helper threads
|
||||
|
||||
Processor affinity is a critical setting to ensure that ROCm helper threads run on the correct cores. By default, ROCm
|
||||
helper threads are spawned on all available cores, ignoring the parent thread’s processor affinity. This can lead to
|
||||
threads competing for available cores, which may result in suboptimal performance. This behavior occurs by default if
|
||||
the environment variable `HSA_OVERRIDE_CPU_AFFINITY_DEBUG` is not set or is set to `1`. If
|
||||
`HSA_OVERRIDE_CPU_AFFINITY_DEBUG` is set to `0`, the ROCr runtime uses the parent process's core affinity mask when
|
||||
creating helper threads. The parent’s affinity mask should then be set to account for the presence of additional threads
|
||||
by ensuring the affinity mask contains enough cores. Depending on the affinity settings of the software environment,
|
||||
batch system, launch commands like `numactl`/`taskset`, or explicit mask manipulation by the application itself, changing
|
||||
the setting may be advantageous to performance.
|
||||
|
||||
To ensure the parent's core affinity mask is honored by the ROCm helper threads, set the
|
||||
`HSA_OVERRIDE_CPU_AFFINITY_DEBUG` environment variable as follows:
|
||||
|
||||
```{code} shell
|
||||
export HSA_OVERRIDE_CPU_AFFINITY_DEBUG=0
|
||||
```
|
||||
|
||||
To ensure ROCm helper threads run on all available cores, set the `HSA_OVERRIDE_CPU_AFFINITY_DEBUG` environment variable
|
||||
as follows:
|
||||
|
||||
``` shell
|
||||
export HSA_OVERRIDE_CPU_AFFINITY_DEBUG=1
|
||||
```
|
||||
|
||||
Or the default:
|
||||
|
||||
``` shell
|
||||
|
||||
unset HSA_OVERRIDE_CPU_AFFINITY_DEBUG
|
||||
```
|
||||
|
||||
If unsure of the default processor affinity settings for your environment, run the following command from the shell:
|
||||
|
||||
``` shell
|
||||
|
||||
bash -c "echo taskset -p \$\$"
|
||||
```
|
||||
|
||||
See [issue #3493](https://github.com/ROCm/ROCm/issues/3493) on GitHub.
|
||||
|
||||
### Display issues on servers with Instinct MI300-series accelerators when loading AMDGPU driver
|
||||
|
||||
AMD Instinct MI300-series accelerators and third-party GPUs such as the Matrox G200 have an issue impacting video
|
||||
output. The issue was reproduced on a Dell server model PowerEdge XE9680. Servers from other vendors utilizing Matrox
|
||||
G200 cards may be impacted as well. This issue was found with ROCm 6.2.0 but is present in older ROCm versions.
|
||||
|
||||
The AMDGPU driver shipped with ROCm interferes with the operation of the display card video output. On Dell systems,
|
||||
this includes both the local video output and remote access via iDRAC. The display appears blank (black) after loading
|
||||
the `amdgpu` driver modules. Video output impacts both terminal access when running in `runlevel 3` and GUI access when
|
||||
running in `runlevel 5`. Server functionality can still be accessed via SSH or other remote connection methods.
|
||||
|
||||
See [issue #3494](https://github.com/ROCm/ROCm/issues/3494) on GitHub.
|
||||
|
||||
### KFDTest failure on Instinct MI300X with Oracle Linux 8.9
|
||||
|
||||
The `KFDEvictTest.QueueTest` is failing on the MI300X platform during KFD (Kernel Fusion Driver) tests, causing the full
|
||||
suite to not execute properly. This issue is suspected to be hardware-related.
|
||||
|
||||
See [issue #3495](https://github.com/ROCm/ROCm/issues/3495) on GitHub.
|
||||
|
||||
### Bandwidth limitation in gang and non-gang modes on Instinct MI300A
|
||||
|
||||
Expected target peak non-gang performance (~60GB/s) and target peak gang performance (~90GB/s) are not achieved. Both gang
|
||||
and non-gang performance are observed to be limited at 45GB/s.
|
||||
|
||||
This issue will be addressed in a future ROCm release.
|
||||
|
||||
See [issue #3496](https://github.com/ROCm/ROCm/issues/3496) on GitHub.
|
||||
|
||||
### rocm-llvm-alt
|
||||
|
||||
ROCm provides an optional package -- `rocm-llvm-alt` -- that provides a closed-source compiler for
|
||||
users interested in additional closed-source CPU optimizations. This feature is not functional in
|
||||
the ROCm 6.2.0 release. Users who attempt to invoke the closed-source compiler will experience an
|
||||
LLVM consumer-producer mismatch and the compilation will fail. There is no workaround that allows
|
||||
use of the closed-source compiler. It is recommended to compile using the default open-source
|
||||
compiler, which generates high-quality AMD CPU and AMD GPU code.
|
||||
|
||||
See [issue #3492](https://github.com/ROCm/ROCm/issues/3492) on GitHub.
|
||||
14
tools/autotag/templates/known_issues/6.2.1.md
Normal file
14
tools/autotag/templates/known_issues/6.2.1.md
Normal file
@@ -0,0 +1,14 @@
|
||||
## ROCm known issues
|
||||
|
||||
ROCm known issues are tracked on [GitHub](https://github.com/ROCm/ROCm/labels/Verified%20Issue). Known issues related to
|
||||
individual components are listed in the [Detailed component changes](detailed-component-changes) section.
|
||||
|
||||
### Instinct MI300X GPU recovery failure on uncorrectable errors
|
||||
|
||||
For the AMD Instinct MI300X accelerator, GPU recovery resets triggered by uncorrectable errors (UE) might not complete
|
||||
successfully, which can result in the system being left in an undefined state. A system reboot is needed to recover from
|
||||
this state. Additionally, error logging might fail in these situations, hindering diagnostics.
|
||||
|
||||
This issue is under investigation and will be resolved in a future ROCm release.
|
||||
|
||||
See [issue #3766](https://github.com/ROCm/ROCm/issues/3766) on GitHub.
|
||||
@@ -1,2 +1,5 @@
|
||||
## Operating system and hardware support changes
|
||||
|
||||
ROCm 6.2.1 adds support for Ubuntu 24.04.1 (kernel: 6.8 [GA]).
|
||||
|
||||
See the [Compatibility matrix](https://rocm.docs.amd.com/en/docs-6.2.1/compatibility/compatibility-matrix.html) for the full list of supported operating systems and hardware architectures.
|
||||
|
||||
@@ -1,85 +1,3 @@
|
||||
### Default processor affinity behavior for helper threads
|
||||
|
||||
Processor affinity is a critical setting to ensure that ROCm helper threads run on the correct cores. By default, ROCm
|
||||
helper threads are spawned on all available cores, ignoring the parent thread’s processor affinity. This can lead to
|
||||
threads competing for available cores, which may result in suboptimal performance. This behavior occurs by default if
|
||||
the environment variable `HSA_OVERRIDE_CPU_AFFINITY_DEBUG` is not set or is set to `1`. If
|
||||
`HSA_OVERRIDE_CPU_AFFINITY_DEBUG` is set to `0`, the ROCr runtime uses the parent process's core affinity mask when
|
||||
creating helper threads. The parent’s affinity mask should then be set to account for the presence of additional threads
|
||||
by ensuring the affinity mask contains enough cores. Depending on the affinity settings of the software environment,
|
||||
batch system, launch commands like `numactl`/`taskset`, or explicit mask manipulation by the application itself, changing
|
||||
the setting may be advantageous to performance.
|
||||
|
||||
To ensure the parent's core affinity mask is honored by the ROCm helper threads, set the
|
||||
`HSA_OVERRIDE_CPU_AFFINITY_DEBUG` environment variable as follows:
|
||||
|
||||
```{code} shell
|
||||
export HSA_OVERRIDE_CPU_AFFINITY_DEBUG=0
|
||||
```
|
||||
|
||||
To ensure ROCm helper threads run on all available cores, set the `HSA_OVERRIDE_CPU_AFFINITY_DEBUG` environment variable
|
||||
as follows:
|
||||
|
||||
``` shell
|
||||
export HSA_OVERRIDE_CPU_AFFINITY_DEBUG=1
|
||||
```
|
||||
|
||||
Or the default:
|
||||
|
||||
``` shell
|
||||
|
||||
unset HSA_OVERRIDE_CPU_AFFINITY_DEBUG
|
||||
```
|
||||
|
||||
If unsure of the default processor affinity settings for your environment, run the following command from the shell:
|
||||
|
||||
``` shell
|
||||
|
||||
bash -c "echo taskset -p \$\$"
|
||||
```
|
||||
|
||||
See [issue #3493](https://github.com/ROCm/ROCm/issues/3493) on GitHub.
|
||||
|
||||
### Display issues on servers with Instinct MI300-series accelerators when loading AMDGPU driver
|
||||
|
||||
AMD Instinct MI300-series accelerators and third-party GPUs such as the Matrox G200 have an issue impacting video
|
||||
output. The issue was reproduced on a Dell server model PowerEdge XE9680. Servers from other vendors utilizing Matrox
|
||||
G200 cards may be impacted as well. This issue was found with ROCm 6.2.0 but is present in older ROCm versions.
|
||||
|
||||
The AMDGPU driver shipped with ROCm interferes with the operation of the display card video output. On Dell systems,
|
||||
this includes both the local video output and remote access via iDRAC. The display appears blank (black) after loading
|
||||
the `amdgpu` driver modules. Video output impacts both terminal access when running in `runlevel 3` and GUI access when
|
||||
running in `runlevel 5`. Server functionality can still be accessed via SSH or other remote connection methods.
|
||||
|
||||
See [issue #3494](https://github.com/ROCm/ROCm/issues/3494) on GitHub.
|
||||
|
||||
### KFDTest failure on Instinct MI300X with Oracle Linux 8.9
|
||||
|
||||
The `KFDEvictTest.QueueTest` is failing on the MI300X platform during KFD (Kernel Fusion Driver) tests, causing the full
|
||||
suite to not execute properly. This issue is suspected to be hardware-related.
|
||||
|
||||
See [issue #3495](https://github.com/ROCm/ROCm/issues/3495) on GitHub.
|
||||
|
||||
### Bandwidth limitation in gang and non-gang modes on Instinct MI300A
|
||||
|
||||
Expected target peak non-gang performance (~60GB/s) and target peak gang performance (~90GB/s) are not achieved. Both gang
|
||||
and non-gang performance are observed to be limited at 45GB/s.
|
||||
|
||||
This issue will be addressed in a future ROCm release.
|
||||
|
||||
See [issue #3496](https://github.com/ROCm/ROCm/issues/3496) on GitHub.
|
||||
|
||||
### rocm-llvm-alt
|
||||
|
||||
ROCm provides an optional package -- `rocm-llvm-alt` -- that provides a closed-source compiler for
|
||||
users interested in additional closed-source CPU optimizations. This feature is not functional in
|
||||
the ROCm 6.2.0 release. Users who attempt to invoke the closed-source compiler will experience an
|
||||
LLVM consumer-producer mismatch and the compilation will fail. There is no workaround that allows
|
||||
use of the closed-source compiler. It is recommended to compile using the default open-source
|
||||
compiler, which generates high-quality AMD CPU and AMD GPU code.
|
||||
|
||||
See [issue #3492](https://github.com/ROCm/ROCm/issues/3492) on GitHub.
|
||||
|
||||
## ROCm upcoming changes
|
||||
|
||||
The section notes upcoming changes to the ROCm software stack. For upcoming changes related to individual components, review
|
||||
|
||||
@@ -1,94 +1,11 @@
|
||||
### Default processor affinity behavior for helper threads
|
||||
|
||||
Processor affinity is a critical setting to ensure that ROCm helper threads run on the correct cores. By default, ROCm
|
||||
helper threads are spawned on all available cores, ignoring the parent thread’s processor affinity. This can lead to
|
||||
threads competing for available cores, which may result in suboptimal performance. This behavior occurs by default if
|
||||
the environment variable `HSA_OVERRIDE_CPU_AFFINITY_DEBUG` is not set or is set to `1`. If
|
||||
`HSA_OVERRIDE_CPU_AFFINITY_DEBUG` is set to `0`, the ROCr runtime uses the parent process's core affinity mask when
|
||||
creating helper threads. The parent’s affinity mask should then be set to account for the presence of additional threads
|
||||
by ensuring the affinity mask contains enough cores. Depending on the affinity settings of the software environment,
|
||||
batch system, launch commands like `numactl`/`taskset`, or explicit mask manipulation by the application itself, changing
|
||||
the setting may be advantageous to performance.
|
||||
|
||||
To ensure the parent's core affinity mask is honored by the ROCm helper threads, set the
|
||||
`HSA_OVERRIDE_CPU_AFFINITY_DEBUG` environment variable as follows:
|
||||
|
||||
```{code} shell
|
||||
export HSA_OVERRIDE_CPU_AFFINITY_DEBUG=0
|
||||
```
|
||||
|
||||
To ensure ROCm helper threads run on all available cores, set the `HSA_OVERRIDE_CPU_AFFINITY_DEBUG` environment variable
|
||||
as follows:
|
||||
|
||||
``` shell
|
||||
export HSA_OVERRIDE_CPU_AFFINITY_DEBUG=1
|
||||
```
|
||||
|
||||
Or the default:
|
||||
|
||||
``` shell
|
||||
|
||||
unset HSA_OVERRIDE_CPU_AFFINITY_DEBUG
|
||||
```
|
||||
|
||||
If unsure of the default processor affinity settings for your environment, run the following command from the shell:
|
||||
|
||||
``` shell
|
||||
|
||||
bash -c "echo taskset -p \$\$"
|
||||
```
|
||||
|
||||
See [issue #3493](https://github.com/ROCm/ROCm/issues/3493) on GitHub.
|
||||
|
||||
### Display issues on servers with Instinct MI300-series accelerators when loading AMDGPU driver
|
||||
|
||||
AMD Instinct MI300-series accelerators and third-party GPUs such as the Matrox G200 have an issue impacting video
|
||||
output. The issue was reproduced on a Dell server model PowerEdge XE9680. Servers from other vendors utilizing Matrox
|
||||
G200 cards may be impacted as well. This issue was found with ROCm 6.2.0 but is present in older ROCm versions.
|
||||
|
||||
The AMDGPU driver shipped with ROCm interferes with the operation of the display card video output. On Dell systems,
|
||||
this includes both the local video output and remote access via iDRAC. The display appears blank (black) after loading
|
||||
the `amdgpu` driver modules. Video output impacts both terminal access when running in `runlevel 3` and GUI access when
|
||||
running in `runlevel 5`. Server functionality can still be accessed via SSH or other remote connection methods.
|
||||
|
||||
See [issue #3494](https://github.com/ROCm/ROCm/issues/3494) on GitHub.
|
||||
|
||||
### KFDTest failure on Instinct MI300X with Oracle Linux 8.9
|
||||
|
||||
The `KFDEvictTest.QueueTest` is failing on the MI300X platform during KFD (Kernel Fusion Driver) tests, causing the full
|
||||
suite to not execute properly. This issue is suspected to be hardware-related.
|
||||
|
||||
See [issue #3495](https://github.com/ROCm/ROCm/issues/3495) on GitHub.
|
||||
|
||||
### Bandwidth limitation in gang and non-gang modes on Instinct MI300A
|
||||
|
||||
Expected target peak non-gang performance (~60GB/s) and target peak gang performance (~90GB/s) are not achieved. Both gang
|
||||
and non-gang performance are observed to be limited at 45GB/s.
|
||||
|
||||
This issue will be addressed in a future ROCm release.
|
||||
|
||||
See [issue #3496](https://github.com/ROCm/ROCm/issues/3496) on GitHub.
|
||||
|
||||
### rocm-llvm-alt
|
||||
|
||||
ROCm provides an optional package -- `rocm-llvm-alt` -- that provides a closed-source compiler for
|
||||
users interested in additional closed-source CPU optimizations. This feature is not functional in
|
||||
the ROCm 6.2.0 release. Users who attempt to invoke the closed-source compiler will experience an
|
||||
LLVM consumer-producer mismatch and the compilation will fail. There is no workaround that allows
|
||||
use of the closed-source compiler. It is recommended to compile using the default open-source
|
||||
compiler, which generates high-quality AMD CPU and AMD GPU code.
|
||||
|
||||
See [issue #3492](https://github.com/ROCm/ROCm/issues/3492) on GitHub.
|
||||
|
||||
## ROCm upcoming changes
|
||||
|
||||
The section notes upcoming changes to the ROCm software stack. For upcoming changes related to individual components, review
|
||||
the [Detailed component changes](detailed-component-changes).
|
||||
The following changes to the ROCm software stack are anticipated for future releases.
|
||||
|
||||
### rocm-llvm-alt
|
||||
|
||||
The `rocm-llvm-alt` package will be removed in an upcoming release. Users relying on the
|
||||
functionality provided by the closed-source compiler should transition to the open-source compiler.
|
||||
Once the `rocm-llvm-alt` package is removed, any compilation requesting functionality provided by
|
||||
the closed-source compiler will result in a Clang warning: "*[AMD] proprietary optimization compiler
|
||||
has been removed*".
|
||||
The `rocm-llvm-alt` package will be removed in an upcoming release. Users relying on the functionality provided by the closed-source compiler should transition to the open-source compiler. Once the `rocm-llvm-alt` package is removed, any compilation requesting functionality provided by the closed-source compiler will result in a Clang warning: "*[AMD] proprietary optimization compiler has been removed*".
|
||||
|
||||
### rccl-rdma-sharp-plugins
|
||||
|
||||
The RCCL plugin package, `rccl-rdma-sharp-plugins`, will be removed in an upcoming ROCm release.
|
||||
|
||||
Reference in New Issue
Block a user