mirror of
https://github.com/ROCm/ROCm.git
synced 2026-01-09 14:48:06 -05:00
Add links to GH issues in 6.2.1 release notes (#3769)
* add MAD page
* link to GitHub issues in release notes known issues
* update templates for 6.2.1
* Revert "add MAD page"
This reverts commit 9cce72bba3.
* update wordlist for spellcheck linter
* add rccl note
* update rocal version change heading to be more obvious
* make rocal note more specific
* fix missing space
* fix capitalization
This commit is contained in:
@@ -53,6 +53,7 @@ CSC
|
||||
CSE
|
||||
CSV
|
||||
CSn
|
||||
CTest
|
||||
CTests
|
||||
CU
|
||||
CUDA
|
||||
@@ -387,6 +388,7 @@ UAC
|
||||
UC
|
||||
UCC
|
||||
UCX
|
||||
UE
|
||||
UIF
|
||||
UMC
|
||||
USM
|
||||
@@ -653,6 +655,7 @@ quasirandom
|
||||
queueing
|
||||
rccl
|
||||
rdc
|
||||
rdma
|
||||
reStructuredText
|
||||
redirections
|
||||
refactorization
|
||||
|
||||
32
RELEASE.md
32
RELEASE.md
@@ -24,9 +24,12 @@ See the [ROCm documentation release history](https://rocm.docs.amd.com/en/latest
|
||||
|
||||
The following are notable new features and improvements in ROCm 6.2.1. For changes to individual components, see [Detailed component changes](#detailed-component-changes).
|
||||
|
||||
### rocAL version change
|
||||
### rocAL major version change
|
||||
|
||||
The version of rocAL has been updated to 2.0.0. Applications built using rocAL 1.0.0 must be recompiled to work with rocAL 2.0.0. See [the rocAL detailed changes](#rocal-2-0-0) for more information.
|
||||
The new version of rocAL introduces many new features, but does not modify any of the existing public API functions. However, the version number was incremented from 1.3 to 2.0.
|
||||
Applications linked to version 1.3 must be recompiled to link against version 2.0.
|
||||
|
||||
See [the rocAL detailed changes](#rocal-2-0-0) for more information.
|
||||
|
||||
### New support for FBGEMM (Facebook General Matrix Multiplication)
|
||||
|
||||
@@ -140,7 +143,7 @@ Click the component's updated version to go to a detailed list of its changes. C
|
||||
<th rowspan="1"></th>
|
||||
<th rowspan="1">Communication</th>
|
||||
<td><a href="https://rocm.docs.amd.com/projects/rccl/en/docs-6.2.1">RCCL</a></td>
|
||||
<td>2.20.5</td>
|
||||
<td>2.20.5 ⇒ <a href="#rccl-2-20-5">2.20.5</a></td>
|
||||
<td><a href="https://github.com/ROCm/rccl/releases/tag/rocm-6.2.1"><i
|
||||
class="fab fa-github fa-lg"></i></a></td>
|
||||
</tr>
|
||||
@@ -457,15 +460,30 @@ The following sections describe key changes to ROCm components.
|
||||
|
||||
### **Omnitrace** (1.11.2)
|
||||
|
||||
#### Known Issues
|
||||
#### Known issues
|
||||
|
||||
* Perfetto can no longer open Omnitrace proto files. Loading Perfetto trace output `.proto` files in the latest version of `ui.perfetto.dev` can result in a dialog with the message, "Oops, something went wrong! Please file a bug." The information in the dialog will refer to an "Unknown field type." The workaround is to open the files with the previous version of the Perfetto UI found at [https://ui.perfetto.dev/v46.0-35b3d9845/#!/](https://ui.perfetto.dev/v46.0-35b3d9845/#!/).
|
||||
Perfetto can no longer open Omnitrace proto files. Loading Perfetto trace output `.proto` files in the latest version of `ui.perfetto.dev` can result in a dialog with the message, "Oops, something went wrong! Please file a bug." The information in the dialog will refer to an "Unknown field type." The workaround is to open the files with the previous version of the Perfetto UI found at [https://ui.perfetto.dev/v46.0-35b3d9845/#!/](https://ui.perfetto.dev/v46.0-35b3d9845/#!/).
|
||||
|
||||
See [issue #3767](https://github.com/ROCm/ROCm/issues/3767) on GitHub.
|
||||
|
||||
### **RCCL** (2.20.5)
|
||||
|
||||
#### Known issues
|
||||
|
||||
On systems running Linux kernel 6.8.0, such as Ubuntu 24.04, GPUDirect RDMA is disabled and impacts multi-node RCCL performance.
|
||||
This issue was reproduced with RCCL 2.20.5 (ROCm 6.2.0 and 6.2.1) on systems with Broadcom Thor-2 NICs and affects other systems with RoCE networks using Linux 6.8.0 or newer.
|
||||
Older RCCL versions are also impacted.
|
||||
|
||||
This issue will be addressed in a future ROCm release.
|
||||
|
||||
See [issue #3772](https://github.com/ROCm/ROCm/issues/3772) on GitHub.
|
||||
|
||||
### **rocAL** (2.0.0)
|
||||
|
||||
#### Changes
|
||||
|
||||
* Version updated from 1.0.0 to 2.0.0. Applications built using rocAL 1.0.0 must be recompiled to work with rocAL 2.0.0.
|
||||
* The new version of rocAL introduces many new features, but does not modify any of the existing public API functions.However, the version number was incremented from 1.3 to 2.0.
|
||||
Applications linked to version 1.3 must be recompiled to link against version 2.0.
|
||||
* Added development and test packages.
|
||||
* Added C++ rocAL audio unit test and Python script to run and compare the outputs.
|
||||
* Added Python support for audio decoders.
|
||||
@@ -540,6 +558,8 @@ this state. Additionally, error logging might fail in these situations, hinderin
|
||||
|
||||
This issue is under investigation and will be resolved in a future ROCm release.
|
||||
|
||||
See [issue #3766](https://github.com/ROCm/ROCm/issues/3766) on GitHub.
|
||||
|
||||
## ROCm upcoming changes
|
||||
|
||||
The following changes to the ROCm software stack are anticipated for future releases.
|
||||
|
||||
@@ -1 +1,34 @@
|
||||
### Highlights will go here
|
||||
### rocAL major version change
|
||||
|
||||
The new version of rocAL introduces many new features, but does not modify any of the existing public API functions.However, the version number was incremented from 1.3 to 2.0.
|
||||
Applications linked to version 1.3 must be recompiled to link against version 2.0.
|
||||
|
||||
See [the rocAL detailed changes](#rocal-2-0-0) for more information.
|
||||
|
||||
### New support for FBGEMM (Facebook General Matrix Multiplication)
|
||||
|
||||
As of ROCm 6.2.1, ROCm supports Facebook General Matrix Multiplication (FBGEMM) and the related FBGEMM_GPU library.
|
||||
|
||||
FBGEMM is a low-precision, high-performance CPU kernel library for convolution and matrix multiplication. It is used for server-side inference and as a back end for PyTorch quantized operators. FBGEMM_GPU includes a collection of PyTorch GPU operator libraries for training and inference. For more information, see the ROCm [Model acceleration libraries guide](https://rocm.docs.amd.com/en/6.2.1/how-to/llm-fine-tuning-optimization/model-acceleration-libraries.html)
|
||||
and [PyTorch's FBGEMM GitHub repository](https://github.com/pytorch/FBGEMM).
|
||||
|
||||
### ROCm Offline Installer Creator changes
|
||||
|
||||
The [ROCm Offline Installer Creator 6.2.1](https://rocm.docs.amd.com/projects/install-on-linux/en/6.2.1/install/rocm-offline-installer.html) introduces several new features and improvements including:
|
||||
|
||||
* Logging support for create and install logs
|
||||
* More stringent checks for Linux versions and distributions
|
||||
* Updated prerequisite repositories
|
||||
* Fixed CTest issues
|
||||
|
||||
### ROCm documentation changes
|
||||
|
||||
There have been no changes to supported hardware or operating systems from ROCm 6.2.0 to ROCm 6.2.1.
|
||||
|
||||
* The Programming Model Reference and Understanding the Programming Model topics in HIP have been consolidated into one topic,
|
||||
[HIP programming model (conceptual)](https://rocm.docs.amd.com/projects/HIP/en/6.2.1/understand/programming_model.html).
|
||||
* The [HIP virtual memory management](https://rocm.docs.amd.com/projects/HIP/en/6.2.1/how-to/virtual_memory.html) and [HIP virtual memory management API](https://rocm.docs.amd.com/projects/HIP/en/6.2.1/reference/virtual_memory_reference.html) topics have been added.
|
||||
|
||||
```{note}
|
||||
The ROCm documentation, like all ROCm projects, is open source and available on GitHub. To contribute to ROCm documentation, see the [ROCm documentation contribution guidelines](https://rocm.docs.amd.com/en/latest/contribute/contributing.html).
|
||||
```
|
||||
|
||||
9
tools/autotag/templates/known_issues/6.2.1.md
Normal file
9
tools/autotag/templates/known_issues/6.2.1.md
Normal file
@@ -0,0 +1,9 @@
|
||||
### Instinct MI300X GPU recovery failure on uncorrectable errors
|
||||
|
||||
For the AMD Instinct MI300X accelerator, GPU recovery resets triggered by uncorrectable errors (UE) might not complete
|
||||
successfully, which can result in the system being left in an undefined state. A system reboot is needed to recover from
|
||||
this state. Additionally, error logging might fail in these situations, hindering diagnostics.
|
||||
|
||||
This issue is under investigation and will be resolved in a future ROCm release.
|
||||
|
||||
See [issue #3766](https://github.com/ROCm/ROCm/issues/3766) on GitHub.
|
||||
@@ -1,2 +1,2 @@
|
||||
## Operating system and hardware support changes
|
||||
There are no changes to supported hardware or operating systems from ROCm 6.2.0 to ROCm 6.2.1.
|
||||
|
||||
|
||||
@@ -1,94 +1,7 @@
|
||||
### Default processor affinity behavior for helper threads
|
||||
|
||||
Processor affinity is a critical setting to ensure that ROCm helper threads run on the correct cores. By default, ROCm
|
||||
helper threads are spawned on all available cores, ignoring the parent thread’s processor affinity. This can lead to
|
||||
threads competing for available cores, which may result in suboptimal performance. This behavior occurs by default if
|
||||
the environment variable `HSA_OVERRIDE_CPU_AFFINITY_DEBUG` is not set or is set to `1`. If
|
||||
`HSA_OVERRIDE_CPU_AFFINITY_DEBUG` is set to `0`, the ROCr runtime uses the parent process's core affinity mask when
|
||||
creating helper threads. The parent’s affinity mask should then be set to account for the presence of additional threads
|
||||
by ensuring the affinity mask contains enough cores. Depending on the affinity settings of the software environment,
|
||||
batch system, launch commands like `numactl`/`taskset`, or explicit mask manipulation by the application itself, changing
|
||||
the setting may be advantageous to performance.
|
||||
|
||||
To ensure the parent's core affinity mask is honored by the ROCm helper threads, set the
|
||||
`HSA_OVERRIDE_CPU_AFFINITY_DEBUG` environment variable as follows:
|
||||
|
||||
```{code} shell
|
||||
export HSA_OVERRIDE_CPU_AFFINITY_DEBUG=0
|
||||
```
|
||||
|
||||
To ensure ROCm helper threads run on all available cores, set the `HSA_OVERRIDE_CPU_AFFINITY_DEBUG` environment variable
|
||||
as follows:
|
||||
|
||||
``` shell
|
||||
export HSA_OVERRIDE_CPU_AFFINITY_DEBUG=1
|
||||
```
|
||||
|
||||
Or the default:
|
||||
|
||||
``` shell
|
||||
|
||||
unset HSA_OVERRIDE_CPU_AFFINITY_DEBUG
|
||||
```
|
||||
|
||||
If unsure of the default processor affinity settings for your environment, run the following command from the shell:
|
||||
|
||||
``` shell
|
||||
|
||||
bash -c "echo taskset -p \$\$"
|
||||
```
|
||||
|
||||
See [issue #3493](https://github.com/ROCm/ROCm/issues/3493) on GitHub.
|
||||
|
||||
### Display issues on servers with Instinct MI300-series accelerators when loading AMDGPU driver
|
||||
|
||||
AMD Instinct MI300-series accelerators and third-party GPUs such as the Matrox G200 have an issue impacting video
|
||||
output. The issue was reproduced on a Dell server model PowerEdge XE9680. Servers from other vendors utilizing Matrox
|
||||
G200 cards may be impacted as well. This issue was found with ROCm 6.2.0 but is present in older ROCm versions.
|
||||
|
||||
The AMDGPU driver shipped with ROCm interferes with the operation of the display card video output. On Dell systems,
|
||||
this includes both the local video output and remote access via iDRAC. The display appears blank (black) after loading
|
||||
the `amdgpu` driver modules. Video output impacts both terminal access when running in `runlevel 3` and GUI access when
|
||||
running in `runlevel 5`. Server functionality can still be accessed via SSH or other remote connection methods.
|
||||
|
||||
See [issue #3494](https://github.com/ROCm/ROCm/issues/3494) on GitHub.
|
||||
|
||||
### KFDTest failure on Instinct MI300X with Oracle Linux 8.9
|
||||
|
||||
The `KFDEvictTest.QueueTest` is failing on the MI300X platform during KFD (Kernel Fusion Driver) tests, causing the full
|
||||
suite to not execute properly. This issue is suspected to be hardware-related.
|
||||
|
||||
See [issue #3495](https://github.com/ROCm/ROCm/issues/3495) on GitHub.
|
||||
|
||||
### Bandwidth limitation in gang and non-gang modes on Instinct MI300A
|
||||
|
||||
Expected target peak non-gang performance (~60GB/s) and target peak gang performance (~90GB/s) are not achieved. Both gang
|
||||
and non-gang performance are observed to be limited at 45GB/s.
|
||||
|
||||
This issue will be addressed in a future ROCm release.
|
||||
|
||||
See [issue #3496](https://github.com/ROCm/ROCm/issues/3496) on GitHub.
|
||||
|
||||
### rocm-llvm-alt
|
||||
|
||||
ROCm provides an optional package -- `rocm-llvm-alt` -- that provides a closed-source compiler for
|
||||
users interested in additional closed-source CPU optimizations. This feature is not functional in
|
||||
the ROCm 6.2.0 release. Users who attempt to invoke the closed-source compiler will experience an
|
||||
LLVM consumer-producer mismatch and the compilation will fail. There is no workaround that allows
|
||||
use of the closed-source compiler. It is recommended to compile using the default open-source
|
||||
compiler, which generates high-quality AMD CPU and AMD GPU code.
|
||||
The `rocm-llvm-alt` package will be removed in an upcoming release. Users relying on the functionality provided by the closed-source compiler should transition to the open-source compiler. Once the `rocm-llvm-alt` package is removed, any compilation requesting functionality provided by the closed-source compiler will result in a Clang warning: "*[AMD] proprietary optimization compiler has been removed*".
|
||||
|
||||
See [issue #3492](https://github.com/ROCm/ROCm/issues/3492) on GitHub.
|
||||
### rccl-rdma-sharp-plugins
|
||||
|
||||
## ROCm upcoming changes
|
||||
|
||||
The section notes upcoming changes to the ROCm software stack. For upcoming changes related to individual components, review
|
||||
the [Detailed component changes](detailed-component-changes).
|
||||
|
||||
### rocm-llvm-alt
|
||||
|
||||
The `rocm-llvm-alt` package will be removed in an upcoming release. Users relying on the
|
||||
functionality provided by the closed-source compiler should transition to the open-source compiler.
|
||||
Once the `rocm-llvm-alt` package is removed, any compilation requesting functionality provided by
|
||||
the closed-source compiler will result in a Clang warning: "*[AMD] proprietary optimization compiler
|
||||
has been removed*".
|
||||
The RCCL plugin package, `rccl-rdma-sharp-plugins`, will be removed in an upcoming ROCm release.
|
||||
|
||||
Reference in New Issue
Block a user