GitHub issue added to 702 known issues (#5520)

* GitHub issue added to 702 known issues

* Added missing RCCL changelog
This commit is contained in:
Pratik Basyal
2025-10-15 09:58:23 -04:00
committed by GitHub
parent 170cb47a4f
commit f21cfe1171
2 changed files with 7 additions and 3 deletions

View File

@@ -912,11 +912,15 @@ HIP runtime has the following functional improvements which improves runtime per
* Compatibility with NCCL 2.25.1. * Compatibility with NCCL 2.25.1.
* Compatibility with NCCL 2.26.6. * Compatibility with NCCL 2.26.6.
#### Optimized
* Improved the performance of the `FP8` Sum operation by upcasting to `FP16`.
#### Resolved issues #### Resolved issues
* Resolved an issue when using more than 64 channels when multiple collectives are used in the same `ncclGroup()` call. * Resolved an issue when using more than 64 channels when multiple collectives are used in the same `ncclGroup()` call.
* Fixed unit test failures in tests ending with the `ManagedMem` and `ManagedMemGraph` suffixes. * Fixed unit test failures in tests ending with the `ManagedMem` and `ManagedMemGraph` suffixes.
* Fixed a suboptimal algorithmic switching point for AllReduce on the AMD Instinct MI300X. * Fixed a suboptimal algorithmic switching point for AllReduce on the AMD Instinct MI300X.
* Fixed broken functionality within the LL protocol on gfx950 by disabling inlining of LLGenericOp kernels.
* Fixed the known issue "When splitting a communicator using `ncclCommSplit` in some GPU configurations, MSCCL initialization can cause a segmentation fault" with a design change to use `comm` instead of `rank` for `mscclStatus`. The global map for `comm` to `mscclStatus` is still not thread safe but should be explicitly handled by mutexes for read-write operations. This is tested for correctness, but there is a plan to use a thread-safe map data structure in an upcoming release. * Fixed the known issue "When splitting a communicator using `ncclCommSplit` in some GPU configurations, MSCCL initialization can cause a segmentation fault" with a design change to use `comm` instead of `rank` for `mscclStatus`. The global map for `comm` to `mscclStatus` is still not thread safe but should be explicitly handled by mutexes for read-write operations. This is tested for correctness, but there is a plan to use a thread-safe map data structure in an upcoming release.
### **rocAL** (2.3.0) ### **rocAL** (2.3.0)

View File

@@ -699,7 +699,7 @@ The problem occurs when attempting to debug a program that contains code that ru
The ROCR Debug Agent might also become unresponsive when attempting to capture data from a program that is experiencing queue errors, memory faults, or other triggering events. The ROCR Debug Agent might also become unresponsive when attempting to capture data from a program that is experiencing queue errors, memory faults, or other triggering events.
For a detailed workaround, see the [Installation troubleshooting](https://rocm.docs.amd.com/projects/install-on-linux/en/latest/reference/install-faq.html#issue-10-rocm-debugging-tools-might-become-unresponsive-in-selinux-enabled-distributions) documentation. This issue will be fixed in a future ROCm release. For a detailed workaround, see the [Installation troubleshooting](https://rocm.docs.amd.com/projects/install-on-linux/en/latest/reference/install-faq.html#issue-10-rocm-debugging-tools-might-become-unresponsive-in-selinux-enabled-distributions) documentation. This issue will be fixed in a future ROCm release. See [GitHub issue #5498](https://github.com/ROCm/ROCm/issues/5498).
### MIGraphX Python API will fail when running on Python 3.13 ### MIGraphX Python API will fail when running on Python 3.13
@@ -708,11 +708,11 @@ Applications using the MIGraphX Python API will fail when running on Python 3.13
``` ```
ls -l /opt/rocm-7.0.0/lib/libmigraphx_py_*.so ls -l /opt/rocm-7.0.0/lib/libmigraphx_py_*.so
``` ```
The issue will be resolved in a future ROCm release. The issue will be resolved in a future ROCm release. See [GitHub issue #5500](https://github.com/ROCm/ROCm/issues/5500).
### Applications using OpenCV might fail due to package incompatibility between the OS ### Applications using OpenCV might fail due to package incompatibility between the OS
OpenCV packages built on Ubuntu 24.04 are incompatible with Debian 13 due to a version conflict. As a result, applications, tests, and samples that use OpenCV might fail. To avoid the version conflict, rebuild OpenCV with the version corresponding to Debian 13, then rebuild MIVisionX on top of it. As a workaround, rebuild OpenCV from source, followed by the application that uses OpenCV. This issue will be fixed in a future ROCm release. OpenCV packages built on Ubuntu 24.04 are incompatible with Debian 13 due to a version conflict. As a result, applications, tests, and samples that use OpenCV might fail. To avoid the version conflict, rebuild OpenCV with the version corresponding to Debian 13, then rebuild MIVisionX on top of it. As a workaround, rebuild OpenCV from source, followed by the application that uses OpenCV. This issue will be fixed in a future ROCm release. See [GitHub issue #5501](https://github.com/ROCm/ROCm/issues/5501).
## ROCm upcoming changes ## ROCm upcoming changes