rocBLAS precision known issue added [Develop] (#5641)

* rocBLAS precision known issue added * IPC note removed * Review feedback added
2026-01-07 22:03:58 -05:00 · 2025-11-07 19:45:33 -05:00
parent 148d6670ad
commit 4881887e2c
2 changed files with 7 additions and 7 deletions
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -618,7 +618,7 @@ for a complete overview of this release.

 #### Known issues

-* PyTorch and other Python applications might fail to profile device activities when it is unable to find the libraries in the default linker path. As a workaround, you need to explicitly add the library path to ``LD_LIBRARY_PATH``. For PyTorch use:
+* Profiling PyTorch and other AI workloads might fail because it is unable to find the libraries in the default linker path. As a workaround, you need to explicitly add the library path to ``LD_LIBRARY_PATH``. For example, when using PyTorch with Python 3.10, add the following to the environment:

 ```
 export LD_LIBRARY_PATH=:/opt/venv/lib/python3.10/site-packages/torch/lib:$LD_LIBRARY_PATH
--- a/RELEASE.md
+++ b/RELEASE.md
@@ -53,10 +53,6 @@ For more information about supported:

 * Operating systems, see [Supported operating systems](https://rocm.docs.amd.com/projects/install-on-linux/en/docs-7.1.0/reference/system-requirements.html#supported-operating-systems) and [ROCm installation for Linux](https://rocm.docs.amd.com/projects/install-on-linux/en/docs-7.1.0/).

-```{note}
-Starting ROCm 7.1.0, Upstream Inter-Process Communication (IPC) works with Checkpoint Restore in User space (CRIU) feature, but it requires the most up-to-date kernel and CRIU plugin. 
-```
-
 #### Virtualization support

 ROCm 7.1.0 adds Guest OS support for RHEL 10.0 in KVM SR-IOV for AMD Instinct MI355X and MI350X GPUs.
@@ -1309,7 +1305,7 @@ For a historical overview of ROCm component updates, see the {doc}`ROCm consolid

 #### Known issues

-* PyTorch and other Python applications might fail to profile device activities when it is unable to find the libraries in the default linker path. As a workaround, you need to explicitly add the library path to ``LD_LIBRARY_PATH``. For PyTorch use:
+* Profiling PyTorch and other AI workloads might fail because it is unable to find the libraries in the default linker path. As a workaround, you need to explicitly add the library path to ``LD_LIBRARY_PATH``. For example, when using PyTorch with Python 3.10, add the following to the environment:

 ```
 export LD_LIBRARY_PATH=:/opt/venv/lib/python3.10/site-packages/torch/lib:$LD_LIBRARY_PATH
@@ -1511,7 +1507,11 @@ Running `hipblaslt-test` or `hipblaslt-bench` without installing the OpenBLAS de
 ```
 libopenblas.so.0: cannot open shared object file: No such file or directory
 ```
-As a workaround, first install `libopenblas-dev` or `libopenblas-deve`, depending on the package manager used. The issue will be fixed in a future ROCm release.
+As a workaround, first install `libopenblas-dev` or `libopenblas-deve`, depending on the package manager used. The issue will be fixed in a future ROCm release. See [GitHub issue #5639](https://github.com/ROCm/ROCm/issues/5639).
+
+### Reduced precision in gemm_ex operations for rocBLAS and hipBLAS
+
+Some `gemm_ex` operations with `half` or `f32_r` data types might yield 16-bit precision results instead of the expected 32-bit precision when matrix dimensions are m=1 or n=1. The issue results from the optimization that enables `_ex` APIs to use lower precision multiples. It limits the high-precision matrix operations performed in PyTorch with rocBLAS and hipBLAS. The issue will be fixed in a future ROCm release. See [GitHub issue #5640](https://github.com/ROCm/ROCm/issues/5640).

 ## ROCm resolved issues