Compare commits

..

9 Commits

Author SHA1 Message Date
Istvan Kiss
27e8d9a012 Update release notes links ROCm 7.1.1 2025-11-26 19:46:10 +01:00
Pratik Basyal
a966db29ca Known issue from 7.1.0 removed (#5702) (#5703) 2025-11-26 12:30:28 -05:00
Pratik Basyal
9ea8a48b3a Link and PyTorch version updated (#5700) (#5701) 2025-11-26 12:01:12 -05:00
Alex Xu
9956d72614 fix dependency 2025-11-26 11:42:22 -05:00
Alex Xu
305d24f486 Merge branch 'roc-7.1.x' into docs/7.1.1 2025-11-26 11:37:06 -05:00
Alex Xu
26f6b6b3e1 Merge branch 'roc-7.1.x' into docs/7.1.1 2025-11-26 11:29:02 -05:00
Alex Xu
d4cdbd79a3 Merge branch 'develop' into docs/7.1.1 2025-11-26 08:47:19 -05:00
alexxu-amd
26d1ab7d27 Update documentation requirements 2025-11-25 16:30:46 -05:00
alexxu-amd
272c9f6be3 Update documentation requirements 2025-11-25 15:37:04 -05:00
6 changed files with 19 additions and 70 deletions

View File

@@ -36,7 +36,6 @@ Andrej
Arb
Autocast
autograd
Backported
BARs
BatchNorm
BLAS
@@ -203,11 +202,9 @@ GenAI
GenZ
GitHub
Gitpod
hardcoded
HBM
HCA
HGX
HLO
HIPCC
hipDataType
HIPExtension
@@ -332,7 +329,6 @@ MoEs
Mooncake
Mpops
Multicore
multihost
Multithreaded
mx
MXFP
@@ -1024,7 +1020,6 @@ uncacheable
uncorrectable
underoptimized
unhandled
unfused
uninstallation
unmapped
unsqueeze

View File

@@ -100,13 +100,12 @@ firmware, AMD GPU drivers, and the ROCm user space software.
01.25.16.03<br>
01.25.15.04
</td>
<td>
<td rowspan="2" style="vertical-align: middle;">
30.20.1<br>
30.20.0<br>
30.10.2<br>
30.10.1<br>
30.10
</td>
30.10</td>
<td rowspan="3" style="vertical-align: middle;">8.6.0.K</td>
</tr>
<tr>
@@ -115,13 +114,6 @@ firmware, AMD GPU drivers, and the ROCm user space software.
01.25.16.03<br>
01.25.15.04
</td>
<td>
30.20.1<br>
30.20.0<br>
30.10.2<br>
30.10.1<br>
30.10
</td>
</tr>
<tr>
<td>MI325X<a href="#footnote1"><sup>[1]</sup></a></td>
@@ -270,26 +262,26 @@ The [ROCm examples repository](https://github.com/ROCm/rocm-examples) has been e
:margin: auto 0 auto auto
:::{grid}
:margin: auto 0 auto auto
* [hipBLASLt](https://rocm.docs.amd.com/projects/hipBLASLt/en/latest/)
* [hipSPARSE](https://rocm.docs.amd.com/projects/hipSPARSE/en/latest/)
* [hipSPARSELt](https://rocm.docs.amd.com/projects/hipSPARSELt/en/latest/)
* [hipTensor](https://rocm.docs.amd.com/projects/hipTensor/en/latest/)
* [hipBLASLt](https://github.com/ROCm/rocm-examples/tree/amd-staging/Libraries/hipBLASLt)
* [hipSPARSE](https://github.com/ROCm/rocm-examples/tree/amd-staging/Libraries/hipSPARSE)
* [hipSPARSELt](https://github.com/ROCm/rocm-examples/tree/amd-staging/Libraries/hipSPARSELt)
* [hipTensor](https://github.com/ROCm/rocm-examples/tree/amd-staging/Libraries/hipTensor)
:::
:::{grid}
:margin: auto 0 auto auto
* [rocALUTION](https://rocm.docs.amd.com/projects/rocALUTION/en/latest/)
* [ROCprofiler-SDK](https://rocm.docs.amd.com/projects/rocprofiler-sdk/en/latest/)
* [rocWMMA](https://rocm.docs.amd.com/projects/rocWMMA/en/latest/)
* [rocALUTION](https://github.com/ROCm/rocm-examples/tree/amd-staging/Libraries/rocALUTION)
* [ROCprofiler-SDK](https://github.com/ROCm/rocm-examples/tree/amd-staging/Libraries/rocProfiler-SDK)
* [rocWMMA](https://github.com/ROCm/rocm-examples/tree/amd-staging/Libraries/rocWMMA)
:::
::::
Usage examples are now available for the following performance analysis tools:
* [ROCm Compute Profiler](https://rocm.docs.amd.com/projects/rocprofiler-compute/en/latest/index.html)
* [ROCm Systems Profiler](https://rocm.docs.amd.com/projects/rocprofiler-systems/en/latest/index.html)
* [rocprofv3](https://rocm.docs.amd.com/projects/rocprofiler-sdk/en/latest/how-to/using-rocprofv3.html)
* [ROCm Compute Profiler](https://github.com/ROCm/rocm-examples/tree/amd-staging/Tools/rocprof-compute)
* [ROCm Systems Profiler](https://github.com/ROCm/rocm-examples/tree/amd-staging/Tools/rocprof-systems)
* [rocprofv3](https://github.com/ROCm/rocm-examples/tree/amd-staging/Tools/rocprofv3)
The complete source code for the [HIP Graph Tutorial](https://rocm.docs.amd.com/projects/HIP/en/latest/tutorial/graph_api.html) is also available as part of the ROCm examples.
The complete source code for the [HIP Graph Tutorial](https://github.com/ROCm/rocm-examples/tree/amd-staging/HIP-Doc/Tutorials/graph_api) is also available as part of the ROCm examples.
### ROCm documentation updates
@@ -839,7 +831,7 @@ issues related to individual components, review the [Detailed component changes]
### RCCL performance degradation on AMD Instinct MI300X GPU with AMD Pollara AI NIC
If youre using RCCL on AMD Instinct MI300X GPUs with AMD Pollara AI NIC, you might observe performance degradation for specific collectives and message sizes. The affected collectives are `Scatter`, `AllToAll`, and `AlltoAllv`. It's recommended to avoid using RCCL packaged with ROCm 7.1.1. As a workaround, use the {fab}`github`[RCCL `develop` branch](https://github.com/ROCm/rccl/tree/develop), which contains the fix and will be included in a future ROCm release. See [GitHub issue #5717](https://github.com/ROCm/ROCm/issues/5717).
If youre using RCCL on AMD Instinct MI300X GPUs with AMD Pollara AI NIC, you might observe performance degradation for specific collectives and message sizes. The affected collectives are `Scatter`, `AllToAll`, and `AlltoAllv`. It's recommended to avoid using RCCL packaged with ROCm 7.1.1. As a workaround, use the {fab}`github`[RCCL `develop` branch](https://github.com/ROCm/rccl/tree/develop), which contains the fix and will be included in a future ROCm release.
### Segmentation fault in training models using TensorFlow 2.20.0 Docker images
@@ -847,7 +839,7 @@ Training models `tf2_tfm_resnet50_fp16_train` and `tf2_tfm_resnet50_fp32_train`
might fail with a segmentation fault when run on the TensorFlow 2.20.0 Docker
image with ROCm 7.1.1. As a workaround, use TensorFlow 2.19.x Docker image for
training the models in ROCm 7.1.1. This issue will be fixed in a future ROCm
release. See [GitHub issue #5718](https://github.com/ROCm/ROCm/issues/5718).
release.
### AMD SMI CLI triggers repeated kernel errors on GPUs with partitioning support
@@ -866,19 +858,11 @@ amdgpu 0000:15:00.0: amdgpu: renderD153 partition 1 not valid!
These repeated kernel logs can clutter the system logs and may cause
unnecessary concern about GPU health. However, this is a non-functional issue
and does not affect AMD SMI functionality or GPU performance. This issue will
be fixed in a future ROCm release. See [GitHub issue #5720](https://github.com/ROCm/ROCm/issues/5720).
be fixed in a future ROCm release.
### Excessive bad page logs in AMD GPU Driver (amdgpu)
Due to partial data corruption in the Electrically Erasable Programmable Read-Only Memory (EEPROM) and limited error handling in the AMD GPU Driver (amdgpu), excessive log output might occur when querying the reliability, availability, and serviceability (RAS) bad pages. This issue will be fixed in a future AMD GPU Driver (amdgpu) and ROCm release. See [GitHub issue #5719](https://github.com/ROCm/ROCm/issues/5719).
### Incorrect results in gemm_ex operations for rocBLAS and hipBLAS
Some `gemm_ex` operations with 8-bit input data types (`int8`, `float8`, `bfloat8`) for specific matrix dimensions (K = 1 and number of workgroups > 1) might yield incorrect results. The issue results from incorrect tailloop code that fails to consider workgroup index when calculating valid element size. The issue will be fixed in a future ROCm release. See [GitHub issue #5722](https://github.com/ROCm/ROCm/issues/5722).
### hipBLASLt performance variation for a particular FP8 GEMM operation on AMD Instinct MI325X GPUs
If youre using hipBLASLt on AMD Instinct MI325X GPUs for large FP8 GEMM operations (such as 9728x8192x65536), you might observe a noticeable performance variation. The issue is currently under investigation and will be fixed in a future ROCm release. See [GitHub issue #5734](https://github.com/ROCm/ROCm/issues/5734).
Due to partial data corruption of Electrically Erasable Programmable Read-Only Memory (EEPROM) and limited error handling in the AMD GPU Driver(amdgpu), excessive log output might result when querying the reliability, availability, and serviceability (RAS) bad pages. This issue will be fixed in a future AMD GPU Driver(amdgpu) and ROCm release.
## ROCm resolved issues

View File

@@ -269,33 +269,6 @@ For a complete and up-to-date list of JAX public modules (for example, ``jax.num
JAX API modules are maintained by the JAX project and is subject to change.
Refer to the official Jax documentation for the most up-to-date information.
Key features and enhancements for ROCm 7.1
===============================================================================
- Enabled compilation of multihost HLO runner Python bindings.
- Backported multihost HLO runner bindings and some related changes to
:code:`FunctionalHloRunner`.
- Added :code:`requirements_lock_3_12` to enable building for Python 3.12.
- Removed hardcoded NHWC convolution layout for ``fp16`` precision to address the performance drops for ``fp16`` precision on gfx12xx GPUs.
- ROCprofiler-SDK integration:
- Integrated ROCprofiler-SDK (v3) to XLA to improve profiling of GPU events,
support both time-based and step-based profiling.
- Added unit tests for :code:`rocm_collector` and :code:`rocm_tracer`.
- Added Triton unsupported conversion from ``f8E4M3FNUZ`` to ``fp16`` with
rounding mode.
- Introduced :code:`CudnnFusedConvDecomposer` to revert fused convolutions
when :code:`ConvAlgorithmPicker` fails to find a fused algorithm, and removed
unfused fallback paths from :code:`RocmFusedConvRunner`.
Key features and enhancements for ROCm 7.0
===============================================================================

View File

@@ -249,6 +249,3 @@ html_context = {
"granularity_type" : [('Coarse-grained', 'coarse-grained'), ('Fine-grained', 'fine-grained')],
"scope_type" : [('Device', 'device'), ('System', 'system')]
}
# Disable figure and table numbering
numfig = False

View File

@@ -1,4 +1,4 @@
rocm-docs-core==1.30.1
rocm-docs-core==1.29.0
sphinx-reredirects
sphinx-sitemap
sphinxcontrib.datatemplates==0.11.0

View File

@@ -187,7 +187,7 @@ requests==2.32.5
# via
# pygithub
# sphinx
rocm-docs-core==1.30.1
rocm-docs-core==1.29.0
# via -r requirements.in
rpds-py==0.29.0
# via