Compare commits

...

14 Commits

Author SHA1 Message Date
Pratik Basyal
af8ea73581 720 reference link update and note fixes [Develop] (#5883) (#5884)
* Links updated to 7.2.0

* COmpatibility note fixed
2026-01-22 12:21:46 -05:00
Pratik Basyal
f1c86d7d29 720 Post GA Known Issues update (#5879)
* 7.2.0 Known issues and PLDM table updated (#5877)

* Known issues and PLDM table updated

* JAX workload known issues added

* Minor changes

* Minor update
2026-01-21 17:29:18 -05:00
Alex Xu
370816001e Merge branch 'roc-7.2.x' into docs/7.2.0 2026-01-21 15:29:08 -05:00
Swati Rawat
d5994da509 Merge pull request #5872 from SwRaw/swaraw_cherrypick
Cherrypicking replacement of rocm-smi with amd-smi from ROCm internal
2026-01-21 19:10:51 +05:30
srawat
c02f86c0e7 Update prerequisite-system-validation.rst 2026-01-21 17:43:10 +05:30
srawat
d3523c24d3 replace rocm-smi reference with amd-smi 2026-01-21 17:40:26 +05:30
Swati Rawat
1980239b81 Update docs/how-to/rocm-for-ai/training/benchmark-docker/previous-versions/megatron-lm-v24.12-dev.rst
Co-authored-by: peterjunpark <git@peterjunpark.com>
2026-01-21 17:31:41 +05:30
Swati Rawat
c75fd6f532 Update docs/how-to/rocm-for-ai/system-setup/prerequisite-system-validation.rst
Co-authored-by: peterjunpark <git@peterjunpark.com>
2026-01-21 17:31:05 +05:30
Swati Rawat
72cb598190 Update docs/how-to/rocm-for-ai/system-setup/prerequisite-system-validation.rst
Co-authored-by: peterjunpark <git@peterjunpark.com>
2026-01-21 17:30:33 +05:30
Swati Rawat
9b55b77aaa Update docs/how-to/rocm-for-ai/system-setup/prerequisite-system-validation.rst
Co-authored-by: peterjunpark <git@peterjunpark.com>
2026-01-21 17:29:45 +05:30
Swati Rawat
8267303e1d Update docs/how-to/rocm-for-ai/training/benchmark-docker/previous-versions/megatron-lm-v24.12-dev.rst
Co-authored-by: peterjunpark <git@peterjunpark.com>
2026-01-21 17:29:04 +05:30
Swati Rawat
86d2c4e891 Update docs/how-to/rocm-for-ai/training/benchmark-docker/previous-versions/megatron-lm-v24.12-dev.rst
Co-authored-by: peterjunpark <git@peterjunpark.com>
2026-01-21 17:28:23 +05:30
srawat
2977e35330 Update single-gpu-fine-tuning-and-inference.rst 2026-01-21 17:27:13 +05:30
srawat
e95955f572 Update multi-gpu-fine-tuning-and-inference.rst 2026-01-21 17:27:13 +05:30
4 changed files with 22 additions and 12 deletions

View File

@@ -199,7 +199,7 @@ for a complete overview of this release.
* fftw_execute_dft_c2r
* fftwf_execute_dft_c2r
### **HIPIFY** (22.2.0)
### **HIPIFY** (22.0.0)
#### Added
@@ -279,6 +279,10 @@ for a complete overview of this release.
* Updated clang/llvm to AMD clang version 22.0.0 (equivalent to LLVM 22.0.0 with additional out-of-tree patches).
#### Upcoming changes
* As of ROCm 7.2.0, the [HIPCC](https://rocm.docs.amd.com/projects/HIPCC/en/latest/index.html) compiler is deprecated. HIPCC now invokes [AMD Clang](https://rocm.docs.amd.com/projects/llvm-project/en/latest/index.html). Its recommended that you now invoke AMD Clang directly rather than using HIPCC. There isnt any expected impact on usability, functionality, or performance when invoking AMD Clang directly. In a future ROCm release, HIPCC will become a symbolic link to AMD Clang.
### **MIGraphX** (2.15.0)
#### Added

View File

@@ -139,7 +139,7 @@ GPU and baseboard firmware versioning might differ across GPU families.
</td>
</tr>
<tr>
<td>MI300X</td>
<td>MI300X<a href="#footnote1"><sup>[2]</sup></a></td>
<td>01.25.03.12</td>
<td rowspan="6" style="vertical-align: middle;">
30.30.0<br>
@@ -180,6 +180,7 @@ GPU and baseboard firmware versioning might differ across GPU families.
</div>
<p id="footnote1">[1]: For AMD Instinct MI325X KVM SR-IOV users, don't use AMD GPU driver (amdgpu) 30.20.0.</p>
<p id="footnote1">[2]: For AMD Instinct MI300X KVM SR-IOV with Multi-VF (8 VF) support requires a compatible firmware BKC bundle for the GPU which will be released in coming months.</p>
#### Node power management for multi-GPU nodes added
@@ -245,7 +246,7 @@ New Stream Management API `hipStreamCopyAttributes` is implemented for CUDA Pari
The rocSHMEM communications library has added the GDA (GPUDirect Async) intra-node and inter-node communication backend conduit. This new backend enables communication between GPUs within a node or between nodes through a RNIC (RDMA NIC) using device-initiated GPU kernels to communicate with other GPUs. The GPU directly interacts with the RNIC with no host (CPU) involvement in the critical path of communication.
In addition to the already supported GDA NIC types, Mellanox CX-7 and Broadcom Thor2, ROCm 7.2.0 introduces support for AMD Pensando AI NIC installed with the corresponding driver and firmware versions that support GDA functionality. For more information, see [Installing rocSHMEM](https://rocm.docs.amd.com/projects/rocSHMEM/en/latest/install.html).
In addition to the already supported GDA NIC types, Mellanox CX-7 and Broadcom Thor2, ROCm 7.2.0 introduces support for AMD Pensando AI NIC installed with the corresponding driver and firmware versions that support GDA functionality. For more information, see [Installing rocSHMEM](https://rocm.docs.amd.com/projects/rocSHMEM/en/docs-7.2.0/install.html).
### Software-managed plan cache support for hipTensor
@@ -285,7 +286,7 @@ MIGraphX has the following enhancements:
### AMDGPU wavefront size macro removal
The `__AMDGCN_WAVEFRONT_SIZE` and `__AMDGCN_WAVEFRONT_SIZE__` macros, which provided a compile-time-constant wavefront size, are removed. Where required, the wavefront size should instead be queried using the warpSize variable in device code, or using `hipGetDeviceProperties` in host code. Neither of these will result in a compile-time constant. For more information, see [warpSize](https://rocm.docs.amd.com/projects/HIP/en/latest/how-to/hip_cpp_language_extensions.html#warpsize).
The `__AMDGCN_WAVEFRONT_SIZE` and `__AMDGCN_WAVEFRONT_SIZE__` macros, which provided a compile-time-constant wavefront size, are removed. Where required, the wavefront size should instead be queried using the warpSize variable in device code, or using `hipGetDeviceProperties` in host code. Neither of these will result in a compile-time constant. For more information, see [warpSize](https://rocm.docs.amd.com/projects/HIP/en/docs-7.2.0/how-to/hip_cpp_language_extensions.html#warpsize).
For cases where compile-time evaluation of the wavefront size cannot be avoided, uses of `__AMDGCN_WAVEFRONT_SIZE` or `__AMDGCN_WAVEFRONT_SIZE__` can be replaced with a user-defined macro or `constexpr` variable with the wavefront size(s) for the target hardware. For example:
```
@@ -349,13 +350,13 @@ The ROCm Offline Installer Creator 7.2.0 includes the following features and imp
* Fixes for Oracle Linux 10.0 ROCm and driver minimum mode installer creation.
* Added support for creating an offline installer for Oracle Linux 8, 9, and 10, where the kernel version of the target OS differs from the host OS creating the installer.
See [ROCm Offline Installer Creator](https://rocm.docs.amd.com/projects/install-on-linux/en/latest/install/rocm-offline-installer.html) for more information.
See [ROCm Offline Installer Creator](https://rocm.docs.amd.com/projects/install-on-linux/en/docs-7.2.0/install/rocm-offline-installer.html) for more information.
### ROCm Runfile Installer updates
The ROCm Runfile Installer 7.2.0 includes fixes for rocm-examples test script build issues.
For more information, see [ROCm Runfile Installer](https://rocm.docs.amd.com/projects/install-on-linux/en/latest/install/rocm-runfile-installer.html).
For more information, see [ROCm Runfile Installer](https://rocm.docs.amd.com/projects/install-on-linux/en/docs-7.2.0/install/rocm-runfile-installer.html).
### Expansion of the ROCm examples repository
@@ -375,7 +376,7 @@ Usage examples are now available for the [ROCgdb](https://github.com/ROCm/rocm-e
ROCm documentation continues to be updated to provide clearer and more comprehensive guidance for a wider variety of user needs and use cases.
* The newest resource for ROCm and HIP developers is the [AMD ROCm Programming Guide](https://rocm-handbook.amd.com/projects/amd-rocm-programming-guide/en/latest/). This guide introduces the core concepts, APIs, and best practices for programming with ROCm and the HIP programming language. It provides hands-on guidance for writing GPU kernels, managing memory, optimizing performance, and integrating HIP with the broader AMD ROCm ecosystem of tools and libraries. The [HIP documentation](https://rocm.docs.amd.com/projects/HIP/en/latest/index.html) set continues to provide detailed information, tutorials, and reference content.
* The newest resource for ROCm and HIP developers is the [AMD ROCm Programming Guide](https://rocm-handbook.amd.com/projects/amd-rocm-programming-guide/en/docs-7.2.0/). This guide introduces the core concepts, APIs, and best practices for programming with ROCm and the HIP programming language. It provides hands-on guidance for writing GPU kernels, managing memory, optimizing performance, and integrating HIP with the broader AMD ROCm ecosystem of tools and libraries. The [HIP documentation](https://rocm.docs.amd.com/projects/HIP/en/docs-7.2.0/index.html) set continues to provide detailed information, tutorials, and reference content.
* The HIP Programming Guide section includes a new topic titled [“Understanding GPU performance”](https://rocm.docs.amd.com/projects/HIP/en/docs-7.2.0/understand/performance_optimization.html). It explains the theoretical foundations of GPU performance on AMD hardware. Understanding these concepts helps you analyze performance characteristics, identify bottlenecks, and make informed optimization decisions. Two other topics in this guide have been enhanced: [Performance guidelines](https://rocm.docs.amd.com/projects/HIP/en/docs-7.2.0/how-to/performance_guidelines.html) and [Hardware implementation](https://rocm.docs.amd.com/projects/HIP/en/docs-7.2.0/understand/hardware_implementation.html).
@@ -913,7 +914,7 @@ For a historical overview of ROCm component updates, see the {doc}`ROCm consolid
* fftw_execute_dft_c2r
* fftwf_execute_dft_c2r
### **HIPIFY** (22.2.0)
### **HIPIFY** (22.0.0)
#### Added
@@ -995,7 +996,7 @@ For a historical overview of ROCm component updates, see the {doc}`ROCm consolid
#### Upcoming changes
* As of ROCm 7.2.0, the [HIPCC](https://rocm.docs.amd.com/projects/HIPCC/en/latest/index.html) compiler is deprecated. HIPCC now invokes [AMD Clang](https://rocm.docs.amd.com/projects/llvm-project/en/latest/index.html). Its recommended that you now invoke AMD Clang directly rather than using HIPCC. There isnt any expected impact on usability, functionality, or performance when invoking AMD Clang directly. In a future ROCm release, HIPCC will become a symbolic link to AMD Clang.
* As of ROCm 7.2.0, the [HIPCC](https://rocm.docs.amd.com/projects/HIPCC/en/docs-7.2.0/index.html) compiler is deprecated. HIPCC now invokes [AMD Clang](https://rocm.docs.amd.com/projects/llvm-project/en/docs-7.2.0/index.html). Its recommended that you now invoke AMD Clang directly rather than using HIPCC. There isnt any expected impact on usability, functionality, or performance when invoking AMD Clang directly. In a future ROCm release, HIPCC will become a symbolic link to AMD Clang.
### **MIGraphX** (2.15.0)
@@ -1432,7 +1433,11 @@ python3 -m pip install --user .
`sudo` might be required. Use flag `--break-system-packages` if `pip un/installation` fails.
```
For detailed instructions, see [Install the Python library for multiple ROCm instances](https://rocm.docs.amd.com/projects/amdsmi/en/latest/install/install.html#install-the-python-library-for-multiple-rocm-instances). The issue will be fixed in a future ROCm release.
For detailed instructions, see [Install the Python library for multiple ROCm instances](https://rocm.docs.amd.com/projects/amdsmi/en/latest/install/install.html#install-the-python-library-for-multiple-rocm-instances). The issue will be fixed in a future ROCm release. See [GitHub issue #5875](https://github.com/ROCm/ROCm/issues/5875).
### Intermittent errors when running JAX workloads
You might experience intermittent errors or segmentation faults when running JAX workloads. The issue is currently under investigation and will be addressed in an upcoming ROCm release. See [GitHub issue #5878](https://github.com/ROCm/ROCm/issues/5878).
### hipBLASLt performance variation for a particular FP8 GEMM operation on AMD Instinct MI325X GPUs
@@ -1477,7 +1482,7 @@ The following changes to the ROCm software stack are anticipated for future rele
### ROCm Offline Installer Creator deprecation
The ROCm Offline Installer Creator is deprecated with the ROCm 7.2.0 release. Equivalent installation capabilities are available through the ROCm Runfile Installer, a self-extracting installer that is not based on OS package managers. This installer will be removed in a future release.
The [ROCm Offline Installer Creator](https://rocm.docs.amd.com/projects/install-on-linux/en/docs-7.2.0/install/rocm-offline-installer.html) is deprecated with the ROCm 7.2.0 release and will be removed in a future release. Equivalent installation capabilities are available through the [ROCm Runfile Installer](https://rocm.docs.amd.com/projects/install-on-linux/en/docs-7.2.0/install/rocm-runfile-installer.html), a self-extracting installer that is not based on OS package managers.
### ROCm SMI deprecation

View File

@@ -171,6 +171,7 @@ Operating systems, kernel and Glibc versions
*********************************************
For detailed information on operating system supported on ROCm 7.2.0 and associated Kernel and Glibc version, see the latest :ref:`supported_distributions`. For version specific information, see `ROCm 7.1.1 <https://rocm.docs.amd.com/projects/install-on-linux/en/docs-7.1.1/reference/system-requirements.html#supported-operating-systems>`__, and `ROCm 6.4.0 <https://rocm.docs.amd.com/projects/install-on-linux/en/docs-6.4.0/reference/system-requirements.html#supported-operating-systems>`__.
.. note::
* See `Red Hat Enterprise Linux Release Dates <https://access.redhat.com/articles/3078>`_ to learn about the specific kernel versions supported on Red Hat Enterprise Linux (RHEL).

View File

@@ -130,7 +130,7 @@ After loading the model in this way, the model is fully ready to use the resourc
torchtune for fine-tuning and inference
=============================================
`torchtune <https://meta-pytorch.org/torchtune/main/>`_ is a PyTorch-native library for easy single and multi-GPU
`torchtune <https://pytorch.org/torchtune/main/>`_ is a PyTorch-native library for easy single and multi-GPU
model fine-tuning and inference with LLMs.
#. Install torchtune using pip.