mirror of
https://github.com/ROCm/ROCm.git
synced 2026-01-09 14:48:06 -05:00
Sync develop branch
This commit is contained in:
692
CHANGELOG.md
692
CHANGELOG.md
File diff suppressed because it is too large
Load Diff
300
RELEASE.md
300
RELEASE.md
@@ -1,4 +1,5 @@
|
||||
# ROCm 6.1 release highlights
|
||||
# ROCm 6.1.1 release notes
|
||||
|
||||
<!-- Disable lints since this is an auto-generated file. -->
|
||||
<!-- markdownlint-disable blanks-around-headers -->
|
||||
<!-- markdownlint-disable no-duplicate-header -->
|
||||
@@ -8,245 +9,140 @@
|
||||
|
||||
<!-- spellcheck-disable -->
|
||||
|
||||
The ROCm™ 6.1 release consists of new features and fixes to improve the stability and
|
||||
performance of AMD Instinct™ MI300 GPU applications. Notably, we've added:
|
||||
ROCm™ 6.1.1 introduces minor fixes and improvements to some tools and libraries.
|
||||
|
||||
* Full support for Ubuntu 22.04.4.
|
||||
## OS support
|
||||
|
||||
* **rocDecode**, a new ROCm component that provides high-performance video decode support for
|
||||
AMD GPUs. With rocDecode, you can decode compressed video streams while keeping the resulting
|
||||
YUV frames in video memory. With decoded frames in video memory, you can run video
|
||||
post-processing using ROCm HIP, avoiding unnecessary data copies via the PCIe bus.
|
||||
ROCm 6.1.1 has been tested against a pre-release version of Ubuntu 22.04.5 (kernel 6.8).
|
||||
|
||||
To learn more, refer to the rocDecode
|
||||
[documentation](https://rocm.docs.amd.com/projects/rocDecode/en/latest/).
|
||||
## AMD SMI
|
||||
|
||||
## OS and GPU support changes
|
||||
AMD SMI for ROCm 6.1.1
|
||||
|
||||
ROCm 6.1 adds the following operating system support:
|
||||
### Additions
|
||||
|
||||
* MI300A: Ubuntu 22.04.4 and RHEL 9.3
|
||||
* MI300X: Ubuntu 22.04.4
|
||||
- Added deferred error correctable counts to `amd-smi metric -ecc -ecc-blocks`.
|
||||
|
||||
Future releases will add additional operating systems to match the general offering. For older
|
||||
generations of supported AMD Instinct products, we’ve added Ubuntu 22.04.4 support.
|
||||
### Changes
|
||||
|
||||
```{tip}
|
||||
To view the complete list of supported GPUs and operating systems, refer to the system requirements
|
||||
page for
|
||||
[Linux](https://rocm.docs.amd.com/projects/install-on-linux/en/latest/reference/system-requirements.html)
|
||||
and
|
||||
[Windows](https://rocm.docs.amd.com/projects/install-on-windows/en/latest/reference/system-requirements.html).
|
||||
- Updated the output of `amd-smi metric --ecc-blocks` to show counters available from blocks.
|
||||
- Updated the output of `amd-smi metric --clock` to reflect each engine.
|
||||
- Updated the output of `amd-smi topology --json` to align with output reported by host and guest systems.
|
||||
|
||||
### Fixes
|
||||
|
||||
- Fixed `amd-smi metric --clock`'s clock lock and deep sleep status.
|
||||
- Fixed an issue that would cause an error when resetting non-AMD GPUs.
|
||||
- Fixed `amd-smi metric --pcie` and `amdsmi_get_pcie_info()` when using RDNA3 (Navi 32 and Navi 31) hardware to prevent "UNKNOWN" reports.
|
||||
- Fixed the output results of `amd-smi process` when getting processes running on a device.
|
||||
|
||||
### Removals
|
||||
|
||||
- Removed the `amdsmi_get_gpu_process_info` API from the Python library. It was removed from the C library in an earlier release.
|
||||
|
||||
### Known issues
|
||||
|
||||
- `amd-smi bad-pages` can result in a `ValueError: Null pointer access` error when using some PMU firmware versions.
|
||||
|
||||
```{note}
|
||||
See the [detailed changelog](https://github.com/ROCm/amdsmi/blob/docs/6.1.1/CHANGELOG.md) with code samples for more information.
|
||||
```
|
||||
|
||||
## Installation packages
|
||||
## HIPCC
|
||||
|
||||
This release includes a new set of packages for every module (all libraries and binaries default to
|
||||
`DT_RPATH`). Package names have the suffix `rpath`; for example, the `rpath` variant of `rocminfo` is
|
||||
`rocminfo-rpath`.
|
||||
HIPCC for ROCm 6.1.1
|
||||
|
||||
```{warning}
|
||||
The new `rpath` packages will conflict with the default packages; they are meant to be used only in
|
||||
environments where legacy `DT_RPATH` is the preferred form of linking (instead of `DT_RUNPATH`). We
|
||||
do **not** recommend installing both sets of packages.
|
||||
```
|
||||
### Changes
|
||||
|
||||
## ROCm components
|
||||
- **Upcoming:** a future release will enable use of compiled binaries `hipcc.bin` and `hipconfig.bin` by default. No action is needed by users. You can continue calling high-level Perl scripts `hipcc` and `hipconfig`. `hipcc.bin` and `hipconfig.bin` will be invoked by the high-level Perl scripts. To revert to the previous behavior and invoke `hipcc.pl` and `hipconfig.pl`, set the `HIP_USE_PERL_SCRIPTS` environment variable to `1`.
|
||||
- **Upcoming:** a subsequent release will remove high-level Perl scripts `hipcc` and `hipconfig`. This release will remove the `HIP_USE_PERL_SCRIPTS` environment variable. It will rename `hipcc.bin` and `hipconfig.bin` to `hipcc` and `hipconfig` respectively. No action is needed by the users. To revert to the previous behavior, invoke `hipcc.pl` and `hipconfig.pl` explicitly.
|
||||
- **Upcoming:** a subsequent release will remove `hipcc.pl` and `hipconfig.pl`.
|
||||
|
||||
The following sections highlight select component-specific changes. For additional details, refer to the
|
||||
[Changelog](https://rocm.docs.amd.com/en/develop/about/CHANGELOG.html).
|
||||
## HIPIFY
|
||||
|
||||
### AMD System Management Interface (SMI) Tool
|
||||
HIPIFY for ROCm 6.1.1
|
||||
|
||||
* **New monitor command for GPU metrics**.
|
||||
Use the monitor command to customize, capture, collect, and observe GPU metrics on
|
||||
target devices.
|
||||
### Additions
|
||||
|
||||
* **Integration with E-SMI**.
|
||||
The EPYC™ System Management Interface In-band Library is a Linux C-library that provides in-band
|
||||
user space software APIs to monitor and control your CPU’s power, energy, performance, and other
|
||||
system management functionality. This integration enables access to CPU metrics and telemetry
|
||||
through the AMD SMI API and CLI tools.
|
||||
- Added support for LLVM 18.1.2.
|
||||
- Added support for cuDNN 9.0.0.
|
||||
- Added a new option: `--clang-resource-directory` to specify the clang resource path (the path to the parent folder for the `include` folder that contains `__clang_cuda_runtime_wrapper.h` and other header files used during the hipification process).
|
||||
|
||||
### Composable Kernel (CK)
|
||||
## ROCm SMI
|
||||
|
||||
* **New architecture support**.
|
||||
CK now supports to the following architectures to enable efficient image denoising on the following
|
||||
AMD GPUs: gfx1030, gfx1100, gfx1031, gfx1101, gfx1032, gfx1102, gfx1034, gfx1103, gfx1035,
|
||||
gfx1036
|
||||
ROCm SMI for ROCm 6.1.1
|
||||
|
||||
* **FP8 rounding logic is replaced with stochastic rounding**.
|
||||
Stochastic rounding mimics a more realistic data behavior and improves model convergence.
|
||||
### Known issues
|
||||
|
||||
### HIP
|
||||
- ROCm SMI reports GPU utilization incorrectly for RDNA3 GPUs in some situations.
|
||||
|
||||
* **New environment variable to enable kernel run serialization**.
|
||||
The default `HIP_LAUNCH_BLOCKING` value is `0` (disable); which causes kernels to run as defined in
|
||||
the queue. When set to `1` (enable), the HIP runtime serializes the kernel queue, which behaves the
|
||||
same as `AMD_SERIALIZE_KERNEL`.
|
||||
## Library changes in ROCm 6.1.1
|
||||
|
||||
### hipBLASLt
|
||||
| Library | Version |
|
||||
| ----------- | -------------------------------------------------------------------------- |
|
||||
| AMDMIGraphX | [2.9](https://github.com/ROCm/AMDMIGraphX/releases/tag/rocm-6.1.1) |
|
||||
| hipBLAS | [2.1.0](https://github.com/ROCm/hipBLAS/releases/tag/rocm-6.1.1) |
|
||||
| hipBLASLt | [0.7.0](https://github.com/ROCm/hipBLASLt/releases/tag/rocm-6.1.1) |
|
||||
| hipCUB | [3.1.0](https://github.com/ROCm/hipCUB/releases/tag/rocm-6.1.1) |
|
||||
| hipFFT | [1.0.14](https://github.com/ROCm/hipFFT/releases/tag/rocm-6.1.1) |
|
||||
| hipRAND | [2.10.17](https://github.com/ROCm/hipRAND/releases/tag/rocm-6.1.1) |
|
||||
| hipSOLVER | 2.1.0 ⇒ [2.1.1](https://github.com/ROCm/hipSOLVER/releases/tag/rocm-6.1.1) |
|
||||
| hipSPARSE | [3.0.1](https://github.com/ROCm/hipSPARSE/releases/tag/rocm-6.1.1) |
|
||||
| hipSPARSELt | [0.2.0](https://github.com/ROCm/hipSPARSELt/releases/tag/rocm-6.1.1) |
|
||||
| hipTensor | [1.2.0](https://github.com/ROCm/hipTensor/releases/tag/rocm-6.1.1) |
|
||||
| MIOpen | [3.1.0](https://github.com/ROCm/MIOpen/releases/tag/rocm-6.1.1) |
|
||||
| MIVisionX | [2.5.0](https://github.com/ROCm/MIVisionX/releases/tag/rocm-6.1.1) |
|
||||
| rccl | [2.18.6](https://github.com/ROCm/rccl/releases/tag/rocm-6.1.1) |
|
||||
| rocALUTION | [3.1.1](https://github.com/ROCm/rocALUTION/releases/tag/rocm-6.1.1) |
|
||||
| rocBLAS | [4.1.0](https://github.com/ROCm/rocBLAS/releases/tag/rocm-6.1.1) |
|
||||
| rocDecode | [0.5.0](https://github.com/ROCm/rocDecode/releases/tag/rocm-6.1.1) |
|
||||
| rocFFT | 1.0.26 ⇒ [1.0.27](https://github.com/ROCm/rocFFT/releases/tag/rocm-6.1.1) |
|
||||
| rocm-cmake | [0.12.0](https://github.com/ROCm/rocm-cmake/releases/tag/rocm-6.1.1) |
|
||||
| rocPRIM | [3.1.0](https://github.com/ROCm/rocPRIM/releases/tag/rocm-6.1.1) |
|
||||
| rocRAND | [3.0.1](https://github.com/ROCm/rocRAND/releases/tag/rocm-6.1.1) |
|
||||
| rocSOLVER | [3.25.0](https://github.com/ROCm/rocSOLVER/releases/tag/rocm-6.1.1) |
|
||||
| rocSPARSE | [3.1.2](https://github.com/ROCm/rocSPARSE/releases/tag/rocm-6.1.1) |
|
||||
| rocThrust | [3.0.1](https://github.com/ROCm/rocThrust/releases/tag/rocm-6.1.1) |
|
||||
| rocWMMA | [1.4.0](https://github.com/ROCm/rocWMMA/releases/tag/rocm-6.1.1) |
|
||||
| rpp | [1.5.0](https://github.com/ROCm/rpp/releases/tag/rocm-6.1.1) |
|
||||
| Tensile | [4.40.0](https://github.com/ROCm/Tensile/releases/tag/rocm-6.1.1) |
|
||||
|
||||
* **New GemmTuning extension parameter** GemmTuning allows you to set a split-k value for each solution, which is more feasible for
|
||||
performance tuning.
|
||||
#### hipBLASLt 0.7.0
|
||||
|
||||
### hipFFT
|
||||
hipBLASLt 0.7.0 for ROCm 6.1.1
|
||||
|
||||
* **New multi-GPU support for single-process transforms** Multiple GPUs can be used to perform a transform in a single process. Note that this initial
|
||||
implementation is a functional preview.
|
||||
##### Additions
|
||||
|
||||
### HIPIFY
|
||||
- Added `hipblasltExtSoftmax` extension API.
|
||||
- Added `hipblasltExtLayerNorm` extension API.
|
||||
- Added `hipblasltExtAMax` extension API.
|
||||
- Added `GemmTuning` extension parameter to set split-k by user.
|
||||
- Added support for mixed precision datatype: fp16/fp8 in with fp16 outk.
|
||||
|
||||
* **Skipped code blocks**: Code blocks that are skipped by the preprocessor are no longer hipified under the
|
||||
`--default-preprocessor` option. To hipify everything, despite conditional preprocessor directives
|
||||
(`#if`, `#ifdef`, `#ifndef`, `#elif`, or `#else`), don't use the `--default-preprocessor` or `--amap` options.
|
||||
##### Deprecations
|
||||
|
||||
### hipSPARSELt
|
||||
- **Upcoming**: `algoGetHeuristic()` ext API for GroupGemm will be deprecated in a future release of hipBLASLt.
|
||||
|
||||
* **Structured sparsity matrix support extensions**
|
||||
Structured sparsity matrices help speed up deep-learning workloads. We now support `B` as the
|
||||
sparse matrix and `A` as the dense matrix in Sparse Matrix-Matrix Multiplication (SPMM). Prior to this
|
||||
release, we only supported sparse (matrix A) x dense (matrix B) matrix multiplication. Structured
|
||||
sparsity matrices help speed up deep learning workloads.
|
||||
### hipSOLVER 2.1.1
|
||||
|
||||
### hipTensor
|
||||
hipSOLVER 2.1.1 for ROCm 6.1.1
|
||||
|
||||
* **4D tensor permutation and contraction support**.
|
||||
You can now perform tensor permutation on 4D tensors and 4D contractions for F16, BF16, and
|
||||
Complex F32/F64 datatypes.
|
||||
#### Changes
|
||||
|
||||
### MIGraphX
|
||||
- By default, `BUILD_WITH_SPARSE` is now set to OFF on Microsoft Windows.
|
||||
|
||||
* **Improved performance for transformer-based models**.
|
||||
We added support for FlashAttention, which benefits models like BERT, GPT, and Stable Diffusion.
|
||||
#### Fixes
|
||||
|
||||
* **New Torch-MIGraphX driver**.
|
||||
This driver calls MIGraphX directly from PyTorch. It provides an `mgx_module` object that you can
|
||||
invoke like any other Torch module, but which utilizes the MIGraphX inference engine internally.
|
||||
Torch-MIGraphX supports FP32, FP16, and INT8 datatypes.
|
||||
- Fixed benchmark client build when `BUILD_WITH_SPARSE` is OFF.
|
||||
|
||||
* **FP8 support**. We now offer functional support for inference in the FP8E4M3FNUZ datatype. You
|
||||
can load an ONNX model in FP8E4M3FNUZ using C++ or Python APIs, or `migraphx-driver`.
|
||||
You can quantize a floating point model to FP8 format by using the `--fp8` flag with `migraphx-driver`.
|
||||
To accelerate inference, MIGraphX uses hardware acceleration on MI300 for FP8 by leveraging FP8
|
||||
support in various backend kernel libraries.
|
||||
### rocFFT 1.0.27
|
||||
|
||||
### MIOpen
|
||||
rocFFT 1.0.27 for ROCm 6.1.1
|
||||
|
||||
* **Improved performance for inference and convolutions**.
|
||||
Inference support now provided for Find 2.0 fusion plans. Additionally, we've enhanced the Number of
|
||||
samples, Height, Width, and Channels (NHWC) convolution kernels for heuristics. NHWC stores data
|
||||
in a format where the height and width dimensions come first, followed by channels.
|
||||
#### Additions
|
||||
|
||||
### OpenMP
|
||||
- Enable multi-GPU testing on systems without direct GPU-interconnects.
|
||||
|
||||
* **Implicit Zero-copy is triggered automatically in XNACK-enabled MI300A systems**.
|
||||
Implicit Zero-copy behavior in `non unified_shared_memory` programs is triggered automatically in
|
||||
XNACK-enabled MI300A systems (for example, when using the `HSA_XNACK=1` environment
|
||||
variable). OpenMP supports the 'requires `unified_shared_memory`' directive to support programs
|
||||
that don’t want to copy data explicitly between the CPU and GPU. However, this requires that you add
|
||||
these directives to every translation unit of the program.
|
||||
#### Fixes
|
||||
|
||||
* **New MI300 FP atomics**. Application performance can now improve by leveraging fast floating-point atomics on MI300 (gfx942).
|
||||
|
||||
|
||||
### RCCL
|
||||
|
||||
* **NCCL 2.18.6 compatibility**.
|
||||
RCCL is now compatible with NCCL 2.18.6, which includes increasing the maximum IB network interfaces to 32 and fixing network device ordering when creating communicators with only one GPU
|
||||
per node.
|
||||
|
||||
* **Doubled simultaneous communication channels**.
|
||||
We improved MI300X performance by increasing the maximum number of simultaneous
|
||||
communication channels from 32 to 64.
|
||||
|
||||
### rocALUTION
|
||||
|
||||
* **New multiple node and GPU support**.
|
||||
Unsmoothed and smoothed aggregations and Ruge-Stueben AMG now work with multiple nodes
|
||||
and GPUs. For more information, refer to the
|
||||
[API documentation](https://rocm.docs.amd.com/projects/rocALUTION/en/latest/usermanual/solvers.html#unsmoothed-aggregation-amg).
|
||||
|
||||
### rocDecode
|
||||
|
||||
* **New ROCm component**.
|
||||
rocDecode ROCm's newest component, providing high-performance video decode support for AMD
|
||||
GPUs. To learn more, refer to the
|
||||
[documentation](https://rocm.docs.amd.com/projects/rocDecode/en/latest/).
|
||||
|
||||
### ROCm Compiler
|
||||
|
||||
* **Combined projects**. ROCm Device-Libs, ROCm Compiler Support, and hipCC are now located in
|
||||
the `llvm-project/amd` subdirectory of AMD's fork of the LLVM project. Previously, these projects
|
||||
were maintained in separate repositories. Note that the projects themselves will continue to be
|
||||
packaged separately.
|
||||
|
||||
* **Split the 'rocm-llvm' package**. This package has been split into a required and an optional package:
|
||||
|
||||
* **rocm-llvm(required)**: A package containing the essential binaries needed for compilation.
|
||||
|
||||
* **rocm-llvm-dev(optional)**: A package containing binaries for compiler and application developers.
|
||||
|
||||
|
||||
### ROCm Data Center Tool (RDC)
|
||||
|
||||
* **C++ upgrades**.
|
||||
RDC was upgraded from C++11 to C++17 to enable a more modern C++ standard when writing RDC plugins.
|
||||
|
||||
### ROCm Performance Primitives (RPP)
|
||||
|
||||
* **New backend support**.
|
||||
Audio processing support added for the `HOST` backend and 3D Voxel kernels support
|
||||
for the `HOST` and `HIP` backends.
|
||||
|
||||
### ROCm Validation Suite
|
||||
|
||||
* **New datatype support**.
|
||||
Added BF16 and FP8 datatypes based on General Matrix Multiply(GEMM) operations in the GPU Stress Test (GST) module. This provides additional performance benchmarking and stress testing based on the newly supported datatypes.
|
||||
|
||||
### rocSOLVER
|
||||
|
||||
* **New EigenSolver routine**.
|
||||
Based on the Jacobi algorithm, a new EigenSolver routine was added to the library. This routine computes the eigenvalues and eigenvectors of a matrix with improved performance.
|
||||
|
||||
### ROCTracer
|
||||
|
||||
* **New versioning and callback enhancements**.
|
||||
Improved to match versioning changes in HIP Runtime and supports runtime API callbacks and activity record logging. The APIs of different runtimes at different levels are considered different API domains with assigned domain IDs.
|
||||
|
||||
## Upcoming changes
|
||||
|
||||
* ROCm SMI will be deprecated in a future release. We advise **migrating to AMD SMI** now to
|
||||
prevent future workflow disruptions.
|
||||
|
||||
* hipCC supports, by default, the following compiler invocation flags:
|
||||
|
||||
* `-mllvm -amdgpu-early-inline-all=true`
|
||||
* `-mllvm -amdgpu-function-calls=false`
|
||||
|
||||
In a future ROCm release, hipCC will no longer support these flags. It will, instead, use the Clang
|
||||
defaults:
|
||||
|
||||
* `-mllvm -amdgpu-early-inline-all=false`
|
||||
* `-mllvm -amdgpu-function-calls=true`
|
||||
|
||||
To evaluate the impact of this change, include `--hipcc-func-supp` in your hipCC invocation.
|
||||
|
||||
For information on these flags, and the differences between hipCC and Clang, refer to
|
||||
[ROCm Compiler Interfaces](https://rocm.docs.amd.com/en/latest/reference/rocmcc.html#rocm-compiler-interfaces).
|
||||
|
||||
* Future ROCm releases will not provide `clang-ocl`. For more information, refer to the
|
||||
[`clang-ocl` README](https://github.com/ROCm/clang-ocl).
|
||||
|
||||
* The following operating systems will be supported in a future ROCm release. They are currently
|
||||
only available in beta.
|
||||
|
||||
* RHEL 9.4
|
||||
* RHEL 8.10
|
||||
* SLES 15 SP6
|
||||
|
||||
* As of ROCm 6.2, we’ve planned for **end-of-support** for:
|
||||
|
||||
* Ubuntu 20.04.5
|
||||
* SLES 15 SP4
|
||||
* RHEL/CentOS 7.9
|
||||
- Fixed kernel launch failure on execute of very large odd-length real-complex transforms.
|
||||
|
||||
@@ -8,7 +8,7 @@ subtrees:
|
||||
- entries:
|
||||
- file: what-is-rocm.rst
|
||||
- file: about/release-notes.md
|
||||
title: Release highlights
|
||||
title: Release notes
|
||||
subtrees:
|
||||
- entries:
|
||||
- file: about/CHANGELOG.md
|
||||
|
||||
@@ -20,16 +20,16 @@
|
||||
* Run this for 5.6.0 (change for whatever version you require)
|
||||
* `GITHUB_ACCESS_TOKEN=my_token_here`
|
||||
|
||||
To generate the changelog from 5.0.0 up to and including 6.1.0:
|
||||
To generate the changelog from 5.0.0 up to and including 6.1.1:
|
||||
|
||||
```sh
|
||||
python3 tag_script.py -t $GITHUB_ACCESS_TOKEN --no-release --no-pulls --do-previous --compile_file ../../CHANGELOG.md --branch release/rocm-rel-6.1 6.1.0
|
||||
python3 tag_script.py -t $GITHUB_ACCESS_TOKEN --no-release --no-pulls --do-previous --compile_file ../../CHANGELOG.md --branch release/rocm-rel-6.1 6.1.1
|
||||
```
|
||||
|
||||
To generate the changelog only for 6.1.0:
|
||||
To generate the changelog only for 6.1.1:
|
||||
|
||||
```sh
|
||||
python3 tag_script.py -t $GITHUB_ACCESS_TOKEN --no-release --no-pulls --compile_file ../../CHANGELOG.md --branch release/rocm-rel-6.1 6.1.0
|
||||
python3 tag_script.py -t $GITHUB_ACCESS_TOKEN --no-release --no-pulls --compile_file ../../CHANGELOG.md --branch release/rocm-rel-6.1 6.1.1
|
||||
```
|
||||
|
||||
### Notes
|
||||
|
||||
@@ -1,4 +1,5 @@
|
||||
The ROCm 6.0.2 point release consists of minor bug fixes to improve the stability of MI300 GPU applications. This release introduces several new driver features for system qualification on our partner server offerings.
|
||||
|
||||
The ROCm 6.0.2 point release consists of minor bug fixes to improve the stability of MI300 GPU applications. This release introduces several new driver features for system qualification on our partner server offerings.
|
||||
|
||||
#### hipFFT 1.0.13
|
||||
|
||||
|
||||
67
tools/autotag/templates/rocm_changes/6.1.1.md
Normal file
67
tools/autotag/templates/rocm_changes/6.1.1.md
Normal file
@@ -0,0 +1,67 @@
|
||||
|
||||
ROCm™ 6.1.1 introduces minor fixes and improvements to some tools and libraries.
|
||||
|
||||
### OS support
|
||||
|
||||
ROCm 6.1.1 has been tested against a pre-release version of Ubuntu 22.04.5 (kernel 6.8).
|
||||
|
||||
### AMD SMI
|
||||
|
||||
AMD SMI for ROCm 6.1.1
|
||||
|
||||
#### Additions
|
||||
|
||||
* Added deferred error correctable counts to `amd-smi metric -ecc -ecc-blocks`.
|
||||
|
||||
#### Changes
|
||||
|
||||
* Updated the output of `amd-smi metric --ecc-blocks` to show counters available from blocks.
|
||||
* Updated the output of `amd-smi metric --clock` to reflect each engine.
|
||||
* Updated the output of `amd-smi topology --json` to align with output reported by host and guest systems.
|
||||
|
||||
#### Fixes
|
||||
|
||||
* Fixed `amd-smi metric --clock`'s clock lock status and deep sleep status.
|
||||
* Fixed an issue that would cause an error when attempting to reset non-AMD GPUs.
|
||||
* Fixed `amd-smi metric --pcie` and `amdsmi_get_pcie_info()` when using RDNA3 (Navi 32 and Navi 31) hardware to prevent "UNKNOWN" reports.
|
||||
* Fixed the output results of `amd-smi process` when getting processes running on a device.
|
||||
|
||||
#### Removals
|
||||
|
||||
* Removed the `amdsmi_get_gpu_process_info` API from the Python library. It was removed from the C library in an earlier release.
|
||||
|
||||
#### Known issues
|
||||
|
||||
* `amd-smi bad-pages` can result in a `ValueError: Null pointer access` error when using certain PMU firmware versions.
|
||||
|
||||
```{note}
|
||||
See the [detailed changelog](https://github.com/ROCm/amdsmi/blob/docs/6.1.1/CHANGELOG.md) with code samples for more information.
|
||||
```
|
||||
|
||||
### HIPCC
|
||||
|
||||
HIPCC for ROCm 6.1.1
|
||||
|
||||
#### Changes
|
||||
|
||||
* **Upcoming:** a future release will enable use of compiled binaries `hipcc.bin` and `hipconfig.bin` by default. No action is needed by users; you may continue calling high-level Perl scripts `hipcc` and `hipconfig`. `hipcc.bin` and `hipconfig.bin` will be invoked by the high-level Perl scripts. To revert to the previous behavior and invoke `hipcc.pl` and `hipconfig.pl`, set the `HIP_USE_PERL_SCRIPTS` environment variable to `1`.
|
||||
* **Upcoming:** a subsequent release will remove high-level Perl scripts `hipcc` and `hipconfig`. This release will remove the `HIP_USE_PERL_SCRIPTS` environment variable. It will rename `hipcc.bin` and `hipconfig.bin` to `hipcc` and `hipconfig` respectively. No action is needed by the users. To revert to the previous behavior, invoke `hipcc.pl` and `hipconfig.pl` explicitly.
|
||||
* **Upcoming:** a subsequent will remove `hipcc.pl` and `hipconfig.pl`.
|
||||
|
||||
### HIPIFY
|
||||
|
||||
HIPIFY for ROCm 6.1.1
|
||||
|
||||
#### Additions
|
||||
|
||||
* Added support for LLVM 18.1.2.
|
||||
* Added support for cuDNN 9.0.0.
|
||||
* Added a new option: `--clang-resource-directory` to specify the clang resource path (the path to the parent folder for the `include` folder that contains `__clang_cuda_runtime_wrapper.h` and other header files used during the hipification process).
|
||||
|
||||
### ROCm SMI
|
||||
|
||||
ROCm SMI for ROCm 6.1.1
|
||||
|
||||
#### Known issues
|
||||
|
||||
* ROCm SMI reports GPU utilization incorrectly for RDNA3 GPUs in some situations.
|
||||
@@ -1,2 +1,2 @@
|
||||
from .defaults import TEMPLATES, PROCESSORS
|
||||
from .custom_templates import hipfort, mivisionx, rpp, rvs
|
||||
from .custom_templates import hipfort, mivisionx, rpp
|
||||
|
||||
@@ -1,41 +0,0 @@
|
||||
import re
|
||||
|
||||
from util.release_data import ReleaseLib
|
||||
from util.defaults import TEMPLATES, PROCESSORS
|
||||
|
||||
TEMPLATES['composable_kernel'] = (
|
||||
(
|
||||
r"## (\(Unreleased\))? CK (?P<lib_version>\d+\.\d+(?:\.\d+))?"
|
||||
r"(?P<for_rocm> for ROCm )?"
|
||||
r"(?P<rocm_version>(?(for_rocm)\d+\.\d+(?:\.\d+)?|.*))?"
|
||||
r"\n"
|
||||
r"(?P<body>(?:(?!## ).*(?:(?!\n## )\n|(?=\n## )))*)"
|
||||
)
|
||||
)
|
||||
|
||||
|
||||
def composable_kernel_processor(data: ReleaseLib, template: str, _, __) -> bool:
|
||||
"""Processor for releases."""
|
||||
changelog = data.repo.get_contents("CHANGELOG.md", data.commit)
|
||||
changelog = changelog.decoded_content.decode()
|
||||
pattern = re.compile(template)
|
||||
match = pattern.search(changelog)
|
||||
lib_version = match["lib_version"]
|
||||
data.message = (
|
||||
f"composable_kernel for ROCm"
|
||||
f" {data.full_version}"
|
||||
)
|
||||
|
||||
data.lib_version = lib_version
|
||||
data.notes = f"""{match["body"]}"""
|
||||
|
||||
change_pattern = re.compile(
|
||||
r"^#+ +(?P<type>[^\n]+)$\n*(?P<change>(^(?!#).*\n*)*)",
|
||||
re.RegexFlag.MULTILINE
|
||||
)
|
||||
for match in change_pattern.finditer(data.notes):
|
||||
data.data.changes[match["type"]] = match["change"]
|
||||
|
||||
return True
|
||||
|
||||
PROCESSORS['composable_kernel'] = composable_kernel_processor
|
||||
@@ -4,6 +4,7 @@ from dataclasses import dataclass, field
|
||||
import os
|
||||
import re
|
||||
import shutil
|
||||
import sys
|
||||
from typing import Optional, Union, Dict, List, Tuple
|
||||
from github import Github, UnknownObjectException
|
||||
from github.Repository import Repository
|
||||
@@ -352,6 +353,8 @@ class ReleaseBundleFactory:
|
||||
"""Create a release bundle of libraries."""
|
||||
tag_name = f"rocm-{version}"
|
||||
libraries = { }
|
||||
|
||||
missing_branches = []
|
||||
|
||||
print(f"\nLibraries for rocm-{version}:")
|
||||
for name, remote in names_and_remotes:
|
||||
@@ -365,7 +368,13 @@ class ReleaseBundleFactory:
|
||||
continue
|
||||
|
||||
print(f" Defaulting to branch: {self.branch}")
|
||||
commit = repo.get_branch(self.branch).commit.sha
|
||||
try:
|
||||
repo_branch = repo.get_branch(self.branch)
|
||||
commit = repo_branch.commit.sha
|
||||
except Exception:
|
||||
print(f" - Could not find branch : {self.branch}")
|
||||
missing_branches.append(f"{self.branch} for {name}")
|
||||
continue
|
||||
|
||||
libraries[name] = ReleaseLib(
|
||||
name=name,
|
||||
@@ -381,6 +390,9 @@ class ReleaseBundleFactory:
|
||||
libraries=libraries
|
||||
)
|
||||
|
||||
for missing in missing_branches:
|
||||
print(f"Could not find the following branch: {missing}")
|
||||
|
||||
return data
|
||||
|
||||
def create_data_dict(
|
||||
|
||||
Reference in New Issue
Block a user