mirror of
https://github.com/ROCm/ROCm.git
synced 2026-01-08 22:28:06 -05:00
6.2 release notes (#111)
* generate release notes * update release notes update release.md update anchors fix formatting * add component notes * remove known issues from toc * update pydata sphinx table styling * remove temp file * add 6.2.0 templates * add documentation improvements list * update conf.py with 6.2.0 version and GA date * update changelog headings * remove rserp tickets * add miopen cl * remove bolding * add Ram's feedback fix thing * rm sub-bullets * update new components formatting * update amd smi version * add css * add table styles * add component notes and KIs * update os support wording * update highlights * update compilers cls * fix links * add KIs * update KI wording * add ram's suggestions * add omniperf known issue fmt * system -> system management in components table * change rocthrust version to 3.0.1 * remove release highlight and add RVS changelog * update highlights * fix version nums, add rocr runtime * reorder components table * update compiler KI * more compiler known issue under llvm-proj * add space * word * fix internal links * add gdb * update pytorch autocast highligh * add hipfft cl * fix hipfft internal link * fix svg icon color * fix table * remove rocblas highlight and update tf hl * add fixes * update highlights * fix ck in table * fix mivisionx rocal note * fix link and dbgapi version * fix link to llvm proj docs * fix fmt * add feedback * add more changes move clang-ocl to upcoming changes add fixes fix some fmt fix table width fix formatting add fixes fix tensile fmt remove unused file update templates change words * add known issue * rm "for unknown reasons" * fix hipsolver, platform -> software stack * add amdsmi note * rm mention of mi308 fmt * add beta note to rocprofiler-sdk fix * bold a heading * move hipify under compilers * Revert "move hipify under compilers" This reverts commit 83861f544a75bce1ea64b14871e1224161d34815. * fix typos and GA date update text * update words * add processor affinity KI and remove rocHPL KI * update processor affinity KI * update llvm-proj KI fix * update processor affinity KI update * fix hip link * update templates * words * update links to 6.2.0 * remove extra css * fix some stuff in hip word * add dell black screen hang ki word * fix rocpydecode link * remove sass files
This commit is contained in:
2511
RELEASE.md
2511
RELEASE.md
File diff suppressed because it is too large
Load Diff
10
docs/conf.py
10
docs/conf.py
@@ -25,16 +25,16 @@ latex_elements = {
|
||||
project = "ROCm Documentation"
|
||||
author = "Advanced Micro Devices, Inc."
|
||||
copyright = "Copyright (c) 2024 Advanced Micro Devices, Inc. All rights reserved."
|
||||
version = "6.1.2"
|
||||
release = "6.1.2"
|
||||
version = "6.2.0"
|
||||
release = "6.2.0"
|
||||
setting_all_article_info = True
|
||||
all_article_info_os = ["linux", "windows"]
|
||||
all_article_info_author = ""
|
||||
|
||||
# pages with specific settings
|
||||
article_pages = [
|
||||
{"file": "about/release-notes", "os": ["linux", "windows"], "date": "2024-06-04"},
|
||||
{"file": "about/changelog", "os": ["linux", "windows"], "date": "2024-06-04"},
|
||||
{"file": "about/release-notes", "os": ["linux", "windows"], "date": "2024-08-02"},
|
||||
{"file": "about/changelog", "os": ["linux", "windows"], "date": "2024-08-02"},
|
||||
{"file": "how-to/deep-learning-rocm", "os": ["linux"]},
|
||||
{"file": "how-to/rocm-for-ai/index", "os": ["linux"]},
|
||||
{"file": "how-to/rocm-for-ai/install", "os": ["linux"]},
|
||||
@@ -104,7 +104,7 @@ html_theme = "rocm_docs_theme"
|
||||
html_theme_options = {"flavor": "rocm-docs-home"}
|
||||
|
||||
html_static_path = ["sphinx/static/css"]
|
||||
html_css_files = ["rocm_custom.css"]
|
||||
html_css_files = ["rocm_custom.css", "rocm_rn.css"]
|
||||
|
||||
html_title = "ROCm Documentation"
|
||||
|
||||
|
||||
@@ -191,6 +191,8 @@ This section describes performance-based settings.
|
||||
echo 20 > /proc/sys/vm/compaction_proactiveness
|
||||
echo 1 > /proc/sys/vm/compact_unevictable_allowed
|
||||
|
||||
.. _mi300a-processor-affinity:
|
||||
|
||||
* **Change affinity of ROCm helper threads**
|
||||
|
||||
This change prevents internal ROCm threads from having their CPU core affinity mask
|
||||
|
||||
@@ -9,12 +9,6 @@ subtrees:
|
||||
- file: what-is-rocm.rst
|
||||
- file: about/release-notes.md
|
||||
title: Release notes
|
||||
subtrees:
|
||||
- entries:
|
||||
- file: about/changelog.md
|
||||
title: Changelog
|
||||
- url: https://github.com/ROCm/ROCm/labels/Verified%20Issue
|
||||
title: Known issues
|
||||
|
||||
- caption: Install
|
||||
entries:
|
||||
|
||||
@@ -1,6 +1,21 @@
|
||||
/* Override PyData Sphinx Theme default colors */
|
||||
html[data-theme='light'] {
|
||||
--pst-color-table-row-hover-bg: #E2E8F0;
|
||||
}
|
||||
|
||||
html[data-theme='dark'] {
|
||||
--pst-color-table-row-hover-bg: #1E293B;
|
||||
}
|
||||
|
||||
a svg {
|
||||
color: var(--pst-color-text-base);
|
||||
}
|
||||
|
||||
a svg:hover {
|
||||
color: var(--pst-color-link-hover);
|
||||
}
|
||||
|
||||
/* Adds container for big tables, used for Compatibility Matrix */
|
||||
|
||||
.format-big-table {
|
||||
white-space: nowrap;
|
||||
}
|
||||
}
|
||||
|
||||
126
docs/sphinx/static/css/rocm_rn.css
Normal file
126
docs/sphinx/static/css/rocm_rn.css
Normal file
@@ -0,0 +1,126 @@
|
||||
#rocm-rn-components col {
|
||||
width: 6rem;
|
||||
}
|
||||
#rocm-rn-components col:nth-child(2) {
|
||||
width: 12rem;
|
||||
}
|
||||
#rocm-rn-components td {
|
||||
white-space: nowrap;
|
||||
}
|
||||
#rocm-rn-components td:last-of-type {
|
||||
text-align: center;
|
||||
}
|
||||
#rocm-rn-components a svg {
|
||||
color: var(--pst-color-text-base);
|
||||
}
|
||||
#rocm-rn-components a svg:hover {
|
||||
color: var(--pst-color-link-hover);
|
||||
}
|
||||
#rocm-rn-components .tbody-reverse-zebra tr:nth-child(2n + 1) td {
|
||||
background-color: var(--pst-color-table-row-zebra-high-bg);
|
||||
}
|
||||
#rocm-rn-components .tbody-reverse-zebra tr:nth-child(2n) td {
|
||||
background-color: var(--pst-color-table-row-zebra-low-bg);
|
||||
}
|
||||
|
||||
#rocm-rn-components:has(tbody.rocm-components-libs th[rowspan]:first-of-type:hover) .rocm-components-libs,
|
||||
#rocm-rn-components:has(tbody.rocm-components-libs th[rowspan]:first-of-type:hover) .rocm-components-libs td,
|
||||
#rocm-rn-components:has(tbody.rocm-components-libs th[rowspan]:first-of-type:hover) tbody.rocm-components-libs th {
|
||||
background-color: var(--pst-color-table-row-hover-bg);
|
||||
}
|
||||
#rocm-rn-components:has(tbody.rocm-components-tools th[rowspan]:first-of-type:hover) .rocm-components-tools,
|
||||
#rocm-rn-components:has(tbody.rocm-components-tools th[rowspan]:first-of-type:hover) .rocm-components-tools td,
|
||||
#rocm-rn-components:has(tbody.rocm-components-tools th[rowspan]:first-of-type:hover) tbody.rocm-components-tools th {
|
||||
background-color: var(--pst-color-table-row-hover-bg);
|
||||
}
|
||||
#rocm-rn-components:has(tbody.rocm-components-compilers th[rowspan]:first-of-type:hover) .rocm-components-compilers,
|
||||
#rocm-rn-components:has(tbody.rocm-components-compilers th[rowspan]:first-of-type:hover) .rocm-components-compilers td {
|
||||
background-color: var(--pst-color-table-row-hover-bg);
|
||||
}
|
||||
#rocm-rn-components:has(tbody.rocm-components-runtimes th[rowspan]:first-of-type:hover) .rocm-components-runtimes,
|
||||
#rocm-rn-components:has(tbody.rocm-components-runtimes th[rowspan]:first-of-type:hover) .rocm-components-runtimes td {
|
||||
background-color: var(--pst-color-table-row-hover-bg);
|
||||
}
|
||||
|
||||
#rocm-rn-components:has(tbody.rocm-components-tools th[rowspan]:first-of-type:hover) .rocm-components-tools td {
|
||||
background-color: var(--pst-color-table-row-hover-bg);
|
||||
}
|
||||
#rocm-rn-components:has(tbody.rocm-components-compilers th[rowspan]:first-of-type:hover) .rocm-components-compilers td {
|
||||
background-color: var(--pst-color-table-row-hover-bg);
|
||||
}
|
||||
#rocm-rn-components:has(tbody.rocm-components-runtimes th[rowspan]:first-of-type:hover) .rocm-components-runtimes td {
|
||||
background-color: var(--pst-color-table-row-hover-bg);
|
||||
}
|
||||
|
||||
#rocm-rn-components:has(tbody.rocm-components-ml th[rowspan]:nth-of-type(2):hover) .rocm-components-ml td,
|
||||
#rocm-rn-components:has(tbody.rocm-components-ml th[rowspan]:nth-of-type(2):hover) .rocm-components-libs th:first-of-type {
|
||||
background-color: var(--pst-color-table-row-hover-bg);
|
||||
}
|
||||
#rocm-rn-components:has(tbody.rocm-components-communication th[rowspan]:nth-of-type(2):hover) .rocm-components-communication td,
|
||||
#rocm-rn-components:has(tbody.rocm-components-communication th[rowspan]:nth-of-type(2):hover) .rocm-components-libs th:first-of-type {
|
||||
background-color: var(--pst-color-table-row-hover-bg);
|
||||
}
|
||||
#rocm-rn-components:has(tbody.rocm-components-math th[rowspan]:nth-of-type(2):hover) .rocm-components-math td,
|
||||
#rocm-rn-components:has(tbody.rocm-components-math th[rowspan]:nth-of-type(2):hover) .rocm-components-libs th:first-of-type {
|
||||
background-color: var(--pst-color-table-row-hover-bg);
|
||||
}
|
||||
#rocm-rn-components:has(tbody.rocm-components-primitives th[rowspan]:nth-of-type(2):hover) .rocm-components-primitives td,
|
||||
#rocm-rn-components:has(tbody.rocm-components-primitives th[rowspan]:nth-of-type(2):hover) .rocm-components-libs th:first-of-type {
|
||||
background-color: var(--pst-color-table-row-hover-bg);
|
||||
}
|
||||
#rocm-rn-components:has(tbody.rocm-components-dev th[rowspan]:nth-of-type(2):hover) .rocm-components-dev td,
|
||||
#rocm-rn-components:has(tbody.rocm-components-dev th[rowspan]:nth-of-type(2):hover) .rocm-components-tools th:first-of-type {
|
||||
background-color: var(--pst-color-table-row-hover-bg);
|
||||
}
|
||||
#rocm-rn-components:has(tbody.rocm-components-perf th[rowspan]:nth-of-type(2):hover) .rocm-components-perf td,
|
||||
#rocm-rn-components:has(tbody.rocm-components-perf th[rowspan]:nth-of-type(2):hover) .rocm-components-tools th:first-of-type {
|
||||
background-color: var(--pst-color-table-row-hover-bg);
|
||||
}
|
||||
#rocm-rn-components:has(tbody.rocm-components-system th[rowspan]:nth-of-type(2):hover) .rocm-components-system td,
|
||||
#rocm-rn-components:has(tbody.rocm-components-system th[rowspan]:nth-of-type(2):hover) .rocm-components-tools th:first-of-type {
|
||||
background-color: var(--pst-color-table-row-hover-bg);
|
||||
}
|
||||
|
||||
#rocm-rn-components:has(tbody.rocm-components-ml td:hover) .rocm-components-ml th,
|
||||
#rocm-rn-components:has(tbody.rocm-components-ml td:hover) .rocm-components-libs th:first-of-type,
|
||||
#rocm-rn-components:has(tbody.rocm-components-ml td:hover) tr:hover > td {
|
||||
background-color: var(--pst-color-table-row-hover-bg);
|
||||
}
|
||||
#rocm-rn-components:has(tbody.rocm-components-communication td:hover) .rocm-components-communication th,
|
||||
#rocm-rn-components:has(tbody.rocm-components-communication td:hover) .rocm-components-libs th:first-of-type,
|
||||
#rocm-rn-components:has(tbody.rocm-components-communication td:hover) tr:hover > td {
|
||||
background-color: var(--pst-color-table-row-hover-bg);
|
||||
}
|
||||
#rocm-rn-components:has(tbody.rocm-components-math td:hover) .rocm-components-math th,
|
||||
#rocm-rn-components:has(tbody.rocm-components-math td:hover) .rocm-components-libs th:first-of-type,
|
||||
#rocm-rn-components:has(tbody.rocm-components-math td:hover) tr:hover > td {
|
||||
background-color: var(--pst-color-table-row-hover-bg);
|
||||
}
|
||||
#rocm-rn-components:has(tbody.rocm-components-primitives td:hover) .rocm-components-primitives th,
|
||||
#rocm-rn-components:has(tbody.rocm-components-primitives td:hover) .rocm-components-libs th:first-of-type,
|
||||
#rocm-rn-components:has(tbody.rocm-components-primitives td:hover) tr:hover > td {
|
||||
background-color: var(--pst-color-table-row-hover-bg);
|
||||
}
|
||||
#rocm-rn-components:has(tbody.rocm-components-dev td:hover) .rocm-components-dev th,
|
||||
#rocm-rn-components:has(tbody.rocm-components-dev td:hover) .rocm-components-tools th:first-of-type,
|
||||
#rocm-rn-components:has(tbody.rocm-components-dev td:hover) tr:hover > td {
|
||||
background-color: var(--pst-color-table-row-hover-bg);
|
||||
}
|
||||
#rocm-rn-components:has(tbody.rocm-components-perf td:hover) .rocm-components-perf th,
|
||||
#rocm-rn-components:has(tbody.rocm-components-perf td:hover) .rocm-components-tools th:first-of-type,
|
||||
#rocm-rn-components:has(tbody.rocm-components-perf td:hover) tr:hover > td {
|
||||
background-color: var(--pst-color-table-row-hover-bg);
|
||||
}
|
||||
#rocm-rn-components:has(tbody.rocm-components-system td:hover) .rocm-components-system th,
|
||||
#rocm-rn-components:has(tbody.rocm-components-system td:hover) .rocm-components-tools th:first-of-type,
|
||||
#rocm-rn-components:has(tbody.rocm-components-system td:hover) tr:hover > td {
|
||||
background-color: var(--pst-color-table-row-hover-bg);
|
||||
}
|
||||
#rocm-rn-components:has(tbody.rocm-components-compilers td:hover) .rocm-components-compilers th:first-of-type,
|
||||
#rocm-rn-components:has(tbody.rocm-components-compilers td:hover) tr:hover > td {
|
||||
background-color: var(--pst-color-table-row-hover-bg);
|
||||
}
|
||||
#rocm-rn-components:has(tbody.rocm-components-runtimes td:hover) .rocm-components-runtimes th:first-of-type,
|
||||
#rocm-rn-components:has(tbody.rocm-components-runtimes td:hover) tr:hover > td {
|
||||
background-color: var(--pst-color-table-row-hover-bg);
|
||||
}
|
||||
58
temp.md
Normal file
58
temp.md
Normal file
@@ -0,0 +1,58 @@
|
||||
## Components
|
||||
|
||||
The following table lists ROCm components and their individual versions for ROCm 6.2.0. Find an overview of officially
|
||||
supported versions of ROCm components, third-party libraries, and frameworks in the
|
||||
[Compatibility matrix](https://rocm.docs.amd.com/en/latest/release/docs/6.2.0/compatibility/compatibility-matrix).
|
||||
|
||||
| Category | Group | Name | Version | |
|
||||
|----------|-------|------|---------|:-:|
|
||||
| **Libraries** | **Machine learning and computer vision** | [Composable Kernel](https://rocm.docs.amd.com/projects/composable_kernel/en/docs/6.2.0) | 1.1.0 | [{fab}`github fa-lg`](https://github.com/ROCm/composable_kernel/releases/tag/rocm-6.2.0) |
|
||||
| | | [MIGraphX](https://rocm.docs.amd.com/projects/AMDMIGraphX/en/docs/6.2.0) | 2.9 ⇒ [2.10](migraphx-2-10-0) | [{fab}`github fa-lg`](https://github.com/ROCm/AMDMIGraphX/releases/tag/rocm-6.2.0) |
|
||||
| | | [MIOpen](https://rocm.docs.amd.com/projects/MIOpen/en/docs/6.2.0) | 3.1.0 ⇒ [3.2.0](miopen-3-2-0) | [{fab}`github fa-lg`](https://github.com/ROCm/MIOpen/releases/tag/rocm-6.2.0) |
|
||||
| | | [MIVisionX](https://rocm.docs.amd.com/projects/MIVisionX/en/docs/6.2.0) | 2.5.0 ⇒ [3.0.0](mivisionx-3-0-0) | [{fab}`github fa-lg`](https://github.com/ROCm/MIVisionX/releases/tag/rocm-6.2.0) |
|
||||
| | | [rocAL](https://rocm.docs.amd.com/projects/rocAL/en/docs/6.2.0) | 2.0.0 | [{fab}`github fa-lg`](https://github.com/ROCm/rocAL/releases/tag/rocm-6.2.0) |
|
||||
| | | [rocDecode](https://rocm.docs.amd.com/projects/rocDecode/en/docs/6.2.0) | 0.6.0 | [{fab}`github fa-lg`](https://github.com/ROCm/rocDecode/releases/tag/rocm-6.2.0) |
|
||||
| | | [rocPyDecode](https://rocm.docs.amd.com/projects/rocPyDecode/en/docs/6.2.0) | 0.1.0 | [{fab}`github fa-lg`](https://github.com/ROCm/rocPyDecode/releases/tag/rocm-6.2.0) |
|
||||
| | | [RPP](https://rocm.docs.amd.com/projects/rpp/en/docs/6.2.0) | 1.5.0 ⇒ [1.8.0](rpp-1-8-0) | [{fab}`github fa-lg`](https://github.com/ROCm/rpp/releases/tag/rocm-6.2.0) |
|
||||
| | **Communication** | [rccl](https://rocm.docs.amd.com/projects/rccl/en/docs/6.2.0) | 2.18.6 ⇒ [2.20.5](rccl-2-20-5) | [{fab}`github fa-lg`](https://github.com/ROCm/rccl/releases/tag/rocm-6.2.0) |
|
||||
| | **Math** | [hipBLAS](https://rocm.docs.amd.com/projects/hipBLAS/en/docs/6.2.0) | 2.1.0 ⇒ [2.2.0](hipblas-2-2-0) | [{fab}`github fa-lg`](https://github.com/ROCm/hipBLAS/releases/tag/rocm-6.2.0) |
|
||||
| | | [hipBLASLt](https://rocm.docs.amd.com/projects/hipBLASLt/en/docs/6.2.0) | 0.7.0 ⇒ [0.8.0](hipblaslt-0-8-0) | [{fab}`github fa-lg`](https://github.com/ROCm/hipBLASLt/releases/tag/rocm-6.2.0) |
|
||||
| | | [hipFFT](https://rocm.docs.amd.com/projects/hipFFT/en/docs/6.2.0) | [1.0.14](hipfft-1-0-14) | [{fab}`github fa-lg`](https://github.com/ROCm/hipFFT/releases/tag/rocm-6.2.0) |
|
||||
| | | [hipfort](https://rocm.docs.amd.com/projects/hipfort/en/docs/6.2.0) | 0.4-0 | [{fab}`github fa-lg`](https://github.com/ROCm/hipfort/releases/tag/rocm-6.2.0) |
|
||||
| | | [hipRAND](https://rocm.docs.amd.com/projects/hipRAND/en/docs/6.2.0) | 2.10.17 ⇒ [2.11.0](hiprand-2-11-0) | [{fab}`github fa-lg`](https://github.com/ROCm/hipRAND/releases/tag/rocm-6.2.0) |
|
||||
| | | [hipSOLVER](https://rocm.docs.amd.com/projects/hipSOLVER/en/docs/6.2.0) | 2.1.1 ⇒ [2.2.0](hipsolver-2-2-0) | [{fab}`github fa-lg`](https://github.com/ROCm/hipSOLVER/releases/tag/rocm-6.2.0) |
|
||||
| | | [hipSPARSE](https://rocm.docs.amd.com/projects/hipSPARSE/en/docs/6.2.0) | 3.0.1 ⇒ [3.1.1](hipsparse-3-1-1) | [{fab}`github fa-lg`](https://github.com/ROCm/hipSPARSE/releases/tag/rocm-6.2.0) |
|
||||
| | | [hipSPARSELt](https://rocm.docs.amd.com/projects/hipSPARSELt/en/docs/6.2.0) | 0.2.0 ⇒ [0.2.1](hipsparselt-0-2-1) | [{fab}`github fa-lg`](https://github.com/ROCm/hipSPARSELt/releases/tag/rocm-6.2.0) |
|
||||
| | | [rocALUTION](https://rocm.docs.amd.com/projects/rocALUTION/en/docs/6.2.0) | 3.1.1 ⇒ [3.2.0](rocalution-3-2-0) | [{fab}`github fa-lg`](https://github.com/ROCm/rocALUTION/releases/tag/rocm-6.2.0) |
|
||||
| | | [rocBLAS](https://rocm.docs.amd.com/projects/rocBLAS/en/docs/6.2.0) | 4.1.0 ⇒ [4.2.0](rocblas-4-2-0) | [{fab}`github fa-lg`](https://github.com/ROCm/rocBLAS/releases/tag/rocm-6.2.0) |
|
||||
| | | [rocFFT](https://rocm.docs.amd.com/projects/rocFFT/en/docs/6.2.0) | 1.0.27 ⇒ [1.0.28](rocfft-1-0-28) | [{fab}`github fa-lg`](https://github.com/ROCm/rocFFT/releases/tag/rocm-6.2.0) |
|
||||
| | | [rocRAND](https://rocm.docs.amd.com/projects/rocRAND/en/docs/6.2.0) | 3.0.0 ⇒ [3.1.0](rocrand-3-1-0) | [{fab}`github fa-lg`](https://github.com/ROCm/rocRAND/releases/tag/rocm-6.2.0) |
|
||||
| | | [rocSOLVER](https://rocm.docs.amd.com/projects/rocSOLVER/en/docs/6.2.0) | 3.25.0 ⇒ [3.26.0](rocsolver-3-26-0) | [{fab}`github fa-lg`](https://github.com/ROCm/rocSOLVER/releases/tag/rocm-6.2.0) |
|
||||
| | | [rocSPARSE](https://rocm.docs.amd.com/projects/rocSPARSE/en/docs/6.2.0) | 3.1.1 ⇒ [3.2.0](rocsparse-3-2-0) | [{fab}`github fa-lg`](https://github.com/ROCm/rocSPARSE/releases/tag/rocm-6.2.0) |
|
||||
| | | [rocWMMA](https://rocm.docs.amd.com/projects/rocWMMA/en/docs/6.2.0) | 1.4.0 ⇒ [1.5.0](rocwmma-1-5-0) | [{fab}`github fa-lg`](https://github.com/ROCm/rocWMMA/releases/tag/rocm-6.2.0) |
|
||||
| | | [Tensile](https://rocm.docs.amd.com/projects/tensile/en/docs/6.2.0) | 4.40.0 ⇒ [4.41.0](tensile-4-41-0) | [{fab}`github fa-lg`](https://github.com/ROCm/tensile/releases/tag/rocm-6.2.0) |
|
||||
| | **Primitives** | [hipCUB](https://rocm.docs.amd.com/projects/hipCUB/en/docs/6.2.0) | 3.1.0 ⇒ [3.2.0](hipcub-3-2-0) | [{fab}`github fa-lg`](https://github.com/ROCm/hipCUB/releases/tag/rocm-6.2.0) |
|
||||
| | | [hipTensor](https://rocm.docs.amd.com/projects/hipTensor/en/docs/6.2.0) | 1.2.0 ⇒ [1.3.0](hiptensor-1-3-0) | [{fab}`github fa-lg`](https://github.com/ROCm/hipTensor/releases/tag/rocm-6.2.0) |
|
||||
| | | [rocPRIM](https://rocm.docs.amd.com/projects/rocPRIM/en/docs/6.2.0) | 3.1.0 ⇒ [3.2.0](rocprim-3-2-0) | [{fab}`github fa-lg`](https://github.com/ROCm/rocPRIM/releases/tag/rocm-6.2.0) |
|
||||
| | | [rocThrust](https://rocm.docs.amd.com/projects/rocThrust/en/docs/6.2.0) | 3.0.0 ⇒ [3.1.0](rocthrust-3-1-0) | [{fab}`github fa-lg`](https://github.com/ROCm/rocThrust/releases/tag/rocm-6.2.0) |
|
||||
| **Tools** | **Development** | [HIPIFY](https://rocm.docs.amd.com/projects/HIPIFY/docs/6.2.0) | 17.0.0 ⇒ [18.0.0](hipify-18-0-0) | [{fab}`github fa-lg`](https://github.com/ROCm/HIPIFY/releases/tag/rocm-6.2.0) |
|
||||
| | | [ROCdbgapi](https://rocm.docs.amd.com/projects/ROCdbgapi/en/docs/6.2.0) | 0.71.0 ⇒ [0.76.0](rocdbgapi-0-76-0) | [{fab}`github fa-lg`](https://github.com/ROCm/ROCdbgapi/releases/tag/rocm-6.2.0) |
|
||||
| | | [ROCm CMake](https://rocm.docs.amd.com/projects/rocm-cmake/en/docs/6.2.0) | 0.12.0 ⇒ [0.13.0](rocm-cmake-0-13-0) | [{fab}`github fa-lg`](https://github.com/ROCm/rocm-cmake/releases/tag/rocm-6.2.0) |
|
||||
| | | [ROCm Debugger (ROCgdb)](https://rocm.docs.amd.com/projects/rocm-cmake/en/docs/6.2.0) | 13 ⇒ [15](rocgdb-15) | [{fab}`github fa-lg`](https://github.com/ROCm/ROCgdb/releases/tag/rocm-6.2.0) |
|
||||
| | | [ROCr Debug Agent](https://rocm.docs.amd.com/projects/rocr_debug_agent/en/docs/6.2.0) | 2.0.3 | [{fab}`github fa-lg`](https://github.com/ROCm/rocr_debug_agent/releases/tag/rocm-6.2.0) |
|
||||
| | **Performance** | [Omniperf](https://rocm.docs.amd.com/projects/omniperf/en/docs/6.2.0) | 2.0.1 | [{fab}`github fa-lg`](https://github.com/ROCm/omniperf/releases/tag/rocm-6.2.0) |
|
||||
| | | [Omnitrace](https://rocm.docs.amd.com/projects/omnitrace/en/docs/6.2.0) | 1.11.2 | [{fab}`github fa-lg`](https://github.com/ROCm/omnitrace/releases/tag/rocm-6.2.0) |
|
||||
| | | [ROCm Bandwidth Test](https://rocm.docs.amd.com/projects/rocm_bandwidth_test/en/docs/6.2.0) | 1.4.0 | [{fab}`github fa-lg`](https://github.com/ROCm/rocm_bandwidth_test/releases/tag/rocm-6.2.0) |
|
||||
| | | [ROCProfiler](https://rocm.docs.amd.com/projects/ROCProfiler/en/docs/6.2.0) | 2.0.0 ⇒ [2.0.0](rocprofiler-2-0-0) | [{fab}`github fa-lg`](https://github.com/ROCm/rocm_bandwidth_test/releases/tag/rocm-6.2.0) |
|
||||
| | | [ROCProfiler-SDK](https://rocm.docs.amd.com/projects/rocprofiler-sdk/en/docs/6.2.0) | 0.4.0 | [{fab}`github fa-lg`](https://github.com/ROCm/rocm_bandwidth_test/releases/tag/rocm-6.2.0) |
|
||||
| | | [ROCTracer](https://rocm.docs.amd.com/projects/ROCTracer/en/docs/6.2.0) | 4.1.0 | [{fab}`github fa-lg`](https://github.com/ROCm/rocm_bandwidth_test/releases/tag/rocm-6.2.0) |
|
||||
| | **System** | [AMD SMI](https://rocm.docs.amd.com/projects/amdsmi/en/docs/6.2.0) | 24.5.2 ⇒ [24.6.1](amd-smi-24-6-1) | [{fab}`github fa-lg`](https://github.com/ROCm/rdc/releases/tag/rocm-6.2.0) |
|
||||
| | | [rocminfo](https://rocm.docs.amd.com/projects/rdc/en/docs/6.2.0) | 1.0.0 | [{fab}`github fa-lg`](https://github.com/ROCm/rdc/releases/tag/rocm-6.2.0) |
|
||||
| | | [ROCm Data Center Tool](https://rocm.docs.amd.com/projects/rdc/en/docs/6.2.0) | 0.3.0 ⇒ [1.0.0](rocm-data-center-tool-1-0-0) | [{fab}`github fa-lg`](https://github.com/ROCm/rdc/releases/tag/rocm-6.2.0) |
|
||||
| | | [ROCm SMI](https://rocm.docs.amd.com/projects/rdc/en/docs/6.2.0) | 7.2.0 | [{fab}`github fa-lg`](https://github.com/ROCm/rdc/releases/tag/rocm-6.2.0) |
|
||||
| | | [ROCm Validation Suite](https://rocm.docs.amd.com/projects/rdc/en/docs/6.2.0) | 1.0 | [{fab}`github fa-lg`](https://github.com/ROCm/rdc/releases/tag/rocm-6.2.0) |
|
||||
| | | [TransferBench](https://rocm.docs.amd.com/projects/rdc/en/docs/6.2.0) | 1.5.0 | [{fab}`github fa-lg`](https://github.com/ROCm/rdc/releases/tag/rocm-6.2.0) |
|
||||
| | **Compilers** | [hipCC](https://rocm.docs.amd.com/projects/hipCC/en/docs/6.2.0) | 1.0.0 ⇒ [1.1.1](hipcc-1-1-1) | [{fab}`github fa-lg`](https://github.com/ROCm/llvm-project/releases/tag/rocm-6.2.0) |
|
||||
| | | [llvm-project](https://rocm.docs.amd.com/projects/llvm-project/en/docs/6.2.0) | 17.0.0 ⇒ [18.0.0](llvm-project-18-0-0) | [{fab}`github fa-lg`](https://github.com/ROCm/llvm-project/releases/tag/rocm-6.2.0) |
|
||||
| **Runtimes** | | [HIP](https://rocm.docs.amd.com/projects/HIP/en/docs/6.2.0) | 6.1 ⇒ [6.2](hip-6-2-0) | [{fab}`github fa-lg`](https://github.com/ROCm/HIP/releases/tag/rocm-6.2.0) |
|
||||
| | | [ROCr Runtime](https://rocm.docs.amd.com/projects/ROCr-Runtime/en/docs/6.2.0) | 6.1 ⇒ [6.2](hip-6-2-0) | [{fab}`github fa-lg`](https://github.com/ROCm/ROCR-Runtime/releases/tag/rocm-6.2.0) |
|
||||
1307
tools/autotag/templates/extra_components/6.2.0.md
Normal file
1307
tools/autotag/templates/extra_components/6.2.0.md
Normal file
File diff suppressed because it is too large
Load Diff
223
tools/autotag/templates/highlights/6.2.0.md
Normal file
223
tools/autotag/templates/highlights/6.2.0.md
Normal file
@@ -0,0 +1,223 @@
|
||||
|
||||
The release notes provide a comprehensive summary of changes since the previous ROCm release.
|
||||
|
||||
- [Release highlights](release-highlights)
|
||||
|
||||
- [Operating system and hardware support changes](operating-system-and-hardware-support-changes)
|
||||
|
||||
- [ROCm components versioning](rocm-components)
|
||||
|
||||
- [Detailed component changes](detailed-component-changes)
|
||||
|
||||
- [ROCm known issues](rocm-known-issues)
|
||||
|
||||
- [ROCm upcoming changes](rocm-upcoming-changes)
|
||||
|
||||
The [Compatibility matrix](https://rocm.docs.amd.com/en/latest/release/docs/6.2.0/compatibility/compatibility-matrix)
|
||||
provides an overview of operating system, hardware, ecosystem, and ROCm component support across ROCm releases.
|
||||
|
||||
Release notes for previous ROCm releases are available in earlier versions of the documentation.
|
||||
See the [ROCm documentation release history](https://rocm.docs.amd.com/en/latest/release/versions).
|
||||
|
||||
## Release highlights
|
||||
|
||||
This section introduces notable new features and improvements in ROCm 6.2. See the
|
||||
[Detailed component changes](#detailed-component-changes) for individual component changes.
|
||||
|
||||
### New components
|
||||
|
||||
ROCm 6.2.0 introduces the following new components to the ROCm software stack.
|
||||
|
||||
- **Omniperf** -- A kernel-level profiling tool for machine learning and high-performance computing (HPC) workloads
|
||||
running on AMD Instinct accelerators. Omniperf offers comprehensive profiling and advanced analysis via command line
|
||||
or a GUI dashboard. For more information, see
|
||||
[Omniperf](https://rocm.docs.amd.com/projects/omniperf/en/latest).
|
||||
|
||||
- **Omnitrace** -- A multi-purpose analysis tool for profiling and tracing applications running on the CPU or the CPU and GPU.
|
||||
It supports dynamic binary instrumentation, call-stack sampling, causal profiling, and other features for determining
|
||||
which function and line number are executing. For more information, see
|
||||
[Omnitrace](https://rocm.docs.amd.com/projects/omnitrace/en/latest).
|
||||
|
||||
- **rocPyDecode** -- A tool to access rocDecode APIs in Python. It connects Python and C/C++ libraries,
|
||||
enabling function calling and data passing between the two languages. The `rocpydecode.so` library, a wrapper, uses
|
||||
rocDecode APIs written primarily in C/C++ within Python. For more information, see
|
||||
[rocPyDecode](https://rocm.docs.amd.com/projects/rocpydecode/en/latest).
|
||||
|
||||
- **ROCprofiler-SDK** -- ROCprofiler-SDK is a profiling and tracing library for HIP and ROCm applications on AMD ROCm software
|
||||
used to identify application performance bottlenecks and optimize their performance. The new APIs add restrictions for more
|
||||
efficient implementations and improved thread safety. A new window restriction specifies the services the tool can use.
|
||||
ROCprofiler-SDK also provides a tool library to help you write your tool implementations. `rocprofv3` uses this tool library
|
||||
to profile and trace applications for performance bottlenecks. Examples include API tracing, kernel tracing, and so on.
|
||||
For more information, see [ROCprofiler-SDK](https://rocm.docs.amd.com/projects/rocprofiler-sdk/en/latest).
|
||||
|
||||
```{note}
|
||||
ROCprofiler-SDK for ROCm 6.2.0 is a beta release and subject to change.
|
||||
```
|
||||
|
||||
### ROCm Offline Installer Creator introduced
|
||||
|
||||
The new ROCm Offline Installer Creator creates an installation package for a preconfigured setup of ROCm, the AMDGPU
|
||||
driver, or a combination of the two on a target system without network access. This new tool customizes
|
||||
multiple unique configurations for use when installing ROCm on a target. Other notable features include:
|
||||
|
||||
* A lightweight, easy-to-use user interface for configuring the creation of the installer
|
||||
|
||||
* Support for multiple Linux distributions
|
||||
|
||||
* Installer support for different ROCm releases and specific ROCm components
|
||||
|
||||
* Optional driver or driver-only installer creation
|
||||
|
||||
* Optional post-install preferences
|
||||
|
||||
* Lightweight installer packages, which are unique to the preconfigured ROCm setup
|
||||
|
||||
* Resolution and inclusion of dependency packages for offline installation
|
||||
|
||||
For more information, see
|
||||
[ROCm Offline Installer Creator](https://rocm.docs.amd.com/projects/rocm-install-on-linux/en/latest/install/rocm-offline-installer.html).
|
||||
|
||||
### Math libraries default to Clang instead of HIPCC
|
||||
|
||||
The default compiler used to build the math libraries on Linux changes from `hipcc` to `amdclang++`.
|
||||
Appropriate compiler flags are added to ensure these compilations build correctly. This change only applies when
|
||||
building the libraries. Applications using the libraries can continue to be compiled using `hipcc` or `amdclang++` as
|
||||
described in [ROCm compiler reference](https://rocm.docs.amd.com/projects/llvm-project/en/latest/reference/rocmcc.html).
|
||||
The math libraries can also be built with `hipcc` using any of the previously available methods (for example, the `CXX`
|
||||
environment variable, the `CMAKE_CXX_COMPILER` CMake variable, and so on). This change shouldn't affect performance or
|
||||
functionality.
|
||||
|
||||
### Framework and library changes
|
||||
|
||||
This section highlights updates to supported deep learning frameworks and notable third-party library optimizations.
|
||||
|
||||
#### Additional PyTorch and TensorFlow support
|
||||
|
||||
ROCm 6.2.0 supports PyTorch versions 2.2 and 2.3 and TensorFlow version 2.16.
|
||||
|
||||
See [Installing PyTorch for ROCm](https://rocm.docs.amd.com/projects/install-on-linux/en/latest/how-to/3rd-party/pytorch-install.html)
|
||||
and [Installing TensorFlow for ROCm](https://rocm.docs.amd.com/projects/install-on-linux/en/latest/how-to/3rd-party/tensorflow-install.html)
|
||||
for installation instructions.
|
||||
|
||||
Refer to the
|
||||
[Third-party support matrix](https://rocm.docs.amd.com/projects/install-on-linux/en/latest/reference/3rd-party-support-matrix.html#deep-learning)
|
||||
for a comprehensive list of third-party frameworks and libraries suppported by ROCm.
|
||||
|
||||
#### Optimized framework support for OpenXLA
|
||||
|
||||
PyTorch for ROCm and TensorFlow for ROCm now provide native support for OpenXLA. OpenXLA is an open-source ML compiler
|
||||
ecosystem that enables developers to compile and optimize models from all leading ML frameworks. For more information, see
|
||||
[Installing PyTorch for ROCm](https://rocm.docs.amd.com/projects/install-on-linux/en/latest/how-to/3rd-party/pytorch-install.html)
|
||||
and [Installing TensorFlow for ROCm](https://rocm.docs.amd.com/projects/install-on-linux/en/latest/how-to/3rd-party/tensorflow-install.html).
|
||||
|
||||
#### PyTorch support for Autocast (automatic mixed precision)
|
||||
|
||||
PyTorch now supports Autocast for recurrent neural networks (RNNs) on ROCm. This can help to reduce computational
|
||||
workloads and improve performance. Based on the information about the magnitude of values, Autocast can substitute the
|
||||
original `float32` linear layers and convolutions with their `float16` or `bfloat16` variants. For more information, see
|
||||
[Automatic mixed precision](https://rocm.docs.amd.com/en/latest/how-to/rocm-for-ai/train-a-model#automatic-mixed-precision-amp).
|
||||
|
||||
#### Memory savings for bitsandbytes model quantization
|
||||
|
||||
The [ROCm-aware bitsandbytes library](https://github.com/ROCm/bitsandbytes) is a lightweight Python wrapper around HIP
|
||||
custom functions, in particular 8-bit optimizer, matrix multiplication, and 8-bit and 4-bit quantization functions.
|
||||
ROCm 6.2.0 introduces the following bitsandbytes changes:
|
||||
|
||||
- `Int8` matrix multiplication is enabled, and it includes the following functions:
|
||||
- `extract-outliers` – extracts rows and columns that have outliers in the inputs. They’re later used for matrix multiplication without quantization.
|
||||
- `transform` – row-to-column and column-to-row transformations are enabled, along with transpose operations. These are used before and after matmul computation.
|
||||
- `igemmlt` – new function for GEMM computation A*B^T. It uses
|
||||
[hipblasLtMatMul](https://rocm.docs.amd.com/projects/hipBLASLt/en/latest/api-reference.html#hipblasltmatmul) and performs 8-bit GEMM operations.
|
||||
- `dequant_mm` – dequantizes output matrix to original data type using scaling factors from vector-wise quantization.
|
||||
- Blockwise quantization – input tensors are quantized for a fixed block size.
|
||||
- 4-bit quantization and dequantization functions – normalized `Float4` quantization, quantile estimation, and quantile quantization functions are enabled.
|
||||
- 8-bit and 32-bit optimizers are enabled.
|
||||
|
||||
```{note}
|
||||
These functions are included in bitsandbytes. They are not part of ROCm. However, ROCm 6.2.0 has enabled the fixes and
|
||||
features to run them.
|
||||
```
|
||||
|
||||
For more information, see [Model quantization techniques](https://rocm.docs.amd.com/en/latest/how-to/llm-fine-tuning-optimization/model-quantization.html).
|
||||
|
||||
#### Improved vLLM support
|
||||
|
||||
ROCm 6.2.0 enhances vLLM support for inference on AMD Instinct accelerators, adding
|
||||
capabilities for `FP16`/`BF16` precision for LLMs, and `FP8` support for Llama.
|
||||
ROCm 6.2.0 adds support for the following vLLM features:
|
||||
|
||||
- MP:
|
||||
|
||||
Multi-GPU execution. Choose between MP and Ray using a flag. To set it to MP,
|
||||
use `--distributed-executor-backed=mp`. The default depends on the commit in flux.
|
||||
|
||||
- FP8 KV cache:
|
||||
|
||||
Enhances computational efficiency and performance by significantly reducing memory usage and bandwidth requirements.
|
||||
The QUARK quantizer currently only supports Llama.
|
||||
|
||||
- Triton Flash Attention:
|
||||
|
||||
ROCm supports both Triton and Composable Kernel Flash Attention 2 in vLLM. The default is Triton, but you can change this
|
||||
setting using the `VLLM_USE_FLASH_ATTN_TRITON=False` environment variable.
|
||||
|
||||
- PyTorch TunableOp:
|
||||
|
||||
Improved optimization and tuning of GEMMs. It requires Docker with PyTorch 2.3 or later.
|
||||
|
||||
For more information about enabling these features, see
|
||||
[vLLM inference](https://rocm.docs.amd.com/en/latest/how-to/llm-fine-tuning-optimization/llm-inference-frameworks.html#vllm-inference).
|
||||
|
||||
ROCm has a vLLM branch for experimental features. This includes performance improvements, accuracy, and correctness testing.
|
||||
These features include:
|
||||
|
||||
- FP8 GEMMs: To improve the performance of FP8 quantization, work is underway on tuning the GEMM using the shapes used
|
||||
in the model's execution. It only supports LLAMA because the QUARK quantizer currently only supports Llama.
|
||||
|
||||
- Custom decode paged attention: Improves performance by efficiently managing memory and enabling faster attention
|
||||
computation in large-scale models. This benefits all workloads in `FP16` configurations.
|
||||
|
||||
To enable these experimental new features, see
|
||||
[vLLM inference](https://rocm.docs.amd.com/en/latest/how-to/llm-fine-tuning-optimization/llm-inference-frameworks.html#vllm-inference).
|
||||
Use the `rocm/vllm` branch when cloning the GitHub repo. The `vllm/ROCm_performance.md` document outlines
|
||||
all the accessible features, and the `vllm/Dockerfile.rocm` file can be used.
|
||||
|
||||
### Enhanced performance tuning on AMD Instinct accelerators
|
||||
|
||||
ROCm is pretuned for high-performance computing workloads including large language models, generative AI, and scientific computing.
|
||||
The ROCm documentation provides comprehensive guidance on configuring your system for AMD Instinct accelerators. It includes
|
||||
detailed instructions on system settings and application tuning suggestions to help you fully leverage the capabilities of these
|
||||
accelerators for optimal performance. For more information, see
|
||||
[AMD MI300X tuning guides](https://rocm.docs.amd.com/en/latest/how-to/tuning-guides/mi300x/index.html) and
|
||||
[AMD MI300A system optimization](https://rocm.docs.amd.com/en/latest/how-to/system-optimization/mi300x.html).
|
||||
|
||||
### Removed clang-ocl
|
||||
|
||||
As of version 6.2, ROCm no longer provides the `clang-ocl` package. The project will be archived in the future.
|
||||
See the [clang-ocl README](https://github.com/ROCm/clang-ocl).
|
||||
|
||||
### ROCm documentation changes
|
||||
|
||||
The documentation for the ROCm components has been reorganized and reformatted in a standard look and feel. This
|
||||
improves the usability and readability of the documentation. For more information about the ROCm components, see
|
||||
[What is ROCm?](https://rocm.docs.amd.com/en/latest/what-is-rocm.html).
|
||||
|
||||
Since the release of ROCm 6.1, the documentation has added some key topics including:
|
||||
|
||||
- [AMD Instinct MI300X workload tuning guide](https://rocm.docs.amd.com/en/latest/how-to/tuning-guides/mi300x/workload.html)
|
||||
- [AMD Instinct MI300X system tuning guide](https://rocm.docs.amd.com/en/latest/how-to/system-optimization/mi300x.html)
|
||||
- [AMD Instinct MI300A system tuning guide](https://rocm.docs.amd.com/en/latest/how-to/system-optimization/mi300a.html)
|
||||
- [Using ROCm for AI](https://rocm.docs.amd.com/en/latest/how-to/rocm-for-ai/index.html)
|
||||
- [Using ROCm for HPC](https://rocm.docs.amd.com/en/latest/how-to/rocm-for-hpc/index.html)
|
||||
- [Fine-tuning LLMs and inference optimization](https://rocm.docs.amd.com/en/latest/how-to/llm-fine-tuning-optimization/index.html)
|
||||
- [LLVM reference documentation](https://rocm.docs.amd.com/projects/llvm-project/en/latest/)
|
||||
|
||||
The following topics have been significantly improved, expanded, or both:
|
||||
|
||||
- [HIP programming manual](https://rocm.docs.amd.com/projects/HIP/en/latest/)
|
||||
- [Compatibility matrix](https://rocm.docs.amd.com/en/latest/compatibility/compatibility-matrix.html)
|
||||
|
||||
```{note}
|
||||
All ROCm projects are open source and available on GitHub. To contribute to ROCm documentation, see the
|
||||
[ROCm documentation contribution guidelines](https://rocm.docs.amd.com/en/latest/contribute/contributing.html).
|
||||
```
|
||||
27
tools/autotag/templates/support/6.2.0.md
Normal file
27
tools/autotag/templates/support/6.2.0.md
Normal file
@@ -0,0 +1,27 @@
|
||||
|
||||
## Operating system and hardware support changes
|
||||
|
||||
ROCm 6.2.0 adds support for the following operating system and kernel versions.
|
||||
|
||||
- Ubuntu 24.04 LTS (kernel: 6.8 [GA])
|
||||
|
||||
- RHEL 8.10 (kernel: 4.18.0-544)
|
||||
|
||||
- SLES 15 SP6 (kernel: 6.4)
|
||||
|
||||
ROCm 6.2.0 marks the end of support (EoS) for:
|
||||
|
||||
- Ubuntu 22.04.3
|
||||
|
||||
- RHEL 9.2
|
||||
|
||||
- RHEL 8.8
|
||||
|
||||
- SLES 15 SP 4
|
||||
|
||||
- CentOS 7.9
|
||||
|
||||
ROCm 6.2.0 has been tested against pre-release Ubuntu 22.04.5 (kernel: 6.5 [HWE]).
|
||||
|
||||
See the [Compatibility matrix](https://rocm-stg.amd.com/en/docs/6.2.0/compatibility/compatibility-matrix.html) for an
|
||||
overview of supported operating systems and hardware architectures.
|
||||
79
tools/autotag/templates/upcoming_changes/6.2.0.md
Normal file
79
tools/autotag/templates/upcoming_changes/6.2.0.md
Normal file
@@ -0,0 +1,79 @@
|
||||
|
||||
## ROCm known issues
|
||||
|
||||
ROCm known issues are noted on [{fab}`github` GitHub](https://github.com/ROCm/ROCm/labels/Verified%20Issue). For known
|
||||
issues related to individual components, review the [Detailed component changes](detailed-component-changes).
|
||||
|
||||
### Default processor affinity behavior for helper threads
|
||||
|
||||
Processor affinity is a critical setting to ensure that ROCm helper threads run on the correct cores. By default, ROCm
|
||||
helper threads are spawned on all available cores, ignoring the parent thread’s processor affinity. This can lead to
|
||||
threads competing for available cores, which may result in suboptimal performance. This behavior occurs by default if
|
||||
the environment variable `HSA_OVERRIDE_CPU_AFFINITY_DEBUG` is not set or is set to `1`. If
|
||||
`HSA_OVERRIDE_CPU_AFFINITY_DEBUG` is set to `0`, the ROCr runtime uses the parent process's core affinity mask when
|
||||
creating helper threads. The parent’s affinity mask should then be set to account for the presence of additional threads
|
||||
by ensuring the affinity mask contains enough cores. Depending on the affinity settings of the software environment,
|
||||
batch system, launch commands like `numactl`/`taskset`, or explicit mask manipulation by the application itself, changing
|
||||
the setting may be advantageous to performance.
|
||||
|
||||
To ensure the parent's core affinity mask is honored by the ROCm helper threads, set the
|
||||
`HSA_OVERRIDE_CPU_AFFINITY_DEBUG` environment variable as follows:
|
||||
|
||||
```{code} shell
|
||||
export HSA_OVERRIDE_CPU_AFFINITY_DEBUG=0
|
||||
```
|
||||
|
||||
To ensure ROCm helper threads run on all available cores, set the `HSA_OVERRIDE_CPU_AFFINITY_DEBUG` environment variable
|
||||
as follows:
|
||||
|
||||
``` shell
|
||||
export HSA_OVERRIDE_CPU_AFFINITY_DEBUG=1
|
||||
```
|
||||
|
||||
Or the default:
|
||||
|
||||
``` shell
|
||||
|
||||
unset HSA_OVERRIDE_CPU_AFFINITY_DEBUG
|
||||
```
|
||||
|
||||
If unsure of the default processor affinity settings for your environment, run the following command from the shell:
|
||||
|
||||
``` shell
|
||||
|
||||
bash -c "echo taskset -p \$\$"
|
||||
```
|
||||
|
||||
### KFDTest failure on Instinct MI300X with Oracle Linux 8.9
|
||||
|
||||
The `KFDEvictTest.QueueTest` is failing on the MI300X platform during KFD (Kernel Fusion Driver) tests, causing the full
|
||||
suite to not execute properly. This issue is suspected to be hardware-related.
|
||||
|
||||
### Bandwidth limitation in gang and non-gang modes on Instinct MI300A
|
||||
|
||||
Expected target peak non-gang performance (~60GB/s) and target peak gang performance (~90GB/s) are not achieved. Both gang
|
||||
and non-gang performance are observed to be limited at 45GB/s.
|
||||
|
||||
This issue will be addressed in a future ROCm release.
|
||||
|
||||
### rocm-llvm-alt
|
||||
|
||||
ROCm provides an optional package -- `rocm-llvm-alt` -- that provides a closed-source compiler for
|
||||
users interested in additional closed-source CPU optimizations. This feature is not functional in
|
||||
the ROCm 6.2.0 release. Users who attempt to invoke the closed-source compiler will experience an
|
||||
LLVM consumer-producer mismatch and the compilation will fail. There is no workaround that allows
|
||||
use of the closed-source compiler. It is recommended to compile using the default open-source
|
||||
compiler, which generates high-quality AMD CPU and AMD GPU code.
|
||||
|
||||
## ROCm upcoming changes
|
||||
|
||||
The section notes upcoming changes to the ROCm software stack. For upcoming changes related to individual components, review
|
||||
the [Detailed component changes](detailed-component-changes).
|
||||
|
||||
### rocm-llvm-alt
|
||||
|
||||
The `rocm-llvm-alt` package will be removed in an upcoming release. Users relying on the
|
||||
functionality provided by the closed-source compiler should transition to the open-source compiler.
|
||||
Once the `rocm-llvm-alt` package is removed, any compilation requesting functionality provided by
|
||||
the closed-source compiler will result in a Clang warning: "*[AMD] proprietary optimization compiler
|
||||
has been removed*".
|
||||
Reference in New Issue
Block a user