6.2 release notes (#111)

* generate release notes

* update release notes

update release.md

update anchors

fix formatting

* add component notes

* remove known issues from toc

* update pydata sphinx table styling

* remove temp file

* add 6.2.0 templates

* add documentation improvements list

* update conf.py with 6.2.0 version and GA date

* update changelog headings

* remove rserp tickets

* add miopen cl

* remove bolding

* add Ram's feedback

fix  thing

* rm sub-bullets

* update new components formatting

* update amd smi version

* add css

* add table styles

* add component notes and KIs

* update os support wording

* update highlights

* update compilers cls

* fix links

* add KIs

* update KI wording

* add ram's suggestions

* add omniperf known issue

fmt

* system -> system management in components table

* change rocthrust version to 3.0.1

* remove release highlight and add RVS changelog

* update highlights

* fix version nums, add rocr runtime

* reorder components table

* update compiler KI

* more compiler known issue under llvm-proj

* add space

* word

* fix internal links

* add gdb

* update pytorch autocast highligh

* add hipfft cl

* fix hipfft internal link

* fix svg icon color

* fix table

* remove rocblas highlight and update tf hl

* add fixes

* update highlights

* fix ck in table

* fix mivisionx rocal note

* fix link and dbgapi version

* fix link to llvm proj docs

* fix fmt

* add feedback

* add more changes

move clang-ocl to upcoming changes

add fixes

fix some fmt

fix table width

fix formatting

add fixes

fix tensile fmt

remove unused file

update templates

change words

* add known issue

* rm "for unknown reasons"

* fix hipsolver, platform -> software stack

* add amdsmi note

* rm mention of mi308

fmt

* add beta note to rocprofiler-sdk

fix

* bold a heading

* move hipify under compilers

* Revert "move hipify under compilers"

This reverts commit 83861f544a75bce1ea64b14871e1224161d34815.

* fix typos and GA date

update text

* update words

* add processor affinity KI and remove rocHPL KI

* update processor affinity KI

* update llvm-proj KI

fix

* update processor affinity KI

update

* fix hip link

* update templates

* words

* update links to 6.2.0

* remove extra css

* fix some stuff in hip

word

* add dell black screen hang ki

word

* fix rocpydecode link

* remove sass files
This commit is contained in:
Peter Park
2024-08-02 12:40:33 -04:00
committed by GitHub
parent 717ec0df34
commit 63d3dfd344
11 changed files with 3763 additions and 605 deletions

2511
RELEASE.md

File diff suppressed because it is too large Load Diff

View File

@@ -25,16 +25,16 @@ latex_elements = {
project = "ROCm Documentation"
author = "Advanced Micro Devices, Inc."
copyright = "Copyright (c) 2024 Advanced Micro Devices, Inc. All rights reserved."
version = "6.1.2"
release = "6.1.2"
version = "6.2.0"
release = "6.2.0"
setting_all_article_info = True
all_article_info_os = ["linux", "windows"]
all_article_info_author = ""
# pages with specific settings
article_pages = [
{"file": "about/release-notes", "os": ["linux", "windows"], "date": "2024-06-04"},
{"file": "about/changelog", "os": ["linux", "windows"], "date": "2024-06-04"},
{"file": "about/release-notes", "os": ["linux", "windows"], "date": "2024-08-02"},
{"file": "about/changelog", "os": ["linux", "windows"], "date": "2024-08-02"},
{"file": "how-to/deep-learning-rocm", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/index", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/install", "os": ["linux"]},
@@ -104,7 +104,7 @@ html_theme = "rocm_docs_theme"
html_theme_options = {"flavor": "rocm-docs-home"}
html_static_path = ["sphinx/static/css"]
html_css_files = ["rocm_custom.css"]
html_css_files = ["rocm_custom.css", "rocm_rn.css"]
html_title = "ROCm Documentation"

View File

@@ -191,6 +191,8 @@ This section describes performance-based settings.
echo 20 > /proc/sys/vm/compaction_proactiveness
echo 1 > /proc/sys/vm/compact_unevictable_allowed
.. _mi300a-processor-affinity:
* **Change affinity of ROCm helper threads**
This change prevents internal ROCm threads from having their CPU core affinity mask

View File

@@ -9,12 +9,6 @@ subtrees:
- file: what-is-rocm.rst
- file: about/release-notes.md
title: Release notes
subtrees:
- entries:
- file: about/changelog.md
title: Changelog
- url: https://github.com/ROCm/ROCm/labels/Verified%20Issue
title: Known issues
- caption: Install
entries:

View File

@@ -1,6 +1,21 @@
/* Override PyData Sphinx Theme default colors */
html[data-theme='light'] {
--pst-color-table-row-hover-bg: #E2E8F0;
}
html[data-theme='dark'] {
--pst-color-table-row-hover-bg: #1E293B;
}
a svg {
color: var(--pst-color-text-base);
}
a svg:hover {
color: var(--pst-color-link-hover);
}
/* Adds container for big tables, used for Compatibility Matrix */
.format-big-table {
white-space: nowrap;
}
}

View File

@@ -0,0 +1,126 @@
#rocm-rn-components col {
width: 6rem;
}
#rocm-rn-components col:nth-child(2) {
width: 12rem;
}
#rocm-rn-components td {
white-space: nowrap;
}
#rocm-rn-components td:last-of-type {
text-align: center;
}
#rocm-rn-components a svg {
color: var(--pst-color-text-base);
}
#rocm-rn-components a svg:hover {
color: var(--pst-color-link-hover);
}
#rocm-rn-components .tbody-reverse-zebra tr:nth-child(2n + 1) td {
background-color: var(--pst-color-table-row-zebra-high-bg);
}
#rocm-rn-components .tbody-reverse-zebra tr:nth-child(2n) td {
background-color: var(--pst-color-table-row-zebra-low-bg);
}
#rocm-rn-components:has(tbody.rocm-components-libs th[rowspan]:first-of-type:hover) .rocm-components-libs,
#rocm-rn-components:has(tbody.rocm-components-libs th[rowspan]:first-of-type:hover) .rocm-components-libs td,
#rocm-rn-components:has(tbody.rocm-components-libs th[rowspan]:first-of-type:hover) tbody.rocm-components-libs th {
background-color: var(--pst-color-table-row-hover-bg);
}
#rocm-rn-components:has(tbody.rocm-components-tools th[rowspan]:first-of-type:hover) .rocm-components-tools,
#rocm-rn-components:has(tbody.rocm-components-tools th[rowspan]:first-of-type:hover) .rocm-components-tools td,
#rocm-rn-components:has(tbody.rocm-components-tools th[rowspan]:first-of-type:hover) tbody.rocm-components-tools th {
background-color: var(--pst-color-table-row-hover-bg);
}
#rocm-rn-components:has(tbody.rocm-components-compilers th[rowspan]:first-of-type:hover) .rocm-components-compilers,
#rocm-rn-components:has(tbody.rocm-components-compilers th[rowspan]:first-of-type:hover) .rocm-components-compilers td {
background-color: var(--pst-color-table-row-hover-bg);
}
#rocm-rn-components:has(tbody.rocm-components-runtimes th[rowspan]:first-of-type:hover) .rocm-components-runtimes,
#rocm-rn-components:has(tbody.rocm-components-runtimes th[rowspan]:first-of-type:hover) .rocm-components-runtimes td {
background-color: var(--pst-color-table-row-hover-bg);
}
#rocm-rn-components:has(tbody.rocm-components-tools th[rowspan]:first-of-type:hover) .rocm-components-tools td {
background-color: var(--pst-color-table-row-hover-bg);
}
#rocm-rn-components:has(tbody.rocm-components-compilers th[rowspan]:first-of-type:hover) .rocm-components-compilers td {
background-color: var(--pst-color-table-row-hover-bg);
}
#rocm-rn-components:has(tbody.rocm-components-runtimes th[rowspan]:first-of-type:hover) .rocm-components-runtimes td {
background-color: var(--pst-color-table-row-hover-bg);
}
#rocm-rn-components:has(tbody.rocm-components-ml th[rowspan]:nth-of-type(2):hover) .rocm-components-ml td,
#rocm-rn-components:has(tbody.rocm-components-ml th[rowspan]:nth-of-type(2):hover) .rocm-components-libs th:first-of-type {
background-color: var(--pst-color-table-row-hover-bg);
}
#rocm-rn-components:has(tbody.rocm-components-communication th[rowspan]:nth-of-type(2):hover) .rocm-components-communication td,
#rocm-rn-components:has(tbody.rocm-components-communication th[rowspan]:nth-of-type(2):hover) .rocm-components-libs th:first-of-type {
background-color: var(--pst-color-table-row-hover-bg);
}
#rocm-rn-components:has(tbody.rocm-components-math th[rowspan]:nth-of-type(2):hover) .rocm-components-math td,
#rocm-rn-components:has(tbody.rocm-components-math th[rowspan]:nth-of-type(2):hover) .rocm-components-libs th:first-of-type {
background-color: var(--pst-color-table-row-hover-bg);
}
#rocm-rn-components:has(tbody.rocm-components-primitives th[rowspan]:nth-of-type(2):hover) .rocm-components-primitives td,
#rocm-rn-components:has(tbody.rocm-components-primitives th[rowspan]:nth-of-type(2):hover) .rocm-components-libs th:first-of-type {
background-color: var(--pst-color-table-row-hover-bg);
}
#rocm-rn-components:has(tbody.rocm-components-dev th[rowspan]:nth-of-type(2):hover) .rocm-components-dev td,
#rocm-rn-components:has(tbody.rocm-components-dev th[rowspan]:nth-of-type(2):hover) .rocm-components-tools th:first-of-type {
background-color: var(--pst-color-table-row-hover-bg);
}
#rocm-rn-components:has(tbody.rocm-components-perf th[rowspan]:nth-of-type(2):hover) .rocm-components-perf td,
#rocm-rn-components:has(tbody.rocm-components-perf th[rowspan]:nth-of-type(2):hover) .rocm-components-tools th:first-of-type {
background-color: var(--pst-color-table-row-hover-bg);
}
#rocm-rn-components:has(tbody.rocm-components-system th[rowspan]:nth-of-type(2):hover) .rocm-components-system td,
#rocm-rn-components:has(tbody.rocm-components-system th[rowspan]:nth-of-type(2):hover) .rocm-components-tools th:first-of-type {
background-color: var(--pst-color-table-row-hover-bg);
}
#rocm-rn-components:has(tbody.rocm-components-ml td:hover) .rocm-components-ml th,
#rocm-rn-components:has(tbody.rocm-components-ml td:hover) .rocm-components-libs th:first-of-type,
#rocm-rn-components:has(tbody.rocm-components-ml td:hover) tr:hover > td {
background-color: var(--pst-color-table-row-hover-bg);
}
#rocm-rn-components:has(tbody.rocm-components-communication td:hover) .rocm-components-communication th,
#rocm-rn-components:has(tbody.rocm-components-communication td:hover) .rocm-components-libs th:first-of-type,
#rocm-rn-components:has(tbody.rocm-components-communication td:hover) tr:hover > td {
background-color: var(--pst-color-table-row-hover-bg);
}
#rocm-rn-components:has(tbody.rocm-components-math td:hover) .rocm-components-math th,
#rocm-rn-components:has(tbody.rocm-components-math td:hover) .rocm-components-libs th:first-of-type,
#rocm-rn-components:has(tbody.rocm-components-math td:hover) tr:hover > td {
background-color: var(--pst-color-table-row-hover-bg);
}
#rocm-rn-components:has(tbody.rocm-components-primitives td:hover) .rocm-components-primitives th,
#rocm-rn-components:has(tbody.rocm-components-primitives td:hover) .rocm-components-libs th:first-of-type,
#rocm-rn-components:has(tbody.rocm-components-primitives td:hover) tr:hover > td {
background-color: var(--pst-color-table-row-hover-bg);
}
#rocm-rn-components:has(tbody.rocm-components-dev td:hover) .rocm-components-dev th,
#rocm-rn-components:has(tbody.rocm-components-dev td:hover) .rocm-components-tools th:first-of-type,
#rocm-rn-components:has(tbody.rocm-components-dev td:hover) tr:hover > td {
background-color: var(--pst-color-table-row-hover-bg);
}
#rocm-rn-components:has(tbody.rocm-components-perf td:hover) .rocm-components-perf th,
#rocm-rn-components:has(tbody.rocm-components-perf td:hover) .rocm-components-tools th:first-of-type,
#rocm-rn-components:has(tbody.rocm-components-perf td:hover) tr:hover > td {
background-color: var(--pst-color-table-row-hover-bg);
}
#rocm-rn-components:has(tbody.rocm-components-system td:hover) .rocm-components-system th,
#rocm-rn-components:has(tbody.rocm-components-system td:hover) .rocm-components-tools th:first-of-type,
#rocm-rn-components:has(tbody.rocm-components-system td:hover) tr:hover > td {
background-color: var(--pst-color-table-row-hover-bg);
}
#rocm-rn-components:has(tbody.rocm-components-compilers td:hover) .rocm-components-compilers th:first-of-type,
#rocm-rn-components:has(tbody.rocm-components-compilers td:hover) tr:hover > td {
background-color: var(--pst-color-table-row-hover-bg);
}
#rocm-rn-components:has(tbody.rocm-components-runtimes td:hover) .rocm-components-runtimes th:first-of-type,
#rocm-rn-components:has(tbody.rocm-components-runtimes td:hover) tr:hover > td {
background-color: var(--pst-color-table-row-hover-bg);
}

58
temp.md Normal file
View File

@@ -0,0 +1,58 @@
## Components
The following table lists ROCm components and their individual versions for ROCm 6.2.0. Find an overview of officially
supported versions of ROCm components, third-party libraries, and frameworks in the
[Compatibility matrix](https://rocm.docs.amd.com/en/latest/release/docs/6.2.0/compatibility/compatibility-matrix).
| Category | Group | Name | Version | |
|----------|-------|------|---------|:-:|
| **Libraries** | **Machine learning and computer vision** | [Composable Kernel](https://rocm.docs.amd.com/projects/composable_kernel/en/docs/6.2.0) | 1.1.0 | [{fab}`github fa-lg`](https://github.com/ROCm/composable_kernel/releases/tag/rocm-6.2.0) |
| | | [MIGraphX](https://rocm.docs.amd.com/projects/AMDMIGraphX/en/docs/6.2.0) | 2.9 ⇒ [2.10](migraphx-2-10-0) | [{fab}`github fa-lg`](https://github.com/ROCm/AMDMIGraphX/releases/tag/rocm-6.2.0) |
| | | [MIOpen](https://rocm.docs.amd.com/projects/MIOpen/en/docs/6.2.0) | 3.1.0 ⇒ [3.2.0](miopen-3-2-0) | [{fab}`github fa-lg`](https://github.com/ROCm/MIOpen/releases/tag/rocm-6.2.0) |
| | | [MIVisionX](https://rocm.docs.amd.com/projects/MIVisionX/en/docs/6.2.0) | 2.5.0 ⇒ [3.0.0](mivisionx-3-0-0) | [{fab}`github fa-lg`](https://github.com/ROCm/MIVisionX/releases/tag/rocm-6.2.0) |
| | | [rocAL](https://rocm.docs.amd.com/projects/rocAL/en/docs/6.2.0) | 2.0.0 | [{fab}`github fa-lg`](https://github.com/ROCm/rocAL/releases/tag/rocm-6.2.0) |
| | | [rocDecode](https://rocm.docs.amd.com/projects/rocDecode/en/docs/6.2.0) | 0.6.0 | [{fab}`github fa-lg`](https://github.com/ROCm/rocDecode/releases/tag/rocm-6.2.0) |
| | | [rocPyDecode](https://rocm.docs.amd.com/projects/rocPyDecode/en/docs/6.2.0) | 0.1.0 | [{fab}`github fa-lg`](https://github.com/ROCm/rocPyDecode/releases/tag/rocm-6.2.0) |
| | | [RPP](https://rocm.docs.amd.com/projects/rpp/en/docs/6.2.0) | 1.5.0 ⇒ [1.8.0](rpp-1-8-0) | [{fab}`github fa-lg`](https://github.com/ROCm/rpp/releases/tag/rocm-6.2.0) |
| | **Communication** | [rccl](https://rocm.docs.amd.com/projects/rccl/en/docs/6.2.0) | 2.18.6 ⇒ [2.20.5](rccl-2-20-5) | [{fab}`github fa-lg`](https://github.com/ROCm/rccl/releases/tag/rocm-6.2.0) |
| | **Math** | [hipBLAS](https://rocm.docs.amd.com/projects/hipBLAS/en/docs/6.2.0) | 2.1.0 ⇒ [2.2.0](hipblas-2-2-0) | [{fab}`github fa-lg`](https://github.com/ROCm/hipBLAS/releases/tag/rocm-6.2.0) |
| | | [hipBLASLt](https://rocm.docs.amd.com/projects/hipBLASLt/en/docs/6.2.0) | 0.7.0 ⇒ [0.8.0](hipblaslt-0-8-0) | [{fab}`github fa-lg`](https://github.com/ROCm/hipBLASLt/releases/tag/rocm-6.2.0) |
| | | [hipFFT](https://rocm.docs.amd.com/projects/hipFFT/en/docs/6.2.0) | [1.0.14](hipfft-1-0-14) | [{fab}`github fa-lg`](https://github.com/ROCm/hipFFT/releases/tag/rocm-6.2.0) |
| | | [hipfort](https://rocm.docs.amd.com/projects/hipfort/en/docs/6.2.0) | 0.4-0 | [{fab}`github fa-lg`](https://github.com/ROCm/hipfort/releases/tag/rocm-6.2.0) |
| | | [hipRAND](https://rocm.docs.amd.com/projects/hipRAND/en/docs/6.2.0) | 2.10.17 ⇒ [2.11.0](hiprand-2-11-0) | [{fab}`github fa-lg`](https://github.com/ROCm/hipRAND/releases/tag/rocm-6.2.0) |
| | | [hipSOLVER](https://rocm.docs.amd.com/projects/hipSOLVER/en/docs/6.2.0) | 2.1.1 ⇒ [2.2.0](hipsolver-2-2-0) | [{fab}`github fa-lg`](https://github.com/ROCm/hipSOLVER/releases/tag/rocm-6.2.0) |
| | | [hipSPARSE](https://rocm.docs.amd.com/projects/hipSPARSE/en/docs/6.2.0) | 3.0.1 ⇒ [3.1.1](hipsparse-3-1-1) | [{fab}`github fa-lg`](https://github.com/ROCm/hipSPARSE/releases/tag/rocm-6.2.0) |
| | | [hipSPARSELt](https://rocm.docs.amd.com/projects/hipSPARSELt/en/docs/6.2.0) | 0.2.0 ⇒ [0.2.1](hipsparselt-0-2-1) | [{fab}`github fa-lg`](https://github.com/ROCm/hipSPARSELt/releases/tag/rocm-6.2.0) |
| | | [rocALUTION](https://rocm.docs.amd.com/projects/rocALUTION/en/docs/6.2.0) | 3.1.1 ⇒ [3.2.0](rocalution-3-2-0) | [{fab}`github fa-lg`](https://github.com/ROCm/rocALUTION/releases/tag/rocm-6.2.0) |
| | | [rocBLAS](https://rocm.docs.amd.com/projects/rocBLAS/en/docs/6.2.0) | 4.1.0 ⇒ [4.2.0](rocblas-4-2-0) | [{fab}`github fa-lg`](https://github.com/ROCm/rocBLAS/releases/tag/rocm-6.2.0) |
| | | [rocFFT](https://rocm.docs.amd.com/projects/rocFFT/en/docs/6.2.0) | 1.0.27 ⇒ [1.0.28](rocfft-1-0-28) | [{fab}`github fa-lg`](https://github.com/ROCm/rocFFT/releases/tag/rocm-6.2.0) |
| | | [rocRAND](https://rocm.docs.amd.com/projects/rocRAND/en/docs/6.2.0) | 3.0.0 ⇒ [3.1.0](rocrand-3-1-0) | [{fab}`github fa-lg`](https://github.com/ROCm/rocRAND/releases/tag/rocm-6.2.0) |
| | | [rocSOLVER](https://rocm.docs.amd.com/projects/rocSOLVER/en/docs/6.2.0) | 3.25.0 ⇒ [3.26.0](rocsolver-3-26-0) | [{fab}`github fa-lg`](https://github.com/ROCm/rocSOLVER/releases/tag/rocm-6.2.0) |
| | | [rocSPARSE](https://rocm.docs.amd.com/projects/rocSPARSE/en/docs/6.2.0) | 3.1.1 ⇒ [3.2.0](rocsparse-3-2-0) | [{fab}`github fa-lg`](https://github.com/ROCm/rocSPARSE/releases/tag/rocm-6.2.0) |
| | | [rocWMMA](https://rocm.docs.amd.com/projects/rocWMMA/en/docs/6.2.0) | 1.4.0 ⇒ [1.5.0](rocwmma-1-5-0) | [{fab}`github fa-lg`](https://github.com/ROCm/rocWMMA/releases/tag/rocm-6.2.0) |
| | | [Tensile](https://rocm.docs.amd.com/projects/tensile/en/docs/6.2.0) | 4.40.0 ⇒ [4.41.0](tensile-4-41-0) | [{fab}`github fa-lg`](https://github.com/ROCm/tensile/releases/tag/rocm-6.2.0) |
| | **Primitives** | [hipCUB](https://rocm.docs.amd.com/projects/hipCUB/en/docs/6.2.0) | 3.1.0 ⇒ [3.2.0](hipcub-3-2-0) | [{fab}`github fa-lg`](https://github.com/ROCm/hipCUB/releases/tag/rocm-6.2.0) |
| | | [hipTensor](https://rocm.docs.amd.com/projects/hipTensor/en/docs/6.2.0) | 1.2.0 ⇒ [1.3.0](hiptensor-1-3-0) | [{fab}`github fa-lg`](https://github.com/ROCm/hipTensor/releases/tag/rocm-6.2.0) |
| | | [rocPRIM](https://rocm.docs.amd.com/projects/rocPRIM/en/docs/6.2.0) | 3.1.0 ⇒ [3.2.0](rocprim-3-2-0) | [{fab}`github fa-lg`](https://github.com/ROCm/rocPRIM/releases/tag/rocm-6.2.0) |
| | | [rocThrust](https://rocm.docs.amd.com/projects/rocThrust/en/docs/6.2.0) | 3.0.0 ⇒ [3.1.0](rocthrust-3-1-0) | [{fab}`github fa-lg`](https://github.com/ROCm/rocThrust/releases/tag/rocm-6.2.0) |
| **Tools** | **Development** | [HIPIFY](https://rocm.docs.amd.com/projects/HIPIFY/docs/6.2.0) | 17.0.0 ⇒ [18.0.0](hipify-18-0-0) | [{fab}`github fa-lg`](https://github.com/ROCm/HIPIFY/releases/tag/rocm-6.2.0) |
| | | [ROCdbgapi](https://rocm.docs.amd.com/projects/ROCdbgapi/en/docs/6.2.0) | 0.71.0 ⇒ [0.76.0](rocdbgapi-0-76-0) | [{fab}`github fa-lg`](https://github.com/ROCm/ROCdbgapi/releases/tag/rocm-6.2.0) |
| | | [ROCm CMake](https://rocm.docs.amd.com/projects/rocm-cmake/en/docs/6.2.0) | 0.12.0 ⇒ [0.13.0](rocm-cmake-0-13-0) | [{fab}`github fa-lg`](https://github.com/ROCm/rocm-cmake/releases/tag/rocm-6.2.0) |
| | | [ROCm Debugger (ROCgdb)](https://rocm.docs.amd.com/projects/rocm-cmake/en/docs/6.2.0) | 13 ⇒ [15](rocgdb-15) | [{fab}`github fa-lg`](https://github.com/ROCm/ROCgdb/releases/tag/rocm-6.2.0) |
| | | [ROCr Debug Agent](https://rocm.docs.amd.com/projects/rocr_debug_agent/en/docs/6.2.0) | 2.0.3 | [{fab}`github fa-lg`](https://github.com/ROCm/rocr_debug_agent/releases/tag/rocm-6.2.0) |
| | **Performance** | [Omniperf](https://rocm.docs.amd.com/projects/omniperf/en/docs/6.2.0) | 2.0.1 | [{fab}`github fa-lg`](https://github.com/ROCm/omniperf/releases/tag/rocm-6.2.0) |
| | | [Omnitrace](https://rocm.docs.amd.com/projects/omnitrace/en/docs/6.2.0) | 1.11.2 | [{fab}`github fa-lg`](https://github.com/ROCm/omnitrace/releases/tag/rocm-6.2.0) |
| | | [ROCm Bandwidth Test](https://rocm.docs.amd.com/projects/rocm_bandwidth_test/en/docs/6.2.0) | 1.4.0 | [{fab}`github fa-lg`](https://github.com/ROCm/rocm_bandwidth_test/releases/tag/rocm-6.2.0) |
| | | [ROCProfiler](https://rocm.docs.amd.com/projects/ROCProfiler/en/docs/6.2.0) | 2.0.0 ⇒ [2.0.0](rocprofiler-2-0-0) | [{fab}`github fa-lg`](https://github.com/ROCm/rocm_bandwidth_test/releases/tag/rocm-6.2.0) |
| | | [ROCProfiler-SDK](https://rocm.docs.amd.com/projects/rocprofiler-sdk/en/docs/6.2.0) | 0.4.0 | [{fab}`github fa-lg`](https://github.com/ROCm/rocm_bandwidth_test/releases/tag/rocm-6.2.0) |
| | | [ROCTracer](https://rocm.docs.amd.com/projects/ROCTracer/en/docs/6.2.0) | 4.1.0 | [{fab}`github fa-lg`](https://github.com/ROCm/rocm_bandwidth_test/releases/tag/rocm-6.2.0) |
| | **System** | [AMD SMI](https://rocm.docs.amd.com/projects/amdsmi/en/docs/6.2.0) | 24.5.2 ⇒ [24.6.1](amd-smi-24-6-1) | [{fab}`github fa-lg`](https://github.com/ROCm/rdc/releases/tag/rocm-6.2.0) |
| | | [rocminfo](https://rocm.docs.amd.com/projects/rdc/en/docs/6.2.0) | 1.0.0 | [{fab}`github fa-lg`](https://github.com/ROCm/rdc/releases/tag/rocm-6.2.0) |
| | | [ROCm Data Center Tool](https://rocm.docs.amd.com/projects/rdc/en/docs/6.2.0) | 0.3.0 ⇒ [1.0.0](rocm-data-center-tool-1-0-0) | [{fab}`github fa-lg`](https://github.com/ROCm/rdc/releases/tag/rocm-6.2.0) |
| | | [ROCm SMI](https://rocm.docs.amd.com/projects/rdc/en/docs/6.2.0) | 7.2.0 | [{fab}`github fa-lg`](https://github.com/ROCm/rdc/releases/tag/rocm-6.2.0) |
| | | [ROCm Validation Suite](https://rocm.docs.amd.com/projects/rdc/en/docs/6.2.0) | 1.0 | [{fab}`github fa-lg`](https://github.com/ROCm/rdc/releases/tag/rocm-6.2.0) |
| | | [TransferBench](https://rocm.docs.amd.com/projects/rdc/en/docs/6.2.0) | 1.5.0 | [{fab}`github fa-lg`](https://github.com/ROCm/rdc/releases/tag/rocm-6.2.0) |
| | **Compilers** | [hipCC](https://rocm.docs.amd.com/projects/hipCC/en/docs/6.2.0) | 1.0.0 ⇒ [1.1.1](hipcc-1-1-1) | [{fab}`github fa-lg`](https://github.com/ROCm/llvm-project/releases/tag/rocm-6.2.0) |
| | | [llvm-project](https://rocm.docs.amd.com/projects/llvm-project/en/docs/6.2.0) | 17.0.0 ⇒ [18.0.0](llvm-project-18-0-0) | [{fab}`github fa-lg`](https://github.com/ROCm/llvm-project/releases/tag/rocm-6.2.0) |
| **Runtimes** | | [HIP](https://rocm.docs.amd.com/projects/HIP/en/docs/6.2.0) | 6.1 ⇒ [6.2](hip-6-2-0) | [{fab}`github fa-lg`](https://github.com/ROCm/HIP/releases/tag/rocm-6.2.0) |
| | | [ROCr Runtime](https://rocm.docs.amd.com/projects/ROCr-Runtime/en/docs/6.2.0) | 6.1 ⇒ [6.2](hip-6-2-0) | [{fab}`github fa-lg`](https://github.com/ROCm/ROCR-Runtime/releases/tag/rocm-6.2.0) |

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,223 @@
The release notes provide a comprehensive summary of changes since the previous ROCm release.
- [Release highlights](release-highlights)
- [Operating system and hardware support changes](operating-system-and-hardware-support-changes)
- [ROCm components versioning](rocm-components)
- [Detailed component changes](detailed-component-changes)
- [ROCm known issues](rocm-known-issues)
- [ROCm upcoming changes](rocm-upcoming-changes)
The [Compatibility matrix](https://rocm.docs.amd.com/en/latest/release/docs/6.2.0/compatibility/compatibility-matrix)
provides an overview of operating system, hardware, ecosystem, and ROCm component support across ROCm releases.
Release notes for previous ROCm releases are available in earlier versions of the documentation.
See the [ROCm documentation release history](https://rocm.docs.amd.com/en/latest/release/versions).
## Release highlights
This section introduces notable new features and improvements in ROCm 6.2. See the
[Detailed component changes](#detailed-component-changes) for individual component changes.
### New components
ROCm 6.2.0 introduces the following new components to the ROCm software stack.
- **Omniperf** -- A kernel-level profiling tool for machine learning and high-performance computing (HPC) workloads
running on AMD Instinct accelerators. Omniperf offers comprehensive profiling and advanced analysis via command line
or a GUI dashboard. For more information, see
[Omniperf](https://rocm.docs.amd.com/projects/omniperf/en/latest).
- **Omnitrace** -- A multi-purpose analysis tool for profiling and tracing applications running on the CPU or the CPU and GPU.
It supports dynamic binary instrumentation, call-stack sampling, causal profiling, and other features for determining
which function and line number are executing. For more information, see
[Omnitrace](https://rocm.docs.amd.com/projects/omnitrace/en/latest).
- **rocPyDecode** -- A tool to access rocDecode APIs in Python. It connects Python and C/C++ libraries,
enabling function calling and data passing between the two languages. The `rocpydecode.so` library, a wrapper, uses
rocDecode APIs written primarily in C/C++ within Python. For more information, see
[rocPyDecode](https://rocm.docs.amd.com/projects/rocpydecode/en/latest).
- **ROCprofiler-SDK** -- ROCprofiler-SDK is a profiling and tracing library for HIP and ROCm applications on AMD ROCm software
used to identify application performance bottlenecks and optimize their performance. The new APIs add restrictions for more
efficient implementations and improved thread safety. A new window restriction specifies the services the tool can use.
ROCprofiler-SDK also provides a tool library to help you write your tool implementations. `rocprofv3` uses this tool library
to profile and trace applications for performance bottlenecks. Examples include API tracing, kernel tracing, and so on.
For more information, see [ROCprofiler-SDK](https://rocm.docs.amd.com/projects/rocprofiler-sdk/en/latest).
```{note}
ROCprofiler-SDK for ROCm 6.2.0 is a beta release and subject to change.
```
### ROCm Offline Installer Creator introduced
The new ROCm Offline Installer Creator creates an installation package for a preconfigured setup of ROCm, the AMDGPU
driver, or a combination of the two on a target system without network access. This new tool customizes
multiple unique configurations for use when installing ROCm on a target. Other notable features include:
* A lightweight, easy-to-use user interface for configuring the creation of the installer
* Support for multiple Linux distributions
* Installer support for different ROCm releases and specific ROCm components
* Optional driver or driver-only installer creation
* Optional post-install preferences
* Lightweight installer packages, which are unique to the preconfigured ROCm setup
* Resolution and inclusion of dependency packages for offline installation
For more information, see
[ROCm Offline Installer Creator](https://rocm.docs.amd.com/projects/rocm-install-on-linux/en/latest/install/rocm-offline-installer.html).
### Math libraries default to Clang instead of HIPCC
The default compiler used to build the math libraries on Linux changes from `hipcc` to `amdclang++`.
Appropriate compiler flags are added to ensure these compilations build correctly. This change only applies when
building the libraries. Applications using the libraries can continue to be compiled using `hipcc` or `amdclang++` as
described in [ROCm compiler reference](https://rocm.docs.amd.com/projects/llvm-project/en/latest/reference/rocmcc.html).
The math libraries can also be built with `hipcc` using any of the previously available methods (for example, the `CXX`
environment variable, the `CMAKE_CXX_COMPILER` CMake variable, and so on). This change shouldn't affect performance or
functionality.
### Framework and library changes
This section highlights updates to supported deep learning frameworks and notable third-party library optimizations.
#### Additional PyTorch and TensorFlow support
ROCm 6.2.0 supports PyTorch versions 2.2 and 2.3 and TensorFlow version 2.16.
See [Installing PyTorch for ROCm](https://rocm.docs.amd.com/projects/install-on-linux/en/latest/how-to/3rd-party/pytorch-install.html)
and [Installing TensorFlow for ROCm](https://rocm.docs.amd.com/projects/install-on-linux/en/latest/how-to/3rd-party/tensorflow-install.html)
for installation instructions.
Refer to the
[Third-party support matrix](https://rocm.docs.amd.com/projects/install-on-linux/en/latest/reference/3rd-party-support-matrix.html#deep-learning)
for a comprehensive list of third-party frameworks and libraries suppported by ROCm.
#### Optimized framework support for OpenXLA
PyTorch for ROCm and TensorFlow for ROCm now provide native support for OpenXLA. OpenXLA is an open-source ML compiler
ecosystem that enables developers to compile and optimize models from all leading ML frameworks. For more information, see
[Installing PyTorch for ROCm](https://rocm.docs.amd.com/projects/install-on-linux/en/latest/how-to/3rd-party/pytorch-install.html)
and [Installing TensorFlow for ROCm](https://rocm.docs.amd.com/projects/install-on-linux/en/latest/how-to/3rd-party/tensorflow-install.html).
#### PyTorch support for Autocast (automatic mixed precision)
PyTorch now supports Autocast for recurrent neural networks (RNNs) on ROCm. This can help to reduce computational
workloads and improve performance. Based on the information about the magnitude of values, Autocast can substitute the
original `float32` linear layers and convolutions with their `float16` or `bfloat16` variants. For more information, see
[Automatic mixed precision](https://rocm.docs.amd.com/en/latest/how-to/rocm-for-ai/train-a-model#automatic-mixed-precision-amp).
#### Memory savings for bitsandbytes model quantization
The [ROCm-aware bitsandbytes library](https://github.com/ROCm/bitsandbytes) is a lightweight Python wrapper around HIP
custom functions, in particular 8-bit optimizer, matrix multiplication, and 8-bit and 4-bit quantization functions.
ROCm 6.2.0 introduces the following bitsandbytes changes:
- `Int8` matrix multiplication is enabled, and it includes the following functions:
- `extract-outliers` extracts rows and columns that have outliers in the inputs. Theyre later used for matrix multiplication without quantization.
- `transform` row-to-column and column-to-row transformations are enabled, along with transpose operations. These are used before and after matmul computation.
- `igemmlt` new function for GEMM computation A*B^T. It uses
[hipblasLtMatMul](https://rocm.docs.amd.com/projects/hipBLASLt/en/latest/api-reference.html#hipblasltmatmul) and performs 8-bit GEMM operations.
- `dequant_mm` dequantizes output matrix to original data type using scaling factors from vector-wise quantization.
- Blockwise quantization input tensors are quantized for a fixed block size.
- 4-bit quantization and dequantization functions normalized `Float4` quantization, quantile estimation, and quantile quantization functions are enabled.
- 8-bit and 32-bit optimizers are enabled.
```{note}
These functions are included in bitsandbytes. They are not part of ROCm. However, ROCm 6.2.0 has enabled the fixes and
features to run them.
```
For more information, see [Model quantization techniques](https://rocm.docs.amd.com/en/latest/how-to/llm-fine-tuning-optimization/model-quantization.html).
#### Improved vLLM support
ROCm 6.2.0 enhances vLLM support for inference on AMD Instinct accelerators, adding
capabilities for `FP16`/`BF16` precision for LLMs, and `FP8` support for Llama.
ROCm 6.2.0 adds support for the following vLLM features:
- MP:
Multi-GPU execution. Choose between MP and Ray using a flag. To set it to MP,
use `--distributed-executor-backed=mp`. The default depends on the commit in flux.
- FP8 KV cache:
Enhances computational efficiency and performance by significantly reducing memory usage and bandwidth requirements.
The QUARK quantizer currently only supports Llama.
- Triton Flash Attention:
ROCm supports both Triton and Composable Kernel Flash Attention 2 in vLLM. The default is Triton, but you can change this
setting using the `VLLM_USE_FLASH_ATTN_TRITON=False` environment variable.
- PyTorch TunableOp:
Improved optimization and tuning of GEMMs. It requires Docker with PyTorch 2.3 or later.
For more information about enabling these features, see
[vLLM inference](https://rocm.docs.amd.com/en/latest/how-to/llm-fine-tuning-optimization/llm-inference-frameworks.html#vllm-inference).
ROCm has a vLLM branch for experimental features. This includes performance improvements, accuracy, and correctness testing.
These features include:
- FP8 GEMMs: To improve the performance of FP8 quantization, work is underway on tuning the GEMM using the shapes used
in the model's execution. It only supports LLAMA because the QUARK quantizer currently only supports Llama.
- Custom decode paged attention: Improves performance by efficiently managing memory and enabling faster attention
computation in large-scale models. This benefits all workloads in `FP16` configurations.
To enable these experimental new features, see
[vLLM inference](https://rocm.docs.amd.com/en/latest/how-to/llm-fine-tuning-optimization/llm-inference-frameworks.html#vllm-inference).
Use the `rocm/vllm` branch when cloning the GitHub repo. The `vllm/ROCm_performance.md` document outlines
all the accessible features, and the `vllm/Dockerfile.rocm` file can be used.
### Enhanced performance tuning on AMD Instinct accelerators
ROCm is pretuned for high-performance computing workloads including large language models, generative AI, and scientific computing.
The ROCm documentation provides comprehensive guidance on configuring your system for AMD Instinct accelerators. It includes
detailed instructions on system settings and application tuning suggestions to help you fully leverage the capabilities of these
accelerators for optimal performance. For more information, see
[AMD MI300X tuning guides](https://rocm.docs.amd.com/en/latest/how-to/tuning-guides/mi300x/index.html) and
[AMD MI300A system optimization](https://rocm.docs.amd.com/en/latest/how-to/system-optimization/mi300x.html).
### Removed clang-ocl
As of version 6.2, ROCm no longer provides the `clang-ocl` package. The project will be archived in the future.
See the [clang-ocl README](https://github.com/ROCm/clang-ocl).
### ROCm documentation changes
The documentation for the ROCm components has been reorganized and reformatted in a standard look and feel. This
improves the usability and readability of the documentation. For more information about the ROCm components, see
[What is ROCm?](https://rocm.docs.amd.com/en/latest/what-is-rocm.html).
Since the release of ROCm 6.1, the documentation has added some key topics including:
- [AMD Instinct MI300X workload tuning guide](https://rocm.docs.amd.com/en/latest/how-to/tuning-guides/mi300x/workload.html)
- [AMD Instinct MI300X system tuning guide](https://rocm.docs.amd.com/en/latest/how-to/system-optimization/mi300x.html)
- [AMD Instinct MI300A system tuning guide](https://rocm.docs.amd.com/en/latest/how-to/system-optimization/mi300a.html)
- [Using ROCm for AI](https://rocm.docs.amd.com/en/latest/how-to/rocm-for-ai/index.html)
- [Using ROCm for HPC](https://rocm.docs.amd.com/en/latest/how-to/rocm-for-hpc/index.html)
- [Fine-tuning LLMs and inference optimization](https://rocm.docs.amd.com/en/latest/how-to/llm-fine-tuning-optimization/index.html)
- [LLVM reference documentation](https://rocm.docs.amd.com/projects/llvm-project/en/latest/)
The following topics have been significantly improved, expanded, or both:
- [HIP programming manual](https://rocm.docs.amd.com/projects/HIP/en/latest/)
- [Compatibility matrix](https://rocm.docs.amd.com/en/latest/compatibility/compatibility-matrix.html)
```{note}
All ROCm projects are open source and available on GitHub. To contribute to ROCm documentation, see the
[ROCm documentation contribution guidelines](https://rocm.docs.amd.com/en/latest/contribute/contributing.html).
```

View File

@@ -0,0 +1,27 @@
## Operating system and hardware support changes
ROCm 6.2.0 adds support for the following operating system and kernel versions.
- Ubuntu 24.04 LTS (kernel: 6.8 [GA])
- RHEL 8.10 (kernel: 4.18.0-544)
- SLES 15 SP6 (kernel: 6.4)
ROCm 6.2.0 marks the end of support (EoS) for:
- Ubuntu 22.04.3
- RHEL 9.2
- RHEL 8.8
- SLES 15 SP 4
- CentOS 7.9
ROCm 6.2.0 has been tested against pre-release Ubuntu 22.04.5 (kernel: 6.5 [HWE]).
See the [Compatibility matrix](https://rocm-stg.amd.com/en/docs/6.2.0/compatibility/compatibility-matrix.html) for an
overview of supported operating systems and hardware architectures.

View File

@@ -0,0 +1,79 @@
## ROCm known issues
ROCm known issues are noted on [{fab}`github` GitHub](https://github.com/ROCm/ROCm/labels/Verified%20Issue). For known
issues related to individual components, review the [Detailed component changes](detailed-component-changes).
### Default processor affinity behavior for helper threads
Processor affinity is a critical setting to ensure that ROCm helper threads run on the correct cores. By default, ROCm
helper threads are spawned on all available cores, ignoring the parent threads processor affinity. This can lead to
threads competing for available cores, which may result in suboptimal performance. This behavior occurs by default if
the environment variable `HSA_OVERRIDE_CPU_AFFINITY_DEBUG` is not set or is set to `1`. If
`HSA_OVERRIDE_CPU_AFFINITY_DEBUG` is set to `0`, the ROCr runtime uses the parent process's core affinity mask when
creating helper threads. The parents affinity mask should then be set to account for the presence of additional threads
by ensuring the affinity mask contains enough cores. Depending on the affinity settings of the software environment,
batch system, launch commands like `numactl`/`taskset`, or explicit mask manipulation by the application itself, changing
the setting may be advantageous to performance.
To ensure the parent's core affinity mask is honored by the ROCm helper threads, set the
`HSA_OVERRIDE_CPU_AFFINITY_DEBUG` environment variable as follows:
```{code} shell
export HSA_OVERRIDE_CPU_AFFINITY_DEBUG=0
```
To ensure ROCm helper threads run on all available cores, set the `HSA_OVERRIDE_CPU_AFFINITY_DEBUG` environment variable
as follows:
``` shell
export HSA_OVERRIDE_CPU_AFFINITY_DEBUG=1
```
Or the default:
``` shell
unset HSA_OVERRIDE_CPU_AFFINITY_DEBUG
```
If unsure of the default processor affinity settings for your environment, run the following command from the shell:
``` shell
bash -c "echo taskset -p \$\$"
```
### KFDTest failure on Instinct MI300X with Oracle Linux 8.9
The `KFDEvictTest.QueueTest` is failing on the MI300X platform during KFD (Kernel Fusion Driver) tests, causing the full
suite to not execute properly. This issue is suspected to be hardware-related.
### Bandwidth limitation in gang and non-gang modes on Instinct MI300A
Expected target peak non-gang performance (~60GB/s) and target peak gang performance (~90GB/s) are not achieved. Both gang
and non-gang performance are observed to be limited at 45GB/s.
This issue will be addressed in a future ROCm release.
### rocm-llvm-alt
ROCm provides an optional package -- `rocm-llvm-alt` -- that provides a closed-source compiler for
users interested in additional closed-source CPU optimizations. This feature is not functional in
the ROCm 6.2.0 release. Users who attempt to invoke the closed-source compiler will experience an
LLVM consumer-producer mismatch and the compilation will fail. There is no workaround that allows
use of the closed-source compiler. It is recommended to compile using the default open-source
compiler, which generates high-quality AMD CPU and AMD GPU code.
## ROCm upcoming changes
The section notes upcoming changes to the ROCm software stack. For upcoming changes related to individual components, review
the [Detailed component changes](detailed-component-changes).
### rocm-llvm-alt
The `rocm-llvm-alt` package will be removed in an upcoming release. Users relying on the
functionality provided by the closed-source compiler should transition to the open-source compiler.
Once the `rocm-llvm-alt` package is removed, any compilation requesting functionality provided by
the closed-source compiler will result in a Clang warning: "*[AMD] proprietary optimization compiler
has been removed*".