mirror of
https://github.com/ROCm/ROCm.git
synced 2026-01-08 22:28:06 -05:00
6.3.0 release notes (#199)
* generate 6.3.0 RELEASE.md * add 6.3.0 os/hw support * regenerate changelog * update table * add amd smi and fix fmt * add rocjpeg note * add missed changelog entries * update ga date * add SHARK toolkit introduced note update SHARK note * Edited some components (#202) * Edited some components * fixed formatting on rocal * markdown fail on the last commit; fixed * capitalization fix * Copy edit component change logs (#203) * fix some formatting * fix table and add OpenCL note fix fmt fix more formatting * add radeon note * add rocmsmi * Updated hipCUB, rocPrim, and rocThrust (#206) * fix some stuff * add transferbench * Edits to RCCL 6.3 change log (#207) * Update tools/autotag/templates/upcoming_changes/6.3.0.md * fix formatting * fix sphinx underline warning * add @lpaoletti's highlights * fix os support * add missing kernel version * fix heading * add bitsandbytes ki * Copy edits to release notes (#208) * Copy edits to release notes * Additional updates to release notes * updated shark AI toolkit description * fix formatting * update opencl * update opencl fixes and updates * Update RELEASE.md Co-authored-by: Jeffrey Novotny <jnovotny@amd.com> * Update RELEASE.md Co-authored-by: Jeffrey Novotny <jnovotny@amd.com> * fix omnitools rename text * Apply suggestions from code review Co-authored-by: Jeffrey Novotny <jnovotny@amd.com> * Update RELEASE.md * Update RELEASE.md * Update RELEASE.md * Update RELEASE.md * Update RELEASE.md * Update RELEASE.md * update omniperf and tesile notes * Update RELEASE.md * Update RELEASE.md * Update RELEASE.md * Update RELEASE.md * Update RELEASE.md * Apply suggestions from code review * Apply suggestions from code review * Apply suggestions from code review * Apply suggestions from code review * made some copy edits (#209) * Apply suggestions from code review * Update RELEASE.md * Apply suggestions from code review Co-authored-by: Jeffrey Novotny <jnovotny@amd.com> * indent * add more highlights * update shark urls * add omni notes * Apply suggestions from code review Co-authored-by: Jeffrey Novotny <jnovotny@amd.com> * update some changelogs * Update RELEASE.md Co-authored-by: Jeffrey Novotny <jnovotny@amd.com> * Update RELEASE.md Co-authored-by: Jeffrey Novotny <jnovotny@amd.com> * Update RELEASE.md Co-authored-by: Jeffrey Novotny <jnovotny@amd.com> * update some cls * and missed changelogs * add missed component updates * fix links * add amdgpu-dkms highlight * Update RELEASE.md Co-authored-by: Jeffrey Novotny <jnovotny@amd.com> * change links * add fixed issues * @neon60's changes Co-authored-by: Istvan Kiss <neon60@gmail.com> * Apply suggestions from code review Co-authored-by: Jeffrey Novotny <jnovotny@amd.com> Co-authored-by: Swati Rawat <120587655+SwRaw@users.noreply.github.com> * rm extra hip docs * add hip links * add fixed issue fix * Update RELEASE.md Co-authored-by: Istvan Kiss <neon60@gmail.com> * Update RELEASE.md Co-authored-by: Istvan Kiss <neon60@gmail.com> * Update RELEASE.md Co-authored-by: Istvan Kiss <neon60@gmail.com> * fix ri * fix zebra * Update RELEASE.md Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com> * rm extra amd smi info * Apply suggestions from code review Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com> Co-authored-by: Jeffrey Novotny <jnovotny@amd.com> * add more about omni renmae fix rename stuff * Update RELEASE.md Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com> * Update RELEASE.md Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com> * fix formatting * wording * fix link * update aotriton * remove libraries performance improved * fix rhel version * fix urls shorten title * Apply suggestions from code review Co-authored-by: Swati Rawat <120587655+SwRaw@users.noreply.github.com> * Release notes updates (#212) * Made language more precise (#211) MIVisionX and rocAL were changed. An awkward sentence in rocAL was also fixed. * add rocprofiler * add rdc add rdc entry * Update RELEASE.md Co-authored-by: Istvan Kiss <neon60@gmail.com> * Update RELEASE.md Co-authored-by: Istvan Kiss <neon60@gmail.com> * Update RELEASE.md Co-authored-by: Swati Rawat <120587655+SwRaw@users.noreply.github.com> * remove bitsandbytes known issue * fix missed hip doc * update rocprof-compute version to 3.0.0 * remove words * change hiprand ver to 2.11.0 * update new components descriptions * add # * fix tensile versions * fix versions and add missed cls * Update RELEASE.md Co-authored-by: Istvan Kiss <neon60@gmail.com> * remove resolved issue for #3493 * add rdc note * add hiprand known issue add hiprand known issue add asterisk for hiprand ki asterisk formatting asterisk link asterisk * rdc known issue * @lpaoletti updates * @wenchenvincent add CK to Transformer Engine note * fix links fix links * add roct thunk interface note * rm 'previously' * Apply suggestions from code review Co-authored-by: Istvan Kiss <neon60@gmail.com> * add known issues * add mi300x cpfw known issue * add mi300x cpfw known issue add note * spacing * update te error KI * rm incorrect user impact in TE known issue * correct description of transformer engine fatal python error known issue * update autotag/templates * fix order * fix typo * update .wordlist.txt w/ lib names * add missing css classes * remove ROCT-Thunk-Interface from ROCm licenses * add rocJPEG LICENSE * fix table zebra b/c added rows * fix capitalization in toc * update URLs post-review * update AMD SMI changelog * update ROCm SMI changelog * add opencl icd stale file kI words * remove Azure Linux * update omnitrace note * add mi200 DLM known issue * update omnitrace note update omnitrace note wording update omnitrace note * update 6.3 ga to 11/26 * update KIs wording * Update tools/autotag/templates/highlights/6.3.0.md Co-authored-by: Istvan Kiss <neon60@gmail.com> * Update tools/autotag/templates/highlights/6.3.0.md Co-authored-by: Istvan Kiss <neon60@gmail.com> * update TransferBench note * remove transferbench remove transferbench * remove gfx12, 1151 * remove sr-iov * rm tb * css classes * rm gfx12 * add back transferbench * add transferbench to table * rm transferbench, add as KI * update transferbench KI workaround * add rocprof-comp KI fix * fix tensile * add backward weights conv KI update * remove RHEL 8.9 from OS EOS * remove mi200 perf drop for DLMs * add RHEL 8.9 to end of support OSes * add omniperf/omnitrace KIs * remove bf16 statement in mi300x KI * update rvs versions in compat * add amd smi KI update update * words * update GA date for 6.3.0 * add rvs KI * add KI links same * rvs in compat * update tf versions * add rvs changelog * update rn templates * add possessives to wordlist --------- Co-authored-by: spolifroni-amd <Sandra.Polifroni@amd.com> Co-authored-by: Jeffrey Novotny <jnovotny@amd.com> Co-authored-by: randyh62 <42045079+randyh62@users.noreply.github.com> Co-authored-by: Istvan Kiss <neon60@gmail.com> Co-authored-by: Swati Rawat <120587655+SwRaw@users.noreply.github.com> Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com>
This commit is contained in:
@@ -13,6 +13,7 @@ AMDMIGraphX
|
||||
AMI
|
||||
AOCC
|
||||
AOMP
|
||||
AOTriton
|
||||
APBDIS
|
||||
APIC
|
||||
APIs
|
||||
@@ -158,6 +159,7 @@ HWS
|
||||
Haswell
|
||||
Higgs
|
||||
Hyperparameters
|
||||
ICD
|
||||
ICV
|
||||
IDE
|
||||
IDEs
|
||||
@@ -208,6 +210,7 @@ MiB
|
||||
MIGraphX
|
||||
MIOpen
|
||||
MIOpenGEMM
|
||||
MIOpen's
|
||||
MIVisionX
|
||||
MLM
|
||||
MMA
|
||||
@@ -295,7 +298,9 @@ PipelineParallel
|
||||
PnP
|
||||
PowerEdge
|
||||
PowerShell
|
||||
Profiler's
|
||||
PyPi
|
||||
Pytest
|
||||
PyTorch
|
||||
Qcycles
|
||||
Qwen
|
||||
@@ -303,6 +308,7 @@ RAII
|
||||
RAS
|
||||
RCCL
|
||||
RDC
|
||||
RDC's
|
||||
RDMA
|
||||
RDNA
|
||||
README
|
||||
@@ -342,6 +348,7 @@ SENDMSG
|
||||
SGPR
|
||||
SGPRs
|
||||
SHA
|
||||
SHARK's
|
||||
SIGQUIT
|
||||
SIMD
|
||||
SIMDs
|
||||
@@ -521,6 +528,7 @@ devsel
|
||||
dimensionality
|
||||
disambiguates
|
||||
distro
|
||||
dkms
|
||||
el
|
||||
embeddings
|
||||
enablement
|
||||
@@ -686,6 +694,7 @@ rocALUTION
|
||||
rocBLAS
|
||||
rocDecode
|
||||
rocFFT
|
||||
rocJPEG
|
||||
rocLIB
|
||||
rocMLIR
|
||||
rocPRIM
|
||||
@@ -778,6 +787,7 @@ vectorize
|
||||
vectorized
|
||||
vectorizer
|
||||
vectorizes
|
||||
virtualized
|
||||
vjxb
|
||||
voxel
|
||||
walkthrough
|
||||
|
||||
1745
RELEASE.md
1745
RELEASE.md
File diff suppressed because it is too large
Load Diff
@@ -59,6 +59,7 @@ additional licenses. Please review individual repositories for more information.
|
||||
| [rocDecode](https://github.com/ROCm/rocDecode) | [MIT](https://github.com/ROCm/rocDecode/blob/develop/LICENSE) |
|
||||
| [rocFFT](https://github.com/ROCm/rocFFT/) | [MIT](https://github.com/ROCm/rocFFT/blob/develop/LICENSE.md) |
|
||||
| [ROCgdb](https://github.com/ROCm/ROCgdb/) | [GNU General Public License v2.0](https://github.com/ROCm/ROCgdb/blob/amd-master/COPYING) |
|
||||
| [rocJPEG](https://github.com/ROCm/rocJPEG/) | [MIT](https://github.com/ROCm/rocJPEG/blob/develop/LICENSE) |
|
||||
| [ROCK-Kernel-Driver](https://github.com/ROCm/ROCK-Kernel-Driver/) | [GPL 2.0 WITH Linux-syscall-note](https://github.com/ROCm/ROCK-Kernel-Driver/blob/master/COPYING) |
|
||||
| [rocminfo](https://github.com/ROCm/rocminfo/) | [The University of Illinois/NCSA](https://github.com/ROCm/rocminfo/blob/amd-staging/License.txt) |
|
||||
| [ROCm Bandwidth Test](https://github.com/ROCm/rocm_bandwidth_test/) | [The University of Illinois/NCSA](https://github.com/ROCm/rocm_bandwidth_test/blob/master/LICENSE.txt) |
|
||||
@@ -84,7 +85,6 @@ additional licenses. Please review individual repositories for more information.
|
||||
| [rocSPARSE](https://github.com/ROCm/rocSPARSE/) | [MIT](https://github.com/ROCm/rocSPARSE/blob/develop/LICENSE.md) |
|
||||
| [rocThrust](https://github.com/ROCm/rocThrust/) | [Apache 2.0](https://github.com/ROCm/rocThrust/blob/develop/LICENSE) |
|
||||
| [ROCTracer](https://github.com/ROCm/roctracer/) | [MIT](https://github.com/ROCm/roctracer/blob/amd-master/LICENSE) |
|
||||
| [ROCT-Thunk-Interface](https://github.com/ROCm/ROCT-Thunk-Interface/) | [MIT](https://github.com/ROCm/ROCT-Thunk-Interface/blob/master/LICENSE.md) |
|
||||
| [rocWMMA](https://github.com/ROCm/rocWMMA/) | [MIT](https://github.com/ROCm/rocWMMA/blob/develop/LICENSE.md) |
|
||||
| [Tensile](https://github.com/ROCm/Tensile/) | [MIT](https://github.com/ROCm/Tensile/blob/develop/LICENSE.md) |
|
||||
| [TransferBench](https://github.com/ROCm/TransferBench) | [MIT](https://github.com/ROCm/TransferBench/blob/develop/LICENSE.md) |
|
||||
|
||||
@@ -22,7 +22,7 @@ ROCm Version,6.3.0,6.2.4,6.2.2,6.2.1,6.2.0, 6.1.2, 6.1.1, 6.1.0, 6.0.2, 6.0.0
|
||||
,,,,,,,,,,
|
||||
FRAMEWORK SUPPORT,.. _framework-support-compatibility-matrix-past-60:,,,,,,,,,
|
||||
:doc:`PyTorch <rocm-install-on-linux:install/3rd-party/pytorch-install>`,"2.4, 2.3, 2.2, 2.1, 2.0, 1.13","2.3, 2.2, 2.1, 2.0, 1.13","2.3, 2.2, 2.1, 2.0, 1.13","2.3, 2.2, 2.1, 2.0, 1.13","2.3, 2.2, 2.1, 2.0, 1.13","2.1, 2.0, 1.13","2.1, 2.0, 1.13","2.1, 2.0, 1.13","2.1, 2.0, 1.13","2.1, 2.0, 1.13"
|
||||
:doc:`TensorFlow <rocm-install-on-linux:install/3rd-party/tensorflow-install>`,"2.16.1, 2.15.1, 2.14.1","2.16.1, 2.15.1, 2.14.1","2.16.1, 2.15.1, 2.14.1","2.16.1, 2.15.1, 2.14.1","2.16.1, 2.15.1, 2.14.1","2.15.0, 2.14.0, 2.13.1","2.15.0, 2.14.0, 2.13.1","2.15.0, 2.14.0, 2.13.1","2.14.0, 2.13.1, 2.12.1","2.14.0, 2.13.1, 2.12.1"
|
||||
:doc:`TensorFlow <rocm-install-on-linux:install/3rd-party/tensorflow-install>`,"2.17.0, 2.16.2, 2.15.1","2.16.1, 2.15.1, 2.14.1","2.16.1, 2.15.1, 2.14.1","2.16.1, 2.15.1, 2.14.1","2.16.1, 2.15.1, 2.14.1","2.15.0, 2.14.0, 2.13.1","2.15.0, 2.14.0, 2.13.1","2.15.0, 2.14.0, 2.13.1","2.14.0, 2.13.1, 2.12.1","2.14.0, 2.13.1, 2.12.1"
|
||||
:doc:`JAX <rocm-install-on-linux:install/3rd-party/jax-install>`,0.4.26,0.4.26,0.4.26,0.4.26,0.4.26,0.4.26,0.4.26,0.4.26,0.4.26,0.4.26
|
||||
`ONNX Runtime <https://onnxruntime.ai/docs/build/eps.html#amd-migraphx>`_,1.17.3,1.17.3,1.17.3,1.17.3,1.17.3,1.17.3,1.17.3,1.17.3,1.14.1,1.14.1
|
||||
,,,,,,,,,,
|
||||
@@ -86,7 +86,7 @@ ROCm Version,6.3.0,6.2.4,6.2.2,6.2.1,6.2.0, 6.1.2, 6.1.1, 6.1.0, 6.0.2, 6.0.0
|
||||
:doc:`ROCm Data Center Tool <rdc:index>`,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0
|
||||
:doc:`rocminfo <rocminfo:index>`,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0
|
||||
:doc:`ROCm SMI <rocm_smi_lib:index>`,7.4.0,7.3.0,7.3.0,7.3.0,7.3.0,7.2.0,7.0.0,7.0.0,6.0.2,6.0.0
|
||||
:doc:`ROCm Validation Suite <rocmvalidationsuite:index>`,rocm-6.3.0,rocm-6.2.4,rocm-6.2.2,rocm-6.2.1,rocm-6.2.0,rocm-6.1.2,rocm-6.1.1,rocm-6.1.0,rocm-6.0.2,rocm-6.0.0
|
||||
:doc:`ROCm Validation Suite <rocmvalidationsuite:index>`,1.1.0,1.0.60204,1.0.60202,1.0.60201,1.0.60200,1.0.60102,1.0.60101,1.0.60100,1.0.60002,1.0.60000
|
||||
,,,,,,,,,,
|
||||
PERFORMANCE TOOLS,,,,,,,,,,
|
||||
:doc:`ROCm Bandwidth Test <rocm_bandwidth_test:index>`,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0
|
||||
|
||||
|
@@ -49,7 +49,7 @@ compatibility and system requirements.
|
||||
,,,
|
||||
FRAMEWORK SUPPORT,.. _framework-support-compatibility-matrix:,,
|
||||
:doc:`PyTorch <rocm-install-on-linux:install/3rd-party/pytorch-install>`,"2.4, 2.3, 2.2, 2.1, 2.0, 1.13","2.3, 2.2, 2.1, 2.0, 1.13","2.1, 2.0, 1.13"
|
||||
:doc:`TensorFlow <rocm-install-on-linux:install/3rd-party/tensorflow-install>`,"2.16.1, 2.15.1, 2.14.1","2.16.1, 2.15.1, 2.14.1","2.15.0, 2.14.0, 2.13.1"
|
||||
:doc:`TensorFlow <rocm-install-on-linux:install/3rd-party/tensorflow-install>`,"2.17.0, 2.16.2, 2.15.1","2.16.1, 2.15.1, 2.14.1","2.15.0, 2.14.0, 2.13.1"
|
||||
:doc:`JAX <rocm-install-on-linux:install/3rd-party/jax-install>`,0.4.26,0.4.26,0.4.26
|
||||
`ONNX Runtime <https://onnxruntime.ai/docs/build/eps.html#amd-migraphx>`_,1.17.3,1.17.3,1.17.3
|
||||
,,,
|
||||
@@ -113,7 +113,7 @@ compatibility and system requirements.
|
||||
:doc:`ROCm Data Center Tool <rdc:index>`,0.3.0,0.3.0,0.3.0
|
||||
:doc:`rocminfo <rocminfo:index>`,1.0.0,1.0.0,1.0.0
|
||||
:doc:`ROCm SMI <rocm_smi_lib:index>`,7.4.0,7.3.0,7.0.0
|
||||
:doc:`ROCm Validation Suite <rocmvalidationsuite:index>`,rocm-6.3.0,rocm-6.2.4,rocm-6.1.0
|
||||
:doc:`ROCm Validation Suite <rocmvalidationsuite:index>`,1.1.0,1.0.60204,1.0.60100
|
||||
,,,
|
||||
PERFORMANCE TOOLS,,,
|
||||
:doc:`ROCm Bandwidth Test <rocm_bandwidth_test:index>`,1.4.0,1.4.0,1.4.0
|
||||
|
||||
@@ -30,15 +30,15 @@ if os.environ.get("READTHEDOCS", "") == "True":
|
||||
project = "ROCm Documentation"
|
||||
author = "Advanced Micro Devices, Inc."
|
||||
copyright = "Copyright (c) 2024 Advanced Micro Devices, Inc. All rights reserved."
|
||||
version = "6.2.4"
|
||||
release = "6.2.4"
|
||||
version = "6.3.0"
|
||||
release = "6.3.0"
|
||||
setting_all_article_info = True
|
||||
all_article_info_os = ["linux", "windows"]
|
||||
all_article_info_author = ""
|
||||
|
||||
# pages with specific settings
|
||||
article_pages = [
|
||||
{"file": "about/release-notes", "os": ["linux", "windows"], "date": "2024-11-06"},
|
||||
{"file": "about/release-notes", "os": ["linux", "windows"], "date": "2024-12-03"},
|
||||
{"file": "how-to/deep-learning-rocm", "os": ["linux"]},
|
||||
{"file": "how-to/rocm-for-ai/index", "os": ["linux"]},
|
||||
{"file": "how-to/rocm-for-ai/install", "os": ["linux"]},
|
||||
|
||||
@@ -479,7 +479,7 @@ Change affinity of ROCm helper threads
|
||||
This change prevents internal ROCm threads from having their CPU core affinity mask
|
||||
set to all CPU cores available. With this setting, the threads inherit their parent's
|
||||
CPU core affinity mask. If you have any questions regarding this setting,
|
||||
contact your MI300A platform vendor. To enable this setting, enter the following command:
|
||||
contact your MI300X platform vendor. To enable this setting, enter the following command:
|
||||
|
||||
.. code-block:: shell
|
||||
|
||||
|
||||
@@ -272,7 +272,7 @@ ability to collect timeline traces of the accelerator software stack as well as
|
||||
.. _mi300x-rocprof-compute:
|
||||
|
||||
ROCm Compute Profiler
|
||||
^^^^^^^^
|
||||
^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
:doc:`ROCm Compute Profiler <rocprofiler-compute:index>` is a system performance profiler for high-performance computing (HPC) and
|
||||
machine learning (ML) workloads using Instinct accelerators. Under the hood, ROCm Compute Profiler uses
|
||||
@@ -301,7 +301,7 @@ a web-based GUI or command-line analyzer, depending on your preference.
|
||||
.. _mi300x-rocprof-systems:
|
||||
|
||||
ROCm Systems Profiler
|
||||
^^^^^^^^^
|
||||
^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
:doc:`ROCm Systems Profiler <rocprofiler-systems:index>` is a comprehensive profiling and tracing tool for parallel applications,
|
||||
including HPC and ML packages, written in C, C++, Fortran, HIP, OpenCL, and Python which execute on the CPU or CPU and
|
||||
|
||||
@@ -8,6 +8,7 @@
|
||||
|
||||
| Version | Release date |
|
||||
| ------- | ------------ |
|
||||
| [6.3.0](https://rocm.docs.amd.com/en/docs-6.3.0/) | December 3, 2024 |
|
||||
| [6.2.4](https://rocm.docs.amd.com/en/docs-6.2.4/) | November 6, 2024 |
|
||||
| [6.2.2](https://rocm.docs.amd.com/en/docs-6.2.2/) | September 27, 2024 |
|
||||
| [6.2.1](https://rocm.docs.amd.com/en/docs-6.2.1/) | September 20, 2024 |
|
||||
|
||||
@@ -69,7 +69,7 @@ subtrees:
|
||||
- file: how-to/llm-fine-tuning-optimization/optimizing-triton-kernel.rst
|
||||
title: Optimize Triton kernels
|
||||
- file: how-to/llm-fine-tuning-optimization/profiling-and-debugging.rst
|
||||
title: Profile and Debug
|
||||
title: Profile and debug
|
||||
- file: how-to/system-optimization/index.rst
|
||||
title: System optimization
|
||||
subtrees:
|
||||
|
||||
164
tools/autotag/templates/highlights/6.3.0.md
Normal file
164
tools/autotag/templates/highlights/6.3.0.md
Normal file
@@ -0,0 +1,164 @@
|
||||
# ROCm 6.3.0 release notes
|
||||
|
||||
The release notes provide a summary of notable changes since the previous ROCm release.
|
||||
|
||||
- [Release highlights](#release-highlights)
|
||||
|
||||
- [Operating system and hardware support changes](#operating-system-and-hardware-support-changes)
|
||||
|
||||
- [ROCm components versioning](#rocm-components)
|
||||
|
||||
- [Detailed component changes](#detailed-component-changes)
|
||||
|
||||
- [ROCm known issues](#rocm-known-issues)
|
||||
|
||||
- [ROCm resolved issues](#rocm-resolved-issues)
|
||||
|
||||
- [ROCm upcoming changes](#rocm-upcoming-changes)
|
||||
|
||||
```{note}
|
||||
If you’re using Radeon™ PRO or Radeon GPUs in a workstation setting with a
|
||||
display connected, continue to use ROCm 6.2.3. See the [Use ROCm on Radeon
|
||||
GPUs](https://rocm.docs.amd.com/projects/radeon/en/latest/index.html)
|
||||
documentation to verify compatibility and system requirements.
|
||||
```
|
||||
|
||||
## Release highlights
|
||||
|
||||
The following are notable new features and improvements in ROCm 6.3.0. For changes to individual components, see
|
||||
[Detailed component changes](#detailed-component-changes).
|
||||
|
||||
### rocJPEG added
|
||||
|
||||
ROCm 6.3.0 introduces the rocJPEG library to the ROCm software stack. rocJPEG is a high performance
|
||||
JPEG decode SDK for AMD GPUs. For more information, see the [rocJPEG
|
||||
documentation](https://rocm.docs.amd.com/projects/rocJPEG/en/docs-6.3.0/index.html).
|
||||
|
||||
### ROCm Compute Profiler and ROCm Systems Profiler
|
||||
|
||||
These ROCm components have been renamed to reflect their new direction as part of the ROCm software
|
||||
stack.
|
||||
|
||||
- **ROCm Compute Profiler**, formerly Omniperf. For more information, see the [ROCm Compute Profiler
|
||||
documentation](https://rocm.docs.amd.com/projects/rocprofiler-compute/en/docs-6.3.0/index.html) and
|
||||
[https://github.com/ROCm/rocprofiler-compute](https://github.com/ROCm/rocprofiler-compute) on GitHub.
|
||||
|
||||
- **ROCm Systems Profiler**, formerly Omnitrace. For more information, see the [ROCm Systems Profiler
|
||||
documentation](https://rocm.docs.amd.com/projects/rocprofiler-systems/en/docs-6.3.0/index.html) and
|
||||
[https://github.com/ROCm/rocprofiler-systems](https://github.com/ROCm/rocprofiler-systems) on GitHub.
|
||||
For future compatibility, the Omnitrace project is available at [https://github.com/ROCm/omnitrace](https://github.com/ROCm/omnitrace).
|
||||
See the [Omnitrace documentation](https://rocm.docs.amd.com/projects/omnitrace/en/latest/index.html).
|
||||
|
||||
```{note}
|
||||
Update any references to the old binary names `omniperf` and `omnitrace` to
|
||||
ensure compatibility with the new `rocprof-compute` and `rocprof-sys-*` binaries.
|
||||
This might include updating environment variables, commands, and paths as
|
||||
needed to avoid disruptions to your profiling or tracing workflows.
|
||||
|
||||
See [ROCm Compute Profiler](#rocm-compute-profiler-3-0-0) and [ROCm Systems
|
||||
Profiler](#rocm-systems-profiler-0-1-0).
|
||||
```
|
||||
|
||||
### SHARK AI toolkit for high-speed inferencing and serving introduced
|
||||
|
||||
SHARK is an open-source toolkit for high-performance serving of popular generative AI and large
|
||||
language models. In its initial release, SHARK contains the [Shortfin high-performance serving
|
||||
engine](https://github.com/nod-ai/shark-ai/tree/main/shortfin), which is the SHARK inferencing
|
||||
library that includes example server applications for popular models.
|
||||
|
||||
This initial release includes support for serving the Stable Diffusion XL model on AMD Instinct™
|
||||
MI300 devices using ROCm. See SHARK's [release
|
||||
page](https://github.com/nod-ai/shark-ai/releases/tag/v3.0.0) on GitHub to get started.
|
||||
|
||||
### PyTorch 2.4 support added
|
||||
|
||||
ROCm 6.3.0 adds support for PyTorch 2.4. See the [Compatibility
|
||||
matrix](https://rocm.docs.amd.com/en/docs-6.3.0/compatibility/compatibility-matrix.html#framework-support-compatibility-matrix)
|
||||
for the complete list of PyTorch versions tested for compatibility with ROCm.
|
||||
|
||||
### Flash Attention kernels in Triton and Composable Kernel (CK) added to Transformer Engine
|
||||
|
||||
Composable Kernel-based and Triton-based Flash Attention kernels have been integrated into
|
||||
Transformer Engine via the ROCm Composable Kernel and AOTriton libraries. The
|
||||
Transformer Engine can now optionally select a flexible and optimized Attention
|
||||
solution for AMD GPUs. For more information, see [Fused Attention Backends on
|
||||
ROCm](https://github.com/ROCm/TransformerEngine/tree/dev?tab=readme-ov-file#fused-attention-backends-on-rocm)
|
||||
on GitHub.
|
||||
|
||||
### HIP compatibility
|
||||
|
||||
HIP now includes the `hipStreamLegacy` API. It's equivalent to NVIDIA `cudaStreamLegacy`. For more
|
||||
information, see [Global enum and
|
||||
defines](https://rocm.docs.amd.com/projects/HIP/en/docs-6.3.0/reference/hip_runtime_api/global_defines_enums_structs_files/global_enum_and_defines.html#c.hipStreamLegacy)
|
||||
in the HIP runtime API documentation.
|
||||
|
||||
### Unload active amdgpu-dkms module without a system reboot
|
||||
|
||||
On Instinct MI200 and MI300 systems, you can now unload the active `amdgpu-dkms` modules, and reinstall
|
||||
and reload newer modules without a system reboot. If the new `dkms` package includes newer firmware
|
||||
components, the driver will first reset the device and then load newer firmware components.
|
||||
|
||||
### ROCm Offline Installer Creator updates
|
||||
|
||||
The ROCm Offline Installer Creator 6.3 introduces a new feature to uninstall the previous version of
|
||||
ROCm on the non-connected target system before installing a new version. This feature is only supported
|
||||
on the Ubuntu distribution. See the [ROCm Offline Installer
|
||||
Creator](https://rocm.docs.amd.com/projects/install-on-linux/en/docs-6.3.0/install/rocm-offline-installer.html)
|
||||
documentation for more information.
|
||||
|
||||
### OpenCL ICD loader separated from ROCm
|
||||
|
||||
The OpenCL ICD loader is no longer delivered as part of ROCm, and must be installed separately
|
||||
as part of the [ROCm installation
|
||||
process](https://rocm.docs.amd.com/projects/install-on-linux/en/docs-6.3.0). For Ubuntu and RHEL
|
||||
installations, the required package is installed as part of the setup described in
|
||||
[Prerequisites](https://rocm.docs.amd.com/projects/install-on-linux/en/docs-6.3.0/install/prerequisites.html).
|
||||
In other supported Linux distributions like SUSE, the required package must be installed in separate steps, which are included in the installation instructions.
|
||||
|
||||
Because the OpenCL path is now separate from the ROCm installation for versioned and multi-version
|
||||
installations, you must manually define the `LD_LIBRARY_PATH` to point to the ROCm
|
||||
installation library as described in the [Post-installation
|
||||
instructions](https://rocm.docs.amd.com/projects/install-on-linux/en/docs-6.3.0/install/post-install.html).
|
||||
If the `LD_LIBRARY_PATH` is not set as needed for versioned or multi-version installations, OpenCL
|
||||
applications like `clinfo` will fail to run and return an error.
|
||||
|
||||
### ROCT Thunk Interface integrated into ROCr runtime
|
||||
|
||||
The ROCT Thunk Interface package is now integrated into the ROCr runtime. As a result, the ROCT package
|
||||
is no longer included as a separate package in the ROCm software stack.
|
||||
|
||||
### ROCm documentation updates
|
||||
|
||||
ROCm documentation continues to be updated to provide clearer and more comprehensive guidance for a
|
||||
wider variety of user needs and use cases.
|
||||
|
||||
- Documentation for Tensile is now available. Tensile is a library that creates
|
||||
benchmark-driven backend implementations for GEMMs, serving primarily as a
|
||||
backend component of rocBLAS. See the [Tensile
|
||||
documentation](https://rocm.docs.amd.com/projects/Tensile/en/docs-6.3.0/src/index.html).
|
||||
|
||||
- New documentation has been added to explain the advantages of enabling the IOMMU in passthrough
|
||||
mode for Instinct accelerators and Radeon GPUs. See [Input-Output Memory Management
|
||||
Unit](https://rocm.docs.amd.com/en/docs-6.3.0/conceptual/iommu.html).
|
||||
|
||||
- The HIP documentation has been updated and includes the following new topics:
|
||||
|
||||
- [What is HIP?](https://rocm.docs.amd.com/projects/HIP/en/docs-6.3.0/what_is_hip.html)
|
||||
- [HIP environment variables](https://rocm.docs.amd.com/projects/HIP/en/docs-6.3.0/reference/env_variables.html)
|
||||
- [Initialization](https://rocm.docs.amd.com/projects/HIP/en/docs-6.3.0/how-to/hip_runtime_api/initialization.html)
|
||||
and [error handling](https://rocm.docs.amd.com/projects/HIP/en/docs-6.3.0/how-to/hip_runtime_api/error_handling.html)
|
||||
- [Hardware features](https://rocm.docs.amd.com/projects/HIP/en/docs-6.3.0/reference/hardware_features.html)
|
||||
- [Call stack](https://rocm.docs.amd.com/projects/HIP/en/docs-6.3.0/how-to/hip_runtime_api/call_stack.html)
|
||||
- [External resource interoperability](https://rocm.docs.amd.com/projects/HIP/en/docs-6.3.0/how-to/hip_runtime_api/external_interop.html)
|
||||
|
||||
- The following HIP documentation topics have been updated:
|
||||
|
||||
- [HIP FAQ](https://rocm.docs.amd.com/projects/HIP/en/docs-6.3.0/faq.html)
|
||||
- [Deprecated APIs](https://rocm.docs.amd.com/projects/HIP/en/docs-6.3.0/reference/deprecated_api_list.html)
|
||||
- [Performance guidelines](https://rocm.docs.amd.com/projects/HIP/en/docs-6.3.0/how-to/performance_guidelines.html)
|
||||
|
||||
- The following HIP documentation topics have been reorganized to improve usability:
|
||||
|
||||
- [HIP documentation landing page](https://rocm.docs.amd.com/projects/HIP/en/docs-6.3.0/index.html)
|
||||
- [HIP runtime API reference topics](https://rocm.docs.amd.com/projects/HIP/en/docs-6.3.0/reference/hip_runtime_api_reference.html)
|
||||
- [Programming guide](https://rocm.docs.amd.com/projects/HIP/en/docs-6.3.0/how-to/hip_runtime_api.html)
|
||||
122
tools/autotag/templates/known_issues/6.3.0.md
Normal file
122
tools/autotag/templates/known_issues/6.3.0.md
Normal file
@@ -0,0 +1,122 @@
|
||||
## ROCm known issues
|
||||
|
||||
ROCm known issues are noted on {fab}`github` [GitHub](https://github.com/ROCm/ROCm/labels/Verified%20Issue). For known
|
||||
issues related to individual components, review the [Detailed component changes](#detailed-component-changes).
|
||||
|
||||
### Instinct MI300X reports incorrect raw GPU timestamps
|
||||
|
||||
On MI300X accelerators, the command processor firmware reports incorrect raw GPU timestamps. This
|
||||
issue is under investigation and will be addressed in a future release.
|
||||
|
||||
### Instinct MI300 series: backward weights convolution performance issue
|
||||
|
||||
A performance issue affects certain tensor shapes during backward weights convolution when using
|
||||
FP16 or FP32 data types on Instinct MI300 series accelerators. This issue will be addressed in a future ROCm release.
|
||||
|
||||
To mitigate the issue during model training, set the following environment variables:
|
||||
|
||||
```bash
|
||||
export MIOPEN_FIND_MODE=3
|
||||
export MIOPEN_FIND_ENFORCE=3
|
||||
```
|
||||
|
||||
These settings enable auto-tuning on the first occurrence of a new tensor shape. The tuning results
|
||||
are stored in the user database, eliminating the need for repeated tuning when the same shape is
|
||||
encountered in subsequent runs. See the
|
||||
[MIOpen](https://rocm.docs.amd.com/en/latest/how-to/tuning-guides/mi300x/workload.html#miopen)
|
||||
section in the workload optimization guide to learn more about MIOpen's auto-tuning capabilities.
|
||||
|
||||
### TransferBench package not functional
|
||||
|
||||
TransferBench packages included in the ROCm 6.3.0 release are not compiled properly and are not
|
||||
functional for most GPU targets, with the exception of gfx906. Full functionality will be available
|
||||
in a future ROCm release.
|
||||
|
||||
TransferBench is a utility for benchmarking simultaneous transfers between user-specified devices
|
||||
(CPUs or GPUs). See the documentation at [TransferBench
|
||||
documentation](https://rocm.docs.amd.com/projects/TransferBench/en/docs-6.3.0/index.html). Those
|
||||
looking to use TransferBench can access the properly compiled packages at
|
||||
[https://github.com/ROCm/TransferBench/releases](https://github.com/ROCm/TransferBench/releases).
|
||||
|
||||
### ROCm Compute Profiler post-upgrade
|
||||
|
||||
In ROCm 6.3.0, the `omniperf` package is now named `rocprofiler-compute`. As a result, running `apt install omniperf` will fail to locate the package.
|
||||
Instead, use `apt install rocprofiler-compute`. See [ROCm Compute Profiler 3.0.0](#rocm-compute-profiler-3-0-0).
|
||||
|
||||
When upgrading from ROCm 6.2 to 6.3, any existing `/opt/rocm-6.2/../omniperf` folders are not
|
||||
automatically removed. To clean up these folders, manually uninstall Omniperf using `apt remove omniperf`.
|
||||
|
||||
### ROCm Systems Profiler post-upgrade
|
||||
|
||||
In ROCm 6.3.0, the `omnitrace` package is now named `rocprofiler-systems`. As a result, running `apt install omnitrace` will fail to locate the package.
|
||||
Instead, use `apt install rocprofiler-systems`. See [ROCm Systems Profiler 0.1.0](#rocm-systems-profiler-0-1-0).
|
||||
|
||||
When upgrading from ROCm 6.2 to 6.3, any existing `/opt/rocm-6.2/../omnitrace` folders are not
|
||||
automatically removed. To clean up these folders, manually uninstall Omnitrace using `apt remove omnitrace`.
|
||||
|
||||
### Stale file due to OpenCL ICD loader deprecation
|
||||
|
||||
When upgrading from ROCm 6.2.x to ROCm 6.3.0, the [removal of the `rocm-icd-loader`
|
||||
package](#opencl-icd-loader-separated-from-rocm) leaves a stale file in the old `rocm-6.2.x`
|
||||
directory. This has no functional impact. As a workaround, manually uninstall the
|
||||
`rocm-icd-loader` package to remove the stale file. This issue will be addressed in a future ROCm
|
||||
release.
|
||||
|
||||
### ROCm Compute Profiler CTest failure in CI
|
||||
|
||||
When running ROCm Compute Profiler's (`rocprof-compute`) CTest in the Azure CI environment, the
|
||||
`rocprof-compute` execution test fails. This issue is due to an outdated test file that was not renamed
|
||||
(`omniperf` to `rocprof-compute`), and due to the `ROCM_PATH` environment variable not being set in
|
||||
the Azure CI environment, causing the tool to be unable to extract chip information as expected.
|
||||
This issue will be addressed in a future ROCm release.
|
||||
|
||||
### MIVisionX memory access fault in Canny edge detection
|
||||
|
||||
Canny edge detection kernels might access out-of-bounds memory locations while
|
||||
computing gradient intensities on edge pixels. This issue is isolated to
|
||||
Canny-specific use cases on Instinct MI300 series accelerators. This issue is
|
||||
resolved in the [MIVisionX `develop` branch](https://github.com/ROCm/mivisionx)
|
||||
and will be part of a future ROCm release.
|
||||
|
||||
### Transformer Engine test_distributed_fused_attn aborts with fatal Python error
|
||||
|
||||
The `test_distributed_fused_attn` Pytest case for JAX in [Transformer Engine
|
||||
for ROCm](https://github.com/ROCm/TransformerEngine) fails with a fatal Python
|
||||
error under certain conditions. The root cause is unrelated Transformer Engine
|
||||
but due to some issue within XLA. This XLA issue is under investigation and
|
||||
will be addressed in a future release.
|
||||
|
||||
### AMD SMI manual build issue
|
||||
|
||||
Manual builds of AMD SMI fail due to a broken link in its build configuration.
|
||||
This affects past AMD SMI releases as well. The fix is underway and will be
|
||||
applied to all branches at [https://github.com/ROCm/amdsmi](https://github.com/ROCm/amdsmi).
|
||||
|
||||
### ROCm Data Center Tool incorrect RHEL9 package version
|
||||
|
||||
In previous versions of ROCm Data Center Tool (RDC) included with ROCm 6.2 for RHEL9, RDC's version
|
||||
number was incorrectly set to `1.0.0`. ROCm 6.3 includes RDC with the correct version number.
|
||||
|
||||
```{important}
|
||||
If you're using RHEL9, you must first uninstall the existing ROCm 6.2 RDC 1.0.0 package with `sudo yum
|
||||
remove rdc` before upgrading to the ROCm 6.3 RDC package `sudo yum install rdc`.
|
||||
```
|
||||
|
||||
### ROCm Validation Suite needs specified configuration file
|
||||
|
||||
ROCm Validation Suite might fail for certain platforms if executed without the `-c` option and
|
||||
specifying the configuration file. See [RVS command line
|
||||
options](https://rocm.docs.amd.com/projects/ROCmValidationSuite/en/docs-6.3.0/ug1main.html#command-line-options)
|
||||
for more information. This issue will be addressed in a future release.
|
||||
|
||||
## ROCm resolved issues
|
||||
|
||||
The following are previously known issues resolved in this release. For resolved issues related to
|
||||
individual components, review the [Detailed component changes](#detailed-component-changes).
|
||||
|
||||
### Bandwidth limitation in gang and non-gang modes on Instinct MI300A
|
||||
|
||||
Fixed an issue where expected target peak non-gang performance (~60 GB/s) and target peak gang
|
||||
performance (~90 GB/s) were not achieved. Previously, both gang and non-gang performance were
|
||||
observed to be limited at 45 GB/s. See [issue #3496](https://github.com/ROCm/ROCm/issues/3496) on
|
||||
GitHub.
|
||||
11
tools/autotag/templates/resolved_issues/6.3.0.md
Normal file
11
tools/autotag/templates/resolved_issues/6.3.0.md
Normal file
@@ -0,0 +1,11 @@
|
||||
## ROCm resolved issues
|
||||
|
||||
The following are previously known issues resolved in this release. For resolved issues related to
|
||||
individual components, review the [Detailed component changes](#detailed-component-changes).
|
||||
|
||||
### Bandwidth limitation in gang and non-gang modes on Instinct MI300A
|
||||
|
||||
Fixed an issue where expected target peak non-gang performance (~60 GB/s) and target peak gang
|
||||
performance (~90 GB/s) were not achieved. Previously, both gang and non-gang performance were
|
||||
observed to be limited at 45 GB/s. See [issue #3496](https://github.com/ROCm/ROCm/issues/3496) on
|
||||
GitHub.
|
||||
25
tools/autotag/templates/support/6.3.0.md
Normal file
25
tools/autotag/templates/support/6.3.0.md
Normal file
@@ -0,0 +1,25 @@
|
||||
## Operating system and hardware support changes
|
||||
|
||||
ROCm 6.3.0 adds support for the following operating system and kernel versions:
|
||||
|
||||
- Ubuntu 24.04.2 (kernel: 6.8 [GA], 6.11 [HWE])
|
||||
- Ubuntu 22.04.5 (kernel: 5.15 [GA], 6.8 [HWE])
|
||||
- RHEL 9.5 (kernel: 5.14.0)
|
||||
- Oracle Linux 8.10 (kernel: 5.15.0)
|
||||
|
||||
See installation instructions at [ROCm installation for
|
||||
Linux](https://rocm.docs.amd.com/projects/install-on-linux/en/docs-6.3.0/).
|
||||
|
||||
ROCm 6.3.0 marks the end of support (EoS) for:
|
||||
|
||||
- Ubuntu 24.04.1
|
||||
- Ubuntu 22.04.4
|
||||
- RHEL 9.3
|
||||
- RHEL 8.9
|
||||
- Oracle Linux 8.9
|
||||
|
||||
Hardware support remains unchanged in this release.
|
||||
|
||||
See the [Compatibility
|
||||
matrix](https://rocm.docs.amd.com/en/docs-6.3.0/compatibility/compatibility-matrix.html)
|
||||
for more information about operating system and hardware compatibility.
|
||||
13
tools/autotag/templates/upcoming_changes/6.3.0.md
Normal file
13
tools/autotag/templates/upcoming_changes/6.3.0.md
Normal file
@@ -0,0 +1,13 @@
|
||||
## ROCm upcoming changes
|
||||
|
||||
The following changes to the ROCm software stack are anticipated for future releases.
|
||||
|
||||
### AMDGPU wavefront size compiler macro deprecation
|
||||
|
||||
The `__AMDGCN_WAVEFRONT_SIZE__` macro will be deprecated in an upcoming
|
||||
release. It is recommended to remove any use of this macro. For more information, see [AMDGPU
|
||||
support](https://rocm.docs.amd.com/projects/llvm-project/en/latest/LLVM/clang/html/AMDGPUSupport.html).
|
||||
|
||||
### HIPCC Perl scripts deprecation
|
||||
|
||||
The HIPCC Perl scripts (`hipcc.pl` and `hipconfig.pl`) will be removed in an upcoming release.
|
||||
Reference in New Issue
Block a user