Sync develop branch

2026-01-09 22:58:17 -05:00 · 2024-08-02 11:13:45 -06:00
parent f087dafca2 10f8efa7e8
commit 33ce708926
76 changed files with 4961 additions and 11988 deletions
--- a/.wordlist.txt
+++ b/.wordlist.txt
@@ -242,6 +242,7 @@ OAMs
 OCP
 OEM
 OFED
 OMM
 OMP
 OMPI
 OMPT
@@ -278,6 +279,7 @@ PyPi
 PyTorch
 Qcycles
 RAII
 RAS
 RCCL
 RDC
 RDMA
@@ -368,6 +370,7 @@ UC
 UCC
 UCX
 UIF
 UMC
 USM
 UTCL
 UTIL
@@ -429,6 +432,7 @@ bfloat
 bilinear
 bitsandbytes
 blit
 bootloader
 boson
 bosons
 buildable
@@ -592,6 +596,7 @@ pragma
 pre
 prebuilt
 precompiled
 preconfigured
 prefetch
 prefetchable
 prefill
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
--- a/RELEASE.md
+++ b/RELEASE.md
--- a/docs/about/compatibility/openmp.md
+++ b/docs/about/compatibility/openmp.md
@@ -1,482 +0,0 @@
 <head>
  <meta charset="UTF-8">
  <meta name="description" content="OpenMP support in ROCm">
  <meta name="keywords" content="OpenMP, LLVM, OpenMP toolchain">
 </head>
 # OpenMP support in ROCm
 ## Introduction
 The ROCm™ installation includes an LLVM-based implementation that fully supports
 the OpenMP 4.5 standard and a subset of OpenMP 5.0, 5.1, and 5.2 standards.
 Fortran, C/C++ compilers, and corresponding runtime libraries are included.
 Along with host APIs, the OpenMP compilers support offloading code and data onto
 GPU devices. This document briefly describes the installation location of the
 OpenMP toolchain, example usage of device offloading, and usage of `rocprof`
 with OpenMP applications. The GPUs supported are the same as those supported by
 this ROCm release. See the list of supported GPUs for {doc}`Linux<rocm-install-on-linux:reference/system-requirements>` and
 {doc}`Windows<rocm-install-on-windows:reference/system-requirements>`.
 The ROCm OpenMP compiler is implemented using LLVM compiler technology.
 The following image illustrates the internal steps taken to translate a user’s application into an executable that can offload computation to the AMDGPU. The compilation is a two-pass process. Pass 1 compiles the application to generate the CPU code and Pass 2 links the CPU code to the AMDGPU device code.
 ![OpenMP toolchain](../../data/reference/openmp/openmp-toolchain.svg "OpenMP toolchain")
 ### Installation
 The OpenMP toolchain is automatically installed as part of the standard ROCm
 installation and is available under `/opt/rocm-{version}/llvm`. The
 sub-directories are:
 * bin: Compilers (`flang` and `clang`) and other binaries.
 * examples: The usage section below shows how to compile and run these programs.
 * include: Header files.
 * lib: Libraries including those required for target offload.
 * lib-debug: Debug versions of the above libraries.
 ## OpenMP: usage
 The example programs can be compiled and run by pointing the environment
 variable `ROCM_PATH` to the ROCm install directory.
 **Example:**
 ```bash
 export ROCM_PATH=/opt/rocm-{version}
 cd $ROCM_PATH/share/openmp-extras/examples/openmp/veccopy
 sudo make run
 ```
 :::{note}
 `sudo` is required since we are building inside the `/opt` directory.
 Alternatively, copy the files to your home directory first.
 :::
 The above invocation of Make compiles and runs the program. Note the options
 that are required for target offload from an OpenMP program:
 ```bash
 -fopenmp --offload-arch=<gpu-arch>
 ```
 :::{note}
 The compiler also accepts the alternative offloading notation:
 ```bash
 -fopenmp -fopenmp-targets=amdgcn-amd-amdhsa -Xopenmp-target=amdgcn-amd-amdhsa -march=<gpu-arch>
 ```
 :::
 Obtain the value of `gpu-arch` by running the following command:
 ```bash
 % /opt/rocm-{version}/bin/rocminfo | grep gfx
 ```
 [//]: # (dated link below, needs updating)
 See the complete list of [compiler command-line references](https://github.com/ROCm/llvm-project/blob/amd-staging/openmp/docs/CommandLineArgumentReference.rst).
 ### Using `rocprof` with OpenMP
 The following steps describe a typical workflow for using `rocprof` with OpenMP
 code compiled with AOMP:
 1. Run `rocprof` with the program command line:
    ```bash
    % rocprof <application> <args>
    ```
    This produces a `results.csv` file in the user’s current directory that
    shows basic stats such as kernel names, grid size, number of registers used,
    etc. The user can choose to specify the preferred output file name using the
    o option.
 2. Add options for a detailed result:
   ```bash
   --stats: % rocprof --stats <application> <args>
   ```
   The stats option produces timestamps for the kernels. Look into the output
   CSV file for the field, `DurationNs`, which is useful in getting an
   understanding of the critical kernels in the code.
   Apart from `--stats`, the option `--timestamp` on produces a timestamp for
   the kernels.
 3. After learning about the required kernels, the user can take a detailed look
   at each one of them. `rocprof` has support for hardware counters: a set of
   basic and a set of derived ones. See the complete list of counters using
   options --list-basic and --list-derived. `rocprof` accepts either a text or
   an XML file as an input.
 For more details on `rocprof`, refer to the {doc}`ROCProfilerV1 User Manual <rocprofiler:rocprofv1>`.
 ### Using tracing options
 **Prerequisite:** When using the `--sys-trace` option, compile the OpenMP
 program with:
 ```bash
    -Wl,-rpath,/opt/rocm-{version}/lib -lamdhip64
 ```
 The following tracing options are widely used to generate useful information:
 * **`--hsa-trace`**: This option is used to get a JSON output file with the HSA
  API execution traces and a flat profile in a CSV file.
 * **`--sys-trace`**: This allows programmers to trace both HIP and HSA calls.
  Since this option results in loading ``libamdhip64.so``, follow the
  prerequisite as mentioned above.
 A CSV and a JSON file are produced by the above trace options. The CSV file
 presents the data in a tabular format, and the JSON file can be visualized using
 Google Chrome at chrome://tracing/ or [Perfetto](https://perfetto.dev/).
 Navigate to Chrome or Perfetto and load the JSON file to see the timeline of the
 HSA calls.
 For more details on tracing, refer to the {doc}`ROCProfilerV1 User Manual <rocprofiler:rocprofv1>`.
 ### Environment variables
 :::{table}
 :widths: auto
 | Environment Variable        | Purpose                  |
 | --------------------------- | ---------------------------- |
 | `OMP_NUM_TEAMS`             | To set the number of teams for kernel launch, which is otherwise chosen by the implementation by default. You can set this number (subject to implementation limits) for performance tuning. |
 | `LIBOMPTARGET_KERNEL_TRACE` | To print useful statistics for device operations. Setting it to 1 and running the program emits the name of every kernel launched, the number of teams and threads used, and the corresponding register usage. Setting it to 2 additionally emits timing information for kernel launches and data transfer operations between the host and the device. |
 | `LIBOMPTARGET_INFO`         | To print informational messages from the device runtime as the program executes. Setting it to a value of 1 or higher, prints fine-grain information and setting it to -1 prints complete information. |
 | `LIBOMPTARGET_DEBUG`        | To get detailed debugging information about data transfer operations and kernel launch when using a debug version of the device library. Set this environment variable to 1 to get the detailed information from the library. |
 | `GPU_MAX_HW_QUEUES`         | To set the number of HSA queues in the OpenMP runtime. The HSA queues are created on demand up to the maximum value as supplied here. The queue creation starts with a single initialized queue to avoid unnecessary allocation of resources. The provided value is capped if it exceeds the recommended, device-specific value. |
 | `LIBOMPTARGET_AMDGPU_MAX_ASYNC_COPY_BYTES` | To set the threshold size up to which data transfers are initiated asynchronously. The default threshold size is 1*1024*1024 bytes (1MB). |
 | `OMPX_FORCE_SYNC_REGIONS` | To force the runtime to execute all operations synchronously, i.e., wait for an operation to complete immediately. This affects data transfers and kernel execution. While it is mainly designed for debugging, it may have a minor positive effect on performance in certain situations. |
 :::
 ## OpenMP: features
 The OpenMP programming model is greatly enhanced with the following new features
 implemented in the past releases.
 (openmp_usm)=
 ### Asynchronous behavior in OpenMP target regions
 * Controlling Asynchronous Behavior
 The OpenMP offloading runtime executes in an asynchronous fashion by default, allowing multiple data transfers to start concurrently. However, if the data to be transferred becomes larger than the default threshold of 1MB, the runtime falls back to a synchronous data transfer. The buffers that have been locked already are always executed asynchronously.
 You can overrule this default behavior by setting `LIBOMPTARGET_AMDGPU_MAX_ASYNC_COPY_BYTES` and `OMPX_FORCE_SYNC_REGIONS`. See the [Environment Variables](#environment-variables) table for details.
 * Multithreaded Offloading on the Same Device
 The `libomptarget` plugin for GPU offloading allows creation of separate configurable HSA queues per chiplet, which enables two or more threads to concurrently offload to the same device.
 * Parallel Memory Copy Invocations
 Implicit asynchronous execution of single target region enables parallel memory copy invocations.
 ### Unified shared memory
 Unified Shared Memory (USM) provides a pointer-based approach to memory
 management. To implement USM, fulfill the following system requirements along
 with Xnack capability.
 #### Prerequisites
 * Linux Kernel versions above 5.14
 * Latest KFD driver packaged in ROCm stack
 * Xnack, as USM support can only be tested with applications compiled with Xnack
  capability
 #### Xnack capability
 When enabled, Xnack capability allows GPU threads to access CPU (system) memory,
 allocated with OS-allocators, such as `malloc`, `new`, and `mmap`. Xnack must be
 enabled both at compile- and run-time. To enable Xnack support at compile-time,
 use:
 ```bash
 --offload-arch=gfx908:xnack+
 ```
 Or use another functionally equivalent option Xnack-any:
 ```bash
 --offload-arch=gfx908
 ```
 To enable Xnack functionality at runtime on a per-application basis,
 use environment variable:
 ```bash
 HSA_XNACK=1
 ```
 When Xnack support is not needed:
 * Build the applications to maximize resource utilization using:
 ```bash
 --offload-arch=gfx908:xnack-
 ```
 * At runtime, set the `HSA_XNACK` environment variable to 0.
 #### Unified shared memory pragma
 This OpenMP pragma is available on MI200 through `xnack+` support.
 ```bash
 omp requires unified_shared_memory
 ```
 As stated in the OpenMP specifications, this pragma makes the map clause on
 target constructs optional. By default, on MI200, all memory allocated on the
 host is fine grain. Using the map clause on a target clause is allowed, which
 transforms the access semantics of the associated memory to coarse grain.
 ```bash
 A simple program demonstrating the use of this feature is:
 $ cat parallel_for.cpp
 #include <stdlib.h>
 #include <stdio.h>
 #define N 64
 #pragma omp requires unified_shared_memory
 int main() {
  int n = N;
  int *a = new int[n];
  int *b = new int[n];
  for(int i = 0; i < n; i++)
    b[i] = i;
  #pragma omp target parallel for map(to:b[:n])
  for(int i = 0; i < n; i++)
    a[i] = b[i];
  for(int i = 0; i < n; i++)
    if(a[i] != i)
      printf("error at %d: expected %d, got %d\n", i, i+1, a[i]);
  return 0;
 }
 $ clang++ -O2 -target x86_64-pc-linux-gnu -fopenmp --offload-arch=gfx90a:xnack+ parallel_for.cpp
 $ HSA_XNACK=1 ./a.out
 ```
 In the above code example, pointer “a” is not mapped in the target region, while
 pointer “b” is. Both are valid pointers on the GPU device and passed by-value to
 the kernel implementing the target region. This means the pointer values on the
 host and the device are the same.
 The difference between the memory pages pointed to by these two variables is
 that the pages pointed by “a” are in fine-grain memory, while the pages pointed
 to by “b” are in coarse-grain memory during and after the execution of the
 target region. This is accomplished in the OpenMP runtime library with calls to
 the ROCr runtime to set the pages pointed by “b” as coarse grain.
 ### OMPT target support
 The OpenMP runtime in ROCm implements a subset of the OMPT device APIs, as
 described in the OpenMP specification document. These APIs allow first-party
 tools to examine the profile and kernel traces that execute on a device. A tool
 can register callbacks for data transfer and kernel dispatch entry points or use
 APIs to start and stop tracing for device-related activities such as data
 transfer and kernel dispatch timings and associated metadata. If device tracing
 is enabled, trace records for device activities are collected during program
 execution and returned to the tool using the APIs described in the
 specification.
 The following example demonstrates how a tool uses the supported OMPT target
 APIs. The `README` in `/opt/rocm/llvm/examples/tools/ompt` outlines the steps to
 be followed, and the provided example can be run as shown below:
 ```bash
 cd $ROCM_PATH/share/openmp-extras/examples/tools/ompt/veccopy-ompt-target-tracing
 sudo make run
 ```
 The file `veccopy-ompt-target-tracing.c` simulates how a tool initiates device
 activity tracing. The file `callbacks.h` shows the callbacks registered and
 implemented by the tool.
 ### Floating point atomic operations
 The MI200-series GPUs support the generation of hardware floating-point atomics
 using the OpenMP atomic pragma. The support includes single- and
 double-precision floating-point atomic operations. The programmer must ensure
 that the memory subjected to the atomic operation is in coarse-grain memory by
 mapping it explicitly with the help of map clauses when not implicitly mapped by
 the compiler as per the [OpenMP
 specifications](https://www.openmp.org/specifications/). This makes these
 hardware floating-point atomic instructions “fast,” as they are faster than
 using a default compare-and-swap loop scheme, but at the same time “unsafe,” as
 they are not supported on fine-grain memory. The operation in
 `unified_shared_memory` mode also requires programmers to map the memory
 explicitly when not implicitly mapped by the compiler.
 To request fast floating-point atomic instructions at the file level, use
 compiler flag `-munsafe-fp-atomics` or a hint clause on a specific pragma:
 ```bash
 double a = 0.0;
 #pragma omp atomic hint(AMD_fast_fp_atomics)
 a = a + 1.0;
 ```
 :::{note}
 `AMD_unsafe_fp_atomics` is an alias for `AMD_fast_fp_atomics`, and
 `AMD_safe_fp_atomics` is implemented with a compare-and-swap loop.
 :::
 To disable the generation of fast floating-point atomic instructions at the file
 level, build using the option `-msafe-fp-atomics` or use a hint clause on a
 specific pragma:
 ```bash
 double a = 0.0;
 #pragma omp atomic hint(AMD_safe_fp_atomics)
 a = a + 1.0;
 ```
 The hint clause value always has a precedence over the compiler flag, which
 allows programmers to create atomic constructs with a different behavior than
 the rest of the file.
 See the example below, where the user builds the program using
 `-msafe-fp-atomics` to select a file-wide “safe atomic” compilation. However,
 the fast atomics hint clause over variable “a” takes precedence and operates on
 “a” using a fast/unsafe floating-point atomic, while the variable “b” in the
 absence of a hint clause is operated upon using safe floating-point atomics as
 per the compiler flag.
 ```bash
 double a = 0.0;.
 #pragma omp atomic hint(AMD_fast_fp_atomics)
 a = a + 1.0;
 double b = 0.0;
 #pragma omp atomic
 b = b + 1.0;
 ```
 ### AddressSanitizer tool
 AddressSanitizer (ASan) is a memory error detector tool utilized by applications to
 detect various errors ranging from spatial issues such as out-of-bound access to
 temporal issues such as use-after-free. The AOMP compiler supports ASan for AMD
 GPUs with applications written in both HIP and OpenMP.
 **Features supported on host platform (Target x86_64):**
 * Use-after-free
 * Buffer overflows
 * Heap buffer overflow
 * Stack buffer overflow
 * Global buffer overflow
 * Use-after-return
 * Use-after-scope
 * Initialization order bugs
 **Features supported on AMDGPU platform (`amdgcn-amd-amdhsa`):**
 * Heap buffer overflow
 * Global buffer overflow
 **Software (kernel/OS) requirements:** Unified Shared Memory support with Xnack
 capability. See the section on [Unified Shared Memory](#unified-shared-memory)
 for prerequisites and details on Xnack.
 **Example:**
 * Heap buffer overflow
 ```bash
 void  main() {
 .......  // Some program statements
 .......  // Some program statements
 #pragma omp target map(to : A[0:N], B[0:N]) map(from: C[0:N])
 {
 #pragma omp parallel for
    for(int i =0 ; i < N; i++){
    C[i+10] = A[i] + B[i];
  }   // end of for loop
 }
 .......   // Some program statements
 }// end of main
 ```
 See the complete sample code for heap buffer overflow
 [here](https://github.com/ROCm/aomp/blob/aomp-dev/examples/tools/asan/heap_buffer_overflow/openmp/vecadd-HBO.cpp).
 * Global buffer overflow
 ```bash
 #pragma omp declare target
   int A[N],B[N],C[N];
 #pragma omp end declare target
 void main(){
 ......  // some program statements
 ......  // some program statements
 #pragma omp target data map(to:A[0:N],B[0:N]) map(from: C[0:N])
 {
 #pragma omp target update to(A,B)
 #pragma omp target parallel for
 for(int i=0; i<N; i++){
    C[i]=A[i*100]+B[i+22];
 } // end of for loop
 #pragma omp target update from(C)
 }
 ........  // some program statements
 } // end of main
 ```
 See the complete sample code for global buffer overflow
 [here](https://github.com/ROCm/aomp/blob/aomp-dev/examples/tools/asan/global_buffer_overflow/openmp/vecadd-GBO.cpp).
 ### Clang compiler option for kernel optimization
 You can use the clang compiler option `-fopenmp-target-fast` for kernel optimization if certain constraints implied by its component options are satisfied. `-fopenmp-target-fast` enables the following options:
 * `-fopenmp-target-ignore-env-vars`: It enables code generation of specialized kernels including no-loop and Cross-team reductions.
 * `-fopenmp-assume-no-thread-state`: It enables the compiler to assume that no thread in a parallel region modifies an Internal Control Variable (`ICV`), thus potentially reducing the device runtime code execution.
 * `-fopenmp-assume-no-nested-parallelism`: It enables the compiler to assume that no thread in a parallel region encounters a parallel region, thus potentially reducing the device runtime code execution.
 * `-O3` if no `-O*` is specified by the user.
 ### Specialized kernels
 Clang will attempt to generate specialized kernels based on compiler options and OpenMP constructs. The following specialized kernels are supported:
 * No-loop
 * Big-jump-loop
 * Cross-team reductions
 To enable the generation of specialized kernels, follow these guidelines:
 * Do not specify teams, threads, and schedule-related environment variables. The `num_teams` clause in an OpenMP target construct acts as an override and prevents the generation of the no-loop kernel. If the specification of `num_teams` clause is a user requirement then clang tries to generate the big-jump-loop kernel instead of the no-loop kernel.
 * Assert the absence of the teams, threads, and schedule-related environment variables by adding the command-line option `-fopenmp-target-ignore-env-vars`.
 * To automatically enable the specialized kernel generation, use `-Ofast` or `-fopenmp-target-fast` for compilation.
 * To disable specialized kernel generation, use `-fno-openmp-target-ignore-env-vars`.
 #### No-loop kernel generation
 The no-loop kernel generation feature optimizes the compiler performance by generating a specialized kernel for certain OpenMP target constructs such as target teams distribute parallel for. The specialized kernel generation feature assumes every thread executes a single iteration of the user loop, which leads the runtime to launch a total number of GPU threads equal to or greater than the iteration space size of the target region loop. This allows the compiler to generate code for the loop body without an enclosing loop, resulting in reduced control-flow complexity and potentially better performance.
 #### Big-jump-loop kernel generation
 A no-loop kernel is not generated if the OpenMP teams construct uses a `num_teams` clause. Instead, the compiler attempts to generate a different specialized kernel called the big-jump-loop kernel. The compiler launches the kernel with a grid size determined by the number of teams specified by the OpenMP `num_teams` clause and the `blocksize` chosen either by the compiler or specified by the corresponding OpenMP clause.
 #### Cross-team optimized reduction kernel generation
 If the OpenMP construct has a reduction clause, the compiler attempts to generate optimized code by utilizing efficient cross-team communication. New APIs for cross-team reduction are implemented in the device runtime and are automatically generated by clang.
--- a/docs/about/license.md
+++ b/docs/about/license.md
@@ -25,66 +25,69 @@ additional licenses. Please review individual repositories for more information.
 <!-- spellcheck-disable -->
 | Component | License |
 |:---------------------|:-------------------------|
 | [HIP](https://github.com/ROCm/HIP/) | [MIT](https://github.com/ROCm/HIP/blob/develop/LICENSE.txt) |
 | [HIPCC](https://github.com/ROCm/llvm-project/tree/amd-staging/amd/hipcc) | [MIT](https://github.com/ROCm/llvm-project/blob/amd-staging/amd/hipcc/LICENSE.txt) |
 | [HIPIFY](https://github.com/ROCm/HIPIFY/) | [MIT](https://github.com/ROCm/HIPIFY/blob/amd-staging/LICENSE.txt) |
 | [AMDMIGraphX](https://github.com/ROCm/AMDMIGraphX/) | [MIT](https://github.com/ROCm/AMDMIGraphX/blob/develop/LICENSE) |
 | [MIOpen](https://github.com/ROCm/MIOpen/) | [MIT](https://github.com/ROCm/MIOpen/blob/develop/LICENSE.txt) |
 | [MIVisionX](https://github.com/ROCm/MIVisionX/) | [MIT](https://github.com/ROCm/MIVisionX/blob/develop/LICENSE.txt) |
 | [AMD Common Language Runtime (CLR)](https://github.com/ROCm/clr) | [MIT](https://github.com/ROCm/clr/blob/develop/LICENCE) |
-| [ROCm-Core](https://github.com/ROCm/rocm-core) | [MIT](https://github.com/ROCm/rocm-core/blob/master/copyright) |
+| [AMD SMI](https://github.com/ROCm/amdsmi) | [MIT](https://github.com/ROCm/amdsmi/blob/develop/LICENSE) |
 | [hipamd](https://github.com/ROCm/clr/tree/develop/hipamd) | [MIT](https://github.com/ROCm/clr/blob/develop/hipamd/LICENSE.txt) |
 | [ROCm-OpenCL-Runtime](https://github.com/ROCm/clr/tree/develop/opencl) | [MIT](https://github.com/ROCm/clr/blob/develop/opencl/LICENSE.txt) |
 | [Tensile](https://github.com/ROCm/Tensile/) | [MIT](https://github.com/ROCm/Tensile/blob/develop/LICENSE.md) |
 | [aomp](https://github.com/ROCm/aomp/) | [Apache 2.0](https://github.com/ROCm/aomp/blob/aomp-dev/LICENSE) |
 | [aomp-extras](https://github.com/ROCm/aomp-extras/) | [MIT](https://github.com/ROCm/aomp-extras/blob/aomp-dev/LICENSE) |
 | [llvm-project](https://github.com/ROCm/llvm-project/) | [Apache](https://github.com/ROCm/llvm-project/blob/amd-staging/LICENSE.TXT) |
 | [llvm-project/flang](https://github.com/ROCm/llvm-project/tree/amd-staging/flang) | [Apache 2.0](https://github.com/ROCm/llvm-project/blob/amd-staging/flang/LICENSE.TXT) |
 | [Code Object Manager (Comgr)](https://github.com/ROCm/llvm-project/tree/amd-staging/amd/comgr) | [The University of Illinois/NCSA](https://github.com/ROCm/llvm-project/blob/amd-staging/amd/comgr/LICENSE.txt) |
 | [ROCm-Device-Libs](https://github.com/ROCm/llvm-project/tree/amd-staging/amd/device-libs) | [The University of Illinois/NCSA](https://github.com/ROCm/llvm-project/blob/amd-staging/amd/device-libs/LICENSE.TXT) |
 | [clang-ocl](https://github.com/ROCm/clang-ocl/) | [MIT](https://github.com/ROCm/clang-ocl/blob/master/LICENSE) |
 | [ROCK-Kernel-Driver](https://github.com/ROCm/ROCK-Kernel-Driver/) | [GPL 2.0 WITH Linux-syscall-note](https://github.com/ROCm/ROCK-Kernel-Driver/blob/master/COPYING) |
 | [ROCT-Thunk-Interface](https://github.com/ROCm/ROCT-Thunk-Interface/) | [MIT](https://github.com/ROCm/ROCT-Thunk-Interface/blob/master/LICENSE.md) |
 | [ROCR-Runtime](https://github.com/ROCm/ROCR-Runtime/) | [The University of Illinois/NCSA](https://github.com/ROCm/ROCR-Runtime/blob/master/LICENSE.txt) |
 | [ROCR Debug Agent](https://github.com/ROCm/rocr_debug_agent/) | [The University of Illinois/NCSA](https://github.com/ROCm/rocr_debug_agent/blob/amd-staging/LICENSE.txt) |
 | [Composable Kernel](https://github.com/ROCm/composable_kernel) | [MIT](https://github.com/ROCm/composable_kernel/blob/develop/LICENSE) |
 | [half](https://github.com/ROCm/half/) | [MIT](https://github.com/ROCm/half/blob/rocm/LICENSE.txt) |
 | [HIP](https://github.com/ROCm/HIP/) | [MIT](https://github.com/ROCm/HIP/blob/develop/LICENSE.txt) |
 | [hipamd](https://github.com/ROCm/clr/tree/develop/hipamd) | [MIT](https://github.com/ROCm/clr/blob/develop/hipamd/LICENSE.txt) |
 | [hipBLAS](https://github.com/ROCm/hipBLAS/) | [MIT](https://github.com/ROCm/hipBLAS/blob/develop/LICENSE.md) |
 | [hipBLASLt](https://github.com/ROCm/hipBLASLt/) | [MIT](https://github.com/ROCm/hipBLASLt/blob/develop/LICENSE.md) |
 | [HIPCC](https://github.com/ROCm/llvm-project/tree/amd-staging/amd/hipcc) | [MIT](https://github.com/ROCm/llvm-project/blob/amd-staging/amd/hipcc/LICENSE.txt) |
 | [hipCUB](https://github.com/ROCm/hipCUB/) | [Custom](https://github.com/ROCm/hipCUB/blob/develop/LICENSE.txt) |
 | [hipFFT](https://github.com/ROCm/hipFFT/) | [MIT](https://github.com/ROCm/hipFFT/blob/develop/LICENSE.md) |
-| [hipFORT](https://github.com/ROCm/hipfort/) | [MIT](https://github.com/ROCm/hipfort/blob/develop/LICENSE) |
+| [hipfort](https://github.com/ROCm/hipfort/) | [MIT](https://github.com/ROCm/hipfort/blob/develop/LICENSE) |
 | [HIPIFY](https://github.com/ROCm/HIPIFY/) | [MIT](https://github.com/ROCm/HIPIFY/blob/amd-staging/LICENSE.txt) |
 | [hipRAND](https://github.com/ROCm/hipRAND/) | [MIT](https://github.com/ROCm/hipRAND/blob/develop/LICENSE.txt) |
 | [hipSOLVER](https://github.com/ROCm/hipSOLVER/) | [MIT](https://github.com/ROCm/hipSOLVER/blob/develop/LICENSE.md) |
 | [hipSPARSE](https://github.com/ROCm/hipSPARSE/) | [MIT](https://github.com/ROCm/hipSPARSE/blob/develop/LICENSE.md) |
 | [hipSPARSELt](https://github.com/ROCm/hipSPARSELt/) | [MIT](https://github.com/ROCm/hipSPARSELt/blob/develop/LICENSE.md) |
 | [hipTensor](https://github.com/ROCm/hipTensor) | [MIT](https://github.com/ROCm/hipTensor/blob/develop/LICENSE) |
 | hsa-amd-aqlprofile | [AMD Software EULA](https://www.amd.com/en/legal/eula/amd-software-eula.html) |
 | [llvm-project](https://github.com/ROCm/llvm-project/) | [Apache](https://github.com/ROCm/llvm-project/blob/amd-staging/LICENSE.TXT) |
 | [llvm-project/flang](https://github.com/ROCm/llvm-project/tree/amd-staging/flang) | [Apache 2.0](https://github.com/ROCm/llvm-project/blob/amd-staging/flang/LICENSE.TXT) |
 | [MIGraphX](https://github.com/ROCm/AMDMIGraphX/) | [MIT](https://github.com/ROCm/AMDMIGraphX/blob/develop/LICENSE) |
 | [MIOpen](https://github.com/ROCm/MIOpen/) | [MIT](https://github.com/ROCm/MIOpen/blob/develop/LICENSE.txt) |
 | [MIVisionX](https://github.com/ROCm/MIVisionX/) | [MIT](https://github.com/ROCm/MIVisionX/blob/develop/LICENSE.txt) |
 | [Omniperf](https://github.com/ROCm/omniperf) | [MIT](https://github.com/ROCm/omniperf/blob/main/LICENSE) |
 | [Omnitrace](https://github.com/ROCm/omnitrace) | [MIT](https://github.com/ROCm/omnitrace/blob/main/LICENSE) |
 | [rocAL](https://github.com/ROCm/rocAL) | [MIT](https://github.com/ROCm/rocAL/blob/develop/LICENSE.txt) |
 | [rocALUTION](https://github.com/ROCm/rocALUTION/) | [MIT](https://github.com/ROCm/rocALUTION/blob/develop/LICENSE.md) |
 | [rocBLAS](https://github.com/ROCm/rocBLAS/) | [MIT](https://github.com/ROCm/rocBLAS/blob/develop/LICENSE.md) |
 | [ROCdbgapi](https://github.com/ROCm/ROCdbgapi/) | [MIT](https://github.com/ROCm/ROCdbgapi/blob/amd-staging/LICENSE.txt) |
 | [rocDecode](https://github.com/ROCm/rocDecode) | [MIT](https://github.com/ROCm/rocDecode/blob/develop/LICENSE) |
 | [rocFFT](https://github.com/ROCm/rocFFT/) | [MIT](https://github.com/ROCm/rocFFT/blob/develop/LICENSE.md) |
-| [rocPRIM](https://github.com/ROCm/rocPRIM/) | [MIT](https://github.com/ROCm/rocPRIM/blob/develop/LICENSE.txt) |
+| [ROCgdb](https://github.com/ROCm/ROCgdb/) | [GNU General Public License v2.0](https://github.com/ROCm/ROCgdb/blob/amd-master/COPYING) |
 | [ROCK-Kernel-Driver](https://github.com/ROCm/ROCK-Kernel-Driver/) | [GPL 2.0 WITH Linux-syscall-note](https://github.com/ROCm/ROCK-Kernel-Driver/blob/master/COPYING) |
 | [rocminfo](https://github.com/ROCm/rocminfo/) | [The University of Illinois/NCSA](https://github.com/ROCm/rocminfo/blob/amd-staging/License.txt) |
 | [ROCm Bandwidth Test](https://github.com/ROCm/rocm_bandwidth_test/) | [The University of Illinois/NCSA](https://github.com/ROCm/rocm_bandwidth_test/blob/master/LICENSE.txt) |
 | [ROCm CMake](https://github.com/ROCm/rocm-cmake/) | [MIT](https://github.com/ROCm/rocm-cmake/blob/develop/LICENSE) |
 | [ROCm Communication Collectives Library (RCCL)](https://github.com/ROCm/rccl/) | [Custom](https://github.com/ROCm/rccl/blob/develop/LICENSE.txt) |
 | [ROCm-Core](https://github.com/ROCm/rocm-core) | [MIT](https://github.com/ROCm/rocm-core/blob/master/copyright) |
 | [ROCm Data Center (RDC)](https://github.com/ROCm/rdc/) | [MIT](https://github.com/ROCm/rdc/blob/develop/LICENSE) |
 | [ROCm-Device-Libs](https://github.com/ROCm/llvm-project/tree/amd-staging/amd/device-libs) | [The University of Illinois/NCSA](https://github.com/ROCm/llvm-project/blob/amd-staging/amd/device-libs/LICENSE.TXT) |
 | [ROCm-OpenCL-Runtime](https://github.com/ROCm/clr/tree/develop/opencl) | [MIT](https://github.com/ROCm/clr/blob/develop/opencl/LICENSE.txt) |
 | [ROCm Performance Primitives (RPP)](https://github.com/ROCm/rpp) | [MIT](https://github.com/ROCm/rpp/blob/develop/LICENSE) |
 | [ROCm SMI Lib](https://github.com/ROCm/rocm_smi_lib/) | [MIT](https://github.com/ROCm/rocm_smi_lib/blob/develop/License.txt) |
 | [ROCm Validation Suite](https://github.com/ROCm/ROCmValidationSuite/) | [MIT](https://github.com/ROCm/ROCmValidationSuite/blob/master/LICENSE) |
 | [rocPRIM](https://github.com/ROCm/rocPRIM/) | [MIT](https://github.com/ROCm/rocPRIM/blob/develop/LICENSE.txt) |
 | [ROCProfiler](https://github.com/ROCm/rocprofiler/) | [MIT](https://github.com/ROCm/rocprofiler/blob/amd-master/LICENSE) |
 | [ROCprofiler-SDK](https://github.com/ROCm/rocprofiler-sdk) | [MIT](https://github.com/ROCm/rocprofiler-sdk/blob/amd-mainline/LICENSE) |
 | [rocPyDecode](https://github.com/ROCm/rocPyDecode) | [MIT](https://github.com/ROCm/rocPyDecode/blob/develop/LICENSE) |
 | [rocRAND](https://github.com/ROCm/rocRAND/) | [MIT](https://github.com/ROCm/rocRAND/blob/develop/LICENSE.txt) |
 | [ROCr Debug Agent](https://github.com/ROCm/rocr_debug_agent/) | [The University of Illinois/NCSA](https://github.com/ROCm/rocr_debug_agent/blob/amd-staging/LICENSE.txt) |
 | [ROCR-Runtime](https://github.com/ROCm/ROCR-Runtime/) | [The University of Illinois/NCSA](https://github.com/ROCm/ROCR-Runtime/blob/master/LICENSE.txt) |
 | [rocSOLVER](https://github.com/ROCm/rocSOLVER/) | [BSD-2-Clause](https://github.com/ROCm/rocSOLVER/blob/develop/LICENSE.md) |
 | [rocSPARSE](https://github.com/ROCm/rocSPARSE/) | [MIT](https://github.com/ROCm/rocSPARSE/blob/develop/LICENSE.md) |
 | [rocThrust](https://github.com/ROCm/rocThrust/) | [Apache 2.0](https://github.com/ROCm/rocThrust/blob/develop/LICENSE) |
 | [rocWMMA](https://github.com/ROCm/rocWMMA/) | [MIT](https://github.com/ROCm/rocWMMA/blob/develop/LICENSE.md) |
 | [ROCm Communication Collectives Library (RCCL)](https://github.com/ROCm/rccl/) | [Custom](https://github.com/ROCm/rccl/blob/develop/LICENSE.txt) |
 | [ROCm Data Center (RDC)](https://github.com/ROCm/rdc/) | [MIT](https://github.com/ROCm/rdc/blob/develop/LICENSE) |
 | [ROCm CMake](https://github.com/ROCm/rocm-cmake/) | [MIT](https://github.com/ROCm/rocm-cmake/blob/develop/LICENSE) |
 | [ROCdbgapi](https://github.com/ROCm/ROCdbgapi/) | [MIT](https://github.com/ROCm/ROCdbgapi/blob/amd-staging/LICENSE.txt) |
 | [ROCgdb](https://github.com/ROCm/ROCgdb/) | [GNU General Public License v2.0](https://github.com/ROCm/ROCgdb/blob/amd-master/COPYING) |
 | [ROCm SMI Lib](https://github.com/ROCm/rocm_smi_lib/) | [MIT](https://github.com/ROCm/rocm_smi_lib/blob/develop/License.txt) |
 | [AMD SMI](https://github.com/ROCm/amdsmi) | [MIT](https://github.com/ROCm/amdsmi/blob/develop/LICENSE) |
 | [rocminfo](https://github.com/ROCm/rocminfo/) | [The University of Illinois/NCSA](https://github.com/ROCm/rocminfo/blob/amd-staging/License.txt) |
 | [ROCProfiler](https://github.com/ROCm/rocprofiler/) | [MIT](https://github.com/ROCm/rocprofiler/blob/amd-master/LICENSE) |
 | [ROCTracer](https://github.com/ROCm/roctracer/) | [MIT](https://github.com/ROCm/roctracer/blob/amd-master/LICENSE) |
-| [ROCm Bandwidth Test](https://github.com/ROCm/rocm_bandwidth_test/) | [The University of Illinois/NCSA](https://github.com/ROCm/rocm_bandwidth_test/blob/master/LICENSE.txt) |
+| [ROCT-Thunk-Interface](https://github.com/ROCm/ROCT-Thunk-Interface/) | [MIT](https://github.com/ROCm/ROCT-Thunk-Interface/blob/master/LICENSE.md) |
 | [rocWMMA](https://github.com/ROCm/rocWMMA/) | [MIT](https://github.com/ROCm/rocWMMA/blob/develop/LICENSE.md) |
 | [Tensile](https://github.com/ROCm/Tensile/) | [MIT](https://github.com/ROCm/Tensile/blob/develop/LICENSE.md) |
 | [TransferBench](https://github.com/ROCm/TransferBench) | [MIT](https://github.com/ROCm/TransferBench/blob/develop/LICENSE.md) |
 | [ROCmValidationSuite](https://github.com/ROCm/ROCmValidationSuite/) | [MIT](https://github.com/ROCm/ROCmValidationSuite/blob/master/LICENSE) |
 | hsa-amd-aqlprofile | [AMD Software EULA](https://www.amd.com/en/legal/eula/amd-software-eula.html)
 Open sourced ROCm components are released via public GitHub
 repositories, packages on [https://repo.radeon.com](https://repo.radeon.com) and other distribution channels.
--- a/docs/compatibility/compatibility-matrix.rst
+++ b/docs/compatibility/compatibility-matrix.rst
@@ -8,121 +8,169 @@ Compatibility matrix
 Use this matrix to view the ROCm compatibility across successive major and minor releases.
 You can also refer to the :ref:`past versions of ROCm compatibility matrix<past-rocm-compatibility-matrix>`.
 .. container:: format-big-table
  .. csv-table:: 
-      :header: "ROCm Version", "6.1.0", "6.0.0"
+      :header: "ROCm Version", "6.2.0", "6.1.2", "6.0.0"
      :stub-columns: 1
-      :doc:`Operating Systems <rocm-install-on-linux:reference/system-requirements>`, "Ubuntu 22.04.4, 22.04.3","Ubuntu 22.04.4, 22.04.3"
+      :doc:`Operating Systems <rocm-install-on-linux:reference/system-requirements>`, "Ubuntu 24.04","",""
-      ,"Ubuntu 20.04.6, 20.04.5","Ubuntu 20.04.6, 20.04.5"
+      ,"Ubuntu 22.04.5 [#Ubuntu220405]_, 22.04.4","Ubuntu 22.04.4, 22.04.3","Ubuntu 22.04.4, 22.04.3"
-      ,"RHEL 9.4 [#red-hat94]_, 9.3, 9.2","RHEL 9.3, 9.2"
+      ,,"Ubuntu 20.04.6, 20.04.5","Ubuntu 20.04.6, 20.04.5"
-      ,"RHEL 8.9, 8.8","RHEL 8.9, 8.8"
+      ,"RHEL 9.4, 9.3","RHEL 9.4 [#red-hat94]_, 9.3, 9.2","RHEL 9.3, 9.2"
-      ,"SLES 15 SP5, SP4","SLES 15 SP5, SP4"
+      ,"RHEL 8.10, 8.9","RHEL 8.9, 8.8","RHEL 8.9, 8.8"
-      ,CentOS 7.9,CentOS 7.9
+      ,"SLES 15 SP6, SP5","SLES 15 SP5, SP4","SLES 15 SP5, SP4"
-      ,"Oracle Linux 8.9 [#oracle89]_"
+      ,,CentOS 7.9,CentOS 7.9
-      ,,
+      ,"Oracle Linux 8.9 [#oracle89]_","Oracle Linux 8.9 [#oracle89]_",""
-      :doc:`GFX Architecture <rocm-install-on-linux:reference/system-requirements>`,CDNA3,CDNA3
+      ,".. _architecture-support-compatibility-matrix:",,
-      ,CDNA2,CDNA2
+      :doc:`Architecture <rocm-install-on-linux:reference/system-requirements>`,CDNA3,CDNA3,CDNA3
-      ,CDNA,CDNA
+      ,CDNA2,CDNA2,CDNA2
-      ,RDNA3,RDNA3
+      ,CDNA,CDNA,CDNA
-      ,RDNA2,RDNA2
+      ,RDNA3,RDNA3,RDNA3
-      ,,
+      ,RDNA2,RDNA2,RDNA2
-      :doc:`GFX Card <rocm-install-on-linux:reference/system-requirements>`,gfx1100,gfx1100
+      ,".. _gpu-support-compatibility-matrix:",,
-      ,gfx1030,gfx1030
+      :doc:`GPU / LLVM target <rocm-install-on-linux:reference/system-requirements>`,gfx1100,gfx1100,gfx1100
-      ,gfx942 [#]_, gfx942 [#]_
+      ,gfx1030,gfx1030,gfx1030
-      ,gfx90a,gfx90a
+      ,gfx942 [#mi300_620]_, gfx942 [#mi300_612]_, gfx942 [#mi300_600]_
-      ,gfx908,gfx908
+      ,gfx90a,gfx90a,gfx90a
-      ,,
+      ,gfx908,gfx908,gfx908
-      ECOSYSTEM SUPPORT:,,
+      ,,,
-      :doc:`PyTorch <rocm-install-on-linux:how-to/3rd-party/pytorch-install>`,"2.1, 2.0, 1.13","2.1, 2.0, 1.13"
+      FRAMEWORK SUPPORT,".. _framework-support-compatibility-matrix:",,
-      :doc:`TensorFlow <rocm-install-on-linux:how-to/3rd-party/tensorflow-install>`,"2.15, 2.14, 2.13","2.14, 2.13, 2.12"
+      :doc:`PyTorch <rocm-install-on-linux:how-to/3rd-party/pytorch-install>`,"2.3, 2.2, 2.1, 2.0, 1.13","2.1, 2.0, 1.13","2.1, 2.0, 1.13"
-      :doc:`JAX <rocm-install-on-linux:how-to/3rd-party/jax-install>`,0.4.26,0.4.26
+      :doc:`TensorFlow <rocm-install-on-linux:how-to/3rd-party/tensorflow-install>`,"2.16.1, 2.15.1, 2.14.1","2.15, 2.14, 2.13","2.14, 2.13, 2.12"
-      `ONNX Runtime <https://onnxruntime.ai/docs/build/eps.html#amd-migraphx>`_,1.17.3,1.14.1
+      :doc:`JAX <rocm-install-on-linux:how-to/3rd-party/jax-install>`,0.4.26,0.4.26,0.4.26
-      ,,
+      `ONNX Runtime <https://onnxruntime.ai/docs/build/eps.html#amd-migraphx>`_,1.17.3,1.17.3,1.14.1
-      3RD PARTY COMMUNICATION LIBS:,,
+      ,,,
-      `UCC <https://github.com/ROCm/ucc>`_,>=1.2.0,>=1.2.0
+      THIRD PARTY COMMS,".. _thirdpartycomms-support-compatibility-matrix:",,
-      `UCX <https://github.com/ROCm/ucx>`_,>=1.14.1,>=1.14.1
+      `UCC <https://github.com/ROCm/ucc>`_,>=1.2.0,>=1.2.0,>=1.2.0
-      ,,
+      `UCX <https://github.com/ROCm/ucx>`_,>=1.15.0,>=1.14.1,>=1.14.1
-      3RD PARTY ALGORITHM LIBS:,,
+      ,,,
-      Thrust,2.1.0,2.0.1
+      THIRD PARTY ALGORITHM,".. _thirdpartyalgorithm-support-compatibility-matrix:",,
-      CUB,2.1.0,2.0.1
+      Thrust,2.2.0,2.1.0,2.0.1
-      ,,
+      CUB,2.2.0,2.1.0,2.0.1
-      ML & COMPUTER VISION LIBS:,,
+      ,,,
-      :doc:`Composable Kernel <composable_kernel:index>`,1.1.0,1.1.0
+      ML & COMPUTER VISION,".. _mllibs-support-compatibility-matrix:",,
-      :doc:`MIGraphX <amdmigraphx:index>`,2.9.0,2.8.0
+      :doc:`Composable Kernel <composable_kernel:index>`,1.1.0,1.1.0,1.1.0
-      :doc:`MIOpen <miopen:index>`,3.1.0,3.0.0
+      :doc:`MIGraphX <amdmigraphx:index>`,2.10.0,2.9.0,2.8.0
-      :doc:`MIVisionX <mivisionx:index>`,2.5.0,2.5.0
+      :doc:`MIOpen <miopen:index>`,3.2.0,3.1.0,3.0.0
-      :doc:`rocDecode <rocdecode:index>`,0.5.0,N/A
+      :doc:`MIVisionX <mivisionx:index>`,3.0.0,2.5.0,2.5.0
-      :doc:`ROCm Performance Primitives (RPP) <rpp:index>`,1.5.0,1.4.0
+      :doc:`rocDecode <rocdecode:index>`,0.6.0,0.6.0,N/A
-      ,,
+      :doc:`RPP <rpp:index>`,1.8.0,1.5.0,1.4.0
-      COMMUNICATION:,,
+      :doc:`rocPyDecode <rocpydecode:index>`,0.1.0,N/A,N/A
-      :doc:`RCCL <rccl:index>`,2.18.6,2.18.3
+      ,,,
-      ,,
+      COMMUNICATION,".. _commlibs-support-compatibility-matrix:",,
-      MATH LIBS:,,
+      :doc:`RCCL <rccl:index>`,2.20.5,2.18.6,2.18.3
-      `half <https://github.com/ROCm/half>`_ ,1.12.0,1.12.0
+      ,,,
-      :doc:`hipBLAS <hipblas:index>`,2.1.0,2.0.0
+      MATH LIBS,".. _mathlibs-support-compatibility-matrix:",,
-      :doc:`hipBLASLt <hipblaslt:index>`,0.7.0,0.6.0
+      `half <https://github.com/ROCm/half>`_ ,1.12.0,1.12.0,1.12.0
-      :doc:`hipFFT <hipfft:index>`,1.0.14,1.0.13
+      :doc:`hipBLAS <hipblas:index>`,2.2.0,2.1.0,2.0.0
-      :doc:`hipFORT <hipfort:index>`,0.4.0,0.4.0
+      :doc:`hipBLASLt <hipblaslt:index>`,0.8.0,0.7.0,0.6.0
-      :doc:`hipRAND <hiprand:index>`,2.10.16,2.10.16
+      :doc:`hipFFT <hipfft:index>`,1.0.14,1.0.14,1.0.13
-      :doc:`hipSOLVER <hipsolver:index>`,2.1.0,2.0.0
+      :doc:`hipFORT <hipfort:index>`,0.4.0,0.4.0,0.4.0
-      :doc:`hipSPARSE <hipsparse:index>`,3.0.1,3.0.0
+      :doc:`hipRAND <hiprand:index>`,2.11.0,2.10.16,2.10.16
-      :doc:`hipSPARSELt <hipsparselt:index>`,0.1.0,0.1.0
+      :doc:`hipSOLVER <hipsolver:index>`,2.2.0,2.1.1,2.0.0
-      :doc:`rocALUTION <rocalution:index>`,3.1.1,3.0.3
+      :doc:`hipSPARSE <hipsparse:index>`,3.1.1,3.0.1,3.0.0
-      :doc:`rocBLAS <rocblas:index>`,4.1.0,4.0.0
+      :doc:`hipSPARSELt <hipsparselt:index>`,0.2.1,0.2.0,0.1.0
-      :doc:`rocFFT <rocfft:index>`,1.0.27,1.0.23
+      :doc:`rocALUTION <rocalution:index>`,3.2.0,3.1.1,3.0.3
-      :doc:`rocRAND <rocrand:index>`,3.0.1,2.10.17
+      :doc:`rocBLAS <rocblas:index>`,4.2.0,4.1.2,4.0.0
-      :doc:`rocSOLVER <rocsolver:index>`,3.25.0,3.24.0
+      :doc:`rocFFT <rocfft:index>`,1.0.28,1.0.27,1.0.23
-      :doc:`rocSPARSE <rocsparse:index>`,3.1.2,3.0.2
+      :doc:`rocRAND <rocrand:index>`,3.1.0,3.0.1,2.10.17
-      :doc:`rocWMMA <rocwmma:index>`,1.4.0,1.3.0
+      :doc:`rocSOLVER <rocsolver:index>`,3.26.0,3.25.0,3.24.0
-      `Tensile <https://github.com/ROCm/Tensile>`_,4.40.0,4.39.0
+      :doc:`rocSPARSE <rocsparse:index>`,3.2.0,3.1.2,3.0.2
-      ,,
+      :doc:`rocWMMA <rocwmma:index>`,1.5.0,1.4.0,1.3.0
-      PRIMITIVES:,,
+      `Tensile <https://github.com/ROCm/Tensile>`_,4.40.0,4.40.0,4.39.0
-      :doc:`hipCUB <hipcub:index>`,3.1.0,3.0.0
+      ,,,
-      :doc:`hipTensor <hiptensor:index>`,1.2.0,1.1.0
+      PRIMITIVES,".. _primitivelibs-support-compatibility-matrix:",,
-      :doc:`rocPRIM <rocprim:index>`,3.1.0,3.0.0
+      :doc:`hipCUB <hipcub:index>`,3.2.0,3.1.0,3.0.0
-      :doc:`rocThrust <rocthrust:index>`,3.0.1,3.0.0
+      :doc:`hipTensor <hiptensor:index>`,1.3.0,1.2.0,1.1.0
-      ,,
+      :doc:`rocPRIM <rocprim:index>`,3.2.0,3.1.0,3.0.0
-      SUPPORT LIBS:,,
+      :doc:`rocThrust <rocthrust:index>`,3.0.1,3.0.1,3.0.0
-      `hipother <https://github.com/ROCm/hipother>`_,6.1.40091,6.0.32830
+      ,,,
-      :doc:`ROCm CMake <rocmcmakebuildtools:index>`,0.12.0,0.11.0
+      SUPPORT LIBS,,,
-      `rocm-core <https://github.com/ROCm/rocm-core>`_,6.1.0,6.0.0
+      `hipother <https://github.com/ROCm/hipother>`_,6.2.41133,6.1.40093,6.1.32830
-      `ROCT-Thunk-Interface <https://github.com/ROCm/ROCT-Thunk-Interface>`_,20240125.3.30,20231016.2.245
+      `rocm-core <https://github.com/ROCm/rocm-core>`_,6.2.0,6.1.2,6.0.0
-      ,,
+      `ROCT-Thunk-Interface <https://github.com/ROCm/ROCT-Thunk-Interface>`_,20240607.1.4246,20240125.5.08,20231016.2.245
-      TOOLS:,,
+      ,,,
-      :doc:`AMD SMI <amdsmi:index>`,24.4.1,23.4.2
+      SYSTEM MGMT TOOLS,".. _tools-support-compatibility-matrix:",,
-      :doc:`HIPIFY <hipify:index>`,17.0.0,17.0.0
+      :doc:`AMD SMI <amdsmi:index>`,24.6.2,24.5.1,23.4.2
-      :doc:`ROCdbgapi <rocdbgapi:index>`,0.71.0,0.71.0
+      :doc:`ROCm Data Center Tool <rdc:index>`,1.0.0,0.3.0,0.3.0
-      :doc:`rocminfo <rocminfo:index>`,1.0.0,1.0.0
+      :doc:`rocminfo <rocminfo:index>`,1.0.0,1.0.0,1.0.0
-      :doc:`ROCProfiler <rocprofiler:index>`,2.0.60100,2.0.0
+      :doc:`ROCm SMI <rocm_smi_lib:index>`,7.3.0,7.2.0,6.0.0
-      `rocprofiler-register <https://github.com/ROCm/rocprofiler-register>`_,0.3.0,N/A
+      :doc:`ROCm Validation Suite <rocmvalidationsuite:index>`,rocm-6.2.0,rocm-6.1.2,rocm-6.0.0
-      :doc:`ROCTracer <roctracer:index>`,4.1.60100,4.1.0
+      ,,,
-      :doc:`ROCm Bandwidth Test <rocm_bandwidth_test:index>`,1.4.0,1.4.0
+      PERFORMANCE TOOLS,,,
-      :doc:`ROCm Data Center Tool <rdc:index>`,0.3.0,0.3.0
+      :doc:`Omniperf <omniperf:index>`,2.0.1,N/A,N/A
-      :doc:`ROCm Debugger (ROCgdb) <rocgdb:index>`,14.1.0,13.2.0
+      :doc:`Omnitrace <omnitrace:index>`,1.11.2,N/A,N/A
-      :doc:`ROCm SMI <rocm_smi_lib:index>`,7.0.0,6.0.0
+      :doc:`ROCm Bandwidth Test <rocm_bandwidth_test:index>`,1.4.0,1.4.0,1.4.0
-      :doc:`ROCm Validation Suite <rocmvalidationsuite:index>`,rocm-6.1.0,rocm-6.0.0
+      :doc:`ROCProfiler <rocprofiler:index>`,2.0.60200,2.0.60102,2.0.60000
-      :doc:`ROCr Debug Agent <rocr_debug_agent:index>`,2.0.3,2.0.3
+      :doc:`ROCprofiler-SDK <rocprofiler-sdk:index>`,0.4.0,N/A,N/A
-      :doc:`TransferBench <transferbench:index>`,1.48,1.46
+      :doc:`ROCTracer <roctracer:index>`,4.1.60200,4.1.60102,4.1.60000
-      ,,
+      ,,,
-      COMPILERS:,,
+      DEVELOPMENT TOOLS,,,
-      `clang-ocl <https://github.com/ROCm/clang-ocl>`_,0.5.0,0.5.0
+      :doc:`HIPIFY <hipify:index>`,18.0.0.24232,17.0.0.24193,17.0.0.23483
-      `Flang <https://github.com/ROCm/flang>`_,17.0.0.24103,17.0.0.23483
+      :doc:`ROCm CMake <rocmcmakebuildtools:index>`,0.13.0,0.12.0,0.11.0
-      `llvm-project <https://github.com/ROCm/llvm-project>`_,17.0.0.24103,17.0.0.23483
+      :doc:`ROCdbgapi <rocdbgapi:index>`,0.76.0,0.71.0,0.71.0
-      `OpenMP <https://github.com/ROCm/llvm-project/tree/amd-staging/openmp>`_,17.0.0.24103,17.0.0.23483
+      :doc:`ROCm Debugger (ROCgdb) <rocgdb:index>`,14.2.0,14.1.0,13.2.0
-      ,,
+      `rocprofiler-register <https://github.com/ROCm/rocprofiler-register>`_,0.4.0,0.3.0,N/A
-      RUNTIMES:,,
+      :doc:`ROCr Debug Agent <rocr_debug_agent:index>`,2.0.3,2.0.3,2.0.3
-      :doc:`HIP <hip:index>`,6.1.40091,6.0.32830
+      ,,,
-      `OpenCL Runtime <https://github.com/ROCm/clr/tree/develop/opencl>`_,2.0.0,2.0.0
+      COMPILERS,".. _compilers-support-compatibility-matrix:",,
-      :doc:`ROCR-Runtime <rocr-runtime:index>`,1.13.0,1.12.0
+      `clang-ocl <https://github.com/ROCm/clang-ocl>`_,N/A,0.5.0,0.5.0
      `Flang <https://github.com/ROCm/flang>`_,18.0.0.24232,17.0.0.24193,17.0.0.23483
      `llvm-project <https://github.com/ROCm/llvm-project>`_,18.0.0.24232,17.0.0.24193,17.0.0.23483
      `OpenMP <https://github.com/ROCm/llvm-project/tree/amd-staging/openmp>`_,18.0.0.24232,17.0.0.24193,17.0.0.23483
      ,,,
      RUNTIMES,".. _runtime-support-compatibility-matrix:",,
      :doc:`HIP <hip:index>`,6.2.41133,6.1.40093,6.1.32830
      `OpenCL Runtime <https://github.com/ROCm/clr/tree/develop/opencl>`_,2.0.0,2.0.0,2.0.0
      :doc:`ROCR-Runtime <rocr-runtime:index>`,1.13.0,1.13.0,1.12.0
 .. rubric:: Footnotes
-.. [#red-hat94] **For ROCm 6.1** - RHEL 9.4 is supported only on AMD Instinct MI300A.
+.. [#Ubuntu220405] Preview support of Ubuntu 22.04.5 only
-.. [#oracle89] **For ROCm 6.1.1** - Oracle Linux is supported only on AMD Instinct MI300X.
+.. [#red-hat94] RHEL 9.4 is supported only on AMD Instinct MI300A.
-.. [#] **For ROCm 6.1** - MI300A (gfx942) is supported on Ubuntu 22.04.4, RHEL 9.4, RHEL 9.3, RHEL 8.9, and SLES 15 SP5. MI300X (gfx942) is only supported on Ubuntu 22.04.4.
+.. [#oracle89] Oracle Linux is supported only on AMD Instinct MI300X.
-.. [#] **For ROCm 6.0** - MI300A (gfx942) is supported on Ubuntu 22.04.3, RHEL 8.9 and SLES 15 SP5. MI300X (gfx942) is only supported on Ubuntu 22.04.3.
+.. [#mi300_620] **For ROCm 6.2.0** - MI300X (gfx942) is supported on listed operating systems *except* Ubuntu 22.04.5 [6.8 HWE] and Ubuntu 22.04.4 [6.5 HWE].
 .. [#mi300_612] **For ROCm 6.1.2** - MI300A (gfx942) is supported on Ubuntu 22.04.4, RHEL 9.4, RHEL 9.3, RHEL 8.9, and SLES 15 SP5. MI300X (gfx942) is only supported on Ubuntu 22.04.4 and Oracle Linux.
 .. [#mi300_600] **For ROCm 6.0.0** - MI300A (gfx942) is supported on Ubuntu 22.04.3, RHEL 8.9, and SLES 15 SP5. MI300X (gfx942) is only supported on Ubuntu 22.04.3.
 ..
   Footnotes and ref anchors in below historical tables should be appended with "-past-60", to differentiate from the 
   footnote references in the above, latest, compatibility matrix.  It also allows to easily find & replace.
   An easy way to work is to download the historical.CSV file, and update open it in excel. Then when content is ready, 
   delete the columns you don't need, to build the current compatibility matrix to use in above table.  Find & replace all
   instances of "-past-60" to make it ready for above table.
 .. _past-rocm-compatibility-matrix:
 Past versions of ROCm compatibility matrix
 ***************************************************
 Expand for full historical view of:
 .. dropdown:: ROCm 6.0 - Present
   You can `download the entire .csv <../downloads/compatibility-matrix-historical-6.0.csv>`_ for offline reference.
   .. csv-table::
      :file: ../data/reference/compatibility-matrix-historical-6.0.csv
      :widths: 20,10,10,10,10,10,10
      :header-rows: 1
      :stub-columns: 1
   .. rubric:: Footnotes
   .. [#Ubuntu220405-past-60] Preview support of Ubuntu 22.04.5 only
   .. [#red-hat94-past-60] RHEL 9.4 is supported only on AMD Instinct MI300A.
   .. [#oracle89-past-60] Oracle Linux is supported only on AMD Instinct MI300X.
   .. [#mi300_620-past-60] **For ROCm 6.2.0** - MI300X (gfx942) is supported on listed operating systems *except* Ubuntu 22.04.5 [6.8 HWE] and Ubuntu 22.04.4 [6.5 HWE].
   .. [#mi300_612-past-60] **For ROCm 6.1.2** - MI300A (gfx942) is supported on Ubuntu 22.04.4, RHEL 9.4, RHEL 9.3, RHEL 8.9, and SLES 15 SP5. MI300X (gfx942) is only supported on Ubuntu 22.04.4 and Oracle Linux.
   .. [#mi300_611-past-60] **For ROCm 6.1.1** - MI300A (gfx942) is supported on Ubuntu 22.04.4, RHEL 9.4, RHEL 9.3, RHEL 8.9, and SLES 15 SP5. MI300X (gfx942) is only supported on Ubuntu 22.04.4 and Oracle Linux.
   .. [#mi300_610-past-60] **For ROCm 6.1.0** - MI300A (gfx942) is supported on Ubuntu 22.04.4, RHEL 9.4, RHEL 9.3, RHEL 8.9, and SLES 15 SP5. MI300X (gfx942) is only supported on Ubuntu 22.04.4.
   .. [#mi300_602-past-60] **For ROCm 6.0.2** - MI300A (gfx942) is supported on Ubuntu 22.04.3, RHEL 8.9, and SLES 15 SP5. MI300X (gfx942) is only supported on Ubuntu 22.04.3.
   .. [#mi300_600-past-60] **For ROCm 6.0.0** - MI300A (gfx942) is supported on Ubuntu 22.04.3, RHEL 8.9, and SLES 15 SP5. MI300X (gfx942) is only supported on Ubuntu 22.04.3.
--- a/docs/conceptual/compiler-disambiguation.md
+++ b/docs/conceptual/compiler-disambiguation.md
@@ -1,21 +0,0 @@
 <head>
  <meta charset="UTF-8">
  <meta name="description" content="ROCm compilers disambiguation">
  <meta name="keywords" content="compilers, compiler naming, AMD, ROCm">
 </head>
 # ROCm compilers disambiguation
 ROCm ships multiple compilers of varying origins and purposes. This article
 disambiguates compiler naming used throughout the documentation.
 ## Compiler terms
 | Term | Description |
 | - | - |
 | `amdclang++` | Clang/LLVM-based compiler that is part of `rocm-llvm` package. The source code is available at <a href="https://github.com/ROCm/llvm-project" target="_blank">https://github.com/ROCm/llvm-project</a>. |
 | AOCC | Closed-source clang-based compiler that includes additional CPU optimizations. Offered as part of ROCm via the `rocm-llvm-alt` package. See for details, <a href="https://developer.amd.com/amd-aocc/" target="_blank">https://developer.amd.com/amd-aocc/</a>. |
 | HIP-Clang | Informal term for the `amdclang++` compiler |
 | HIPIFY | Tools including `hipify-clang` and `hipify-perl`, used to automatically translate CUDA source code into portable HIP C++. The source code is available at <a href="https://github.com/ROCm/HIPIFY" target="_blank">https://github.com/ROCm/HIPIFY</a> |
 | `hipcc` | HIP compiler driver. A utility that invokes `clang` or `nvcc` depending on the target and passes the appropriate include and library options for the target compiler and HIP infrastructure. The source code is available at <a href="https://github.com/ROCm/HIPCC" target="_blank">https://github.com/ROCm/HIPCC</a>. |
 | ROCmCC | Clang/LLVM-based compiler. ROCmCC in itself is not a binary but refers to the overall compiler. |
--- a/docs/conceptual/compiler-topics.md
+++ b/docs/conceptual/compiler-topics.md
@@ -9,6 +9,6 @@
 The following topics describe using specific features of the compilation tools:
-* [Using AddressSanitizer](./using-gpu-sanitizer.md)
+* [ROCm compiler infrastructure](https://rocm.docs.amd.com/projects/llvm-project/en/latest/index.html)
-* [Compiler disambiguation](./compiler-disambiguation.md)
+* [Using AddressSanitizer](https://rocm.docs.amd.com/projects/llvm-project/en/latest/conceptual/using-gpu-sanitizer.html)
-* [OpenMP support in ROCm](../about/compatibility/openmp.md)
+* [OpenMP support](https://rocm.docs.amd.com/projects/llvm-project/en/latest/conceptual/openmp.html)
--- a/docs/conceptual/using-gpu-sanitizer.md
+++ b/docs/conceptual/using-gpu-sanitizer.md
@@ -1,431 +0,0 @@
 <head>
  <meta charset="UTF-8">
  <meta name="description" content="Using the LLVM ASan on a GPU">
  <meta name="keywords" content="LLVM, ASan, address sanitizer, AddressSanitizer, instrumented
  libraries, instrumented applications, AMD, ROCm">
 </head>
 # Using the AddressSanitizer on a GPU (beta release)
 The LLVM AddressSanitizer (ASan) provides a process that allows developers to detect runtime addressing errors in applications and libraries. The detection is achieved using a combination of compiler-added instrumentation and runtime techniques, including function interception and replacement.
 Until now, the LLVM ASan process was only available for traditional purely CPU applications. However, ROCm has extended this mechanism to additionally allow the detection of some addressing errors on the GPU in heterogeneous applications. Ideally, developers should treat heterogeneous HIP and OpenMP applications exactly like pure CPU applications. However, this simplicity has not been achieved yet.
 This document provides documentation on using ROCm ASan.
 For information about LLVM ASan, see the [LLVM documentation](https://clang.llvm.org/docs/AddressSanitizer.html).
 :::{note}
 The beta release of LLVM ASan for ROCm is currently tested and validated on Ubuntu 20.04.
 :::
 ## Compiling for ASan
 The ASan process begins by compiling the application of interest with the ASan instrumentation.
 Recommendations for doing this are:
 * Compile as many application and dependent library sources as possible using an AMD-built clang-based compiler such as `amdclang++`.
 * Add the following options to the existing compiler and linker options:
  * `-fsanitize=address` - enables instrumentation
  * `-shared-libsan` - use shared version of runtime
  * `-g` - add debug info for improved reporting
 * Explicitly use `xnack+` in the offload architecture option. For example, `--offload-arch=gfx90a:xnack+`
 Other architectures are allowed, but their device code will not be instrumented and a warning will be emitted.
 :::{tip}
 It is not an error to compile some files without ASan instrumentation, but doing so reduces the ability of the process to detect addressing errors. However, if the main program "`a.out`" does not directly depend on the ASan runtime (`libclang_rt.asan-x86_64.so`) after the build completes (check by running `ldd` (List Dynamic Dependencies) or `readelf`), the application will immediately report an error at runtime as described in the next section.
 :::
 :::{note}
 When compiling OpenMP programs with ASan instrumentation, it is currently necessary to set the environment variable `LIBRARY_PATH` to `/opt/rocm-<version>/lib/llvm/lib/asan:/opt/rocm-<version>/lib/asan`. At runtime, it may be necessary to add `/opt/rocm-<version>/lib/llvm/lib/asan` to `LD_LIBRARY_PATH`.
 :::
 ### About compilation time
 When `-fsanitize=address` is used, the LLVM compiler adds instrumentation code around every memory operation. This added code must be handled by all downstream components of the compiler toolchain and results in increased overall compilation time. This increase is especially evident in the AMDGPU device compiler and has in a few instances raised the compile time to an unacceptable level.
 There are a few options if the compile time becomes unacceptable:
 * Avoid instrumentation of the files which have the worst compile times. This will reduce the effectiveness of the ASan process.
 * Add the option `-fsanitize-recover=address` to the compiles with the worst compile times. This option simplifies the added instrumentation resulting in faster compilation. See below for more information.
 * Disable instrumentation on a per-function basis by adding `__attribute__`((no_sanitize("address"))) to functions found to be responsible for the large compile time. Again, this will reduce the effectiveness of the process.
 ## Installing ROCm GPU ASan packages
 For a complete ROCm GPU Sanitizer installation, including packages, instrumented HSA and HIP runtimes, tools, and math libraries, use the following instruction,
 ```bash
    sudo apt-get install rocm-ml-sdk-asan
 ```
 ## Using AMD-supplied ASan instrumented libraries
 ROCm releases have optional packages that contain additional ASan instrumented builds of the ROCm libraries (usually found in `/opt/rocm-<version>/lib`). The instrumented libraries have identical names to the regular uninstrumented libraries, and are located in `/opt/rocm-<version>/lib/asan`.
 These additional libraries are built using the `amdclang++` and `hipcc` compilers, while some uninstrumented libraries are built with `g++`. The preexisting build options are used but, as described above, additional options are used: `-fsanitize=address`, `-shared-libsan` and `-g`.
 These additional libraries avoid additional developer effort to locate repositories, identify the correct branch, check out the correct tags, and other efforts needed to build the libraries from the source. And they extend the ability of the process to detect addressing errors into the ROCm libraries themselves.
 When adjusting an application build to add instrumentation, linking against these instrumented libraries is unnecessary. For example, any `-L` `/opt/rocm-<version>/lib` compiler options need not be changed. However, the instrumented libraries should be used when the application is run. It is particularly important that the instrumented language runtimes, like `libamdhip64.so` and `librocm-core.so`, are used; otherwise, device invalid access detections may not be reported.
 ## Running ASan instrumented applications
 ### Preparing to run an instrumented application
 Here are a few recommendations to consider before running an ASan instrumented heterogeneous application.
 * Ensure the Linux kernel running on the system has Heterogeneous Memory Management (HMM) support. A kernel version of 5.6 or higher should be sufficient.
 * Ensure XNACK is enabled
  * For `gfx90a` (MI-2X0) or `gfx940` (MI-3X0) use environment `HSA_XNACK = 1`.
  * For `gfx906` (MI-50) or `gfx908` (MI-100) use environment `HSA_XNACK = 1` but also ensure the amdgpu kernel module is loaded with module argument `noretry=0`.
 This requirement is due to the fact that the XNACK setting for these GPUs is system-wide.
 * Ensure that the application will use the instrumented libraries when it runs. The output from the shell command `ldd <application name>` can be used to see which libraries will be used.
 If the instrumented libraries are not listed by `ldd`, the environment variable `LD_LIBRARY_PATH` may need to be adjusted, or in some cases an `RPATH` compiled into the application may need to be changed and the application recompiled.
 * Ensure that the application depends on the ASan runtime. This can be checked by running the command `readelf -d <application name> | grep NEEDED` and verifying that shared library: `libclang_rt.asan-x86_64.so` appears in the output.
 If it does not appear, when executed the application will quickly output an ASan error that looks like:
 ```bash
 ==3210==ASan runtime does not come first in initial library list; you should either link runtime to your application or manually preload it with LD_PRELOAD.
 ```
 * Ensure that the application `llvm-symbolizer` can be executed, and that it is located in `/opt/rocm-<version>/llvm/bin`. This executable is not strictly required, but if found is used to translate ("symbolize") a host-side instruction address into a more useful function name, file name, and line number (assuming the application has been built to include debug information).
 There is an environment variable, `ASAN_OPTIONS`, that can be used to adjust the runtime behavior of the ASan runtime itself. There are more than a hundred "flags" that can be adjusted (see an old list at [flags](https://github.com/google/sanitizers/wiki/AddressSanitizerFlags)) but the default settings are correct and should be used in most cases. It must be noted that these options only affect the host ASan runtime. The device runtime only currently supports the default settings for the few relevant options.
 There are three `ASAN_OPTION` flags of note.
 * `halt_on_error=0/1 default 1`.
  This tells the ASan runtime to halt the application immediately after detecting and reporting an addressing error. The default makes sense because the application has entered the realm of undefined behavior. If the developer wishes to have the application continue anyway, this option can be set to zero. However, the application and libraries should then be compiled with the additional option `-fsanitize-recover=address`. Note that the ROCm optional ASan instrumented libraries are not compiled with this option and if an error is detected within one of them, but halt_on_error is set to 0, more undefined behavior will occur.
 * `detect_leaks=0/1 default 1`.
  This option directs the ASan runtime to enable the [Leak Sanitizer](https://clang.llvm.org/docs/LeakSanitizer.html) (LSan). For heterogeneous applications, this default results in significant output from the leak sanitizer when the application exits due to allocations made by the language runtime which are not considered to be leaks. This output can be avoided by adding `detect_leaks=0` to the `ASAN_OPTIONS`, or alternatively by producing an LSan suppression file (syntax described [here](https://github.com/google/sanitizers/wiki/AddressSanitizerLeakSanitizer)) and activating it with environment variable `LSAN_OPTIONS=suppressions=/path/to/suppression/file`. When using a suppression file, a suppression report is printed by default. The suppression report can be disabled by using the `LSAN_OPTIONS` flag `print_suppressions=0`.
 * `quarantine_size_mb=N default 256`
  This option defines the number of megabytes (MB) `N` of memory that the ASan runtime will hold after it is `freed` to detect use-after-free situations. This memory is unavailable for other purposes. The default of 256 MB may be too small to detect some use-after-free situations, especially given that the large size of many GPU memory allocations may push `freed` allocations out of quarantine before the attempted use.
  :::{note}
  Setting the value of `quarantine_size_mb` larger may enable more problematic uses to be detected, but at the cost of reducing memory available for other purposes.
  :::
 ## Runtime overhead
 Running an ASan instrumented application incurs
 overheads which may result in unacceptably long runtimes
 or failure to run at all.
 ### Higher execution time
 ASan detection works by checking each address at runtime
 before the address is actually accessed by a load, store, or atomic
 instruction.
 This checking involves an additional load to "shadow" memory which
 records whether the address is "poisoned" or not, and additional logic
 that decides whether to produce an detection report or not.
 This extra runtime work can cause the application to slow down by
 a factor of three or more, depending on how many memory accesses are
 executed.
 For heterogeneous applications, the shadow memory must be accessible by all devices
 and this can mean that shadow accesses from some devices may be more costly
 than non-shadow accesses.
 ### Higher memory use
 The address checking described above relies on the compiler to surround
 each program variable with a red zone and on ASan
 runtime to surround each runtime memory allocation with a red zone and
 fill the shadow corresponding to each red zone with poison.
 The added memory for the red zones is additional overhead on top
 of the 13% overhead for the shadow memory itself.
 Applications which consume most one or more available memory pools when
 run normally are likely to encounter allocation failures when run with
 instrumentation.
 ## Runtime reporting
 It is not the intention of this document to provide a detailed explanation of all the types of reports that can be output by the ASan runtime. Instead, the focus is on the differences between the standard reports for CPU issues, and reports for GPU issues.
 An invalid address detection report for the CPU always starts with
 ```bash
 ==<PID>==ERROR: AddressSanitizer: <problem type> on address <memory address> at pc <pc> bp <bp> sp <sp> <access> of size <N> at <memory address> thread T0
 ```
 and continues with a stack trace for the access, a stack trace for the allocation and deallocation, if relevant, and a dump of the shadow near the <memory address>.
 In contrast, an invalid address detection report for the GPU always starts with
 ```bash
 ==<PID>==ERROR: AddressSanitizer: <problem type> on amdgpu device <device> at pc <pc> <access> of size <n> in workgroup id (<X>,<Y>,<Z>)
 ```
 Above, `<device>` is the integer device ID, and `(<X>, <Y>, <Z>)` is the ID of the workgroup or block where the invalid address was detected.
 While the CPU report include a call stack for the thread attempting the invalid access, the GPU is currently to a call stack of size one, i.e. the (symbolized) of the invalid access, e.g.
 ```bash
 #0 <pc> in <fuction signature> at /path/to/file.hip:<line>:<column>
 ```
 This short call stack is followed by a GPU unique section that looks like
 ```bash
 Thread ids and accessed addresses:
 <lid0> <maddr 0> : <lid1> <maddr1> : ...
 ```
 where each `<lid j> <maddr j>` indicates the lane ID and the invalid memory address held by lane `j` of the wavefront attempting the invalid access.
 Additionally, reports for invalid GPU accesses to memory allocated by GPU code via `malloc` or new starting with, for example,
 ```bash
 ==1234==ERROR: AddressSanitizer: heap-buffer-overflow on amdgpu device 0 at pc 0x7fa9f5c92dcc
 ```
 or
 ```bash
 ==5678==ERROR: AddressSanitizer: heap-use-after-free on amdgpu device 3 at pc 0x7f4c10062d74
 ```
 currently may include one or two surprising CPU side tracebacks mentioning :`hostcall`". This is due to how `malloc` and `free` are implemented for GPU code and these call stacks can be ignored.
 ## Running ASan with `rocgdb`
 `rocgdb` can be used to further investigate ASan detected errors, with some preparation.
 Currently, the ASan runtime complains when starting `rocgdb` without preparation.
 ```bash
 $ rocgdb my_app
 ==1122==ASan` runtime does not come first in initial library list; you should either link runtime to your application or manually preload it with LD_PRELOAD.
 ```
 This is solved by setting environment variable `LD_PRELOAD` to the path to the ASan runtime, whose path can be obtained using the command
 ```bash
 amdclang++ -print-file-name=libclang_rt.asan-x86_64.so
 ```
 You should also set the environment variable `HIP_ENABLE_DEFERRED_LOADING=0` before debugging HIP applications.
 After starting `rocgdb` breakpoints can be set on the ASan runtime error reporting entry points of interest. For example, if an ASan error report includes
 ```bash
 WRITE of size 4 in workgroup id (10,0,0)
 ```
 the `rocgdb` command needed to stop the program before the report is printed is
 ```bash
 (gdb) break __asan_report_store4
 ```
 Similarly, the appropriate command for a report including
 ```bash
 READ of size <N> in workgroup ID (1,2,3)
 ```
 is
 ```bash
 (gdb) break __asan_report_load<N>
 ```
 It is possible to set breakpoints on all ASan report functions using these commands:
 ```bash
 $ rocgdb <path to application>
 (gdb) start <commmand line arguments>
 (gdb) rbreak ^__asan_report
 (gdb) c
 ```
 ## Using ASan with a short HIP application
 Consider the following simple and short demo of using the Address Sanitizer with a HIP application:
 ```C++
 #include <cstdlib>
 #include <hip/hip_runtime.h>
 __global__ void
 set1(int *p)
 {
    int i = blockDim.x*blockIdx.x + threadIdx.x;
    p[i] = 1;
 }
 int
 main(int argc, char **argv)
 {
    int m = std::atoi(argv[1]);
    int n1 = std::atoi(argv[2]);
    int n2 = std::atoi(argv[3]);
    int c = std::atoi(argv[4]);
    int *dp;
    hipMalloc(&dp, m*sizeof(int));
    hipLaunchKernelGGL(set1, dim3(n1), dim3(n2), 0, 0, dp);
    int *hp = (int*)malloc(c * sizeof(int));
    hipMemcpy(hp, dp, m*sizeof(int), hipMemcpyDeviceToHost);
    hipDeviceSynchronize();
    hipFree(dp);
    free(hp);
    std::puts("Done.");
    return 0;
 }
 ```
 This application will attempt to access invalid addresses for certain command line arguments. In particular, if `m < n1 * n2` some device threads will attempt to access
 unallocated device memory.
 Or, if `c < m`, the `hipMemcpy` function will copy past the end of the `malloc` allocated memory.
 **Note**: The `hipcc` compiler is used here for simplicity.
 Compiling without XNACK results in a warning.
 ```bash
 $ hipcc -g --offload-arch=gfx90a:xnack- -fsanitize=address -shared-libsan mini.hip -o mini
 clang++: warning: ignoring` `-fsanitize=address' option for offload arch 'gfx90a:xnack-`, as it is not currently supported there. Use it with an offload arch containing 'xnack+' instead [-Woption-ignored]`.
 ```
 The binary compiled above will run, but the GPU code will not be instrumented and the `m < n1 * n2` error will not be detected. Switching to `--offload-arch=gfx90a:xnack+` in the command above results in a warning-free compilation and an instrumented application. After setting `PATH`, `LD_LIBRARY_PATH` and `HSA_XNACK` as described earlier, a check of the binary with `ldd` yields the following,
 ```bash
 $ ldd mini
        linux-vdso.so.1 (0x00007ffd1a5ae000)
        libclang_rt.asan-x86_64.so => /opt/rocm-6.1.0-99999/llvm/lib/clang/17.0.0/lib/linux/libclang_rt.asan-x86_64.so (0x00007fb9c14b6000)
        libamdhip64.so.5 => /opt/rocm-6.1.0-99999/lib/asan/libamdhip64.so.5 (0x00007fb9bedd3000)
        libstdc++.so.6 => /lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007fb9beba8000)
        libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fb9bea59000)
        libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007fb9bea3e000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fb9be84a000)
        libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007fb9be844000)
        libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007fb9be821000)
        librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007fb9be817000)
        libamd_comgr.so.2 => /opt/rocm-6.1.0-99999/lib/asan/libamd_comgr.so.2 (0x00007fb9b4382000)
        libhsa-runtime64.so.1 => /opt/rocm-6.1.0-99999/lib/asan/libhsa-runtime64.so.1 (0x00007fb9b3b00000)
        libnuma.so.1 => /lib/x86_64-linux-gnu/libnuma.so.1 (0x00007fb9b3af3000)
        /lib64/ld-linux-x86-64.so.2 (0x00007fb9c2027000)
        libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007fb9b3ad7000)
        libtinfo.so.6 => /lib/x86_64-linux-gnu/libtinfo.so.6 (0x00007fb9b3aa7000)
        libelf.so.1 => /lib/x86_64-linux-gnu/libelf.so.1 (0x00007fb9b3a89000)
        libdrm.so.2 => /opt/amdgpu/lib/x86_64-linux-gnu/libdrm.so.2 (0x00007fb9b3a70000)
        libdrm_amdgpu.so.1 => /opt/amdgpu/lib/x86_64-linux-gnu/libdrm_amdgpu.so.1 (0x00007fb9b3a62000)
 ```
 This confirms that the address sanitizer runtime is linked in, and the ASan instrumented version of the runtime libraries are used.
 Checking the `PATH` yields
 ```bash
 $ which llvm-symbolizer
 /opt/rocm-6.1.0-99999/llvm/bin/llvm-symbolizer
 ```
 Lastly, a check of the OS kernel version yields
 ```bash
 $ uname -rv
 5.15.0-73-generic #80~20.04.1-Ubuntu SMP Wed May 17 14:58:14 UTC 2023
 ```
 which indicates that the required HMM support (kernel version > 5.6) is available. This completes the necessary setup. Running with `m = 100`, `n1 = 11`, `n2 = 10` and `c = 100` should produce
 a report for an invalid access by the last 10 threads.
 ```bash
 =================================================================
 ==3141==ERROR: AddressSanitizer: heap-buffer-overflow on amdgpu device 0 at pc 0x7fb1410d2cc4
 WRITE of size 4 in workgroup id (10,0,0)
  #0 0x7fb1410d2cc4 in set1(int*) at /home/dave/mini/mini.cpp:0:10
 Thread ids and accessed addresses:
 00 : 0x7fb14371d190 01 : 0x7fb14371d194 02 : 0x7fb14371d198 03 : 0x7fb14371d19c 04 : 0x7fb14371d1a0 05 : 0x7fb14371d1a4 06 : 0x7fb14371d1a8 07 : 0x7fb14371d1ac
 08 : 0x7fb14371d1b0 09 : 0x7fb14371d1b4
 0x7fb14371d190 is located 0 bytes after 400-byte region [0x7fb14371d000,0x7fb14371d190)
 allocated by thread T0 here:
    #0 0x7fb151c76828 in hsa_amd_memory_pool_allocate /work/dave/git/compute/external/llvm-project/compiler-rt/lib/asan/asan_interceptors.cpp:692:3
    #1 ...
    #12 0x7fb14fb99ec4 in hipMalloc /work/dave/git/compute/external/clr/hipamd/src/hip_memory.cpp:568:3
    #13 0x226630 in hipError_t hipMalloc<int>(int**, unsigned long) /opt/rocm-6.1.0-99999/include/hip/hip_runtime_api.h:8367:12
    #14 0x226630 in main /home/dave/mini/mini.cpp:19:5
    #15 0x7fb14ef02082 in __libc_start_main /build/glibc-SzIz7B/glibc-2.31/csu/../csu/libc-start.c:308:16
 Shadow bytes around the buggy address:
  0x7fb14371cf00: ...
 =>0x7fb14371d180: 00 00[fa]fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x7fb14371d200: ...
 Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  ...
 ==3141==ABORTING
 ```
 Running with `m = 100`, `n1 = 10`, `n2 = 10` and `c = 99` should produce a report for an invalid copy.
 ```shell
 =================================================================
 ==2817==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x514000150dcc at pc 0x7f5509551aca bp 0x7ffc90a7ae50 sp 0x7ffc90a7a610
 WRITE of size 400 at 0x514000150dcc thread T0
    #0 0x7f5509551ac9 in __asan_memcpy /work/dave/git/compute/external/llvm-project/compiler-rt/lib/asan/asan_interceptors_memintrinsics.cpp:61:3
    #1 ...
    #9 0x7f5507462a28 in hipMemcpy_common(void*, void const*, unsigned long, hipMemcpyKind, ihipStream_t*) /work/dave/git/compute/external/clr/hipamd/src/hip_memory.cpp:637:10
    #10 0x7f5507464205 in hipMemcpy /work/dave/git/compute/external/clr/hipamd/src/hip_memory.cpp:642:3
    #11 0x226844 in main /home/dave/mini/mini.cpp:22:5
    #12 0x7f55067c3082 in __libc_start_main /build/glibc-SzIz7B/glibc-2.31/csu/../csu/libc-start.c:308:16
    #13 0x22605d in _start (/home/dave/mini/mini+0x22605d)
 0x514000150dcc is located 0 bytes after 396-byte region [0x514000150c40,0x514000150dcc)
 allocated by thread T0 here:
    #0 0x7f5509553dcf in malloc /work/dave/git/compute/external/llvm-project/compiler-rt/lib/asan/asan_malloc_linux.cpp:69:3
    #1 0x226817 in main /home/dave/mini/mini.cpp:21:21
    #2 0x7f55067c3082 in __libc_start_main /build/glibc-SzIz7B/glibc-2.31/csu/../csu/libc-start.c:308:16
 SUMMARY: AddressSanitizer: heap-buffer-overflow /work/dave/git/compute/external/llvm-project/compiler-rt/lib/asan/asan_interceptors_memintrinsics.cpp:61:3 in __asan_memcpy
 Shadow bytes around the buggy address:
  0x514000150b00: ...
 =>0x514000150d80: 00 00 00 00 00 00 00 00 00[04]fa fa fa fa fa fa
  0x514000150e00: ...
 Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  ...
 ==2817==ABORTING
 ```
 ## Known issues with using GPU sanitizer
 * Red zones must have limited size. It is possible for an invalid access to completely miss a red zone and not be detected.
 * Lack of detection or false reports can be caused by the runtime not properly maintaining red zone shadows.
 * Lack of detection on the GPU might also be due to the implementation not instrumenting accesses to all GPU specific address spaces. For example, in the current implementation accesses to "private" or "stack" variables on the GPU are not instrumented, and accesses to HIP shared variables (also known as "local data store" or "LDS") are also not instrumented.
 * It can also be the case that a memory fault is reported for an invalid address even with the instrumentation. This is usually caused by the invalid address being so wild that its shadow address is outside any memory region, and the fault actually occurs on the access to the shadow address. It is also possible to hit a memory fault for the `NULL` pointer. While address 0 does have a shadow location, it is not poisoned by the runtime.
 * There is currently a bug which can result in memory faults being reported when running instrumented device code which makes use of `malloc`, `free`, `new`, or `delete`.
 * There is currently a bug which can result in undefined symbols being reported at compile time when instrumented device code makes use of `new` and `delete`.
--- a/docs/conf.py
+++ b/docs/conf.py
@@ -7,9 +7,10 @@
 import os
 import shutil
 # Keep capitalization due to similar linking on GitHub's markdown preview.
 shutil.copy2("../RELEASE.md", "./about/release-notes.md")
-shutil.copy2("../CHANGELOG.md", "./about/changelog.md")
+
 os.system("mkdir -p ../_readthedocs/html/downloads")
 os.system("cp data/reference/compatibility-matrix-historical-6.0.csv ../_readthedocs/html/downloads/compatibility-matrix-historical-6.0.csv")
 latex_engine = "xelatex"
 latex_elements = {
@@ -29,16 +30,16 @@ if os.environ.get("READTHEDOCS", "") == "True":
 project = "ROCm Documentation"
 author = "Advanced Micro Devices, Inc."
 copyright = "Copyright (c) 2024 Advanced Micro Devices, Inc. All rights reserved."
-version = "6.1.2"
+version = "6.2.0"
-release = "6.1.2"
+release = "6.2.0"
 setting_all_article_info = True
 all_article_info_os = ["linux", "windows"]
 all_article_info_author = ""
 # pages with specific settings
 article_pages = [
-    {"file": "about/release-notes", "os": ["linux", "windows"], "date": "2024-06-04"},
+    {"file": "about/release-notes", "os": ["linux", "windows"], "date": "2024-08-02"},
-    {"file": "about/changelog", "os": ["linux", "windows"], "date": "2024-06-04"},
+    {"file": "about/changelog", "os": ["linux", "windows"], "date": "2024-08-02"},
    {"file": "how-to/deep-learning-rocm", "os": ["linux"]},
    {"file": "how-to/rocm-for-ai/index", "os": ["linux"]},
    {"file": "how-to/rocm-for-ai/install", "os": ["linux"]},
@@ -99,11 +100,16 @@ extensions = ["rocm_docs", "sphinx_reredirects"]
 external_projects_current_project = "rocm"
 html_baseurl = os.environ.get("READTHEDOCS_CANONICAL_URL", "rocm-stg.amd.com")
 html_context = {}
 if os.environ.get("READTHEDOCS", "") == "True":
    html_context["READTHEDOCS"] = True
 html_theme = "rocm_docs_theme"
 html_theme_options = {"flavor": "rocm-docs-home"}
 html_static_path = ["sphinx/static/css"]
-html_css_files = ["rocm_custom.css"]
+html_css_files = ["rocm_custom.css", "rocm_rn.css"]
 html_title = "ROCm Documentation"
--- a/docs/contribute/contributing.md
+++ b/docs/contribute/contributing.md
@@ -56,6 +56,10 @@ To make edits to our documentation via PR, follow these steps:
 6. Change directory into the `./docs` folder and make any documentation changes locally using your preferred code editor. Follow the guidelines listed on the
   [documentation structure](./doc-structure.md) page.
   ```{note}
   Spell checking is performed for pull requests by {doc}`ROCm Docs Core<rocm-docs-core:index>`. To ensure your PR passes spell checking you might need at add new words or acronyms to the `.wordlist.txt` file as described in {doc}`Spell Check<rocm-docs-core:user_guide/spellcheck>`. 
   ```
 7. Optionally run a local test build of the documentation to ensure the content builds and looks as expected. In your terminal, run the following commands from within the `./docs` folder of your cloned repository:
     ```bash
--- a/docs/data/how-to/tuning-guides/mi300a-rocm-bandwidth-test-output.png
+++ b/docs/data/how-to/tuning-guides/mi300a-rocm-bandwidth-test-output.png
--- a/docs/data/how-to/tuning-guides/mi300a-rocm-peak-bandwidth-output.png
+++ b/docs/data/how-to/tuning-guides/mi300a-rocm-peak-bandwidth-output.png
--- a/docs/data/how-to/tuning-guides/mi300a-rocm-smi-output.png
+++ b/docs/data/how-to/tuning-guides/mi300a-rocm-smi-output.png
--- a/docs/data/how-to/tuning-guides/mi300a-rocm-smi-showhw-output.png
+++ b/docs/data/how-to/tuning-guides/mi300a-rocm-smi-showhw-output.png
--- a/docs/data/how-to/tuning-guides/mi300a-rocm-smi-showtopo-output.png
+++ b/docs/data/how-to/tuning-guides/mi300a-rocm-smi-showtopo-output.png
--- a/docs/data/reference/banner-compilers.jpg
+++ b/docs/data/reference/banner-compilers.jpg
--- a/docs/data/reference/banner-runtimes.jpg
+++ b/docs/data/reference/banner-runtimes.jpg
--- a/docs/data/reference/compatibility-matrix-historical-6.0.csv
+++ b/docs/data/reference/compatibility-matrix-historical-6.0.csv
@@ -0,0 +1,111 @@
 ROCm Version,6.2.0, 6.1.2, 6.1.1, 6.1.0, 6.0.2, 6.0.0
      :doc:`Operating Systems <rocm-install-on-linux:reference/system-requirements>`,Ubuntu 24.04,,,,,
      ,"Ubuntu 22.04.5 [#Ubuntu220405-past-60]_, 22.04.4","Ubuntu 22.04.4, 22.04.3","Ubuntu 22.04.4, 22.04.3","Ubuntu 22.04.4, 22.04.3","Ubuntu 22.04.4, 22.04.3","Ubuntu 22.04.4, 22.04.3"
      ,,"Ubuntu 20.04.6, 20.04.5","Ubuntu 20.04.6, 20.04.5","Ubuntu 20.04.6, 20.04.5","Ubuntu 20.04.6, 20.04.5","Ubuntu 20.04.6, 20.04.5"
      ,"RHEL 9.4, 9.3","RHEL 9.4 [#red-hat94-past-60]_, 9.3, 9.2","RHEL 9.4 [#red-hat94-past-60]_, 9.3, 9.2","RHEL 9.4 [#red-hat94-past-60]_, 9.3, 9.2","RHEL 9.3, 9.2","RHEL 9.3, 9.2"
      ,"RHEL 8.10, 8.9","RHEL 8.9, 8.8","RHEL 8.9, 8.8","RHEL 8.9, 8.8","RHEL 8.9, 8.8","RHEL 8.9, 8.8"
      ,"SLES 15 SP6, SP5","SLES 15 SP5, SP4","SLES 15 SP5, SP4","SLES 15 SP5, SP4","SLES 15 SP5, SP4","SLES 15 SP5, SP4"
      ,,CentOS 7.9,CentOS 7.9,CentOS 7.9,CentOS 7.9,CentOS 7.9
      ,Oracle Linux 8.9 [#oracle89-past-60]_,Oracle Linux 8.9 [#oracle89-past-60]_,Oracle Linux 8.9 [#oracle89-past-60]_,,,
      ,".. _architecture-support-compatibility-matrix-past-60:",,,,,
      :doc:`Architecture <rocm-install-on-linux:reference/system-requirements>`,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3
      ,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2
      ,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA
      ,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3
      ,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2
      ,".. _gpu-support-compatibility-matrix-past-60:",,,,,
      :doc:`GPU / LLVM target <rocm-install-on-linux:reference/system-requirements>`,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100
      ,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030
      ,gfx942 [#mi300_620-past-60]_, gfx942 [#mi300_612-past-60]_, gfx942 [#mi300_611-past-60]_, gfx942 [#mi300_610-past-60]_, gfx942 [#mi300_602-past-60]_, gfx942 [#mi300_600-past-60]_
      ,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a
      ,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908
      ,,,,,,
      FRAMEWORK SUPPORT,".. _framework-support-compatibility-matrix-past-60:",,,,,
      :doc:`PyTorch <rocm-install-on-linux:how-to/3rd-party/pytorch-install>`,"2.3, 2.2, 2.1, 2.0, 1.13","2.1, 2.0, 1.13","2.1, 2.0, 1.13","2.1, 2.0, 1.13","2.1, 2.0, 1.13","2.1, 2.0, 1.13"
      :doc:`TensorFlow <rocm-install-on-linux:how-to/3rd-party/tensorflow-install>`,"2.16.1, 2.15.1, 2.14.1","2.15, 2.14, 2.13","2.15, 2.14, 2.13","2.15, 2.14, 2.13","2.14, 2.13, 2.12","2.14, 2.13, 2.12"
      :doc:`JAX <rocm-install-on-linux:how-to/3rd-party/jax-install>`,0.4.26,0.4.26,0.4.26,0.4.26,0.4.26,0.4.26
      `ONNX Runtime <https://onnxruntime.ai/docs/build/eps.html#amd-migraphx>`_,1.17.3,1.17.3,1.17.3,1.17.3,1.14.1,1.14.1
      ,,,,,,
      THIRD PARTY COMMS,".. _thirdpartycomms-support-compatibility-matrix-past-60:",,,,,
      `UCC <https://github.com/ROCm/ucc>`_,>=1.2.0,>=1.2.0,>=1.2.0,>=1.2.0,>=1.2.0,>=1.2.0
      `UCX <https://github.com/ROCm/ucx>`_,>=1.15.0,>=1.14.1,>=1.14.1,>=1.14.1,>=1.14.1,>=1.14.1
      ,,,,,,
      THIRD PARTY ALGORITHM,".. _thirdpartyalgorithm-support-compatibility-matrix-past-60:",,,,,
      Thrust,2.2.0,2.1.0,2.1.0,2.1.0,2.0.1,2.0.1
      CUB,2.2.0,2.1.0,2.1.0,2.1.0,2.0.1,2.0.1
      ,,,,,,
      ML & COMPUTER VISION,".. _mllibs-support-compatibility-matrix-past-60:",,,,,
      :doc:`Composable Kernel <composable_kernel:index>`,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0
      :doc:`MIGraphX <amdmigraphx:index>`,2.10.0,2.9.0,2.9.0,2.9.0,2.8.0,2.8.0
      :doc:`MIOpen <miopen:index>`,3.2.0,3.1.0,3.1.0,3.1.0,3.0.0,3.0.0
      :doc:`MIVisionX <mivisionx:index>`,3.0.0,2.5.0,2.5.0,2.5.0,2.5.0,2.5.0
      :doc:`rocDecode <rocdecode:index>`,0.6.0,0.6.0,0.5.0,0.5.0,N/A,N/A
      :doc:`RPP <rpp:index>`,1.8.0,1.5.0,1.5.0,1.5.0,1.4.0,1.4.0
      :doc:`rocPyDecode <rocpydecode:index>`,0.1.0,N/A,N/A,N/A,N/A,N/A
      ,,,,,,
      COMMUNICATION,".. _commlibs-support-compatibility-matrix-past-60:",,,,,
      :doc:`RCCL <rccl:index>`,2.20.5,2.18.6,2.18.6,2.18.6,2.18.3,2.18.3
      ,,,,,,
      MATH LIBS,".. _mathlibs-support-compatibility-matrix-past-60:",,,,,
      `half <https://github.com/ROCm/half>`_ ,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0
      :doc:`hipBLAS <hipblas:index>`,2.2.0,2.1.0,2.1.0,2.1.0,2.0.0,2.0.0
      :doc:`hipBLASLt <hipblaslt:index>`,0.8.0,0.7.0,0.7.0,0.7.0,0.6.0,0.6.0
      :doc:`hipFFT <hipfft:index>`,1.0.14,1.0.14,1.0.14,1.0.14,1.0.13,1.0.13
      :doc:`hipFORT <hipfort:index>`,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0
      :doc:`hipRAND <hiprand:index>`,2.11.0,2.10.16,2.10.16,2.10.16,2.10.16,2.10.16
      :doc:`hipSOLVER <hipsolver:index>`,2.2.0,2.1.1,2.1.1,2.1.0,2.0.0,2.0.0
      :doc:`hipSPARSE <hipsparse:index>`,3.1.1,3.0.1,3.0.1,3.0.1,3.0.0,3.0.0
      :doc:`hipSPARSELt <hipsparselt:index>`,0.2.1,0.2.0,0.1.0,0.1.0,0.1.0,0.1.0
      :doc:`rocALUTION <rocalution:index>`,3.2.0,3.1.1,3.1.1,3.1.1,3.0.3,3.0.3
      :doc:`rocBLAS <rocblas:index>`,4.2.0,4.1.2,4.1.0,4.1.0,4.0.0,4.0.0
      :doc:`rocFFT <rocfft:index>`,1.0.28,1.0.27,1.0.27,1.0.26,1.0.25,1.0.23
      :doc:`rocRAND <rocrand:index>`,3.1.0,3.0.1,3.0.1,3.0.1,3.0.0,2.10.17
      :doc:`rocSOLVER <rocsolver:index>`,3.26.0,3.25.0,3.25.0,3.25.0,3.24.0,3.24.0
      :doc:`rocSPARSE <rocsparse:index>`,3.2.0,3.1.2,3.1.2,3.1.2,3.0.2,3.0.2
      :doc:`rocWMMA <rocwmma:index>`,1.5.0,1.4.0,1.4.0,1.4.0,1.3.0,1.3.0
      `Tensile <https://github.com/ROCm/Tensile>`_,4.40.0,4.40.0,4.40.0,4.40.0,4.39.0,4.39.0
      ,,,,,,
      PRIMITIVES,".. _primitivelibs-support-compatibility-matrix-past-60:",,,,,
      :doc:`hipCUB <hipcub:index>`,3.2.0,3.1.0,3.1.0,3.1.0,3.0.0,3.0.0
      :doc:`hipTensor <hiptensor:index>`,1.3.0,1.2.0,1.2.0,1.2.0,1.1.0,1.1.0
      :doc:`rocPRIM <rocprim:index>`,3.2.0,3.1.0,3.1.0,3.1.0,3.0.0,3.0.0
      :doc:`rocThrust <rocthrust:index>`,3.0.1,3.0.1,3.0.1,3.0.1,3.0.0,3.0.0
      ,,,,,,
      SUPPORT LIBS,,,,,,
      `hipother <https://github.com/ROCm/hipother>`_,6.2.41133,6.1.40093,6.1.40092,6.1.40091,6.1.32831,6.1.32830
      `rocm-core <https://github.com/ROCm/rocm-core>`_,6.2.0,6.1.2,6.1.1,6.1.0,6.0.2,6.0.0
      `ROCT-Thunk-Interface <https://github.com/ROCm/ROCT-Thunk-Interface>`_,20240607.1.4246,20240125.5.08,20240125.5.08,20240125.3.30,20231016.2.245,20231016.2.245
      ,,,,,,
      SYSTEM MGMT TOOLS,".. _tools-support-compatibility-matrix-past-60:",,,,,
      :doc:`AMD SMI <amdsmi:index>`,24.6.2,24.5.1,24.5.1,24.4.1,23.4.2,23.4.2
      :doc:`ROCm Data Center Tool <rdc:index>`,1.0.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0
      :doc:`rocminfo <rocminfo:index>`,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0
      :doc:`ROCm SMI <rocm_smi_lib:index>`,7.3.0,7.2.0,7.0.0,7.0.0,6.0.2,6.0.0
      :doc:`ROCm Validation Suite <rocmvalidationsuite:index>`,rocm-6.2.0,rocm-6.1.2,rocm-6.1.1,rocm-6.1.0,rocm-6.0.2,rocm-6.0.0
      ,,,,,,
      PERFORMANCE TOOLS,,,,,,
      :doc:`Omniperf <omniperf:index>`,2.0.1,N/A,N/A,N/A,N/A,N/A
      :doc:`Omnitrace <omnitrace:index>`,1.11.2,N/A,N/A,N/A,N/A,N/A
      :doc:`ROCm Bandwidth Test <rocm_bandwidth_test:index>`,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0
      :doc:`ROCProfiler <rocprofiler:index>`,2.0.60200,2.0.60102,2.0.60101,2.0.60100,2.0.60002,2.0.60000
      :doc:`ROCprofiler-SDK <rocprofiler-sdk:index>`,0.4.0,N/A,N/A,N/A,N/A,N/A
      :doc:`ROCTracer <roctracer:index>`,4.1.60200,4.1.60102,4.1.60101,4.1.60100,4.1.60002,4.1.60000
      ,,,,,,
      DEVELOPMENT TOOLS,,,,,,
      :doc:`HIPIFY <hipify:index>`,18.0.0.24232,17.0.0.24193,17.0.0.24154,17.0.0.24103,17.0.0.24012,17.0.0.23483
      :doc:`ROCm CMake <rocmcmakebuildtools:index>`,0.13.0,0.12.0,0.12.0,0.12.0,0.11.0,0.11.0
      :doc:`ROCdbgapi <rocdbgapi:index>`,0.76.0,0.71.0,0.71.0,0.71.0,0.71.0,0.71.0
      :doc:`ROCm Debugger (ROCgdb) <rocgdb:index>`,14.2.0,14.1.0,14.1.0,14.1.0,13.2.0,13.2.0
      `rocprofiler-register <https://github.com/ROCm/rocprofiler-register>`_,0.4.0,0.3.0,0.3.0,0.3.0,N/A,N/A
      :doc:`ROCr Debug Agent <rocr_debug_agent:index>`,2.0.3,2.0.3,2.0.3,2.0.3,2.0.3,2.0.3
      ,,,,,,
      COMPILERS,".. _compilers-support-compatibility-matrix-past-60:",,,,,
      `clang-ocl <https://github.com/ROCm/clang-ocl>`_,N/A,0.5.0,0.5.0,0.5.0,0.5.0,0.5.0
      `Flang <https://github.com/ROCm/flang>`_,18.0.0.24232,17.0.0.24193,17.0.0.24154,17.0.0.24103,17.0.0.24012,17.0.0.23483
      `llvm-project <https://github.com/ROCm/llvm-project>`_,18.0.0.24232,17.0.0.24193,17.0.0.24154,17.0.0.24103,17.0.0.24012,17.0.0.23483
      `OpenMP <https://github.com/ROCm/llvm-project/tree/amd-staging/openmp>`_,18.0.0.24232,17.0.0.24193,17.0.0.24154,17.0.0.24103,17.0.0.24012,17.0.0.23483
      ,,,,,,
      RUNTIMES,".. _runtime-support-compatibility-matrix-past-60:",,,,,
      :doc:`HIP <hip:index>`,6.2.41133,6.1.40093,6.1.40092,6.1.40091,6.1.32831,6.1.32830
      `OpenCL Runtime <https://github.com/ROCm/clr/tree/develop/opencl>`_,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0
      :doc:`ROCR-Runtime <rocr-runtime:index>`,1.13.0,1.13.0,1.13.0,1.13.0,1.12.0,1.12.0
--- a/docs/data/rocm-software-stack-6_1_0.jpg
+++ b/docs/data/rocm-software-stack-6_1_0.jpg
--- a/docs/data/rocm-software-stack-6_2_0.jpg
+++ b/docs/data/rocm-software-stack-6_2_0.jpg
--- a/docs/data/unused-images/banner-optimization.jpg
+++ b/docs/data/unused-images/banner-optimization.jpg
--- a/docs/how-to/build-rocm.rst
+++ b/docs/how-to/build-rocm.rst
@@ -0,0 +1,23 @@
 .. meta::
    :description: Build ROCm from source
    :keywords: build ROCm, source, ROCm source, ROCm, repo, make, makefile
 .. _building-rocm:
 *************************************************************
 Build ROCm from source
 *************************************************************
 ROCm is an open-source stack from which you can build from source code. The source code is available from `<https://github.com/ROCm/ROCm>`__.
 The general steps to build ROCm are:
 #. Clone the ROCm source code
 #. Prepare the build environment
 #. Run the build command
 Because the ROCm stack is constantly evolving, the most current instructions are stored with the source code in GitHub.  
 For detailed build instructions, see `Build ROCm from source <https://github.com/ROCm/ROCm?tab=readme-ov-file#build-rocm-from-source>`_
--- a/docs/how-to/system-optimization/index.rst
+++ b/docs/how-to/system-optimization/index.rst
@@ -65,6 +65,12 @@ their own performance testing for additional tuning.
     - `CDNA 3 architecture <https://www.amd.com/content/dam/amd/en/documents/instinct-tech-docs/white-papers/amd-cdna-3-white-paper.pdf>`_
   * - :doc:`AMD Instinct MI300A <mi300a>`
     - `AMD Instinct MI300 instruction set architecture <https://www.amd.com/content/dam/amd/en/documents/instinct-tech-docs/instruction-set-architectures/amd-instinct-mi300-cdna3-instruction-set-architecture.pdf>`_
     - `CDNA 3 architecture <https://www.amd.com/content/dam/amd/en/documents/instinct-tech-docs/white-papers/amd-cdna-3-white-paper.pdf>`_
   * - :doc:`AMD Instinct MI200 <mi200>`
     - `AMD Instinct MI200 instruction set architecture <https://www.amd.com/system/files/TechDocs/instinct-mi200-cdna2-instruction-set-architecture.pdf>`_
--- a/docs/how-to/system-optimization/mi300a.rst
+++ b/docs/how-to/system-optimization/mi300a.rst
@@ -0,0 +1,393 @@
 .. meta::
   :description: AMD Instinct MI300A system settings
   :keywords: AMD, Instinct, MI300A, HPC, tuning, BIOS settings, NBIO, ROCm,
              environment variable, performance, accelerator, GPU, EPYC, GRUB,
              operating system
 ***************************************************
 AMD Instinct MI300A system optimization
 ***************************************************
 This topic discusses the operating system settings and system management commands for 
 the AMD Instinct MI300A accelerator. This topic can help you optimize performance.
 System settings
 ========================================
 This section reviews the system settings required to configure a MI300A SOC system and
 optimize its performance.
 The MI300A system-on-a-chip (SOC) design requires you to review and potentially adjust your OS configuration as explained in 
 the :ref:`operating-system-settings-label` section. These settings are critical for 
 performance because the OS on an accelerated processing unit (APU) is responsible for memory management across the CPU and GPU accelerators.
 In the APU memory model, system settings are available to limit GPU memory allocation. 
 This limit is important because legacy software often determines the 
 amount of allowable memory at start-up time
 by probing discrete memory until it is exhausted. If left unchecked, this practice 
 can starve the OS of resources. 
 System BIOS settings
 -----------------------------------
 System BIOS settings are preconfigured for optimal performance from the 
 platform vendor. This means that you do not need to adjust these settings 
 when using MI300A. If you have any questions regarding these settings, 
 contact your MI300A platform vendor.
 GRUB settings 
 -----------------------------------
 The ``/etc/default/grub`` file is used to configure the GRUB bootloader on modern Linux distributions. 
 Linux uses the string assigned to ``GRUB_CMDLINE_LINUX`` in this file as
 its command line parameters during boot.
 Appending strings using the Linux command line
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 It is recommended that you append the following string to ``GRUB_CMDLINE_LINUX``.
 ``pci=realloc=off``
  This setting disables the automatic reallocation
  of PCI resources, so Linux is able to unambiguously detect all GPUs on the
  MI300A-based system. It's used when Single Root I/O Virtualization (SR-IOV) Base
  Address Registers (BARs) have not been allocated by the BIOS. This can help
  avoid potential issues with certain hardware configurations.
 Validating the IOMMU setting
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 IOMMU is a system-specific IO mapping mechanism for DMA mapping
 and isolation. IOMMU is turned off by default in the operating system settings 
 for optimal performance.
 To verify IOMMU is turned off, first install the ``acpica-tools`` package using your 
 package manager.
 .. code-block:: shell
   sudo apt install acpica-tools
 Then confirm that the following commands do not return any results.
 .. code-block:: shell
   sudo acpidump | grep IVRS
   sudo acpidump | grep DMAR
 Update GRUB
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 Use this command to update GRUB to use the modified configuration:
 .. code-block:: shell
   sudo grub2-mkconfig -o /boot/grub2/grub.cfg
 On some Red Hat-based systems, the ``grub2-mkconfig`` command might not be available. In this case,
 use ``grub-mkconfig`` instead. Verify that you have the
 correct version by using the following command:
 .. code-block:: shell
   grub-mkconfig -version
 .. _operating-system-settings-label:
 Operating system settings 
 -----------------------------------
 The operating system provides several options to customize and tune performance. For more information 
 about supported operating systems, see the :doc:`Compatibility matrix <../../compatibility/compatibility-matrix>`. 
 If you are using a distribution other than RHEL or SLES, the latest Linux kernel is recommended.
 Performance considerations for the Zen4, which is the core architecture in the MI300A, 
 require a Linux kernel running version 5.18 or higher. 
 This section describes performance-based settings.
 * **Enable transparent huge pages** 
  To enable transparent huge pages, use one of the following methods:
  * From the command line, run the following command:
    .. code-block:: shell
       echo always > /sys/kernel/mm/transparent_hugepage/enabled  
  * Set the Linux kernel parameter ``transparent_hugepage`` as follows in the 
    relevant ``.cfg`` file for your system.
    .. code-block:: cfg
       transparent_hugepage=always
 * **Limit the maximum and single memory allocations on the GPU**
  Many AI-related applications were originally developed on discrete GPUs. Some of these applications 
  have fixed problem sizes associated with the targeted GPU size, and some attempt to determine the 
  system memory limits by allocating chunks until failure. These techniques can cause issues in an 
  APU with a shared space.
  To allow these applications to run on the APU without further changes, 
  ROCm supports a default memory policy that restricts the percentage of the GPU that can be allocated. 
  The following environment variables control this feature: 
  * ``GPU_MAX_ALLOC_PERCENT``
  * ``GPU_SINGLE_ALLOC_PERCENT``
  These settings can be added to the default shell environment or the user environment. The effect of the memory allocation 
  settings varies depending on the system, configuration, and task. They might require adjustment, especially when performing GPU benchmarks. Setting these values to ``100`` 
  lets the GPU allocate any amount of free memory. However, the risk of encountering 
  an operating system out-of-memory (OMM) condition increases when almost 
  all the available memory is used.
  Before setting either of these items to 100 percent, 
  carefully consider the expected CPU workload allocation and the anticipated OS usage. 
  For instance, if the OS requires 8GB on a 128GB system, setting these 
  variables to ``100`` authorizes a single 
  workload to allocate up to 120GB of memory. Unless the system has swap space configured 
  any over-allocation attempts will be handled by the OMM policies.
 * **Disable NUMA (Non-uniform memory access) balancing**
  ROCm uses information from the compiled application to ensure an affinity exists
  between the GPU agent processes and their CPU hosts or co-processing agents. 
  Because the APU has OS threads, 
  including threads with memory management, the default kernel NUMA policies can
  adversely impact workload performance without additional tuning.
  .. note::
     At the kernel level, ``pci_relloc`` can also be set to ``off`` as an additional tuning measure. 
  To disable NUMA balancing, use one of the following methods:
  * From the command line, run the following command:
    .. code-block:: shell
       echo 0 > /proc/sys/kernel/numa_balancing   
  * Set the following Linux kernel parameters in the 
    relevant ``.cfg`` file for your system.
    .. code-block:: cfg
       pci=realloc=off numa_balancing=disable  
 * **Enable compaction**
  Compaction is necessary for proper MI300A operation because the APU dynamically shares memory 
  between the CPU and GPU. Compaction can be done proactively, which reduces 
  allocation costs, or performed during allocation, in which case it is part of the background activities. 
  Without compaction, the MI300A application performance eventually degrades as fragmentation increases. 
  In RHEL distributions, compaction is disabled by default. In Ubuntu, it's enabled by default. 
  To enable compaction, enter the following commands using the command line:
  .. code-block:: shell
     echo 20 > /proc/sys/vm/compaction_proactiveness 
     echo 1 > /proc/sys/vm/compact_unevictable_allowed  
 .. _mi300a-processor-affinity:
 * **Change affinity of ROCm helper threads**
  This change prevents internal ROCm threads from having their CPU core affinity mask 
  set to all CPU cores available. With this setting, the threads inherit their parent's 
  CPU core affinity mask. If you have any questions regarding this setting, 
  contact your MI300A platform vendor. To enable this setting, enter the following command:
  .. code-block:: shell
     export HSA_OVERRIDE_CPU_AFFINITY_DEBUG=0 
 * **CPU core states and C-states**
  The system BIOS handles these settings for the MI300A. 
  They don't need to be configured on the operating system.
 System management
 ========================================
 For a complete guide on installing, managing, and uninstalling ROCm on Linux, see
 :doc:`Quick-start (Linux)<rocm-install-on-linux:tutorial/quick-start>`. To verify that the
 installation was successful, see the
 :doc:`Post-installation instructions<rocm-install-on-linux:how-to/native-install/post-install>` and 
 :doc:`ROCm tools <../../reference/rocm-tools>` guides. If verification
 fails, consult the :doc:`System debugging guide <../system-debugging>`.
 .. _hw-verification-rocm-label:
 Hardware verification with ROCm 
 -----------------------------------
 ROCm includes tools to query the system structure. To query
 the GPU hardware, use the ``rocm-smi`` command.
 ``rocm-smi`` reports statistics per socket, so the power results combine CPU and GPU utilization. 
 In an idle state on a multi-socket system, some power imbalances are expected because 
 the distribution of OS threads can keep some APU devices at higher power states.
 .. note::
   The MI300A VRAM settings show as ``N/A``. 
 .. image:: ../../data/how-to/tuning-guides/mi300a-rocm-smi-output.png
   :alt: Output from the rocm-smi command
 The ``rocm-smi --showhw`` command shows the available system
 GPUs and their device ID and firmware details.
 In the MI300A hardware settings, the system BIOS handles the UMC RAS. The 
 ROCm-supplied GPU driver does not manage this setting.
 This results in a value of ``DISABLED`` for the ``UMC RAS`` setting. 
 .. image:: ../../data/how-to/tuning-guides/mi300a-rocm-smi-showhw-output.png
   :alt: Output from the ``rocm-smi showhw`` command
 To see the system structure, the localization of the GPUs in the system, and the 
 fabric connections between the system components, use the ``rocm-smi --showtopo`` command.
 * The first block of the output shows the distance between the GPUs. The weight is a qualitative 
  measure of the “distance” data must travel to reach one GPU from another. 
  While the values do not have a precise physical meaning, the higher the value the 
  more hops are required to reach the destination from the source GPU.
 * The second block contains a matrix named “Hops between two GPUs”, where ``1`` means 
  the two GPUs are directly connected with XGMI, ``2`` means both GPUs are linked to the 
  same CPU socket and GPU communications go through the CPU, and ``3`` means 
  both GPUs are linked to different CPU sockets so communications go 
  through both CPU sockets.
 * The third block indicates the link types between the GPUs. This can either be 
  ``XGMI`` for AMD Infinity Fabric links or ``PCIE`` for PCIe Gen4 links.
 * The fourth block reveals the localization of a GPU with respect to the NUMA organization 
  of the shared memory of the AMD EPYC processors.
 .. image:: ../../data/how-to/tuning-guides/mi300a-rocm-smi-showtopo-output.png
   :alt: Output from the ``rocm-smi showtopo`` command
 Testing inter-device bandwidth
 -----------------------------------
 The ``rocm-smi --showtopo`` command from the :ref:`hw-verification-rocm-label` section 
 displays the system structure and shows how the GPUs are located and connected within this
 structure. For more information, use the :doc:`ROCm Bandwidth Test <rocm_bandwidth_test:index>`, which can run benchmarks to
 show the effective link bandwidth between the system components.
 For information on how to install the ROCm Bandwidth Test, see :doc:`Building the environment <rocm_bandwidth_test:install/install>`.
 The output lists the available compute devices (CPUs and GPUs), including
 their device ID and PCIe ID:
 .. image:: ../../data/how-to/tuning-guides/mi300a-rocm-bandwidth-test-output.png
   :alt: Output from the rocm-bandwidth-test utility
 It also displays the measured bandwidth for unidirectional and
 bidirectional transfers between the devices on the CPU and GPU:
 .. image:: ../../data/how-to/tuning-guides/mi300a-rocm-peak-bandwidth-output.png
   :alt: Bandwidth information from the rocm-bandwidth-test utility
 Abbreviations
 =============
 APBDIS
  Algorithmic Performance Boost Disable
 APU
  Accelerated processing unit
 BAR
  Base Address Register
 BIOS
  Basic Input/Output System
 CBS
  Common BIOS Settings
 CCD
  Compute Core Die
 CDNA
  Compute DNA
 CLI
  Command Line Interface
 CPU
  Central Processing Unit
 cTDP
  Configurable Thermal Design Power
 DF
  Data Fabric
 DMA
  Direct Memory Access
 GPU
  Graphics Processing Unit
 GRUB
  Grand Unified Bootloader
 HBM
  High Bandwidth Memory
 HPC
  High Performance Computing
 IOMMU
  Input-Output Memory Management Unit
 ISA
  Instruction Set Architecture
 NBIO
  North Bridge Input/Output
 NUMA
  Non-Uniform Memory Access
 OMM
  Out of Memory
 PCI
  Peripheral Component Interconnect
 PCIe
  PCI Express
 POR
  Power-On Reset
 RAS
  Reliability, availability and serviceability
 SMI
  System Management Interface
 SMT
  Simultaneous Multi-threading
 SOC
  System On Chip
 SR-IOV
  Single Root I/O Virtualization
 TSME
  Transparent Secure Memory Encryption
 UMC
  Unified Memory Controller
 VRAM
  Video RAM
 xGMI
  Inter-chip Global Memory Interconnect 
--- a/docs/how-to/system-optimization/mi300x.rst
+++ b/docs/how-to/system-optimization/mi300x.rst
@@ -473,6 +473,18 @@ It is recommended to set the following environment variable:
   This is the default option as of ROCm 6.2.
 Change affinity of ROCm helper threads
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 This change prevents internal ROCm threads from having their CPU core affinity mask 
 set to all CPU cores available. With this setting, the threads inherit their parent's 
 CPU core affinity mask. If you have any questions regarding this setting, 
 contact your MI300A platform vendor. To enable this setting, enter the following command:
 .. code-block:: shell
   export HSA_OVERRIDE_CPU_AFFINITY_DEBUG=0 
 IOMMU configuration -- systems with 256 CPU threads
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
@@ -695,8 +707,8 @@ Bidirectional bandwidth
   Bidirectional bandwidth
-Acronyms
+Abbreviations
-========
+=============
 AMI
  American Megatrends International
--- a/docs/how-to/tuning-guides/mi300x/workload.rst
+++ b/docs/how-to/tuning-guides/mi300x/workload.rst
@@ -162,12 +162,12 @@ tools available depending on their specific profiling needs.
 * ROCProfiler tool collects kernel execution performance
  metrics. For more information, see the
-  `ROCProfiler <https://rocm.docs.amd.com/projects/rocprofiler/en/latest/rocprofv1.html>`_
+  :doc:`ROCProfiler <rocprofiler:index>`
  documentation.
 * Omniperf builds upon ROCProfiler but provides more guided analysis.
  For more information, see
-  `Omniperf documentation <https://rocm.github.io/omniperf/>`_.
+  :doc:`Omniperf documentation <omniperf:index>`.
 Refer to :doc:`/how-to/llm-fine-tuning-optimization/profiling-and-debugging`
 to explore commonly used profiling tools and their usage patterns.
--- a/docs/index.md
+++ b/docs/index.md
@@ -5,53 +5,35 @@
  reference, ROCm, AMD">
 </head>
-# AMD ROCm™ documentation
+# AMD ROCm documentation
-Welcome to the ROCm docs home page! If you're new to ROCm, you can review the following
+ROCm is an open-source software platform optimized to extract HPC and AI workload
-resources to learn more about our products and what we support:
+performance from AMD Instinct accelerators and AMD Radeon GPUs while maintaining
 compatibility with industry software frameworks. For more information, see [What is ROCm?](./what-is-rocm.rst)
-* [What is ROCm?](./what-is-rocm.rst)
+If you're using Radeon GPUs, consider reviewing {doc}`Radeon-specific ROCm documentation<radeon:index>`.
 * [Release notes](./about/release-notes.md)
-You can install ROCm on our Radeon™, Radeon™ PRO, and Instinct™ GPUs. If you're using Radeon
+Installation instructions are available from:
 GPUs, we recommend reading the
 {doc}`Radeon-specific ROCm documentation<radeon:index>`.
-For hands-on applications, refer to our [ROCm blogs](https://rocm.blogs.amd.com/) site.
+* {doc}`ROCm installation for Linux<rocm-install-on-linux:index>`
 * {doc}`HIP SDK installation for Windows<rocm-install-on-windows:index>`
 * [Deep learning frameworks installation](./how-to/deep-learning-rocm.rst)
 * [Build ROCm from source](./how-to/build-rocm.rst)
-Our documentation is organized into the following categories:
+ROCm documentation is organized into the following categories:
 ::::{grid} 1 2 2 2
 :class-container: rocm-doc-grid
 :::{grid-item-card}
-:img-top: ./data/banner-installation.jpg
+:class-card: sd-text-black
 :img-alt: Install documentation
 :padding: 2
 * Linux
  * {doc}`Quick start guide<rocm-install-on-linux:tutorial/quick-start>`
  * {doc}`Linux install guide<rocm-install-on-linux:how-to/native-install/index>`
  * {doc}`Package manager integration<rocm-install-on-linux:how-to/native-install/package-manager-integration>`
  * {doc}`Install Docker containers<rocm-install-on-linux:how-to/docker>`
  * {doc}`ROCm & Spack<rocm-install-on-linux:how-to/spack>`
 * Windows
  * {doc}`Windows install guide<rocm-install-on-windows:how-to/install>`
  * {doc}`Application deployment guidelines<rocm-install-on-windows:conceptual/deployment-guidelines>`
 * [Deep learning frameworks](./how-to/deep-learning-rocm.rst)
  * {doc}`PyTorch for ROCm<rocm-install-on-linux:how-to/3rd-party/pytorch-install>`
  * {doc}`TensorFlow for ROCm<rocm-install-on-linux:how-to/3rd-party/tensorflow-install>`
  * {doc}`JAX for ROCm<rocm-install-on-linux:how-to/3rd-party/jax-install>`
 :::
 :::{grid-item-card}
 :img-top: ./data/banner-compatibility.jpg
 :img-alt: Compatibility information
 :padding: 2
 * [Compatibility matrix](./compatibility/compatibility-matrix.rst)
-* {doc}`System requirements (Linux)<rocm-install-on-linux:reference/system-requirements>`
+* {doc}`Linux system requirements<rocm-install-on-linux:reference/system-requirements>`
-* {doc}`System requirements (Windows)<rocm-install-on-windows:reference/system-requirements>`
+* {doc}`Windows system requirements<rocm-install-on-windows:reference/system-requirements>`
 * {doc}`Third-party support<rocm-install-on-linux:reference/3rd-party-support-matrix>`
 * {doc}`User/kernel space<rocm-install-on-linux:reference/user-kernel-space-compat-matrix>`
 * {doc}`Docker<rocm-install-on-linux:reference/docker-image-support-matrix>`
@@ -60,28 +42,8 @@ Our documentation is organized into the following categories:
 * {doc}`ROCm on Radeon GPUs<radeon:index>`
 :::
 <!-- markdownlint-disable MD051 -->
 :::{grid-item-card}
 :img-top: ./data/banner-reference.jpg
 :img-alt: Reference documentation
 :padding: 2
 * [API libraries](./reference/api-libraries.md)
  * [Artificial intelligence](#artificial-intelligence-apis)
  * [C++ primitives](#cpp-primitives)
  * [Communication](#communication-libraries)
  * [Math](#math-apis)
  * [Random number generators](#random-number-apis)
  * [HIP runtime](#hip-runtime)
 * [Tools](./reference/rocm-tools.md)
  * [Development](#development-tools)
  * [Performance analysis](#performance-tools)
  * [System](#system-tools)
 * [Hardware specifications](./reference/gpu-arch-specs.rst)
 :::
 <!-- markdownlint-enable MD051 -->
 :::{grid-item-card}
 :class-card: sd-text-black
 :img-top: ./data/banner-howto.jpg
 :img-alt: How-to documentation
 :padding: 2
@@ -91,6 +53,7 @@ Our documentation is organized into the following categories:
 * [Fine-tuning LLMs and inference optimization](./how-to/llm-fine-tuning-optimization/index.rst)
 * [System optimization](./how-to/system-optimization/index.rst)
  * [AMD Instinct MI300X](./how-to/system-optimization/mi300x.rst)
  * [AMD Instinct MI300A](./how-to/system-optimization/mi300a.rst)
  * [AMD Instinct MI200](./how-to/system-optimization/mi200.md)
  * [AMD Instinct MI100](./how-to/system-optimization/mi100.md)
  * [AMD Instinct RDNA2](./how-to/system-optimization/w6000-v620.md)
@@ -99,23 +62,18 @@ Our documentation is organized into the following categories:
  * [Workload tuning](./how-to/tuning-guides/mi300x/workload.rst)
 * [System debugging](./how-to/system-debugging.md)
 * [GPU-enabled MPI](./how-to/gpu-enabled-mpi.rst)
-* [Using compiler features](./conceptual/compiler-topics.md)
+* [Using advanced compiler features](./conceptual/compiler-topics.md)
  * [Using AddressSanitizer](./conceptual/using-gpu-sanitizer.md)
  * [Compiler disambiguation](./conceptual/compiler-disambiguation.md)
  * [OpenMP support in ROCm](./about/compatibility/openmp.md)
 * [Setting the number of CUs](./how-to/setting-cus)  
 * [GitHub examples](https://github.com/amd/rocm-examples)
 :::
 :::{grid-item-card}
 :class-card: sd-text-black
 :img-top: ./data/banner-conceptual.jpg
 :img-alt: Conceptual documentation
 :padding: 2
 * [GPU architecture](./conceptual/gpu-arch.md)
  * [MI100](./conceptual/gpu-arch/mi100.md)
  * [MI250](./conceptual/gpu-arch/mi250.md)
  * [MI300](./conceptual/gpu-arch/mi300.md)
 * [GPU memory](./conceptual/gpu-memory.md)
 * [File structure (Linux FHS)](./conceptual/file-reorg.md)
 * [GPU isolation techniques](./conceptual/gpu-isolation.md)
@@ -125,4 +83,23 @@ Our documentation is organized into the following categories:
 * [Inference optimization with MIGraphX](./conceptual/ai-migraphx-optimization.md)
 :::
 <!-- markdownlint-disable MD051 -->
 :::{grid-item-card}
 :class-card: sd-text-black
 :img-top: ./data/banner-reference.jpg
 :img-alt: Reference documentation
 :padding: 2
 * [Libraries](./reference/api-libraries.md)
  * [Artificial intelligence](#artificial-intelligence-apis)
  * [C++ primitives](#cpp-primitives)
  * [Communication](#communication-libraries)
  * [Math](#math-apis)
  * [Random number generators](#random-number-apis)
  * [HIP runtime](#hip-runtime)
 * [ROCm tools and compilers](./reference/rocm-tools.md)
 * [GPU hardware specifications](./reference/gpu-arch-specs.rst)
 :::
 <!-- markdownlint-enable MD051 -->
 ::::
--- a/docs/reference/api-libraries.md
+++ b/docs/reference/api-libraries.md
@@ -6,7 +6,7 @@
  algebra, AMD">
 </head>
-# ROCm API libraries
+# ROCm libraries
 ::::{grid} 1 2 2 2
 :class-container: rocm-doc-grid
--- a/docs/reference/rocm-tools.md
+++ b/docs/reference/rocm-tools.md
@@ -6,40 +6,11 @@
  algebra, AMD">
 </head>
-# ROCm tools
+# ROCm tools, compilers, and runtimes
 ::::{grid} 1 2 2 2
 :class-container: rocm-doc-grid
 (development-tools)=
 :::{grid-item-card}
 :class-card: sd-text-black
 :img-top: ../data/reference/banner-development.jpg
 :img-alt: Development tools
 :padding: 2
 * {doc}`HIPIFY <hipify:index>`
 * {doc}`ROCdbgapi <rocdbgapi:index>`
 * [ROCmCC](./rocmcc.md)
 * {doc}`ROCm Debugger (ROCgdb) <rocgdb:index>`
 * {doc}`ROCr Debug Agent <rocr_debug_agent:index>`
 :::
 (performance-tools)=
 :::{grid-item-card}
 :class-card: sd-text-black
 :img-top: ../data/reference/banner-performance.jpg
 :img-alt: Performance tools
 :padding: 2
 * {doc}`ROCm Bandwidth Test <rocm_bandwidth_test:index>`
 * {doc}`ROCProfiler <rocprofiler:profiler_home_page>`
 * [rocprofiler-register](https://github.com/ROCm/rocprofiler-register)
 * {doc}`ROCTracer <roctracer:index>`
 :::
 (system-tools)=
 :::{grid-item-card}
@@ -49,10 +20,67 @@
 :padding: 2
 * {doc}`AMD SMI <amdsmi:index>`
 * {doc}`rocminfo <rocminfo:index>`
 * {doc}`ROCm Data Center Tool <rdc:index>`
 * {doc}`rocminfo <rocminfo:index>`
 * {doc}`ROCm SMI <rocm_smi_lib:index>`
 * {doc}`ROCm Validation Suite <rocmvalidationsuite:index>`
 :::
 (performance-tools)=
 :::{grid-item-card}
 :class-card: sd-text-black
 :img-top: ../data/reference/banner-performance.jpg
 :img-alt: Performance tools
 :padding: 2
 * {doc}`Omniperf <omniperf:index>`
 * {doc}`Omnitrace <omnitrace:index>`
 * {doc}`ROCm Bandwidth Test <rocm_bandwidth_test:index>`
 * {doc}`ROCProfiler <rocprofiler:index>`
 * {doc}`ROCprofiler-SDK <rocprofiler-sdk:index>`
 * {doc}`ROCTracer <roctracer:index>`
 :::
 (development-tools)=
 :::{grid-item-card}
 :class-card: sd-text-black
 :img-top: ../data/reference/banner-development.jpg
 :img-alt: Development tools
 :padding: 2
 * {doc}`ROCm CMake <rocmcmakebuildtools:index>`
 * {doc}`HIPIFY <hipify:index>`
 * {doc}`ROCdbgapi <rocdbgapi:index>`
 * {doc}`ROCm Debugger (ROCgdb) <rocgdb:index>`
 * {doc}`ROCr Debug Agent <rocr_debug_agent:index>`
 :::
 (compilers)=
 :::{grid-item-card}
 :class-card: sd-text-black
 :img-top: ../data/reference/banner-compilers.jpg
 :img-alt: Compilers
 :padding: 2
 * {doc}`ROCm Compilers <llvm-project:index>`
 * {doc}`HIPCC <hipcc:index>`
 * [FLANG](https://github.com/ROCm/flang/)
 :::
 (runtimes)=
 :::{grid-item-card}
 :class-card: sd-text-black
 :img-top: ../data/reference/banner-runtimes.jpg
 :img-alt: Runtimes
 :padding: 2
 * {doc}`AMD Common Language Runtime (CLR) <hip:understand/amd_clr>`
 * {doc}`HIP <hip:index>`
 * {doc}`ROCR-Runtime <rocr-runtime:index>`
 :::
 ::::
--- a/docs/reference/rocmcc.md
+++ b/docs/reference/rocmcc.md
--- a/docs/sphinx/_toc.yml.in
+++ b/docs/sphinx/_toc.yml.in
@@ -9,12 +9,6 @@ subtrees:
  - file: what-is-rocm.rst
  - file: about/release-notes.md
    title: Release notes
    subtrees:
    - entries:
      - file: about/changelog.md
        title: Changelog
  - url: https://github.com/ROCm/ROCm/labels/Verified%20Issue
    title: Known issues
 - caption: Install
  entries:
@@ -24,28 +18,8 @@ subtrees:
    title: HIP SDK on Windows
  - file: how-to/deep-learning-rocm.md
    title: Deep learning frameworks
-
+  - file: how-to/build-rocm.rst
- caption: Compatibility
+    title: Build ROCm from source
  entries:
  - file: compatibility/compatibility-matrix.rst
    title: Compatibility matrix
  - url: https://rocm.docs.amd.com/projects/install-on-linux/en/${branch}/reference/system-requirements.html
    title: Linux
  - url: https://rocm.docs.amd.com/projects/install-on-windows/en/${branch}/reference/system-requirements.html
    title: Windows
  - file: compatibility/precision-support.rst
    title: Precision support
  - url: https://rocm.docs.amd.com/projects/install-on-linux/en/${branch}/reference/3rd-party-support-matrix.html
    title: Third-party
 - caption: Reference
  entries:
    - file: reference/api-libraries.md
      title: API libraries
    - file: reference/rocm-tools.md
      title: Tools
    - file: reference/gpu-arch-specs.rst
      title: Hardware specifications
 - caption: How to
  entries:
@@ -87,6 +61,8 @@ subtrees:
    - entries:
      - file: how-to/system-optimization/mi300x.rst
        title: AMD Instinct MI300X
      - file: how-to/system-optimization/mi300a.rst
        title: AMD Instinct MI300A
      - file: how-to/system-optimization/mi200.md
        title: AMD Instinct MI200
      - file: how-to/system-optimization/mi100.md
@@ -105,19 +81,32 @@ subtrees:
  - file: how-to/gpu-enabled-mpi.rst
    title: Using MPI
  - file: conceptual/compiler-topics.md
-    title: Using compiler features
+    title: Using advanced compiler features
    subtrees:
    - entries:
-      - file: conceptual/using-gpu-sanitizer.md
+      - url: https://rocm.docs.amd.com/projects/llvm-project/en/latest/index.html
        title: ROCm compiler infrastructure
      - url: https://rocm.docs.amd.com/projects/llvm-project/en/latest/conceptual/using-gpu-sanitizer.html
        title: Using AddressSanitizer
-      - file: conceptual/compiler-disambiguation.md
+      - url: https://rocm.docs.amd.com/projects/llvm-project/en/latest/conceptual/openmp.html
        title: Compiler disambiguation
      - file: about/compatibility/openmp.md
        title: OpenMP support
  - file: how-to/setting-cus
    title: Setting the number of CUs  
  - url: https://github.com/amd/rocm-examples
-    title: GitHub examples
+    title: ROCm examples
 - caption: Compatibility
  entries:
  - file: compatibility/compatibility-matrix.rst
    title: Compatibility matrix
  - url: https://rocm.docs.amd.com/projects/install-on-linux/en/${branch}/reference/system-requirements.html
    title: Linux
  - url: https://rocm.docs.amd.com/projects/install-on-windows/en/${branch}/reference/system-requirements.html
    title: Windows
  - file: compatibility/precision-support.rst
    title: Precision support
  - url: https://rocm.docs.amd.com/projects/install-on-linux/en/${branch}/reference/3rd-party-support-matrix.html
    title: Third-party
 - caption: Conceptual
  entries:
@@ -166,6 +155,15 @@ subtrees:
  - file: conceptual/ai-migraphx-optimization.md
    title: Inference optimization with MIGraphX
 - caption: Reference
  entries:
    - file: reference/api-libraries.md
      title: ROCm libraries
    - file: reference/rocm-tools.md
      title: ROCm tools, compilers, and runtimes
    - file: reference/gpu-arch-specs.rst
      title: Hardware specifications
 - caption: Contribute
  entries:
  - file: contribute/contributing.md
--- a/docs/sphinx/requirements.in
+++ b/docs/sphinx/requirements.in
@@ -1,2 +1,2 @@
 rocm-docs-core==1.6.1
-sphinx-reredirects
+sphinx-reredirects
--- a/docs/sphinx/requirements.txt
+++ b/docs/sphinx/requirements.txt
@@ -36,7 +36,7 @@ docutils==0.21.2
    #   myst-parser
    #   pydata-sphinx-theme
    #   sphinx
-fastjsonschema==2.19.1
+fastjsonschema==2.20.0
    # via rocm-docs-core
 gitdb==4.0.11
    # via gitpython
@@ -62,13 +62,13 @@ mdurl==0.1.2
    # via markdown-it-py
 myst-parser==3.0.1
    # via rocm-docs-core
-packaging==24.0
+packaging==24.1
    # via
    #   pydata-sphinx-theme
    #   sphinx
 pycparser==2.22
    # via cffi
-pydata-sphinx-theme==0.15.3
+pydata-sphinx-theme==0.15.4
    # via
    #   rocm-docs-core
    #   sphinx-book-theme
@@ -112,7 +112,7 @@ sphinx==7.3.7
    #   sphinx-external-toc
    #   sphinx-notfound-page
    #   sphinx-reredirects
-sphinx-book-theme==1.1.2
+sphinx-book-theme==1.1.3
    # via rocm-docs-core
 sphinx-copybutton==0.5.2
    # via rocm-docs-core
@@ -138,7 +138,7 @@ sphinxcontrib-serializinghtml==1.1.10
    # via sphinx
 tomli==2.0.1
    # via sphinx
-typing-extensions==4.12.1
+typing-extensions==4.12.2
    # via
    #   pydata-sphinx-theme
    #   pygithub
--- a/docs/sphinx/static/css/rocm_custom.css
+++ b/docs/sphinx/static/css/rocm_custom.css
@@ -1,6 +1,21 @@
 /* Override PyData Sphinx Theme default colors */
 html[data-theme='light'] {
    --pst-color-table-row-hover-bg: #E2E8F0;
 }
 html[data-theme='dark'] {
    --pst-color-table-row-hover-bg: #1E293B;
 }
 a svg {
  color: var(--pst-color-text-base);
 }
 a svg:hover {
  color: var(--pst-color-link-hover);
 }
 /* Adds container for big tables, used for Compatibility Matrix */
 .format-big-table {
    white-space: nowrap;
-  }
+}
--- a/docs/sphinx/static/css/rocm_rn.css
+++ b/docs/sphinx/static/css/rocm_rn.css
@@ -0,0 +1,126 @@
 #rocm-rn-components col {
  width: 6rem;
 }
 #rocm-rn-components col:nth-child(2) {
  width: 12rem;
 }
 #rocm-rn-components td {
  white-space: nowrap;
 }
 #rocm-rn-components td:last-of-type {
  text-align: center;
 }
 #rocm-rn-components a svg {
  color: var(--pst-color-text-base);
 }
 #rocm-rn-components a svg:hover {
  color: var(--pst-color-link-hover);
 }
 #rocm-rn-components .tbody-reverse-zebra tr:nth-child(2n + 1) td {
  background-color: var(--pst-color-table-row-zebra-high-bg);
 }
 #rocm-rn-components .tbody-reverse-zebra tr:nth-child(2n) td {
  background-color: var(--pst-color-table-row-zebra-low-bg);
 }
 #rocm-rn-components:has(tbody.rocm-components-libs th[rowspan]:first-of-type:hover) .rocm-components-libs,
 #rocm-rn-components:has(tbody.rocm-components-libs th[rowspan]:first-of-type:hover) .rocm-components-libs td,
 #rocm-rn-components:has(tbody.rocm-components-libs th[rowspan]:first-of-type:hover) tbody.rocm-components-libs th {
  background-color: var(--pst-color-table-row-hover-bg);
 }
 #rocm-rn-components:has(tbody.rocm-components-tools th[rowspan]:first-of-type:hover) .rocm-components-tools,
 #rocm-rn-components:has(tbody.rocm-components-tools th[rowspan]:first-of-type:hover) .rocm-components-tools td,
 #rocm-rn-components:has(tbody.rocm-components-tools th[rowspan]:first-of-type:hover) tbody.rocm-components-tools th {
  background-color: var(--pst-color-table-row-hover-bg);
 }
 #rocm-rn-components:has(tbody.rocm-components-compilers th[rowspan]:first-of-type:hover) .rocm-components-compilers,
 #rocm-rn-components:has(tbody.rocm-components-compilers th[rowspan]:first-of-type:hover) .rocm-components-compilers td {
  background-color: var(--pst-color-table-row-hover-bg);
 }
 #rocm-rn-components:has(tbody.rocm-components-runtimes th[rowspan]:first-of-type:hover) .rocm-components-runtimes,
 #rocm-rn-components:has(tbody.rocm-components-runtimes th[rowspan]:first-of-type:hover) .rocm-components-runtimes td {
  background-color: var(--pst-color-table-row-hover-bg);
 }
 #rocm-rn-components:has(tbody.rocm-components-tools th[rowspan]:first-of-type:hover) .rocm-components-tools td {
  background-color: var(--pst-color-table-row-hover-bg);
 }
 #rocm-rn-components:has(tbody.rocm-components-compilers th[rowspan]:first-of-type:hover) .rocm-components-compilers td {
  background-color: var(--pst-color-table-row-hover-bg);
 }
 #rocm-rn-components:has(tbody.rocm-components-runtimes th[rowspan]:first-of-type:hover) .rocm-components-runtimes td {
  background-color: var(--pst-color-table-row-hover-bg);
 }
 #rocm-rn-components:has(tbody.rocm-components-ml th[rowspan]:nth-of-type(2):hover) .rocm-components-ml td,
 #rocm-rn-components:has(tbody.rocm-components-ml th[rowspan]:nth-of-type(2):hover) .rocm-components-libs th:first-of-type {
  background-color: var(--pst-color-table-row-hover-bg);
 }
 #rocm-rn-components:has(tbody.rocm-components-communication th[rowspan]:nth-of-type(2):hover) .rocm-components-communication td,
 #rocm-rn-components:has(tbody.rocm-components-communication th[rowspan]:nth-of-type(2):hover) .rocm-components-libs th:first-of-type {
  background-color: var(--pst-color-table-row-hover-bg);
 }
 #rocm-rn-components:has(tbody.rocm-components-math th[rowspan]:nth-of-type(2):hover) .rocm-components-math td,
 #rocm-rn-components:has(tbody.rocm-components-math th[rowspan]:nth-of-type(2):hover) .rocm-components-libs th:first-of-type {
  background-color: var(--pst-color-table-row-hover-bg);
 }
 #rocm-rn-components:has(tbody.rocm-components-primitives th[rowspan]:nth-of-type(2):hover) .rocm-components-primitives td,
 #rocm-rn-components:has(tbody.rocm-components-primitives th[rowspan]:nth-of-type(2):hover) .rocm-components-libs th:first-of-type {
  background-color: var(--pst-color-table-row-hover-bg);
 }
 #rocm-rn-components:has(tbody.rocm-components-dev th[rowspan]:nth-of-type(2):hover) .rocm-components-dev td,
 #rocm-rn-components:has(tbody.rocm-components-dev th[rowspan]:nth-of-type(2):hover) .rocm-components-tools th:first-of-type {
  background-color: var(--pst-color-table-row-hover-bg);
 }
 #rocm-rn-components:has(tbody.rocm-components-perf th[rowspan]:nth-of-type(2):hover) .rocm-components-perf td,
 #rocm-rn-components:has(tbody.rocm-components-perf th[rowspan]:nth-of-type(2):hover) .rocm-components-tools th:first-of-type {
  background-color: var(--pst-color-table-row-hover-bg);
 }
 #rocm-rn-components:has(tbody.rocm-components-system th[rowspan]:nth-of-type(2):hover) .rocm-components-system td,
 #rocm-rn-components:has(tbody.rocm-components-system th[rowspan]:nth-of-type(2):hover) .rocm-components-tools th:first-of-type {
  background-color: var(--pst-color-table-row-hover-bg);
 }
 #rocm-rn-components:has(tbody.rocm-components-ml td:hover) .rocm-components-ml th,
 #rocm-rn-components:has(tbody.rocm-components-ml td:hover) .rocm-components-libs th:first-of-type,
 #rocm-rn-components:has(tbody.rocm-components-ml td:hover) tr:hover > td {
  background-color: var(--pst-color-table-row-hover-bg);
 }
 #rocm-rn-components:has(tbody.rocm-components-communication td:hover) .rocm-components-communication th,
 #rocm-rn-components:has(tbody.rocm-components-communication td:hover) .rocm-components-libs th:first-of-type,
 #rocm-rn-components:has(tbody.rocm-components-communication td:hover) tr:hover > td {
  background-color: var(--pst-color-table-row-hover-bg);
 }
 #rocm-rn-components:has(tbody.rocm-components-math td:hover) .rocm-components-math th,
 #rocm-rn-components:has(tbody.rocm-components-math td:hover) .rocm-components-libs th:first-of-type,
 #rocm-rn-components:has(tbody.rocm-components-math td:hover) tr:hover > td {
  background-color: var(--pst-color-table-row-hover-bg);
 }
 #rocm-rn-components:has(tbody.rocm-components-primitives td:hover) .rocm-components-primitives th,
 #rocm-rn-components:has(tbody.rocm-components-primitives td:hover) .rocm-components-libs th:first-of-type,
 #rocm-rn-components:has(tbody.rocm-components-primitives td:hover) tr:hover > td {
  background-color: var(--pst-color-table-row-hover-bg);
 }
 #rocm-rn-components:has(tbody.rocm-components-dev td:hover) .rocm-components-dev th,
 #rocm-rn-components:has(tbody.rocm-components-dev td:hover) .rocm-components-tools th:first-of-type,
 #rocm-rn-components:has(tbody.rocm-components-dev td:hover) tr:hover > td {
  background-color: var(--pst-color-table-row-hover-bg);
 }
 #rocm-rn-components:has(tbody.rocm-components-perf td:hover) .rocm-components-perf th,
 #rocm-rn-components:has(tbody.rocm-components-perf td:hover) .rocm-components-tools th:first-of-type,
 #rocm-rn-components:has(tbody.rocm-components-perf td:hover) tr:hover > td {
  background-color: var(--pst-color-table-row-hover-bg);
 }
 #rocm-rn-components:has(tbody.rocm-components-system td:hover) .rocm-components-system th,
 #rocm-rn-components:has(tbody.rocm-components-system td:hover) .rocm-components-tools th:first-of-type,
 #rocm-rn-components:has(tbody.rocm-components-system td:hover) tr:hover > td {
  background-color: var(--pst-color-table-row-hover-bg);
 }
 #rocm-rn-components:has(tbody.rocm-components-compilers td:hover) .rocm-components-compilers th:first-of-type,
 #rocm-rn-components:has(tbody.rocm-components-compilers td:hover) tr:hover > td {
  background-color: var(--pst-color-table-row-hover-bg);
 }
 #rocm-rn-components:has(tbody.rocm-components-runtimes td:hover) .rocm-components-runtimes th:first-of-type,
 #rocm-rn-components:has(tbody.rocm-components-runtimes td:hover) tr:hover > td {
  background-color: var(--pst-color-table-row-hover-bg);
 }
--- a/docs/what-is-rocm.rst
+++ b/docs/what-is-rocm.rst
@@ -1,4 +1,4 @@
-.. meta::
+ .. meta::
  :description: What is ROCm
  :keywords: ROCm components, ROCm projects, introduction, ROCm, AMD, runtimes, compilers, tools, libraries, API
@@ -10,7 +10,7 @@ ROCm is an open-source stack, composed primarily of open-source software, design
 graphics processing unit (GPU) computation. ROCm consists of a collection of drivers, development
 tools, and APIs that enable GPU programming from low-level kernel to end-user applications.
-.. image:: data/rocm-software-stack-6_1_0.jpg
+.. image:: data/rocm-software-stack-6_2_0.jpg
  :width: 800
  :alt: AMD's ROCm software stack and neighboring technologies.
  :align: center
@@ -44,9 +44,10 @@ Machine Learning & Computer Vision
  ":doc:`MIGraphX <amdmigraphx:index>`", "Graph inference engine that accelerates machine learning model inference"
  ":doc:`MIOpen <miopen:index>`", "An open source deep-learning library"
  ":doc:`MIVisionX <mivisionx:index>`", "Set of comprehensive computer vision and machine learning libraries, utilities, and applications"
  ":doc:`ROCm Performance Primitives (RPP) <rpp:index>`", "Comprehensive high-performance computer vision library for AMD processors with HIP/OpenCL/CPU back-ends"
  ":doc:`rocAL <rocal:index>`", "An augmentation library designed to decode and process images and videos"
  ":doc:`rocDecode <rocdecode:index>`", "High-performance SDK for access to video decoding features on AMD GPUs"
-  ":doc:`ROCm Performance Primitives (RPP) <rpp:index>`", "Comprehensive high-performance computer vision library for AMD processors with HIP/OpenCL/CPU back-ends"
+  ":doc:`rocPyDecode <rocpydecode:index>`", "Provides access to rocDecode APIs in both Python and C/C++ languages"
 Communication
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
@@ -94,22 +95,41 @@ Primitives
 Tools
 -----------------------------------------------
 System Management
 ^^^^^^^^^^^^^^^^^
 .. csv-table::
  :header: "Component", "Description"
  ":doc:`AMD SMI <amdsmi:index>`", "C library for Linux that provides a user space interface for applications to monitor and control AMD devices"
  ":doc:`HIPIFY <hipify:index>`", "Translates CUDA source code into portable HIP C++"
  ":doc:`ROCdbgapi <rocdbgapi:index>`", "ROCm debugger API library"
  ":doc:`ROCm compilers <./reference/rocmcc>`", "Clang/LLVM-based compiler"
  ":doc:`rocminfo <rocminfo:index>`", "Reports system information"
  ":doc:`ROCProfiler <rocprofiler:index>`", "Profiling tool for HIP applications"
  ":doc:`ROCTracer <roctracer:index>`", "Intercepts runtime API calls and traces asynchronous activity"
  ":doc:`ROCm Bandwidth Test <rocm_bandwidth_test:index>`", "Captures the performance characteristics of buffer copying and kernel read/write operations"
  ":doc:`ROCm CMake <rocmcmakebuildtools:index>`", "Collection of CMake modules for common build and development tasks"
  ":doc:`ROCm Data Center Tool <rdc:index>`", "Simplifies administration and addresses key infrastructure challenges in AMD GPUs in cluster and data-center environments"
-  ":doc:`ROCm Debugger (ROCgdb) <rocgdb:index>`", "Source-level debugger for Linux, based on the GNU Debugger (GDB)"
+  ":doc:`rocminfo <rocminfo:index>`", "Reports system information"
  ":doc:`ROCm SMI <rocm_smi_lib:index>`", "C library for Linux that provides a user space interface for applications to monitor and control GPU applications"
  ":doc:`ROCm Validation Suite <rocmvalidationsuite:index>`", "Detects and troubleshoots common problems affecting AMD GPUs running in a high-performance computing environment"
 Performance
 ^^^^^^^^^^^
 .. csv-table::
  :header: "Component", "Description"
  ":doc:`Omniperf <omniperf:index>`", "System performance profiling tool for machine learning and HPC workloads"
  ":doc:`Omnitrace <omnitrace:index>`", "Comprehensive profiling and tracing tool for HIP applications"
  ":doc:`ROCm Bandwidth Test <rocm_bandwidth_test:index>`", "Captures the performance characteristics of buffer copying and kernel read/write operations"
  ":doc:`ROCProfiler <rocprofiler:index>`", "Profiling tool for HIP applications"
  ":doc:`ROCprofiler-SDK <rocprofiler-sdk:index>`", "Toolkit for developing analysis tools for profiling and tracing GPU compute applications. This toolkit is in beta and subject to change"
  ":doc:`ROCTracer <roctracer:index>`", "Intercepts runtime API calls and traces asynchronous activity"
 Development
 ^^^^^^^^^^^
 .. csv-table::
  :header: "Component", "Description"
  ":doc:`HIPIFY <hipify:index>`", "Translates CUDA source code into portable HIP C++"
  ":doc:`ROCm CMake <rocmcmakebuildtools:index>`", "Collection of CMake modules for common build and development tasks"
  ":doc:`ROCdbgapi <rocdbgapi:index>`", "ROCm debugger API library"
  ":doc:`ROCm Debugger (ROCgdb) <rocgdb:index>`", "Source-level debugger for Linux, based on the GNU Debugger (GDB)"
  ":doc:`ROCr Debug Agent <rocr_debug_agent:index>`", "Prints the state of all AMD GPU wavefronts that caused a queue error by sending a SIGQUIT signal to the process while the program is running"
 Compilers
@@ -118,9 +138,9 @@ Compilers
 .. csv-table::
  :header: "Component", "Description"
  ":doc:`HIPCC <hipcc:index>`", "Compiler driver utility that calls Clang or NVCC and passes the appropriate include and library options for the target compiler and HIP infrastructure"
  ":doc:`ROCm compilers <llvm-project:index>`", "ROCm LLVM compiler infrastructure"
  "`FLANG <https://github.com/ROCm/flang/>`_", "An out-of-tree Fortran compiler targeting LLVM"
  ":doc:`hipCC <hipcc:index>`", "Compiler driver utility that calls Clang or NVCC and passes the appropriate include and library options for the target compiler and HIP infrastructure"
  "`LLVM (amdclang) <https://github.com/ROCm/llvm-project>`_ ", "Toolkit for the construction of highly optimized compilers, optimizers, and runtime environments"
 Runtimes
 -----------------------------------------------
--- a/temp.md
+++ b/temp.md
@@ -0,0 +1,58 @@
 ## Components
 The following table lists ROCm components and their individual versions for ROCm 6.2.0. Find an overview of officially
 supported versions of ROCm components, third-party libraries, and frameworks in the
 [Compatibility matrix](https://rocm.docs.amd.com/en/latest/release/docs/6.2.0/compatibility/compatibility-matrix).
 | Category | Group | Name | Version |   |
 |----------|-------|------|---------|:-:|
 | **Libraries** | **Machine learning and computer vision** | [Composable Kernel](https://rocm.docs.amd.com/projects/composable_kernel/en/docs/6.2.0) | 1.1.0 | [{fab}`github fa-lg`](https://github.com/ROCm/composable_kernel/releases/tag/rocm-6.2.0) |
 |  |  | [MIGraphX](https://rocm.docs.amd.com/projects/AMDMIGraphX/en/docs/6.2.0) | 2.9&nbsp;&Rightarrow;&nbsp;[2.10](migraphx-2-10-0) | [{fab}`github fa-lg`](https://github.com/ROCm/AMDMIGraphX/releases/tag/rocm-6.2.0) |
 |  |  | [MIOpen](https://rocm.docs.amd.com/projects/MIOpen/en/docs/6.2.0) | 3.1.0&nbsp;&Rightarrow;&nbsp;[3.2.0](miopen-3-2-0) | [{fab}`github fa-lg`](https://github.com/ROCm/MIOpen/releases/tag/rocm-6.2.0) |
 |  |  | [MIVisionX](https://rocm.docs.amd.com/projects/MIVisionX/en/docs/6.2.0) | 2.5.0&nbsp;&Rightarrow;&nbsp;[3.0.0](mivisionx-3-0-0) | [{fab}`github fa-lg`](https://github.com/ROCm/MIVisionX/releases/tag/rocm-6.2.0) |
 |  |  | [rocAL](https://rocm.docs.amd.com/projects/rocAL/en/docs/6.2.0) | 2.0.0 | [{fab}`github fa-lg`](https://github.com/ROCm/rocAL/releases/tag/rocm-6.2.0) |
 |  |  | [rocDecode](https://rocm.docs.amd.com/projects/rocDecode/en/docs/6.2.0) | 0.6.0 | [{fab}`github fa-lg`](https://github.com/ROCm/rocDecode/releases/tag/rocm-6.2.0) |
 |  |  | [rocPyDecode](https://rocm.docs.amd.com/projects/rocPyDecode/en/docs/6.2.0) | 0.1.0 | [{fab}`github fa-lg`](https://github.com/ROCm/rocPyDecode/releases/tag/rocm-6.2.0) |
 |  |  | [RPP](https://rocm.docs.amd.com/projects/rpp/en/docs/6.2.0) | 1.5.0&nbsp;&Rightarrow;&nbsp;[1.8.0](rpp-1-8-0) | [{fab}`github fa-lg`](https://github.com/ROCm/rpp/releases/tag/rocm-6.2.0) |
 |  | **Communication** | [rccl](https://rocm.docs.amd.com/projects/rccl/en/docs/6.2.0) | 2.18.6&nbsp;&Rightarrow;&nbsp;[2.20.5](rccl-2-20-5) | [{fab}`github fa-lg`](https://github.com/ROCm/rccl/releases/tag/rocm-6.2.0) |
 |  | **Math** | [hipBLAS](https://rocm.docs.amd.com/projects/hipBLAS/en/docs/6.2.0) | 2.1.0&nbsp;&Rightarrow;&nbsp;[2.2.0](hipblas-2-2-0) | [{fab}`github fa-lg`](https://github.com/ROCm/hipBLAS/releases/tag/rocm-6.2.0) |
 |  |  | [hipBLASLt](https://rocm.docs.amd.com/projects/hipBLASLt/en/docs/6.2.0) | 0.7.0&nbsp;&Rightarrow;&nbsp;[0.8.0](hipblaslt-0-8-0) | [{fab}`github fa-lg`](https://github.com/ROCm/hipBLASLt/releases/tag/rocm-6.2.0) |
 |  |  | [hipFFT](https://rocm.docs.amd.com/projects/hipFFT/en/docs/6.2.0) | [1.0.14](hipfft-1-0-14) | [{fab}`github fa-lg`](https://github.com/ROCm/hipFFT/releases/tag/rocm-6.2.0) |
 |  |  | [hipfort](https://rocm.docs.amd.com/projects/hipfort/en/docs/6.2.0) | 0.4-0 | [{fab}`github fa-lg`](https://github.com/ROCm/hipfort/releases/tag/rocm-6.2.0) |
 |  |  | [hipRAND](https://rocm.docs.amd.com/projects/hipRAND/en/docs/6.2.0) | 2.10.17&nbsp;&Rightarrow;&nbsp;[2.11.0](hiprand-2-11-0) | [{fab}`github fa-lg`](https://github.com/ROCm/hipRAND/releases/tag/rocm-6.2.0) |
 |  |  | [hipSOLVER](https://rocm.docs.amd.com/projects/hipSOLVER/en/docs/6.2.0) | 2.1.1&nbsp;&Rightarrow;&nbsp;[2.2.0](hipsolver-2-2-0) | [{fab}`github fa-lg`](https://github.com/ROCm/hipSOLVER/releases/tag/rocm-6.2.0) |
 |  |  | [hipSPARSE](https://rocm.docs.amd.com/projects/hipSPARSE/en/docs/6.2.0) | 3.0.1&nbsp;&Rightarrow;&nbsp;[3.1.1](hipsparse-3-1-1) | [{fab}`github fa-lg`](https://github.com/ROCm/hipSPARSE/releases/tag/rocm-6.2.0) |
 |  |  | [hipSPARSELt](https://rocm.docs.amd.com/projects/hipSPARSELt/en/docs/6.2.0) | 0.2.0&nbsp;&Rightarrow;&nbsp;[0.2.1](hipsparselt-0-2-1) | [{fab}`github fa-lg`](https://github.com/ROCm/hipSPARSELt/releases/tag/rocm-6.2.0) |
 |  |  | [rocALUTION](https://rocm.docs.amd.com/projects/rocALUTION/en/docs/6.2.0) | 3.1.1&nbsp;&Rightarrow;&nbsp;[3.2.0](rocalution-3-2-0) | [{fab}`github fa-lg`](https://github.com/ROCm/rocALUTION/releases/tag/rocm-6.2.0) |
 |  |  | [rocBLAS](https://rocm.docs.amd.com/projects/rocBLAS/en/docs/6.2.0) | 4.1.0&nbsp;&Rightarrow;&nbsp;[4.2.0](rocblas-4-2-0) | [{fab}`github fa-lg`](https://github.com/ROCm/rocBLAS/releases/tag/rocm-6.2.0) |
 |  |  | [rocFFT](https://rocm.docs.amd.com/projects/rocFFT/en/docs/6.2.0) | 1.0.27&nbsp;&Rightarrow;&nbsp;[1.0.28](rocfft-1-0-28) | [{fab}`github fa-lg`](https://github.com/ROCm/rocFFT/releases/tag/rocm-6.2.0) |
 |  |  | [rocRAND](https://rocm.docs.amd.com/projects/rocRAND/en/docs/6.2.0) | 3.0.0&nbsp;&Rightarrow;&nbsp;[3.1.0](rocrand-3-1-0) | [{fab}`github fa-lg`](https://github.com/ROCm/rocRAND/releases/tag/rocm-6.2.0) |
 |  |  | [rocSOLVER](https://rocm.docs.amd.com/projects/rocSOLVER/en/docs/6.2.0) | 3.25.0&nbsp;&Rightarrow;&nbsp;[3.26.0](rocsolver-3-26-0) | [{fab}`github fa-lg`](https://github.com/ROCm/rocSOLVER/releases/tag/rocm-6.2.0) |
 |  |  | [rocSPARSE](https://rocm.docs.amd.com/projects/rocSPARSE/en/docs/6.2.0) | 3.1.1&nbsp;&Rightarrow;&nbsp;[3.2.0](rocsparse-3-2-0) | [{fab}`github fa-lg`](https://github.com/ROCm/rocSPARSE/releases/tag/rocm-6.2.0) |
 |  |  | [rocWMMA](https://rocm.docs.amd.com/projects/rocWMMA/en/docs/6.2.0) | 1.4.0&nbsp;&Rightarrow;&nbsp;[1.5.0](rocwmma-1-5-0) | [{fab}`github fa-lg`](https://github.com/ROCm/rocWMMA/releases/tag/rocm-6.2.0) |
 |  |  | [Tensile](https://rocm.docs.amd.com/projects/tensile/en/docs/6.2.0) | 4.40.0&nbsp;&Rightarrow;&nbsp;[4.41.0](tensile-4-41-0) | [{fab}`github fa-lg`](https://github.com/ROCm/tensile/releases/tag/rocm-6.2.0) |
 |  | **Primitives** | [hipCUB](https://rocm.docs.amd.com/projects/hipCUB/en/docs/6.2.0) | 3.1.0&nbsp;&Rightarrow;&nbsp;[3.2.0](hipcub-3-2-0) | [{fab}`github fa-lg`](https://github.com/ROCm/hipCUB/releases/tag/rocm-6.2.0) |
 |  |  | [hipTensor](https://rocm.docs.amd.com/projects/hipTensor/en/docs/6.2.0) | 1.2.0&nbsp;&Rightarrow;&nbsp;[1.3.0](hiptensor-1-3-0) | [{fab}`github fa-lg`](https://github.com/ROCm/hipTensor/releases/tag/rocm-6.2.0) |
 |  |  | [rocPRIM](https://rocm.docs.amd.com/projects/rocPRIM/en/docs/6.2.0) | 3.1.0&nbsp;&Rightarrow;&nbsp;[3.2.0](rocprim-3-2-0) | [{fab}`github fa-lg`](https://github.com/ROCm/rocPRIM/releases/tag/rocm-6.2.0) |
 |  |  | [rocThrust](https://rocm.docs.amd.com/projects/rocThrust/en/docs/6.2.0) | 3.0.0&nbsp;&Rightarrow;&nbsp;[3.1.0](rocthrust-3-1-0) | [{fab}`github fa-lg`](https://github.com/ROCm/rocThrust/releases/tag/rocm-6.2.0) |
 | **Tools** | **Development** | [HIPIFY](https://rocm.docs.amd.com/projects/HIPIFY/docs/6.2.0) | 17.0.0&nbsp;&Rightarrow;&nbsp;[18.0.0](hipify-18-0-0) | [{fab}`github fa-lg`](https://github.com/ROCm/HIPIFY/releases/tag/rocm-6.2.0) |
 |  |  | [ROCdbgapi](https://rocm.docs.amd.com/projects/ROCdbgapi/en/docs/6.2.0) | 0.71.0&nbsp;&Rightarrow;&nbsp;[0.76.0](rocdbgapi-0-76-0) | [{fab}`github fa-lg`](https://github.com/ROCm/ROCdbgapi/releases/tag/rocm-6.2.0) |
 |  |  | [ROCm CMake](https://rocm.docs.amd.com/projects/rocm-cmake/en/docs/6.2.0) | 0.12.0&nbsp;&Rightarrow;&nbsp;[0.13.0](rocm-cmake-0-13-0) | [{fab}`github fa-lg`](https://github.com/ROCm/rocm-cmake/releases/tag/rocm-6.2.0) |
 |  |  | [ROCm Debugger (ROCgdb)](https://rocm.docs.amd.com/projects/rocm-cmake/en/docs/6.2.0) | 13&nbsp;&Rightarrow;&nbsp;[15](rocgdb-15) | [{fab}`github fa-lg`](https://github.com/ROCm/ROCgdb/releases/tag/rocm-6.2.0) |
 |  |  | [ROCr Debug Agent](https://rocm.docs.amd.com/projects/rocr_debug_agent/en/docs/6.2.0) | 2.0.3 | [{fab}`github fa-lg`](https://github.com/ROCm/rocr_debug_agent/releases/tag/rocm-6.2.0) |
 |  | **Performance** | [Omniperf](https://rocm.docs.amd.com/projects/omniperf/en/docs/6.2.0) | 2.0.1 | [{fab}`github fa-lg`](https://github.com/ROCm/omniperf/releases/tag/rocm-6.2.0) |
 |  |  | [Omnitrace](https://rocm.docs.amd.com/projects/omnitrace/en/docs/6.2.0) | 1.11.2 | [{fab}`github fa-lg`](https://github.com/ROCm/omnitrace/releases/tag/rocm-6.2.0) |
 |  |  | [ROCm Bandwidth Test](https://rocm.docs.amd.com/projects/rocm_bandwidth_test/en/docs/6.2.0) | 1.4.0 | [{fab}`github fa-lg`](https://github.com/ROCm/rocm_bandwidth_test/releases/tag/rocm-6.2.0) |
 |  |  | [ROCProfiler](https://rocm.docs.amd.com/projects/ROCProfiler/en/docs/6.2.0) | 2.0.0&nbsp;&Rightarrow;&nbsp;[2.0.0](rocprofiler-2-0-0) | [{fab}`github fa-lg`](https://github.com/ROCm/rocm_bandwidth_test/releases/tag/rocm-6.2.0) |
 |  |  | [ROCProfiler-SDK](https://rocm.docs.amd.com/projects/rocprofiler-sdk/en/docs/6.2.0) | 0.4.0 | [{fab}`github fa-lg`](https://github.com/ROCm/rocm_bandwidth_test/releases/tag/rocm-6.2.0) |
 |  |  | [ROCTracer](https://rocm.docs.amd.com/projects/ROCTracer/en/docs/6.2.0) | 4.1.0 | [{fab}`github fa-lg`](https://github.com/ROCm/rocm_bandwidth_test/releases/tag/rocm-6.2.0) |
 |  | **System** | [AMD SMI](https://rocm.docs.amd.com/projects/amdsmi/en/docs/6.2.0) | 24.5.2&nbsp;&Rightarrow;&nbsp;[24.6.1](amd-smi-24-6-1) | [{fab}`github fa-lg`](https://github.com/ROCm/rdc/releases/tag/rocm-6.2.0) |
 |  |  | [rocminfo](https://rocm.docs.amd.com/projects/rdc/en/docs/6.2.0) | 1.0.0 | [{fab}`github fa-lg`](https://github.com/ROCm/rdc/releases/tag/rocm-6.2.0) |
 |  |  | [ROCm Data Center Tool](https://rocm.docs.amd.com/projects/rdc/en/docs/6.2.0) | 0.3.0&nbsp;&Rightarrow;&nbsp;[1.0.0](rocm-data-center-tool-1-0-0) | [{fab}`github fa-lg`](https://github.com/ROCm/rdc/releases/tag/rocm-6.2.0) |
 |  |  | [ROCm SMI](https://rocm.docs.amd.com/projects/rdc/en/docs/6.2.0) | 7.2.0 | [{fab}`github fa-lg`](https://github.com/ROCm/rdc/releases/tag/rocm-6.2.0) |
 |  |  | [ROCm Validation Suite](https://rocm.docs.amd.com/projects/rdc/en/docs/6.2.0) | 1.0 | [{fab}`github fa-lg`](https://github.com/ROCm/rdc/releases/tag/rocm-6.2.0) |
 |  |  | [TransferBench](https://rocm.docs.amd.com/projects/rdc/en/docs/6.2.0) | 1.5.0 | [{fab}`github fa-lg`](https://github.com/ROCm/rdc/releases/tag/rocm-6.2.0) |
 |  | **Compilers** | [hipCC](https://rocm.docs.amd.com/projects/hipCC/en/docs/6.2.0) | 1.0.0&nbsp;&Rightarrow;&nbsp;[1.1.1](hipcc-1-1-1) | [{fab}`github fa-lg`](https://github.com/ROCm/llvm-project/releases/tag/rocm-6.2.0) |
 |  |  | [llvm-project](https://rocm.docs.amd.com/projects/llvm-project/en/docs/6.2.0) | 17.0.0&nbsp;&Rightarrow;&nbsp;[18.0.0](llvm-project-18-0-0) | [{fab}`github fa-lg`](https://github.com/ROCm/llvm-project/releases/tag/rocm-6.2.0) |
 | **Runtimes** |  | [HIP](https://rocm.docs.amd.com/projects/HIP/en/docs/6.2.0) | 6.1&nbsp;&Rightarrow;&nbsp;[6.2](hip-6-2-0) | [{fab}`github fa-lg`](https://github.com/ROCm/HIP/releases/tag/rocm-6.2.0) |
 |  |  | [ROCr Runtime](https://rocm.docs.amd.com/projects/ROCr-Runtime/en/docs/6.2.0) | 6.1&nbsp;&Rightarrow;&nbsp;[6.2](hip-6-2-0) | [{fab}`github fa-lg`](https://github.com/ROCm/ROCR-Runtime/releases/tag/rocm-6.2.0) |
--- a/tools/autotag/README.md
+++ b/tools/autotag/README.md
@@ -11,11 +11,11 @@
  * RadeonOpenCompute
  * ROCmSoftwarePlatform
-## Updating the changelog
+## Updating the changelog and release notes
-> IMPORTANT: It is key to update the template Markdown files in `tools/autotag/templates/rocm_changes` (eg: `5.6.0.md`) and not the `CHANGELOG.md` itself to ensure that updates are not overwritten by the autotag script. The template should only have content from changelogs that are not included by the script to avoid duplicating data.
+> IMPORTANT: It is key to update the template Markdown files in `tools/autotag/templates/<name of change type>` (eg: `5.6.0.md`) and not the `CHANGELOG.md` or `RELEASE.md` itself to ensure that updates are not overwritten by the autotag script. The template should only have content from changelogs that are not included by the script to avoid duplicating data.
-* Add or update the release specific notes in `tools/autotag/templates/rocm_changes`
+* Add or update the release specific notes in `tools/autotag/templates/<name of change type>`
 * Ensure the all the repositories have their release specific branch with the updated changelogs
 * Run this for 5.6.0 (change for whatever version you require)
 * `GITHUB_ACCESS_TOKEN=my_token_here`
@@ -26,10 +26,10 @@ To generate the changelog from 5.0.0 up to and including 6.1.2:
 python3 tag_script.py -t $GITHUB_ACCESS_TOKEN --no-release --no-pulls --do-previous --compile_file ../../CHANGELOG.md --branch release/rocm-rel-6.1 6.1.2
 ```
-To generate the changelog only for 6.1.2:
+To generate the release notes only for 6.1.2:
 ```sh
-python3 tag_script.py -t $GITHUB_ACCESS_TOKEN --no-release --no-pulls --compile_file ../../CHANGELOG.md --branch release/rocm-rel-6.1 6.1.2
+python3 tag_script.py -t $GITHUB_ACCESS_TOKEN --no-release --no-pulls --compile_file ../../RELEASE.md --branch release/rocm-rel-6.1 6.1.2
 ```
 ### Notes
--- a/tools/autotag/components.xml
+++ b/tools/autotag/components.xml
@@ -0,0 +1,71 @@
 <?xml version="1.0" encoding="UTF-8"?>
 <manifest>
    <remote name="rocm-org" fetch="https://github.com/ROCm/" />
    <default revision="refs/tags/rocm-6.1.1"
     remote="rocm-org"
     sync-c="true"
     sync-j="4" />
 <!--list of projects for ROCm-->
    <project category="libs" group="ml" name="composable_kernel" />
    <project category="libs" group="ml" name="AMDMIGraphX" />
    <project category="libs" group="ml" name="MIOpen" />
    <project category="libs" group="ml" name="MIVisionX" />
    <!-- rocAL -->
    <project category="libs" group="ml" name="rocDecode" />
    <project category="libs" group="ml" name="rpp" />
    <project category="libs" group="communication" name="rccl" />
    <project category="libs" group="math" name="half" />
    <project category="libs" group="math" name="hipBLAS" />
    <project category="libs" group="math" name="hipBLASLt" />
    <project category="libs" group="math" name="hipFFT" />
    <project category="libs" group="math" name="hipfort" />
    <project category="libs" group="math" name="hipRAND" />
    <project category="libs" group="math" name="hipSOLVER" />
    <project category="libs" group="math" name="hipSPARSE" />
    <project category="libs" group="math" name="hipSPARSELt" />
    <project category="libs" group="math" name="rocALUTION" />
    <project category="libs" group="math" name="rocBLAS" />
    <project category="libs" group="math" name="rocFFT" />
    <project category="libs" group="math" name="rocRAND" />
    <project category="libs" group="math" name="rocSOLVER" />
    <project category="libs" group="math" name="rocSPARSE" />
    <project category="libs" group="math" name="rocWMMA" />
    <project category="libs" group="math" name="Tensile" />
    <project category="libs" group="primitives" name="hipCUB" />
    <project category="libs" group="primitives" name="hipTensor" />
    <project category="libs" group="primitives" name="rocPRIM" />
    <project category="libs" group="primitives" name="rocThrust" />
    <project category="tools" group="dev" name="HIPIFY" />
    <project category="tools" group="dev" name="ROCdbgapi" />
    <project category="tools" group="dev" name="rocm-cmake" />
    <project category="tools" group="dev" name="ROCgdb" />
    <project category="tools" group="dev" name="rocr_debug_agent" />
    <!-- omniperf, omnitrace -->
    <project category="tools" group="perf" name="rocm_bandwidth_test" />
    <project category="tools" group="perf" name="rocprofiler" />
    <project category="tools" group="perf" name="roctracer" />
    <project category="tools" group="system" name="amdsmi" />
    <project category="tools" group="system" name="rocminfo" />
    <project category="tools" group="system" name="rdc" />
    <project category="tools" group="system" name="rocm_smi_lib" />
    <project category="tools" group="system" name="ROCmValidationSuite" />
    <!-- transferbench -->
    <project category="compilers" name="llvm-project" />
    <project category="compilers" name="flang" path="openmp-extras/flang" />
    <project category="runtimes" name="clr" />
    <project category="runtimes" name="HIP" />
    <project category="runtimes" name="ROCR-Runtime" />
    <!--<project name="ROCK-Kernel-Driver" />-->
    <!--<project name="ROCT-Thunk-Interface" />-->
    <!--<project name="rocm-core" />-->
    <!--<project name="rocprofiler-register" />-->
    <!--<project name="clang-ocl" />-->
 <!--HIP Projects-->
    <!--<project name="hip-tests" />-->
    <!--<project name="HIP-Examples" />-->
    <!--<project name="hipother" />-->
 <!-- Projects for OpenMP-Extras -->
    <!--<project name="aomp" path="openmp-extras/aomp" />-->
    <!--<project name="aomp-extras" path="openmp-extras/aomp-extras" />-->
 </manifest>
--- a/tools/autotag/tag_script.py
+++ b/tools/autotag/tag_script.py
@@ -188,7 +188,7 @@ def run_tagging():
    # Use the manifest included in the ROCm GitHub repository by default.
    if args.manifest_url is None:
        manifest_path = (
-            "./../../default.xml"
+            "./components.xml"
        )
    else:
        manifest_url = args.manifest_url
@@ -233,31 +233,26 @@ def run_tagging():
    )
    # Find all the math libraries and their remotes.
-    included_names = [
+    included_categories = [
-        "AMDMIGraphX",
+        "libs",
-        "HIPIFY", #
+        "tools",
-        "MIOpen",
+        "compilers",
-        "MIVisionX",
+        "runtimes",
        "ROCmValidationSuite", #
        "composable_kernel",
        "hipfort",
        "rocDecode",
        "rocm-cmake",
        "rpp",
    ]
    included_groups = [
        "mathlibs"
    ]
    projects = [ ]
    for project in manifest_tree.iterfind(".//project"):
-        include = str(project.get("name")) in included_names
+        if project.get("category") in included_categories:
        if (project.get("name") in included_names) or (project.get("groups") in included_groups):
            projects.append(project)
-    names_and_remotes = list((entry.get("name"), entry.get("remote")) for entry in projects)
+    component_information = list(
        (entry.get("name"), 
         entry.get("remote"),
         entry.get("group"),
         entry.get("category"),
        ) for entry in projects)
    # Get all the relevant ROCm releases, and only the last version if not doing previous.
    minimum_version = "5.0.0" if args.previous else args.version
-    releases = release_bundle_factory.create_data_dict(args.version, names_and_remotes, minimum_version)
+    releases = release_bundle_factory.create_data_dict(args.version, component_information, minimum_version)
    # Process the individual releases.
    failed: List[Tuple[str, str]] = []
--- a/tools/autotag/templates/changelog.jinja
+++ b/tools/autotag/templates/changelog.jinja
@@ -23,19 +23,23 @@ This page contains the release notes for AMD ROCm™ Software.
 -------------------
 ## ROCm {{version}}
-
+{{- "\n\n" -}}
-{%- set rocm_changes = "./rocm_changes/" ~ version ~ ".md" %}
+{%- set highlights = "./highlights/" ~ version ~ ".md" %}
-{% include rocm_changes ignore missing %}
+{%- include highlights ignore missing -%}
 {{- "\n\n" -}}
 {%- set support = "./support/" ~ version ~ ".md" %}
 {%- include support ignore missing -%}
 ### Library changes in ROCm {{version}}
-| Library | Version |
+| Category | Group | Name | Version | Repository |
-|---------|---------|
+|----------|-------|------|---------|------------|
-{%- for lib_name, lib in release.libraries | dictsort %}
+{%- for lib_name in release.libraries %}
 {%- set lib = release.libraries[lib_name] %}
 {%- if rocm_ver_by_lib_ver[lib_name][lib.lib_version] == version and (prev_lib_ver[lib_name][lib.lib_version] | default([]) | length > 0) and lib.lib_version %}
-| {{ lib_name }} | {{prev_lib_ver[lib_name][lib.lib_version]}} ⇒ [{{ lib.lib_version }}]({{ lib.release_url }}) |
+| {{ lib.category }} | {{ lib.group }} | [{{ lib_name }}]({{ lib.documentation_page }}) | {{prev_lib_ver[lib_name][lib.lib_version]}} ⇒ [{{ lib.lib_version }}]({{ lib.release_url }}) | [ROCm/{{ lib_name }}]({{ lib.repository_url }}) |
 {%- elif lib.lib_version %}
-| {{ lib_name }} | [{{ lib.lib_version }}]({{ lib.release_url }}) |
+| {{ lib.category }} | {{ lib.group }} | [{{ lib_name }}]({{ lib.documentation_page }}) | [{{ lib.lib_version }}]({{ lib.release_url }}) | [ROCm/{{ lib_name }}]({{ lib.repository_url }}) |
 {%- endif %}
 {%- endfor %}
@@ -53,7 +57,17 @@ This page contains the release notes for AMD ROCm™ Software.
 {{change|trim|e}}
 {%- endfor %}{# change in lib.data.changes #}
-{%- endif  %}
+{%- endif -%}
 {%- endfor %}{# lib in release.libraries #}
 {{- "\n\n" -}}
 {%- set extra_components = "./extra_components/" ~ version ~ ".md" %}
 {%- include extra_components ignore missing -%}
 {{- "\n\n" -}}
 {%- set known_issues = "./known_issues/" ~ version ~ ".md" %}
 {%- include known_issues ignore missing -%}
 {{- "\n\n" -}}
 {%- set upcoming_changes = "./upcoming_changes/" ~ version ~ ".md" %}
 {%- include upcoming_changes ignore missing -%}
 {%- endfor %}{# release in releases #}
 {# EOF #}
--- a/tools/autotag/templates/extra_components/6.1.2.md
+++ b/tools/autotag/templates/extra_components/6.1.2.md
@@ -1,10 +1,3 @@
 ROCm 6.1.2 includes enhancements to SMI tools and improvements to some libraries.
 ### OS support
 ROCm 6.1.2 has been tested against a pre-release version of Ubuntu 22.04.5 (kernel: 5.15 [GA], 6.8 [HWE]).
 ### AMD SMI
 AMD SMI for ROCm 6.1.2
@@ -42,16 +35,6 @@ AMD SMI for ROCm 6.1.2
 See the AMD SMI [detailed changelog](https://github.com/ROCm/amdsmi/blob/rocm-6.1.x/CHANGELOG.md) with code samples for more information.
 ```
 ### HIPCC
 HIPCC for ROCm 6.1.2
 #### Changes
 * **Upcoming:** a future release will enable use of compiled binaries `hipcc.bin` and `hipconfig.bin` by default. No action is needed by users; you may continue calling high-level Perl scripts `hipcc` and `hipconfig`. `hipcc.bin` and `hipconfig.bin` will be invoked by the high-level Perl scripts. To revert to the previous behavior and invoke `hipcc.pl` and `hipconfig.pl`, set the `HIP_USE_PERL_SCRIPTS` environment variable to `1`.
 * **Upcoming:** a subsequent release will remove high-level Perl scripts `hipcc` and `hipconfig`. This release will remove the `HIP_USE_PERL_SCRIPTS` environment variable. It will rename `hipcc.bin` and `hipconfig.bin` to `hipcc` and `hipconfig` respectively. No action is needed by the users. To revert to the previous behavior, invoke `hipcc.pl` and `hipconfig.pl` explicitly.
 * **Upcoming:** a subsequent release will remove `hipcc.pl` and `hipconfig.pl`.
 ### ROCm SMI
 ROCm SMI for ROCm 6.1.2
--- a/tools/autotag/templates/extra_components/6.2.0.md
+++ b/tools/autotag/templates/extra_components/6.2.0.md
--- a/tools/autotag/templates/rocm_changes/5.0.0.md
+++ b/tools/autotag/templates/rocm_changes/5.0.0.md
--- a/tools/autotag/templates/rocm_changes/5.0.1.md
+++ b/tools/autotag/templates/rocm_changes/5.0.1.md
--- a/tools/autotag/templates/rocm_changes/5.0.2.md
+++ b/tools/autotag/templates/rocm_changes/5.0.2.md
--- a/tools/autotag/templates/rocm_changes/5.1.0.md
+++ b/tools/autotag/templates/rocm_changes/5.1.0.md
--- a/tools/autotag/templates/rocm_changes/5.2.0.md
+++ b/tools/autotag/templates/rocm_changes/5.2.0.md
--- a/tools/autotag/templates/rocm_changes/5.2.3.md
+++ b/tools/autotag/templates/rocm_changes/5.2.3.md
--- a/tools/autotag/templates/rocm_changes/5.3.0.md
+++ b/tools/autotag/templates/rocm_changes/5.3.0.md
--- a/tools/autotag/templates/rocm_changes/5.3.2.md
+++ b/tools/autotag/templates/rocm_changes/5.3.2.md
--- a/tools/autotag/templates/rocm_changes/5.3.3.md
+++ b/tools/autotag/templates/rocm_changes/5.3.3.md
--- a/tools/autotag/templates/rocm_changes/5.4.0.md
+++ b/tools/autotag/templates/rocm_changes/5.4.0.md
--- a/tools/autotag/templates/rocm_changes/5.4.1.md
+++ b/tools/autotag/templates/rocm_changes/5.4.1.md
--- a/tools/autotag/templates/rocm_changes/5.4.2.md
+++ b/tools/autotag/templates/rocm_changes/5.4.2.md
--- a/tools/autotag/templates/rocm_changes/5.4.3.md
+++ b/tools/autotag/templates/rocm_changes/5.4.3.md
--- a/tools/autotag/templates/rocm_changes/5.5.0.md
+++ b/tools/autotag/templates/rocm_changes/5.5.0.md
--- a/tools/autotag/templates/rocm_changes/5.5.1.md
+++ b/tools/autotag/templates/rocm_changes/5.5.1.md
--- a/tools/autotag/templates/rocm_changes/5.6.0.md
+++ b/tools/autotag/templates/rocm_changes/5.6.0.md
--- a/tools/autotag/templates/rocm_changes/5.6.1.md
+++ b/tools/autotag/templates/rocm_changes/5.6.1.md
--- a/tools/autotag/templates/rocm_changes/5.7.0.md
+++ b/tools/autotag/templates/rocm_changes/5.7.0.md
--- a/tools/autotag/templates/rocm_changes/5.7.1.md
+++ b/tools/autotag/templates/rocm_changes/5.7.1.md
--- a/tools/autotag/templates/rocm_changes/6.0.0.md
+++ b/tools/autotag/templates/rocm_changes/6.0.0.md
--- a/tools/autotag/templates/rocm_changes/6.0.2.md
+++ b/tools/autotag/templates/rocm_changes/6.0.2.md
--- a/tools/autotag/templates/rocm_changes/6.1.0.md
+++ b/tools/autotag/templates/rocm_changes/6.1.0.md
--- a/tools/autotag/templates/rocm_changes/6.1.1.md
+++ b/tools/autotag/templates/rocm_changes/6.1.1.md
--- a/tools/autotag/templates/highlights/6.1.2.md
+++ b/tools/autotag/templates/highlights/6.1.2.md
@@ -0,0 +1 @@
 ROCm 6.1.2 includes enhancements to SMI tools and improvements to some libraries.
--- a/tools/autotag/templates/highlights/6.2.0.md
+++ b/tools/autotag/templates/highlights/6.2.0.md
@@ -0,0 +1,223 @@
 The release notes provide a comprehensive summary of changes since the previous ROCm release.
 - [Release highlights](release-highlights)
 - [Operating system and hardware support changes](operating-system-and-hardware-support-changes)
 - [ROCm components versioning](rocm-components)
 - [Detailed component changes](detailed-component-changes)
 - [ROCm known issues](rocm-known-issues)
 - [ROCm upcoming changes](rocm-upcoming-changes)
 The [Compatibility matrix](https://rocm.docs.amd.com/en/latest/release/docs/6.2.0/compatibility/compatibility-matrix)
 provides an overview of operating system, hardware, ecosystem, and ROCm component support across ROCm releases.
 Release notes for previous ROCm releases are available in earlier versions of the documentation.
 See the [ROCm documentation release history](https://rocm.docs.amd.com/en/latest/release/versions).
 ## Release highlights
 This section introduces notable new features and improvements in ROCm 6.2. See the
 [Detailed component changes](#detailed-component-changes) for individual component changes.
 ### New components
 ROCm 6.2.0 introduces the following new components to the ROCm software stack.
 - **Omniperf** -- A kernel-level profiling tool for machine learning and high-performance computing (HPC) workloads
  running on AMD Instinct accelerators. Omniperf offers comprehensive profiling and advanced analysis via command line
  or a GUI dashboard. For more information, see
  [Omniperf](https://rocm.docs.amd.com/projects/omniperf/en/latest).
 - **Omnitrace** -- A multi-purpose analysis tool for profiling and tracing applications running on the CPU or the CPU and GPU.
  It supports dynamic binary instrumentation, call-stack sampling, causal profiling, and other features for determining
  which function and line number are executing. For more information, see
  [Omnitrace](https://rocm.docs.amd.com/projects/omnitrace/en/latest).
 - **rocPyDecode** -- A tool to access rocDecode APIs in Python. It connects Python and C/C++ libraries,
  enabling function calling and data passing between the two languages. The `rocpydecode.so` library, a wrapper, uses
  rocDecode APIs written primarily in C/C++ within Python. For more information, see
  [rocPyDecode](https://rocm.docs.amd.com/projects/rocpydecode/en/latest).
 - **ROCprofiler-SDK** -- ROCprofiler-SDK is a profiling and tracing library for HIP and ROCm applications on AMD ROCm software
  used to identify application performance bottlenecks and optimize their performance. The new APIs add restrictions for more
  efficient implementations and improved thread safety. A new window restriction specifies the services the tool can use.
  ROCprofiler-SDK also provides a tool library to help you write your tool implementations. `rocprofv3` uses this tool library
  to profile and trace applications for performance bottlenecks. Examples include API tracing, kernel tracing, and so on.
  For more information, see [ROCprofiler-SDK](https://rocm.docs.amd.com/projects/rocprofiler-sdk/en/latest).
  ```{note}
  ROCprofiler-SDK for ROCm 6.2.0 is a beta release and subject to change.
  ```
 ### ROCm Offline Installer Creator introduced
 The new ROCm Offline Installer Creator creates an installation package for a preconfigured setup of ROCm, the AMDGPU
 driver, or a combination of the two on a target system without network access. This new tool customizes
 multiple unique configurations for use when installing ROCm on a target. Other notable features include:
 * A lightweight, easy-to-use user interface for configuring the creation of the installer
 * Support for multiple Linux distributions
 * Installer support for different ROCm releases and specific ROCm components
 * Optional driver or driver-only installer creation
 * Optional post-install preferences
 * Lightweight installer packages, which are unique to the preconfigured ROCm setup
 * Resolution and inclusion of dependency packages for offline installation
 For more information, see
 [ROCm Offline Installer Creator](https://rocm.docs.amd.com/projects/rocm-install-on-linux/en/latest/install/rocm-offline-installer.html).
 ### Math libraries default to Clang instead of HIPCC
 The default compiler used to build the math libraries on Linux changes from `hipcc` to `amdclang++`.
 Appropriate compiler flags are added to ensure these compilations build correctly. This change only applies when
 building the libraries. Applications using the libraries can continue to be compiled using `hipcc` or `amdclang++` as
 described in [ROCm compiler reference](https://rocm.docs.amd.com/projects/llvm-project/en/latest/reference/rocmcc.html).
 The math libraries can also be built with `hipcc` using any of the previously available methods (for example, the `CXX`
 environment variable, the `CMAKE_CXX_COMPILER` CMake variable, and so on). This change shouldn't affect performance or
 functionality.
 ### Framework and library changes
 This section highlights updates to supported deep learning frameworks and notable third-party library optimizations.
 #### Additional PyTorch and TensorFlow support
 ROCm 6.2.0 supports PyTorch versions 2.2 and 2.3 and TensorFlow version 2.16.
 See [Installing PyTorch for ROCm](https://rocm.docs.amd.com/projects/install-on-linux/en/latest/how-to/3rd-party/pytorch-install.html)
 and [Installing TensorFlow for ROCm](https://rocm.docs.amd.com/projects/install-on-linux/en/latest/how-to/3rd-party/tensorflow-install.html)
 for installation instructions.
 Refer to the
 [Third-party support matrix](https://rocm.docs.amd.com/projects/install-on-linux/en/latest/reference/3rd-party-support-matrix.html#deep-learning)
 for a comprehensive list of third-party frameworks and libraries suppported by ROCm.
 #### Optimized framework support for OpenXLA
 PyTorch for ROCm and TensorFlow for ROCm now provide native support for OpenXLA. OpenXLA is an open-source ML compiler
 ecosystem that enables developers to compile and optimize models from all leading ML frameworks. For more information, see
 [Installing PyTorch for ROCm](https://rocm.docs.amd.com/projects/install-on-linux/en/latest/how-to/3rd-party/pytorch-install.html)
 and [Installing TensorFlow for ROCm](https://rocm.docs.amd.com/projects/install-on-linux/en/latest/how-to/3rd-party/tensorflow-install.html).
 #### PyTorch support for Autocast (automatic mixed precision)
 PyTorch now supports Autocast for recurrent neural networks (RNNs) on ROCm. This can help to reduce computational
 workloads and improve performance. Based on the information about the magnitude of values, Autocast can substitute the
 original `float32` linear layers and convolutions with their `float16` or `bfloat16` variants. For more information, see
 [Automatic mixed precision](https://rocm.docs.amd.com/en/latest/how-to/rocm-for-ai/train-a-model#automatic-mixed-precision-amp).
 #### Memory savings for bitsandbytes model quantization
 The [ROCm-aware bitsandbytes library](https://github.com/ROCm/bitsandbytes) is a lightweight Python wrapper around HIP
 custom functions, in particular 8-bit optimizer, matrix multiplication, and 8-bit and 4-bit quantization functions.
 ROCm 6.2.0 introduces the following bitsandbytes changes:
 - `Int8` matrix multiplication is enabled, and it includes the following functions:
  - `extract-outliers` – extracts rows and columns that have outliers in the inputs. They’re later used for matrix multiplication without quantization.
  - `transform` – row-to-column and column-to-row transformations are enabled, along with transpose operations. These are used before and after matmul computation.
  - `igemmlt` – new function for GEMM computation A*B^T. It uses
    [hipblasLtMatMul](https://rocm.docs.amd.com/projects/hipBLASLt/en/latest/api-reference.html#hipblasltmatmul) and performs 8-bit GEMM operations.
  - `dequant_mm` – dequantizes output matrix to original data type using scaling factors from vector-wise quantization.
 - Blockwise quantization – input tensors are quantized for a fixed block size.
 - 4-bit quantization and dequantization functions – normalized `Float4` quantization, quantile estimation, and quantile quantization functions are enabled.
 - 8-bit and 32-bit optimizers are enabled.
 ```{note}
 These functions are included in bitsandbytes. They are not part of ROCm. However, ROCm 6.2.0 has enabled the fixes and
 features to run them.
 ```
 For more information, see [Model quantization techniques](https://rocm.docs.amd.com/en/latest/how-to/llm-fine-tuning-optimization/model-quantization.html).
 #### Improved vLLM support
 ROCm 6.2.0 enhances vLLM support for inference on AMD Instinct accelerators, adding
 capabilities for `FP16`/`BF16` precision for LLMs, and `FP8` support for Llama.
 ROCm 6.2.0 adds support for the following vLLM features:
 - MP:
  Multi-GPU execution. Choose between MP and Ray using a flag. To set it to MP,
  use `--distributed-executor-backed=mp`. The default depends on the commit in flux.
 - FP8 KV cache:
  Enhances computational efficiency and performance by significantly reducing memory usage and bandwidth requirements.
  The QUARK quantizer currently only supports Llama.
 - Triton Flash Attention:
  ROCm supports both Triton and Composable Kernel Flash Attention 2 in vLLM. The default is Triton, but you can change this
  setting using the `VLLM_USE_FLASH_ATTN_TRITON=False` environment variable.
 - PyTorch TunableOp:
  Improved optimization and tuning of GEMMs. It requires Docker with PyTorch 2.3 or later.
 For more information about enabling these features, see
 [vLLM inference](https://rocm.docs.amd.com/en/latest/how-to/llm-fine-tuning-optimization/llm-inference-frameworks.html#vllm-inference).
 ROCm has a vLLM branch for experimental features. This includes performance improvements, accuracy, and correctness testing.
 These features include:
 - FP8 GEMMs: To improve the performance of FP8 quantization, work is underway on tuning the GEMM using the shapes used
  in the model's execution. It only supports LLAMA because the QUARK quantizer currently only supports Llama.
 - Custom decode paged attention: Improves performance by efficiently managing memory and enabling faster attention
  computation in large-scale models. This benefits all workloads in `FP16` configurations.
 To enable these experimental new features, see
 [vLLM inference](https://rocm.docs.amd.com/en/latest/how-to/llm-fine-tuning-optimization/llm-inference-frameworks.html#vllm-inference).
 Use the `rocm/vllm` branch when cloning the GitHub repo. The `vllm/ROCm_performance.md` document outlines
 all the accessible features, and the `vllm/Dockerfile.rocm` file can be used.
 ### Enhanced performance tuning on AMD Instinct accelerators
 ROCm is pretuned for high-performance computing workloads including large language models, generative AI, and scientific computing.
 The ROCm documentation provides comprehensive guidance on configuring your system for AMD Instinct accelerators. It includes
 detailed instructions on system settings and application tuning suggestions to help you fully leverage the capabilities of these
 accelerators for optimal performance. For more information, see
 [AMD MI300X tuning guides](https://rocm.docs.amd.com/en/latest/how-to/tuning-guides/mi300x/index.html) and
 [AMD MI300A system optimization](https://rocm.docs.amd.com/en/latest/how-to/system-optimization/mi300x.html).
 ### Removed clang-ocl
 As of version 6.2, ROCm no longer provides the `clang-ocl` package. The project will be archived in the future.
 See the [clang-ocl README](https://github.com/ROCm/clang-ocl).
 ### ROCm documentation changes
 The documentation for the ROCm components has been reorganized and reformatted in a standard look and feel. This
 improves the usability and readability of the documentation. For more information about the ROCm components, see
 [What is ROCm?](https://rocm.docs.amd.com/en/latest/what-is-rocm.html).
 Since the release of ROCm 6.1, the documentation has added some key topics including:
 - [AMD Instinct MI300X workload tuning guide](https://rocm.docs.amd.com/en/latest/how-to/tuning-guides/mi300x/workload.html)
 - [AMD Instinct MI300X system tuning guide](https://rocm.docs.amd.com/en/latest/how-to/system-optimization/mi300x.html)
 - [AMD Instinct MI300A system tuning guide](https://rocm.docs.amd.com/en/latest/how-to/system-optimization/mi300a.html)
 - [Using ROCm for AI](https://rocm.docs.amd.com/en/latest/how-to/rocm-for-ai/index.html)
 - [Using ROCm for HPC](https://rocm.docs.amd.com/en/latest/how-to/rocm-for-hpc/index.html)
 - [Fine-tuning LLMs and inference optimization](https://rocm.docs.amd.com/en/latest/how-to/llm-fine-tuning-optimization/index.html)
 - [LLVM reference documentation](https://rocm.docs.amd.com/projects/llvm-project/en/latest/)
 The following topics have been significantly improved, expanded, or both:
 - [HIP programming manual](https://rocm.docs.amd.com/projects/HIP/en/latest/)
 - [Compatibility matrix](https://rocm.docs.amd.com/en/latest/compatibility/compatibility-matrix.html)
 ```{note}
 All ROCm projects are open source and available on GitHub. To contribute to ROCm documentation, see the
 [ROCm documentation contribution guidelines](https://rocm.docs.amd.com/en/latest/contribute/contributing.html).
 ```
--- a/tools/autotag/templates/support/6.1.2.md
+++ b/tools/autotag/templates/support/6.1.2.md
@@ -0,0 +1,3 @@
 ### OS support
 ROCm 6.1.2 has been tested against a pre-release version of Ubuntu 22.04.5 (kernel: 5.15 [GA], 6.8 [HWE]).
--- a/tools/autotag/templates/support/6.2.0.md
+++ b/tools/autotag/templates/support/6.2.0.md
@@ -0,0 +1,27 @@
 ## Operating system and hardware support changes
 ROCm 6.2.0 adds support for the following operating system and kernel versions.
 - Ubuntu 24.04 LTS (kernel: 6.8 [GA])
 - RHEL 8.10 (kernel: 4.18.0-544)
 - SLES 15 SP6 (kernel: 6.4)
 ROCm 6.2.0 marks the end of support (EoS) for:
 - Ubuntu 22.04.3
 - RHEL 9.2
 - RHEL 8.8
 - SLES 15 SP 4
 - CentOS 7.9
 ROCm 6.2.0 has been tested against pre-release Ubuntu 22.04.5 (kernel: 6.5 [HWE]).
 See the [Compatibility matrix](https://rocm-stg.amd.com/en/docs/6.2.0/compatibility/compatibility-matrix.html) for an
 overview of supported operating systems and hardware architectures.
--- a/tools/autotag/templates/upcoming_changes/6.1.2.md
+++ b/tools/autotag/templates/upcoming_changes/6.1.2.md
@@ -0,0 +1,9 @@
 ### HIPCC
 HIPCC for ROCm 6.1.2
 #### Changes
 * **Upcoming:** a future release will enable use of compiled binaries `hipcc.bin` and `hipconfig.bin` by default. No action is needed by users; you may continue calling high-level Perl scripts `hipcc` and `hipconfig`. `hipcc.bin` and `hipconfig.bin` will be invoked by the high-level Perl scripts. To revert to the previous behavior and invoke `hipcc.pl` and `hipconfig.pl`, set the `HIP_USE_PERL_SCRIPTS` environment variable to `1`.
 * **Upcoming:** a subsequent release will remove high-level Perl scripts `hipcc` and `hipconfig`. This release will remove the `HIP_USE_PERL_SCRIPTS` environment variable. It will rename `hipcc.bin` and `hipconfig.bin` to `hipcc` and `hipconfig` respectively. No action is needed by the users. To revert to the previous behavior, invoke `hipcc.pl` and `hipconfig.pl` explicitly.
 * **Upcoming:** a subsequent release will remove `hipcc.pl` and `hipconfig.pl`.
--- a/tools/autotag/templates/upcoming_changes/6.2.0.md
+++ b/tools/autotag/templates/upcoming_changes/6.2.0.md
@@ -0,0 +1,79 @@
 ## ROCm known issues
 ROCm known issues are noted on [{fab}`github` GitHub](https://github.com/ROCm/ROCm/labels/Verified%20Issue). For known
 issues related to individual components, review the [Detailed component changes](detailed-component-changes).
 ### Default processor affinity behavior for helper threads 
 Processor affinity is a critical setting to ensure that ROCm helper threads run on the correct cores. By default, ROCm
 helper threads are spawned on all available cores, ignoring the parent thread’s processor affinity. This can lead to
 threads competing for available cores, which may result in suboptimal performance. This behavior occurs by default if
 the environment variable `HSA_OVERRIDE_CPU_AFFINITY_DEBUG` is not set or is set to `1`. If
 `HSA_OVERRIDE_CPU_AFFINITY_DEBUG` is set to `0`, the ROCr runtime uses the parent process's core affinity mask when
 creating helper threads. The parent’s affinity mask should then be set to account for the presence of additional threads
 by ensuring the affinity mask contains enough cores. Depending on the affinity settings of the software environment,
 batch system, launch commands like `numactl`/`taskset`, or explicit mask manipulation by the application itself, changing
 the setting may be advantageous to performance.
 To ensure the parent's core affinity mask is honored by the ROCm helper threads, set the
 `HSA_OVERRIDE_CPU_AFFINITY_DEBUG` environment variable as follows:
 ```{code} shell
 export HSA_OVERRIDE_CPU_AFFINITY_DEBUG=0
 ```
 To ensure ROCm helper threads run on all available cores, set the `HSA_OVERRIDE_CPU_AFFINITY_DEBUG` environment variable
 as follows:
 ``` shell
 export HSA_OVERRIDE_CPU_AFFINITY_DEBUG=1
 ```
 Or the default:
 ``` shell
 unset HSA_OVERRIDE_CPU_AFFINITY_DEBUG
 ```
 If unsure of the default processor affinity settings for your environment, run the following command from the shell:
 ``` shell
 bash -c "echo taskset -p \$\$" 
 ```
 ### KFDTest failure on Instinct MI300X with Oracle Linux 8.9
 The `KFDEvictTest.QueueTest` is failing on the MI300X platform during KFD (Kernel Fusion Driver) tests, causing the full
 suite to not execute properly. This issue is suspected to be hardware-related.
 ### Bandwidth limitation in gang and non-gang modes on Instinct MI300A
 Expected target peak non-gang performance (~60GB/s) and target peak gang performance (~90GB/s) are not achieved. Both gang
 and non-gang performance are observed to be limited at 45GB/s.
 This issue will be addressed in a future ROCm release.
 ### rocm-llvm-alt
 ROCm provides an optional package -- `rocm-llvm-alt` -- that provides a closed-source compiler for
 users interested in additional closed-source CPU optimizations. This feature is not functional in
 the ROCm 6.2.0 release. Users who attempt to invoke the closed-source compiler will experience an
 LLVM consumer-producer mismatch and the compilation will fail. There is no workaround that allows
 use of the closed-source compiler. It is recommended to compile using the default open-source
 compiler, which generates high-quality AMD CPU and AMD GPU code.
 ## ROCm upcoming changes
 The section notes upcoming changes to the ROCm software stack. For upcoming changes related to individual components, review
 the [Detailed component changes](detailed-component-changes).
 ### rocm-llvm-alt
 The `rocm-llvm-alt` package will be removed in an upcoming release. Users relying on the
 functionality provided by the closed-source compiler should transition to the open-source compiler.
 Once the `rocm-llvm-alt` package is removed, any compilation requesting functionality provided by
 the closed-source compiler will result in a Clang warning: "*[AMD] proprietary optimization compiler
 has been removed*".
--- a/tools/autotag/util/changelog.py
+++ b/tools/autotag/util/changelog.py
@@ -16,7 +16,7 @@ class Changelog():
    def __init__(self, releases: Dict[str, ReleaseBundle]):
        self.releases = list(releases.items())
-        self.releases.sort(key=lambda x: Version(x[0]), reverse=True)
+        # self.releases.sort(key=lambda x: Version(x[0]), reverse=True)
        # For each library find the earliest ROCm release where it updated.
        rocm_ver_by_lib_ver: Dict[str, Dict[str, str]] = defaultdict(dict)
@@ -53,4 +53,4 @@ class Changelog():
            prev_lib_ver=self.prev_lib_ver
        )
-        output.write(content)
+        output.write(content)
--- a/tools/autotag/util/mappings.py
+++ b/tools/autotag/util/mappings.py
@@ -0,0 +1,20 @@
 category_mapping = {
    "libs": "Libraries",
    "tools": "Tools",
    "compilers": "Compilers",
    "runtimes": "Runtimes",
    "": "",
    None: "",
 }
 group_mapping = {
    "ml": "Machine Learning and Computer Vision",
    "communication": "Communication",
    "math": "Math",
    "primitives": "Primitives",
    "dev": "Development",
    "perf": "Performance",
    "system": "System",
    "": "",
    None: "",
 }
--- a/tools/autotag/util/release_data.py
+++ b/tools/autotag/util/release_data.py
@@ -1,28 +1,32 @@
 """Class to store data about a particular release."""
 from dataclasses import dataclass, field
 import os
 import re
 import shutil
-import sys
+from dataclasses import dataclass, field
-from typing import Optional, Union, Dict, List, Tuple
+from typing import Dict, List, Optional, Tuple, Union
-from github import Github, UnknownObjectException
+
 from github.Repository import Repository
 from github.Organization import Organization
 from github.NamedUser import NamedUser
 from git import Repo
 from git.cmd import Git
 from github import Github, UnknownObjectException
 from github.NamedUser import NamedUser
 from github.Organization import Organization
 from github.Repository import Repository
 from packaging.version import Version
 from util.util import get_yn_input
 from util.mappings import category_mapping, group_mapping
@dataclass
 class ReleaseData:
    """Store Github data for a release."""
    message: str = ""
    notes: str = ""
    changes: Dict[str, str] = field(default_factory=dict)
@dataclass
 class ReleaseLib:
    """Store data about a release for a particular library."""
@@ -34,6 +38,8 @@ class ReleaseLib:
    commit: str = ""
    rocm_version: str = ""
    lib_version: str = ""
    group: str = ""
    category: str = ""
    @property
    def qualified_repo(self) -> str:
@@ -64,6 +70,16 @@ class ReleaseLib:
    def release_url(self) -> str:
        """The Github URL of the release."""
        return f"https://github.com/{self.qualified_repo}/releases/tag/{self.tag}"
    @property
    def documentation_page(self) -> str:
        """The Read the Docs documentation site."""
        return f"https://rocm.docs.amd.com/projects/{self.qualified_repo}/en/latest"
    @property
    def repository_url(self) -> str:
        """The GitHub repository URL."""
        return f"https://github.com/ROCm/{self.qualified_repo}"
    @property
    def message(self) -> str:
@@ -92,9 +108,7 @@ class ReleaseLib:
        print(f"Release Message: '{self.data.message}'")
        print(f"Release Notes:\n{self.data.notes}")
        print(f"Release Commit: '{self.commit}'")
-        if get_yn_input(
+        if get_yn_input("Would you like to create this tag and release?", release_yn):
            "Would you like to create this tag and release?", release_yn
        ):
            try:
                print("Performing tag and release.")
                release = self.repo.create_git_tag_and_release(
@@ -142,9 +156,7 @@ class ReleaseLib:
            fork.push(f"refs/heads/release:refs/heads/{self.branch}")
        shutil.rmtree(repo_loc)
-        pr_title = (
+        pr_title = f"Hotfixes from {self.branch} at release {self.full_version}"
            f"Hotfixes from {self.branch} at release {self.full_version}"
        )
        pr_body = (
            "This is an autogenerated PR.\n This is intended to pull any"
            f" hotfixes for ROCm release {self.full_version} (including"
@@ -159,10 +171,11 @@ class ReleaseLib:
        print(f"Pull request created: {pr.html_url}")
        return pr
 class ReleaseDataFactory:
    """A factory for ReleaseData objects."""
-    lib_versions: Dict[str, str] = { }
+    lib_versions: Dict[str, str] = {}
    """A map of commit hashes to lib versions."""
    def __init__(
@@ -176,7 +189,9 @@ class ReleaseDataFactory:
        else:
            self.org, self.pr_org = self.get_org_or_user(org_name)
-    def get_org_or_user(self, name: str) -> Tuple[Union[NamedUser, Organization], Union[NamedUser, Organization]]:
+    def get_org_or_user(
        self, name: str
    ) -> Tuple[Union[NamedUser, Organization], Union[NamedUser, Organization]]:
        """Get a Github organization or user by name."""
        gh_ns: Union[NamedUser, Organization]
        pr_ns: Union[NamedUser, Organization]
@@ -188,12 +203,10 @@ class ReleaseDataFactory:
                gh_ns = self.gh.get_user(name)
                pr_ns = self.pr_gh.get_user(name)
            except UnknownObjectException as err:
-                raise ValueError(
+                raise ValueError(f"Could not find organization/user {name}.") from err
                    f"Could not find organization/user {name}."
                ) from err
        return gh_ns, pr_ns
-    def create_data(
+    def create_release_lib_data(
        self,
        name: str,
        commit: str,
@@ -219,6 +232,7 @@ class ReleaseDataFactory:
        )
        return data
@dataclass
 class ReleaseBundle:
    """Stores data about all the libraries bundled in this release."""
@@ -226,6 +240,7 @@ class ReleaseBundle:
    version: str = ""
    libraries: Dict[str, ReleaseLib] = field(default_factory=ReleaseLib)
 class ReleaseBundleFactory:
    gh: Github = None
@@ -234,16 +249,18 @@ class ReleaseBundleFactory:
    default_remote: str = ""
    """The default fallback remote."""
-    remotes: Dict[str, str] = { }
+    remotes: Dict[str, str] = {}
    """A dictionary translating the manifest remote shorthand to the full name."""
-    tags: Dict[str, Dict[Version, str]] = { }
+    tags: Dict[str, Dict[Version, str]] = {}
    """A dictionary with all the ROCm version numbers and commit sha for each library."""
-    orgs_and_users: Dict[str, Tuple[Union[NamedUser, Organization], Union[NamedUser, Organization]]] = { }
+    orgs_and_users: Dict[
        str, Tuple[Union[NamedUser, Organization], Union[NamedUser, Organization]]
    ] = {}
    """A dictionary containing the base and PR user or organization for each project."""
-    pr_repos: Dict[str, Tuple[Repo, Repo]] = { }
+    pr_repos: Dict[str, Tuple[Repo, Repo]] = {}
    """A dictionary containing the base and PR repo for each project."""
    def __init__(
@@ -253,15 +270,15 @@ class ReleaseBundleFactory:
        pr_gh: Github,
        default_remote: str,
        remotes: Dict[str, str],
-        branch: Optional[str]
+        branch: Optional[str],
    ):
        # Store Github data
-        self.gh    = gh
+        self.gh = gh
        self.pr_gh = pr_gh
        self.default_remote = default_remote
-        self.remotes        = remotes
+        self.remotes = remotes
-        self.branch         = branch
+        self.branch = branch
        # Get the main repository:
        self.rocm_repo = gh.get_repo(rocm_repo)
@@ -271,8 +288,10 @@ class ReleaseBundleFactory:
        if remote in self.remotes:
            return self.remotes[remote]
        return self.default_remote
-    
+
-    def get_org_or_user(self, remote: str) -> Tuple[Union[NamedUser, Organization], Union[NamedUser, Organization]]:
+    def get_org_or_user(
        self, remote: str
    ) -> Tuple[Union[NamedUser, Organization], Union[NamedUser, Organization]]:
        """Gets the base and PR organization or user associated to a remote."""
        if remote not in self.orgs_and_users:
            try:
@@ -329,7 +348,7 @@ class ReleaseBundleFactory:
    def fetch_tags(self, url: str) -> Dict[Version, str]:
        """Fetches a version-sha map for a given Git URL."""
-        result: Dict[Version, str] = { }
+        result: Dict[Version, str] = {}
        for line in Git().ls_remote("--tags", url).split("\n"):
            column = line.split("\t")
            sha = column[0]
@@ -344,20 +363,23 @@ class ReleaseBundleFactory:
            result[Version(rocm_ver)] = sha
        return result
-    def create_data(
+    def create_release_bundle_data(
        self,
        version: Version,
-        names_and_remotes: List[Tuple[str, str]],
+        component_info: List[Tuple[str, str]],
-        is_untagged: bool=False
+        is_untagged: bool = False,
    ) -> ReleaseBundle:
        """Create a release bundle of libraries."""
        tag_name = f"rocm-{version}"
-        libraries = { }
+        libraries = {}
        missing_branches = []
-        
+
        prev_group = None
        prev_category = None
        print(f"\nLibraries for rocm-{version}:")
-        for name, remote in names_and_remotes:
+        for name, remote, group, category in component_info:
            repo, pr_repo = self.get_repos(name, remote)
            # Find the tag and otherwise
@@ -375,20 +397,30 @@ class ReleaseBundleFactory:
                    print(f"  - Could not find branch : {self.branch}")
                    missing_branches.append(f"{self.branch} for {name}")
                    continue
-            
+
            if prev_group == group:
                group = ""
            else:
                prev_group = group
            if prev_category == category:
                category = ""
            else:
                prev_category = category
            libraries[name] = ReleaseLib(
                name=name,
                repo=repo,
                pr_repo=pr_repo,
                commit=commit,
                rocm_version=str(version),
                group=group_mapping[group],
                category=category_mapping[category],
            )
            print(f"- {name:11} {commit}")
-        data = ReleaseBundle(
+        data = ReleaseBundle(version=version, libraries=libraries)
            version=version,
            libraries=libraries
        )
        for missing in missing_branches:
            print(f"Could not find the following branch: {missing}")
@@ -398,8 +430,8 @@ class ReleaseBundleFactory:
    def create_data_dict(
        self,
        up_to_version: str,
-        names_and_remotes: List[Tuple[str, str]],
+        component_information: List[Tuple[str, str]],
-        min_version: str = "5.0.0"
+        min_version: str = "5.0.0",
    ) -> Dict[str, ReleaseBundle]:
        """Create a map of versions and release bundles."""
@@ -417,6 +449,8 @@ class ReleaseBundleFactory:
        for version in versions:
            if version >= Version(min_version) and version <= max_version:
                can_be_untagged = version == max_version
-                data[str(version)] = self.create_data(version, names_and_remotes, can_be_untagged)
+                data[str(version)] = self.create_release_bundle_data(
                    version, component_information, can_be_untagged
                )
        return data
`@@ -1,2 +1,2 @@`
	`rocm-docs-core==1.6.1`	`rocm-docs-core==1.6.1`
	`sphinx-reredirects`	`sphinx-reredirects`
		`@@ -0,0 +1 @@`
							`ROCm 6.1.2 includes enhancements to SMI tools and improvements to some libraries.`
		`@@ -0,0 +1,3 @@`
							`### OS support`

							`ROCm 6.1.2 has been tested against a pre-release version of Ubuntu 22.04.5 (kernel: 5.15 [GA], 6.8 [HWE]).`