diff --git a/.wordlist.txt b/.wordlist.txt index c6bf56501..28d35a018 100644 --- a/.wordlist.txt +++ b/.wordlist.txt @@ -1,4 +1,11 @@ GEMM +FFTs +GEMMs +ROCk +SIGQUIT +backend +conformant +optimizers autogenerated cuFFT NVCC diff --git a/CHANGELOG.md b/CHANGELOG.md index 8d9685d04..d7be8bae3 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -2072,7 +2072,7 @@ rocBLAS 2.46.0 for ROCm 5.4.0 ##### Fixed -- FORTRAN interfaces generalized for FORTRAN compilers other than gfortran +- Fortran interfaces generalized for Fortran compilers other than gfortran - fix for trsm_strided_batched rocblas-bench performance gathering - Fix for rocm-smi path in commandrunner.py script to match ROCm 5.2 and above diff --git a/docs/CHANGELOG.md b/docs/CHANGELOG.md index 8d9685d04..d7be8bae3 100644 --- a/docs/CHANGELOG.md +++ b/docs/CHANGELOG.md @@ -2072,7 +2072,7 @@ rocBLAS 2.46.0 for ROCm 5.4.0 ##### Fixed -- FORTRAN interfaces generalized for FORTRAN compilers other than gfortran +- Fortran interfaces generalized for Fortran compilers other than gfortran - fix for trsm_strided_batched rocblas-bench performance gathering - Fix for rocm-smi path in commandrunner.py script to match ROCm 5.2 and above diff --git a/docs/reference/index.md b/docs/reference/index.md index 473f93184..e23ff8e73 100644 --- a/docs/reference/index.md +++ b/docs/reference/index.md @@ -22,7 +22,7 @@ HIP is both AMD's GPU programming language extension and the GPU runtime. HIP Math Libraries support the following domains: * [Linear Algebra Libraries](./libraries/gpu-libraries/math-linear-algebra.md) -* [Fast Fourier Transforms](./libraries/gpu-libraries/math-fft.md) +* [Fast Fourier transforms (FFTs)](./libraries/gpu-libraries/math-fft.md) * [Random Numbers](./libraries/gpu-libraries/rand.md) ::: diff --git a/docs/reference/libraries/gpu-libraries/math-fft.md b/docs/reference/libraries/gpu-libraries/math-fft.md index f0cc0fd50..11bc2e710 100644 --- a/docs/reference/libraries/gpu-libraries/math-fft.md +++ b/docs/reference/libraries/gpu-libraries/math-fft.md @@ -1,4 +1,4 @@ -# Fast Fourier Transforms +# Fast Fourier transforms ROCm libraries for FFT are as follows: diff --git a/docs/reference/libraries/gpu-libraries/math.md b/docs/reference/libraries/gpu-libraries/math.md index 6dc7f2bf8..0993f0e39 100644 --- a/docs/reference/libraries/gpu-libraries/math.md +++ b/docs/reference/libraries/gpu-libraries/math.md @@ -32,7 +32,7 @@ at compile-time of the hipLIB in question. For dynamic dispatch between vendor i ::: :::{grid-item-card} -**[Fast Fourier Transforms](./math-fft.md)** +**[Fast Fourier transforms (FFTs)](./math-fft.md)** * {doc}`rocFFT ` * {doc}`hipFFT ` diff --git a/docs/rocm-a-z.md b/docs/rocm-a-z.md index fc76e734e..4a0f0e20d 100644 --- a/docs/rocm-a-z.md +++ b/docs/rocm-a-z.md @@ -5,42 +5,59 @@ | ROCm product | Description | | :---------------- | :------------ | +| [AMD Compute Language Runtimes (CLR)](https://github.com/ROCm-Developer-Tools/clr) | Contains source code for AMD's compute languages runtimes: {doc}`HIP ` and OpenCL | | [AMDMIGraphX](https://github.com/ROCmSoftwarePlatform/AMDMIGraphX/) | A graph inference engine that accelerates machine learning model inference | +| [AOMP](https://github.com/ROCm-Developer-Tools/aomp/) | A scripted build of [LLVM](https://github.com/RadeonOpenCompute/llvm-project) and supporting software | +| [Asynchronous Task and Memory Interface (ATMI)](https://github.com/RadeonOpenCompute/atmi/) | A runtime framework for efficient task management in heterogeneous CPU-GPU systems | | [Composable Kernel](https://rocm.docs.amd.com/projects/composable_kernel/en/latest/) | A library that aims to provide a programming model for writing performance critical kernels for machine learning workloads across multiple architectures | +| [Flang](https://github.com/ROCm-Developer-Tools/flang/) | An out-of-tree Fortran compiler targeting LLVM | +| [Half-precision floating point library (half)](https://github.com/ROCmSoftwarePlatform/half/) | A C++ header-only library that provides an IEEE 754 conformant, 16-bit half-precision floating-point type along with corresponding arithmetic operators, type conversions, and common mathematical functions | | {doc}`HIP ` | AMD’s GPU programming language extension and the GPU runtime | | [hipBLAS](https://github.com/ROCmSoftwarePlatform/hipBLAS/) | A BLAS-marshaling library that supports [rocBLAS](https://rocm.docs.amd.com/projects/rocBLAS/en/latest/) and cuBLAS backends | | [HIPCC](https://rocm.docs.amd.com/projects/HIPCC/en/latest/) | A compiler driver utility that calls Clang or NVCC and passes the appropriate include and library options for the target compiler and HIP infrastructure | | [hipCUB](https://rocm.docs.amd.com/projects/hipCUB/en/latest/) | A thin header-only wrapper library on top of [rocPRIM](https://rocm.docs.amd.com/projects/rocPRIM/en/latest/) or CUB that allows project porting using the CUB library to the HIP layer | | [hipFFT](https://rocm.docs.amd.com/projects/hipFFT/en/latest/) | An FFT-marshalling library that supports rocFFT or cuFFT backends | +| [hipfort](https://rocm.docs.amd.com/projects/hipfort/en/latest/) | A Fortran interface library for accessing GPU Kernels | | {doc}`HIPIFY ` | A set of tools for translating CUDA source code into portable HIP C++ | | [hipify-clang](https://rocm.docs.amd.com/projects/HIPIFY/en/latest/hipify-clang.html) | A Clang-based tool for translating CUDA sources into HIP sources | | [hipify-perl](https://rocm.docs.amd.com/projects/HIPIFY/en/latest/hipify-perl.html) | An autogenerated, perl-based script that translates CUDA source code into portable HIP C++ | | [hipSOLVER](https://rocm.docs.amd.com/projects/hipSOLVER/en/latest/) | A LAPACK-marshalling library that supports [rocSOLVER](https://rocm.docs.amd.com/projects/rocSOLVER/en/latest/) and cuSOLVER backends | | [hipSPARSE](https://rocm.docs.amd.com/projects/hipSPARSE/en/latest/) | A SPARSE-marshalling library that supports [rocSPARSE](https://rocm.docs.amd.com/projects/rocSPARSE/en/latest/) and cuSPARSE backends | +| [hipTensor](https://github.com/ROCmSoftwarePlatform/hipTensor) | AMD's C++ library for accelerating tensor primitives based on the composable kernel library | +| [LLVM](https://github.com/RadeonOpenCompute/llvm-project) | A toolkit for the construction of highly optimized compilers, optimizers, and run-time environments | | [MIGraphX](https://rocm.docs.amd.com/projects/AMDMIGraphX/en/latest/) | A graph inference engine that accelerates machine learning model inference | | [MIOpen](https://rocm.docs.amd.com/projects/MIOpen/en/latest/) | An open source deep-learning library | | [MIOpenGEMM](https://github.com/ROCmSoftwarePlatform/MIOpenGEMM) | An OpenCL general matrix multiplication (GEMM) API and kernel generator | | [MIOpenTensile](https://github.com/ROCmSoftwarePlatform/MIOpenTensile) | Provides host-callable interfaces to Tensile library | | [MIVisionX](https://rocm.docs.amd.com/projects/MIVisionX/en/latest/doxygen/html/index.html) | A set of comprehensive computer vision and machine learning libraries, utilities, and applications | -| [Radeon Compute Profiler (RCP)](https://github.com/GPUOpen-Tools/radeon_compute_profiler/) | A performance analysis tool that gathers data from the API run-time and GPU for OpenCL™ and ROCm/HSA applications | +| [Radeon Compute Profiler (RCP)](https://github.com/GPUOpen-Tools/radeon_compute_profiler/) | A performance analysis tool that gathers data from the API run-time and GPU for OpenCL and ROCm/HSA applications | | [RCCL](https://rocm.docs.amd.com/projects/rccl/en/latest/) | A standalone library that provides multi-GPU and multi-node collective communication primitives | | [rocAL](https://rocm.docs.amd.com/projects/rocAL/en/latest/doxygen/html/index.html) | An augmentation library designed to decode and process images and videos | +| [rocALUTION](https://rocm.docs.amd.com/projects/rocALUTION/en/latest/) | A sparse linear algebra library for exploring fine-grained parallelism on ROCm runtime and toolchains | +| [RocBandwidthTest](https://github.com/RadeonOpenCompute/rocm_bandwidth_test/) | Captures the performance characteristics of buffer copying and kernel read/write operations | | [rocBLAS](https://rocm.docs.amd.com/projects/rocBLAS/en/latest/)| A BLAS implementation (in the HIP programming language) on ROCm's runtime and toolchains | +| [rocFFT](https://rocm.docs.amd.com/projects/rocFFT/en/latest/) | A software library for computing fast Fourier transforms (FFTs) written in HIP | | [ROCK-Kernel-Driver](https://github.com/RadeonOpenCompute/ROCK-Kernel-Driver/) | An AMDGPU Driver with KFD that is used by ROCm | | [ROCmCC](https://rocm.docs.amd.com/en/latest/reference/rocmcc/rocmcc.html) | A Clang/LLVM-based compiler | | [ROCm cmake](https://github.com/RadeonOpenCompute/rocm-cmake) | A collection of CMake modules for common build and development tasks | | [ROCm Data Center Tool](https://rocm.docs.amd.com/projects/rdc/en/latest/) | Simplifies administration and addresses key infrastructure challenges in AMD GPUs in cluster and data-center environments | +| [ROCm Debug Agent Library (ROCdebug-agent)](https://github.com/ROCm-Developer-Tools/rocr_debug_agent/) | A library that can print the state of all AMD GPU wavefronts that caused a queue error by sending a SIGQUIT signal to the process while the program is running | | [ROCm Debugger (ROCgdb)](https://rocm.docs.amd.com/projects/ROCgdb/en/latest/) | A source-level debugger for Linux, based on the GNU Debugger (GDB) | | [ROCm Debugger API (ROCdbgapi)](https://rocm.docs.amd.com/projects/ROCdbgapi/en/latest/) | The ROCm debugger library | +| [rocminfo](https://github.com/RadeonOpenCompute/rocminfo/) | Reports system information | | [ROCm SMI](https://github.com/RadeonOpenCompute/rocm_smi_lib/) | A C library for Linux that provides a user space interface for applications to monitor and control GPU applications | | [ROCm Validation Suite](https://rocm.docs.amd.com/projects/ROCmValidationSuite/en/latest/) | A tool for detecting and troubleshooting common problems affecting AMD GPUs running in a high-performance computing environment | | [rocPRIM](https://rocm.docs.amd.com/projects/rocPRIM/en/latest/) | A header-only library for HIP parallel primitives | | [ROCProfiler](https://rocm.docs.amd.com/projects/rocprofiler/en/latest/profiler_home_page.html) | A profiling tool for HIP applications | +| [rocRAND](https://rocm.docs.amd.com/projects/rocRAND/en/latest/) | Provides functions that generate pseudo-random and quasi-random numbers | | [ROCR-Runtime](https://github.com/RadeonOpenCompute/ROCR-Runtime/) | User-mode API interfaces and libraries necessary for host applications to launch compute kernels on available HSA ROCm kernel agents | | [rocSOLVER](https://rocm.docs.amd.com/projects/rocSOLVER/en/latest/) | An implementation of LAPACK routines on the ROCm platform, implemented in the HIP programming language and optimized for AMD’s latest discrete GPUs | | [rocSPARSE](https://rocm.docs.amd.com/projects/rocSPARSE/en/latest/) | Exposes a common interface that provides BLAS for sparse computation implemented on ROCm runtime and toolchains (in the HIP programming language) | | [rocThrust](https://rocm.docs.amd.com/projects/rocThrust/en/latest/) | A parallel algorithm library | +| [ROCT-Thunk-Interface](https://github.com/RadeonOpenCompute/ROCT-Thunk-Interface/) | User-mode API interfaces used to interact with the ROCk driver | | [ROCTracer](https://rocm.docs.amd.com/projects/roctracer/en/latest/) | Intercepts runtime API calls and traces asynchronous activity | +| [rocWMMA](https://rocm.docs.amd.com/projects/rocWMMA/en/latest/index.html) | A C++ library for accelerating mixed-precision matrix multiply-accumulate (MMA) operations | +| [Tensile](https://github.com/ROCmSoftwarePlatform/Tensile) | A tool for creating benchmark-driven backend libraries for GEMMs, GEMM-like problems, and general N-dimensional tensor contractions | | [TransferBench](https://rocm.docs.amd.com/projects/TransferBench/en/latest/) | A utility to benchmark simultaneous transfers between user-specified devices (CPUs/GPUs) | ::: diff --git a/docs/sphinx/_toc.yml.in b/docs/sphinx/_toc.yml.in index 8c9da7736..f8ea72e9b 100644 --- a/docs/sphinx/_toc.yml.in +++ b/docs/sphinx/_toc.yml.in @@ -10,8 +10,8 @@ subtrees: title: What is ROCm? subtrees: - entries: - - file: rocm-ai.md - title: ROCm & AI + - file: rocm-a-z.md + title: ROCm A-Z - file: about/whats-new/whats-new.md title: What's new? @@ -218,9 +218,6 @@ subtrees: - file: conceptual/ai-migraphx-optimization.md title: Inference optimization with MIGraphX - - file: rocm-a-z.md - title: ROCm A-Z - - file: contribute/index.md title: Contributing subtrees: