diff --git a/docs/reference/gpu-arch-specs.rst b/docs/reference/gpu-arch-specs.rst index df9068c48..8686e3303 100644 --- a/docs/reference/gpu-arch-specs.rst +++ b/docs/reference/gpu-arch-specs.rst @@ -1,9 +1,9 @@ .. meta:: :description: AMD Instinct™ accelerator, AMD Radeon PRO™, and AMD Radeon™ GPU architecture information - :keywords: Instinct, Radeon, accelerator, CDNA, GPU, architecture, VRAM, Compute Units, Cache, Registers, LDS, Register File + :keywords: Instinct, Radeon, accelerator, GCN, CDNA, RDNA, GPU, architecture, VRAM, Compute Units, Cache, Registers, LDS, Register File Accelerator and GPU hardware specifications -###################################################### +=========================================== The following tables provide an overview of the hardware specifications for AMD Instinct™ accelerators, and AMD Radeon™ PRO and Radeon™ GPUs. @@ -659,4 +659,100 @@ The following tables provide an overview of the hardware specifications for AMD - 256 - 12.5 -For more information on the terms used here, see the :ref:`specific documents and guides ` or :doc:`Understanding the HIP programming model`. +Glossary +======== + +For more information about the terms used, see the +:ref:`specific documents and guides `, or +:doc:`Understanding the HIP programming model`. + +**LLVM target name** + +Argument to pass to clang in `--offload-arch` to compile code for the given +architecture. + +**VRAM** + +Amount of memory available on the GPU. + +**Compute Units** + +Number of compute units on the GPU. + +**Wavefront Size** + +Amount of work items that execute in parallel on a single compute unit. This +is equivalent to the warp size in HIP. + +**LDS** + +The Local Data Share (LDS) is a low-latency, high-bandwidth scratch pad +memory. It is local to the compute units, and can be shared by all work items +in a work group. In HIP, the LDS can be used for shared memory, which is +shared by all threads in a block. + +**L3 Cache (CDNA/GCN only)** + +Size of the level 3 cache. Shared by all compute units on the same GPU. Caches +data and instructions. Similar to the Infinity Cache on RDNA architectures. + +**Infinity Cache (RDNA only)** + +Size of the infinity cache. Shared by all compute units on the same GPU. Caches +data and instructions. Similar to the L3 Cache on CDNA/GCN architectures. + +**L2 Cache** + +Size of the level 3 cache. Shared by all compute units on the same GCD. Caches +data and instructions. + +**Graphics L1 Cache (RDNA only)** + +An additional cache level that only exists in RDNA architectures. Local to a +work group processor. + +**L1 Vector Cache (CDNA/GCN only)** + +Size of the level 1 vector data cache. Local to a compute unit. This is the L0 +vector cache in RDNA architectures. + +**L1 Scalar Cache (CDNA/GCN only)** + +Size of the level 1 scalar data cache. Usually shared by several compute +units. This is the L0 scalar cache in RDNA architectures. + +**L1 Instruction Cache (CDNA/GCN only)** + +Size of the level 1 instruction cache. Usually shared by several compute +units. This is the L0 instruction cache in RDNA architectures. + +**L0 Vector Cache (RDNA only)** + +Size of the level 0 vector data cache. Local to a compute unit. This is the L1 +vector cache in CDNA/GCN architectures. + +**L0 Scalar Cache (RDNA only)** + +Size of the level 0 scalar data cache. Usually shared by several compute +units. This is the L1 scalar cache in CDNA/GCN architectures. + +**L0 Instruction Cache (RDNA only)** + +Size of the level 0 instruction cache. Usually shared by several compute +units. This is the L1 instruction cache in CDNA/GCN architectures. + +**VGPR File** + +Size of the Vector General Purpose Register (VGPR) file and. It holds data used in +vector instructions. +GPUs with matrix cores also have AccVGPRs, which are Accumulation General +Purpose Vector Registers, used specifically in matrix instructions. + +**SGPR File** + +Size of the Scalar General Purpose Register (SGPR) file. Holds data used in +scalar instructions. + +**GCD** + +Graphics Compute Die.