Merge pull request #3163 from peterjunpark/docs/6.1.1

docs/6.1.1: Re-add glossary to hardware specification table (#3073)
This commit is contained in:
Peter Park
2024-05-27 14:15:42 -07:00
committed by GitHub

View File

@@ -1,9 +1,9 @@
.. meta::
:description: AMD Instinct™ accelerator, AMD Radeon PRO™, and AMD Radeon™ GPU architecture information
:keywords: Instinct, Radeon, accelerator, CDNA, GPU, architecture, VRAM, Compute Units, Cache, Registers, LDS, Register File
:keywords: Instinct, Radeon, accelerator, GCN, CDNA, RDNA, GPU, architecture, VRAM, Compute Units, Cache, Registers, LDS, Register File
Accelerator and GPU hardware specifications
######################################################
===========================================
The following tables provide an overview of the hardware specifications for AMD Instinct™ accelerators, and AMD Radeon™ PRO and Radeon™ GPUs.
@@ -659,4 +659,100 @@ The following tables provide an overview of the hardware specifications for AMD
- 256
- 12.5
For more information on the terms used here, see the :ref:`specific documents and guides <gpu-arch-documentation>` or :doc:`Understanding the HIP programming model<hip:understand/programming_model>`.
Glossary
========
For more information about the terms used, see the
:ref:`specific documents and guides <gpu-arch-documentation>`, or
:doc:`Understanding the HIP programming model<hip:understand/programming_model>`.
**LLVM target name**
Argument to pass to clang in `--offload-arch` to compile code for the given
architecture.
**VRAM**
Amount of memory available on the GPU.
**Compute Units**
Number of compute units on the GPU.
**Wavefront Size**
Amount of work items that execute in parallel on a single compute unit. This
is equivalent to the warp size in HIP.
**LDS**
The Local Data Share (LDS) is a low-latency, high-bandwidth scratch pad
memory. It is local to the compute units, and can be shared by all work items
in a work group. In HIP, the LDS can be used for shared memory, which is
shared by all threads in a block.
**L3 Cache (CDNA/GCN only)**
Size of the level 3 cache. Shared by all compute units on the same GPU. Caches
data and instructions. Similar to the Infinity Cache on RDNA architectures.
**Infinity Cache (RDNA only)**
Size of the infinity cache. Shared by all compute units on the same GPU. Caches
data and instructions. Similar to the L3 Cache on CDNA/GCN architectures.
**L2 Cache**
Size of the level 3 cache. Shared by all compute units on the same GCD. Caches
data and instructions.
**Graphics L1 Cache (RDNA only)**
An additional cache level that only exists in RDNA architectures. Local to a
work group processor.
**L1 Vector Cache (CDNA/GCN only)**
Size of the level 1 vector data cache. Local to a compute unit. This is the L0
vector cache in RDNA architectures.
**L1 Scalar Cache (CDNA/GCN only)**
Size of the level 1 scalar data cache. Usually shared by several compute
units. This is the L0 scalar cache in RDNA architectures.
**L1 Instruction Cache (CDNA/GCN only)**
Size of the level 1 instruction cache. Usually shared by several compute
units. This is the L0 instruction cache in RDNA architectures.
**L0 Vector Cache (RDNA only)**
Size of the level 0 vector data cache. Local to a compute unit. This is the L1
vector cache in CDNA/GCN architectures.
**L0 Scalar Cache (RDNA only)**
Size of the level 0 scalar data cache. Usually shared by several compute
units. This is the L1 scalar cache in CDNA/GCN architectures.
**L0 Instruction Cache (RDNA only)**
Size of the level 0 instruction cache. Usually shared by several compute
units. This is the L1 instruction cache in CDNA/GCN architectures.
**VGPR File**
Size of the Vector General Purpose Register (VGPR) file and. It holds data used in
vector instructions.
GPUs with matrix cores also have AccVGPRs, which are Accumulation General
Purpose Vector Registers, used specifically in matrix instructions.
**SGPR File**
Size of the Scalar General Purpose Register (SGPR) file. Holds data used in
scalar instructions.
**GCD**
Graphics Compute Die.