mirror of
https://github.com/ROCm/ROCm.git
synced 2026-01-09 14:48:06 -05:00
Merge pull request #3163 from peterjunpark/docs/6.1.1
docs/6.1.1: Re-add glossary to hardware specification table (#3073)
This commit is contained in:
@@ -1,9 +1,9 @@
|
||||
.. meta::
|
||||
:description: AMD Instinct™ accelerator, AMD Radeon PRO™, and AMD Radeon™ GPU architecture information
|
||||
:keywords: Instinct, Radeon, accelerator, CDNA, GPU, architecture, VRAM, Compute Units, Cache, Registers, LDS, Register File
|
||||
:keywords: Instinct, Radeon, accelerator, GCN, CDNA, RDNA, GPU, architecture, VRAM, Compute Units, Cache, Registers, LDS, Register File
|
||||
|
||||
Accelerator and GPU hardware specifications
|
||||
######################################################
|
||||
===========================================
|
||||
|
||||
The following tables provide an overview of the hardware specifications for AMD Instinct™ accelerators, and AMD Radeon™ PRO and Radeon™ GPUs.
|
||||
|
||||
@@ -659,4 +659,100 @@ The following tables provide an overview of the hardware specifications for AMD
|
||||
- 256
|
||||
- 12.5
|
||||
|
||||
For more information on the terms used here, see the :ref:`specific documents and guides <gpu-arch-documentation>` or :doc:`Understanding the HIP programming model<hip:understand/programming_model>`.
|
||||
Glossary
|
||||
========
|
||||
|
||||
For more information about the terms used, see the
|
||||
:ref:`specific documents and guides <gpu-arch-documentation>`, or
|
||||
:doc:`Understanding the HIP programming model<hip:understand/programming_model>`.
|
||||
|
||||
**LLVM target name**
|
||||
|
||||
Argument to pass to clang in `--offload-arch` to compile code for the given
|
||||
architecture.
|
||||
|
||||
**VRAM**
|
||||
|
||||
Amount of memory available on the GPU.
|
||||
|
||||
**Compute Units**
|
||||
|
||||
Number of compute units on the GPU.
|
||||
|
||||
**Wavefront Size**
|
||||
|
||||
Amount of work items that execute in parallel on a single compute unit. This
|
||||
is equivalent to the warp size in HIP.
|
||||
|
||||
**LDS**
|
||||
|
||||
The Local Data Share (LDS) is a low-latency, high-bandwidth scratch pad
|
||||
memory. It is local to the compute units, and can be shared by all work items
|
||||
in a work group. In HIP, the LDS can be used for shared memory, which is
|
||||
shared by all threads in a block.
|
||||
|
||||
**L3 Cache (CDNA/GCN only)**
|
||||
|
||||
Size of the level 3 cache. Shared by all compute units on the same GPU. Caches
|
||||
data and instructions. Similar to the Infinity Cache on RDNA architectures.
|
||||
|
||||
**Infinity Cache (RDNA only)**
|
||||
|
||||
Size of the infinity cache. Shared by all compute units on the same GPU. Caches
|
||||
data and instructions. Similar to the L3 Cache on CDNA/GCN architectures.
|
||||
|
||||
**L2 Cache**
|
||||
|
||||
Size of the level 3 cache. Shared by all compute units on the same GCD. Caches
|
||||
data and instructions.
|
||||
|
||||
**Graphics L1 Cache (RDNA only)**
|
||||
|
||||
An additional cache level that only exists in RDNA architectures. Local to a
|
||||
work group processor.
|
||||
|
||||
**L1 Vector Cache (CDNA/GCN only)**
|
||||
|
||||
Size of the level 1 vector data cache. Local to a compute unit. This is the L0
|
||||
vector cache in RDNA architectures.
|
||||
|
||||
**L1 Scalar Cache (CDNA/GCN only)**
|
||||
|
||||
Size of the level 1 scalar data cache. Usually shared by several compute
|
||||
units. This is the L0 scalar cache in RDNA architectures.
|
||||
|
||||
**L1 Instruction Cache (CDNA/GCN only)**
|
||||
|
||||
Size of the level 1 instruction cache. Usually shared by several compute
|
||||
units. This is the L0 instruction cache in RDNA architectures.
|
||||
|
||||
**L0 Vector Cache (RDNA only)**
|
||||
|
||||
Size of the level 0 vector data cache. Local to a compute unit. This is the L1
|
||||
vector cache in CDNA/GCN architectures.
|
||||
|
||||
**L0 Scalar Cache (RDNA only)**
|
||||
|
||||
Size of the level 0 scalar data cache. Usually shared by several compute
|
||||
units. This is the L1 scalar cache in CDNA/GCN architectures.
|
||||
|
||||
**L0 Instruction Cache (RDNA only)**
|
||||
|
||||
Size of the level 0 instruction cache. Usually shared by several compute
|
||||
units. This is the L1 instruction cache in CDNA/GCN architectures.
|
||||
|
||||
**VGPR File**
|
||||
|
||||
Size of the Vector General Purpose Register (VGPR) file and. It holds data used in
|
||||
vector instructions.
|
||||
GPUs with matrix cores also have AccVGPRs, which are Accumulation General
|
||||
Purpose Vector Registers, used specifically in matrix instructions.
|
||||
|
||||
**SGPR File**
|
||||
|
||||
Size of the Scalar General Purpose Register (SGPR) file. Holds data used in
|
||||
scalar instructions.
|
||||
|
||||
**GCD**
|
||||
|
||||
Graphics Compute Die.
|
||||
|
||||
Reference in New Issue
Block a user