diff --git a/.wordlist.txt b/.wordlist.txt index 6fb970a93..abbbf3c0f 100644 --- a/.wordlist.txt +++ b/.wordlist.txt @@ -132,6 +132,7 @@ GDS GEMM GEMMs GFortran +GFXIP Gemma GiB GIM diff --git a/docs/reference/gpu-arch-specs.rst b/docs/reference/gpu-arch-specs.rst index ef689fc58..002bcbc42 100644 --- a/docs/reference/gpu-arch-specs.rst +++ b/docs/reference/gpu-arch-specs.rst @@ -21,8 +21,6 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil - Model - Architecture - LLVM target name - - Device Major version - - Device Minor version - VRAM (GiB) - Compute Units - Wavefront Size @@ -34,12 +32,12 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil - L1 Instruction Cache (KiB) - VGPR File (KiB) - SGPR File (KiB) + - GFXIP Major version + - GFXIP Minor version * - MI325X - CDNA3 - gfx942 - - 9 - - 4 - 256 - 304 (38 per XCD) - 64 @@ -51,12 +49,12 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil - 64 per 2 CUs - 512 - 12.5 + - 9 + - 4 * - MI300X - CDNA3 - gfx942 - - 9 - - 4 - 192 - 304 (38 per XCD) - 64 @@ -68,12 +66,12 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil - 64 per 2 CUs - 512 - 12.5 + - 9 + - 4 * - MI300A - CDNA3 - gfx942 - - 9 - - 4 - 128 - 228 (38 per XCD) - 64 @@ -85,12 +83,12 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil - 64 per 2 CUs - 512 - 12.5 + - 9 + - 4 * - MI250X - CDNA2 - gfx90a - - 9 - - 0 - 128 - 220 (110 per GCD) - 64 @@ -102,12 +100,12 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil - 32 per 2 CUs - 512 - 12.5 + - 9 + - 0 * - MI250 - CDNA2 - gfx90a - - 9 - - 0 - 128 - 208 (104 per GCD) - 64 @@ -119,12 +117,12 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil - 32 per 2 CUs - 512 - 12.5 + - 9 + - 0 * - MI210 - CDNA2 - gfx90a - - 9 - - 0 - 64 - 104 - 64 @@ -136,12 +134,12 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil - 32 per 2 CUs - 512 - 12.5 + - 9 + - 0 * - MI100 - CDNA - gfx908 - - 9 - - 0 - 32 - 120 - 64 @@ -153,12 +151,12 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil - 32 per 3 CUs - 256 VGPR and 256 AccVGPR - 12.5 + - 9 + - 0 * - MI60 - GCN5.1 - gfx906 - - 9 - - 0 - 32 - 64 - 64 @@ -170,12 +168,12 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil - 32 per 3 CUs - 256 - 12.5 + - 9 + - 0 * - MI50 (32GB) - GCN5.1 - gfx906 - - 9 - - 0 - 32 - 60 - 64 @@ -187,12 +185,12 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil - 32 per 3 CUs - 256 - 12.5 + - 9 + - 0 * - MI50 (16GB) - GCN5.1 - gfx906 - - 9 - - 0 - 16 - 60 - 64 @@ -204,12 +202,12 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil - 32 per 3 CUs - 256 - 12.5 + - 9 + - 0 * - MI25 - GCN5.0 - gfx900 - - 9 - - 0 - 16  - 64 - 64 @@ -221,12 +219,12 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil - 32 per 3 CUs - 256 - 12.5 + - 9 + - 0 * - MI8 - GCN3.0 - gfx803 - - 8 - - 0 - 4 - 64 - 64 @@ -238,12 +236,12 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil - 32 per 4 CUs - 256 - 12.5 + - 8 + - 0 * - MI6 - GCN4.0 - gfx803 - - 8 - - 0 - 16 - 36 - 64 @@ -255,6 +253,8 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil - 32 per 4 CUs - 256 - 12.5 + - 8 + - 0 .. tab-item:: AMD Radeon PRO GPUs @@ -266,8 +266,7 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil - Model - Architecture - LLVM target name - - Device Major version - - Device Minor version + - VRAM (GiB) - Compute Units - Wavefront Size @@ -280,12 +279,12 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil - L0 Instruction Cache (KiB) - VGPR File (KiB) - SGPR File (KiB) + - GFXIP Major version + - GFXIP Minor version * - Radeon PRO V710 - RDNA3 - gfx1101 - - 11 - - 0 - 28 - 54 - 32 @@ -298,12 +297,12 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil - 32 - 768 - 16 + - 11 + - 0 * - Radeon PRO W7900 Dual Slot - RDNA3 - gfx1100 - - 11 - - 0 - 48 - 96 - 32 @@ -316,12 +315,12 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil - 32 - 768 - 16 + - 11 + - 0 * - Radeon PRO W7900 - RDNA3 - gfx1100 - - 11 - - 0 - 48 - 96 - 32 @@ -334,12 +333,12 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil - 32 - 768 - 16 + - 11 + - 0 * - Radeon PRO W7800 - RDNA3 - gfx1100 - - 11 - - 0 - 32 - 70 - 32 @@ -352,12 +351,12 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil - 32 - 768 - 16 + - 11 + - 0 * - Radeon PRO W7700 - RDNA3 - gfx1101 - - 11 - - 0 - 16 - 48 - 32 @@ -370,12 +369,12 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil - 32 - 768 - 16 + - 11 + - 0 * - Radeon PRO W6800 - RDNA2 - gfx1030 - - 10 - - 3 - 32 - 60 - 32 @@ -388,12 +387,12 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil - 32 - 512 - 16 + - 10 + - 3 * - Radeon PRO W6600 - RDNA2 - gfx1032 - - 10 - - 3 - 8 - 28 - 32 @@ -406,12 +405,12 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil - 32 - 512 - 16 + - 10 + - 3 * - Radeon PRO V620 - RDNA2 - gfx1030 - - 10 - - 3 - 32 - 72 - 32 @@ -424,12 +423,12 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil - 32 - 512 - 16 + - 10 + - 3 * - Radeon Pro W5500 - RDNA - gfx1012 - - 10 - - 1 - 8 - 22 - 32 @@ -442,12 +441,12 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil - 32 - 512 - 20 + - 10 + - 1 * - Radeon Pro VII - GCN5.1 - gfx906 - - 9 - - 0 - 16 - 60 - 64 @@ -460,6 +459,8 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil - 32 per 3 CUs - 256 - 12.5 + - 9 + - 0 .. tab-item:: AMD Radeon GPUs @@ -471,8 +472,6 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil - Model - Architecture - LLVM target name - - Device Major version - - Device Minor version - VRAM (GiB) - Compute Units - Wavefront Size @@ -485,12 +484,12 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil - L0 Instruction Cache (KiB) - VGPR File (KiB) - SGPR File (KiB) + - GFXIP Major version + - GFXIP Minor version * - Radeon RX 7900 XTX - RDNA3 - gfx1100 - - 11 - - 0 - 24 - 96 - 32 @@ -503,12 +502,12 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil - 32 - 768 - 16 + - 11 + - 0 * - Radeon RX 7900 XT - RDNA3 - gfx1100 - - 11 - - 0 - 20 - 84 - 32 @@ -521,12 +520,12 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil - 32 - 768 - 16 + - 11 + - 0 * - Radeon RX 7900 GRE - RDNA3 - gfx1100 - - 11 - - 0 - 16 - 80 - 32 @@ -539,12 +538,12 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil - 32 - 768 - 16 + - 11 + - 0 * - Radeon RX 7800 XT - RDNA3 - gfx1101 - - 11 - - 0 - 16 - 60 - 32 @@ -557,12 +556,12 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil - 32 - 768 - 16 + - 11 + - 0 * - Radeon RX 7700 XT - RDNA3 - gfx1101 - - 11 - - 0 - 12 - 54 - 32 @@ -575,12 +574,12 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil - 32 - 768 - 16 + - 11 + - 0 * - Radeon RX 7600 - RDNA3 - gfx1102 - - 11 - - 0 - 8 - 32 - 32 @@ -593,12 +592,12 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil - 32 - 512 - 16 + - 11 + - 0 * - Radeon RX 6950 XT - RDNA2 - gfx1030 - - 10 - - 3 - 16 - 80 - 32 @@ -611,12 +610,12 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil - 32 - 512 - 16 + - 10 + - 3 * - Radeon RX 6900 XT - RDNA2 - gfx1030 - - 10 - - 3 - 16 - 80 - 32 @@ -629,12 +628,12 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil - 32 - 512 - 16 + - 10 + - 3 * - Radeon RX 6800 XT - RDNA2 - gfx1030 - - 10 - - 3 - 16 - 72 - 32 @@ -647,12 +646,12 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil - 32 - 512 - 16 + - 10 + - 3 * - Radeon RX 6800 - RDNA2 - gfx1030 - - 10 - - 3 - 16 - 60 - 32 @@ -665,12 +664,12 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil - 32 - 512 - 16 + - 10 + - 3 * - Radeon RX 6750 XT - RDNA2 - gfx1031 - - 10 - - 3 - 12 - 40 - 32 @@ -683,12 +682,12 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil - 32 - 512 - 16 + - 10 + - 3 * - Radeon RX 6700 XT - RDNA2 - gfx1031 - - 10 - - 3 - 12 - 40 - 32 @@ -701,13 +700,13 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil - 32 - 512 - 16 + - 10 + - 3 * - Radeon RX 6700 - RDNA2 - gfx1031 - 10 - - 3 - - 10 - 36 - 32 - 128 @@ -719,12 +718,12 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil - 32 - 512 - 16 + - 10 + - 3 * - Radeon RX 6650 XT - RDNA2 - gfx1032 - - 10 - - 3 - 8 - 32 - 32 @@ -737,12 +736,12 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil - 32 - 512 - 16 + - 10 + - 3 * - Radeon RX 6600 XT - RDNA2 - gfx1032 - - 10 - - 3 - 8 - 32 - 32 @@ -755,12 +754,12 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil - 32 - 512 - 16 + - 10 + - 3 * - Radeon RX 6600 - RDNA2 - gfx1032 - - 10 - - 3 - 8 - 28 - 32 @@ -773,12 +772,12 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil - 32 - 512 - 16 + - 10 + - 3 * - Radeon VII - GCN5.1 - gfx906 - - 9 - - 0 - 16 - 60 - 64 @@ -791,6 +790,8 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil - 32 per 3 CUs - 256 - 12.5 + - 9 + - 0 Glossary ======== @@ -804,18 +805,6 @@ For more information about the terms used, see the Argument to pass to clang in ``--offload-arch`` to compile code for the given architecture. -**Device major version** - -Indicates the core instruction set of the GPU architecture. For example, a value -of 11 would correspond to Navi III (RDNA3). - -**Device minor version** - -Indicates a particular configuration, feature set, or variation within the group -represented by the device compute version. For example, different models within -the same major version might have varying levels of support for certain features -or optimizations. - **VRAM** Amount of memory available on the GPU. @@ -898,6 +887,26 @@ Purpose Vector Registers, used specifically in matrix instructions. Size of the Scalar General Purpose Register (SGPR) file. Holds data used in scalar instructions. +**GFXIP** + +GFXIP (Graphics IP) is a versioning system used by AMD to identify the GPU +architecture and its instruction set. It helps categorize different generations +of GPUs and their feature sets. + +**GFXIP major version** + +Defines the GPU's core instruction set and architecture, which determines +compatibility with software stacks such as HIP and OpenCL. For example, a GFXIP +11 major version corresponds to the RDNA 3 (Navi 3x) architecture, influencing +driver support and available compute features. + +**GFXIP minor version** + +Represents specific variations within a GFXIP major version and affects feature sets, +optimizations, and driver behavior in software stacks such as HIP and OpenCL. Different +GPU models within the same major version can have unique capabilities, impacting +performance and supported instructions. + **GCD** Graphics Compute Die.