Compare commits

..

1 Commits

Author SHA1 Message Date
Adel Johar
1499f74c22 Docs: Add Device Major/Minor Versions to gpu-arch-spec.rst 2025-02-13 14:24:00 +01:00
3 changed files with 278 additions and 46 deletions

View File

@@ -677,7 +677,6 @@ namespace
namespaces
nanoGPT
num
numpy
numref
ocl
opencl

View File

@@ -55,7 +55,7 @@ Docker image compatibility
AMD validates and publishes ready-made `ROCm JAX Docker images <https://hub.docker.com/r/rocm/jax>`_
with ROCm backends on Docker Hub. The following Docker image tags and
associated inventories represent the latest JAX version from the official Docker Hub and are validated for
associated inventories are validated for
`ROCm 6.3.1 <https://repo.radeon.com/rocm/apt/6.3.1/>`_. Click the |docker-icon|
icon to view the image on Docker Hub.
@@ -83,8 +83,7 @@ icon to view the image on Docker Hub.
AMD publishes `Community ROCm JAX Docker images <https://hub.docker.com/r/rocm/jax-community>`_
with ROCm backends on Docker Hub. The following Docker image tags and
associated inventories represent the latest JAX version from the official Docker Hub and are
tested for `ROCm 6.3.1 <https://repo.radeon.com/rocm/apt/6.3.1/>`_.
associated inventories are tested for `ROCm 6.2.4 <https://repo.radeon.com/rocm/apt/6.2.4/>`_.
.. list-table:: JAX community Docker image components
:header-rows: 1
@@ -95,25 +94,25 @@ tested for `ROCm 6.3.1 <https://repo.radeon.com/rocm/apt/6.3.1/>`_.
- Python
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/jax-community/rocm6.3.1-jax0.5.0-py3.12.8/images/sha256-897d7471a954d9df7f79cd28b87ec515fbb94189fc3cf13e3a1588aa6b2a5fee?context_explore"><i class="fab fa-docker fa-lg"></i> rocm/jax-community</a>
<a href="https://hub.docker.com/layers/rocm/jax-community/rocm6.2.4-jax0.4.35-py3.12.7/images/sha256-a6032d89c07573b84c44e42c637bf9752b1b7cd2a222d39344e603d8f4c63beb?context=explore"><i class="fab fa-docker fa-lg"></i> rocm/jax-community</a>
- `0.5.0 <https://github.com/ROCm/jax/releases/tag/rocm-jax-v0.5.0>`_
- `0.4.35 <https://github.com/ROCm/jax/releases/tag/rocm-jax-v0.4.35>`_
- Ubuntu 22.04
- `3.12.8 <https://www.python.org/downloads/release/python-3128/>`_
- `3.12.7 <https://www.python.org/downloads/release/python-3127/>`_
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/jax-community/rocm6.3.1-jax0.5.0-py3.11.11/images/sha256-f59243f324ee8da8dd54cd81b3649a860b2b454eaac8e4ce41d5c5f40e42b0e8?context_explore"><i class="fab fa-docker fa-lg"></i> rocm/jax-community</a>
<a href="https://hub.docker.com/layers/rocm/jax-community/rocm6.2.4-jax0.4.35-py3.11.10/images/sha256-d462f7e445545fba2f3b92234a21beaa52fe6c5f550faabcfdcd1bf53486d991?context=explore"><i class="fab fa-docker fa-lg"></i> rocm/jax-community</a>
- `0.5.0 <https://github.com/ROCm/jax/releases/tag/rocm-jax-v0.5.0>`_
- `0.4.35 <https://github.com/ROCm/jax/releases/tag/rocm-jax-v0.4.35>`_
- Ubuntu 22.04
- `3.11.10 <https://www.python.org/downloads/release/python-31110/>`_
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/jax-community/rocm6.3.1-jax0.5.0-py3.10.16/images/sha256-6f12e5f6a3b5d033d2b1a43938b6804978d999978e68e402228d02984a69fb9d?context_explore"><i class="fab fa-docker fa-lg"></i> rocm/jax-community</a>
<a href="https://hub.docker.com/layers/rocm/jax-community/rocm6.2.4-jax0.4.35-py3.10.15/images/sha256-6f2d4d0f529378d9572f0e8cfdcbc101d1e1d335bd626bb3336fff87814e9d60?context=explore"><i class="fab fa-docker fa-lg"></i> rocm/jax-community</a>
- `0.5.0 <https://github.com/ROCm/jax/releases/tag/rocm-jax-v0.5.0>`_
- `0.4.35 <https://github.com/ROCm/jax/releases/tag/rocm-jax-v0.4.35>`_
- Ubuntu 22.04
- `3.10.16 <https://www.python.org/downloads/release/python-31016/>`_
- `3.10.15 <https://www.python.org/downloads/release/python-31015/>`_
Critical ROCm libraries for JAX
================================================================================
@@ -211,7 +210,7 @@ performance, and feature set available to developers.
Supported and unsupported features
===============================================================================
The following table maps the public JAX API modules to their supported
The following table maps GPU-accelerated JAX modules to their supported
ROCm and JAX versions.
.. list-table::
@@ -248,11 +247,21 @@ ROCm and JAX versions.
devices.
- 0.3.20
- 5.1.0
* - ``jax.dlpack``
- For exchanging tensor data between JAX and other libraries that support the
DLPack standard.
- 0.1.57
- 5.0.0
* - ``jax.distributed``
- Enables the scaling of computations across multiple devices on a single
machine or across multiple machines.
- 0.1.74
- 5.0.0
* - ``jax.dtypes``
- Provides utilities for working with and managing data types in JAX
arrays and computations.
- 0.1.66
- 5.0.0
* - ``jax.image``
- Contains image manipulation functions like resize, scale and translation.
- 0.1.57
@@ -266,10 +275,27 @@ ROCm and JAX versions.
array.
- 0.1.57
- 5.0.0
* - ``jax.profiler``
- Contains JAXs tracing and time profiling features.
- 0.1.57
- 5.0.0
* - ``jax.stages``
- Contains interfaces to stages of the compiled execution process.
- 0.3.4
- 5.0.0
* - ``jax.tree``
- Provides utilities for working with tree-like container data structures.
- 0.4.26
- 5.6.0
* - ``jax.tree_util``
- Provides utilities for working with nested data structures, or
``pytrees``.
- 0.1.65
- 5.0.0
* - ``jax.typing``
- Provides JAX-specific static type annotations.
- 0.3.18
- 5.1.0
* - ``jax.extend``
- Provides modules for access to JAX internal machinery module. The
``jax.extend`` module defines a library view of some of JAXs internal
@@ -296,10 +322,10 @@ ROCm and JAX versions.
- jax_triton 0.2.0
- 6.2.4
jax.lax module
jax.scipy module
-------------------------------------------------------------------------------
A module for primitives operations.
A SciPy-like API for scientific computing.
.. list-table::
:header-rows: 1
@@ -307,14 +333,129 @@ A module for primitives operations.
* - Module
- Since JAX
- Since ROCm
* - ``jax.lax.linalg``
- 0.3.2
* - ``jax.scipy.cluster``
- 0.3.11
- 5.1.0
* - ``jax.scipy.fft``
- 0.1.71
- 5.0.0
* - ``jax.scipy.integrate``
- 0.4.15
- 5.5.0
* - ``jax.scipy.interpolate``
- 0.1.76
- 5.0.0
* - ``jax.scipy.linalg``
- 0.1.56
- 5.0.0
* - ``jax.scipy.ndimage``
- 0.1.56
- 5.0.0
* - ``jax.scipy.optimize``
- 0.1.57
- 5.0.0
* - ``jax.scipy.signal``
- 0.1.56
- 5.0.0
* - ``jax.scipy.spatial.transform``
- 0.4.12
- 5.4.0
* - ``jax.scipy.sparse.linalg``
- 0.1.56
- 5.0.0
* - ``jax.scipy.special``
- 0.1.56
- 5.0.0
* - ``jax.scipy.stats``
- 0.1.56
- 5.0.0
jax.scipy.stats module
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.. list-table::
:header-rows: 1
* - Module
- Since JAX
- Since ROCm
* - ``jax.scipy.stats.bernouli``
- 0.1.56
- 5.0.0
* - ``jax.scipy.stats.beta``
- 0.1.56
- 5.0.0
* - ``jax.scipy.stats.betabinom``
- 0.1.61
- 5.0.0
* - ``jax.scipy.stats.binom``
- 0.4.14
- 5.4.0
* - ``jax.scipy.stats.cauchy``
- 0.1.56
- 5.0.0
* - ``jax.scipy.stats.chi2``
- 0.1.61
- 5.0.0
* - ``jax.scipy.stats.dirichlet``
- 0.1.56
- 5.0.0
* - ``jax.scipy.stats.expon``
- 0.1.56
- 5.0.0
* - ``jax.scipy.stats.gamma``
- 0.1.56
- 5.0.0
* - ``jax.scipy.stats.gennorm``
- 0.3.15
- 5.2.0
* - ``jax.scipy.stats.geom``
- 0.1.56
- 5.0.0
* - ``jax.scipy.stats.laplace``
- 0.1.56
- 5.0.0
* - ``jax.scipy.stats.logistic``
- 0.1.56
- 5.0.0
* - ``jax.scipy.stats.multinomial``
- 0.3.18
- 5.1.0
* - ``jax.scipy.stats.multivariate_normal``
- 0.1.56
- 5.0.0
* - ``jax.scipy.stats.nbinom``
- 0.1.72
- 5.0.0
* - ``jax.scipy.stats.norm``
- 0.1.56
- 5.0.0
* - ``jax.scipy.stats.pareto``
- 0.1.56
- 5.0.0
* - ``jax.scipy.stats.poisson``
- 0.1.56
- 5.0.0
* - ``jax.scipy.stats.t``
- 0.1.56
- 5.0.0
* - ``jax.scipy.stats.truncnorm``
- 0.4.0
- 5.3.0
* - ``jax.scipy.stats.uniform``
- 0.1.56
- 5.0.0
* - ``jax.scipy.stats.vonmises``
- 0.4.2
- 5.3.0
* - ``jax.scipy.stats.wrapcauchy``
- 0.4.20
- 5.6.0
jax.extend module
-------------------------------------------------------------------------------
A module for primitives operations.
Modules for JAX extensions.
.. list-table::
:header-rows: 1
@@ -322,30 +463,18 @@ A module for primitives operations.
* - Module
- Since JAX
- Since ROCm
* - ``jax.extend.core``
* - ``jax.extend.ffi``
- 0.4.30
- 6.0.0
* - ``jax.extend.linear_util``
- 0.4.17
- 5.6.0
* - ``jax.extend.mlir``
- 0.4.26
- 5.6.0
* - ``jax.extend.random``
- 0.4.15
- 5.5.0
* - ``jax.extend.core.primitives``
- 0.4.32
- 5.5.0
jax.numpy module
-------------------------------------------------------------------------------
A module for primitives operations.
.. list-table::
:header-rows: 1
* - Module
- Since JAX
- Since ROCm
* - ``jax.numpy.fft``
- 0.3.20
- 5.1.0
* - ``jax.numpy.linalg``
- 0.3.20
- 5.1.0
jax.experimental module
-------------------------------------------------------------------------------
@@ -361,6 +490,9 @@ Experimental modules and APIs.
* - ``jax.experimental.checkify``
- 0.1.75
- 5.0.0
* - ``jax.experimental.compilation_cache.compilation_cache``
- 0.1.68
- 5.0.0
* - ``jax.experimental.custom_partitioning``
- 0.4.0
- 5.3.0
@@ -382,11 +514,8 @@ Experimental modules and APIs.
* - ``jax.experimental.pjit``
- 0.1.61
- 5.0.0
* - ``jax.experimental.roofline``
- 0.4.36
- 5.3.0
* - ``jax.experimental.rnn``
- 0.4.3
* - ``jax.experimental.serialize_executable``
- 0.4.0
- 5.3.0
* - ``jax.experimental.shard_map``
- 0.4.3
@@ -443,6 +572,9 @@ Experimental support for sparse matrix operations.
* - ``jax.experimental.sparse.linalg``
- 0.3.15
- 5.2.0
* - ``jax.experimental.sparse.sparsify``
- 0.3.25
- ❌
.. list-table::
:header-rows: 1
@@ -490,6 +622,9 @@ ROCm.
* - XLA int4 support
- 4-bit integer (int4) precision in the XLA compiler.
- 0.4.0
* - ``jax.experimental.sparsify``
- Converts a dense matrix to a sparse matrix representation.
- Experimental
Use cases and recommendations
================================================================================

View File

@@ -21,6 +21,8 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil
- Model
- Architecture
- LLVM target name
- Device Major version
- Device Minor version
- VRAM (GiB)
- Compute Units
- Wavefront Size
@@ -36,6 +38,8 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil
- MI325X
- CDNA3
- gfx942
- 9
- 4
- 256
- 304 (38 per XCD)
- 64
@@ -51,6 +55,8 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil
- MI300X
- CDNA3
- gfx942
- 9
- 4
- 192
- 304 (38 per XCD)
- 64
@@ -66,6 +72,8 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil
- MI300A
- CDNA3
- gfx942
- 9
- 4
- 128
- 228 (38 per XCD)
- 64
@@ -81,6 +89,8 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil
- MI250X
- CDNA2
- gfx90a
- 9
- 0
- 128
- 220 (110 per GCD)
- 64
@@ -96,6 +106,8 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil
- MI250
- CDNA2
- gfx90a
- 9
- 0
- 128
- 208 (104 per GCD)
- 64
@@ -111,6 +123,8 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil
- MI210
- CDNA2
- gfx90a
- 9
- 0
- 64
- 104
- 64
@@ -126,6 +140,8 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil
- MI100
- CDNA
- gfx908
- 9
- 0
- 32
- 120
- 64
@@ -141,6 +157,8 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil
- MI60
- GCN5.1
- gfx906
- 9
- 0
- 32
- 64
- 64
@@ -156,6 +174,8 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil
- MI50 (32GB)
- GCN5.1
- gfx906
- 9
- 0
- 32
- 60
- 64
@@ -171,6 +191,8 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil
- MI50 (16GB)
- GCN5.1
- gfx906
- 9
- 0
- 16
- 60
- 64
@@ -186,6 +208,8 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil
- MI25
- GCN5.0
- gfx900
- 9
- 0
- 16 
- 64
- 64
@@ -201,6 +225,8 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil
- MI8
- GCN3.0
- gfx803
- 8
- 0
- 4
- 64
- 64
@@ -216,6 +242,8 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil
- MI6
- GCN4.0
- gfx803
- 8
- 0
- 16
- 36
- 64
@@ -238,6 +266,8 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil
- Model
- Architecture
- LLVM target name
- Device Major version
- Device Minor version
- VRAM (GiB)
- Compute Units
- Wavefront Size
@@ -254,6 +284,8 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil
- Radeon PRO V710
- RDNA3
- gfx1101
- 11
- 0
- 28
- 54
- 32
@@ -270,6 +302,8 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil
- Radeon PRO W7900 Dual Slot
- RDNA3
- gfx1100
- 11
- 0
- 48
- 96
- 32
@@ -286,6 +320,8 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil
- Radeon PRO W7900
- RDNA3
- gfx1100
- 11
- 0
- 48
- 96
- 32
@@ -302,6 +338,8 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil
- Radeon PRO W7800
- RDNA3
- gfx1100
- 11
- 0
- 32
- 70
- 32
@@ -318,6 +356,8 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil
- Radeon PRO W7700
- RDNA3
- gfx1101
- 11
- 0
- 16
- 48
- 32
@@ -334,6 +374,8 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil
- Radeon PRO W6800
- RDNA2
- gfx1030
- 10
- 3
- 32
- 60
- 32
@@ -350,6 +392,8 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil
- Radeon PRO W6600
- RDNA2
- gfx1032
- 10
- 3
- 8
- 28
- 32
@@ -366,6 +410,8 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil
- Radeon PRO V620
- RDNA2
- gfx1030
- 10
- 3
- 32
- 72
- 32
@@ -382,6 +428,8 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil
- Radeon Pro W5500
- RDNA
- gfx1012
- 10
- 1
- 8
- 22
- 32
@@ -398,6 +446,8 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil
- Radeon Pro VII
- GCN5.1
- gfx906
- 9
- 0
- 16
- 60
- 64
@@ -421,6 +471,8 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil
- Model
- Architecture
- LLVM target name
- Device Major version
- Device Minor version
- VRAM (GiB)
- Compute Units
- Wavefront Size
@@ -437,6 +489,8 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil
- Radeon RX 7900 XTX
- RDNA3
- gfx1100
- 11
- 0
- 24
- 96
- 32
@@ -453,6 +507,8 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil
- Radeon RX 7900 XT
- RDNA3
- gfx1100
- 11
- 0
- 20
- 84
- 32
@@ -469,6 +525,8 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil
- Radeon RX 7900 GRE
- RDNA3
- gfx1100
- 11
- 0
- 16
- 80
- 32
@@ -485,6 +543,8 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil
- Radeon RX 7800 XT
- RDNA3
- gfx1101
- 11
- 0
- 16
- 60
- 32
@@ -501,6 +561,8 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil
- Radeon RX 7700 XT
- RDNA3
- gfx1101
- 11
- 0
- 12
- 54
- 32
@@ -517,6 +579,8 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil
- Radeon RX 7600
- RDNA3
- gfx1102
- 11
- 0
- 8
- 32
- 32
@@ -533,6 +597,8 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil
- Radeon RX 6950 XT
- RDNA2
- gfx1030
- 10
- 3
- 16
- 80
- 32
@@ -549,6 +615,8 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil
- Radeon RX 6900 XT
- RDNA2
- gfx1030
- 10
- 3
- 16
- 80
- 32
@@ -565,6 +633,8 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil
- Radeon RX 6800 XT
- RDNA2
- gfx1030
- 10
- 3
- 16
- 72
- 32
@@ -581,6 +651,8 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil
- Radeon RX 6800
- RDNA2
- gfx1030
- 10
- 3
- 16
- 60
- 32
@@ -597,6 +669,8 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil
- Radeon RX 6750 XT
- RDNA2
- gfx1031
- 10
- 3
- 12
- 40
- 32
@@ -613,6 +687,8 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil
- Radeon RX 6700 XT
- RDNA2
- gfx1031
- 10
- 3
- 12
- 40
- 32
@@ -630,6 +706,8 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil
- RDNA2
- gfx1031
- 10
- 3
- 10
- 36
- 32
- 128
@@ -645,6 +723,8 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil
- Radeon RX 6650 XT
- RDNA2
- gfx1032
- 10
- 3
- 8
- 32
- 32
@@ -661,6 +741,8 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil
- Radeon RX 6600 XT
- RDNA2
- gfx1032
- 10
- 3
- 8
- 32
- 32
@@ -677,6 +759,8 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil
- Radeon RX 6600
- RDNA2
- gfx1032
- 10
- 3
- 8
- 28
- 32
@@ -693,6 +777,8 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil
- Radeon VII
- GCN5.1
- gfx906
- 9
- 0
- 16
- 60
- 64
@@ -710,7 +796,7 @@ Glossary
========
For more information about the terms used, see the
:ref:`specific documents and guides <gpu-arch-documentation>`, or
:ref:`specific documents and guides <gpu-arch-documentation>`, or
:doc:`Understanding the HIP programming model<hip:understand/programming_model>`.
**LLVM target name**
@@ -718,6 +804,18 @@ For more information about the terms used, see the
Argument to pass to clang in ``--offload-arch`` to compile code for the given
architecture.
**Device major version**
Indicates the core instruction set of the GPU architecture. For example, a value
of 11 would correspond to Navi III (RDNA3).
**Device minor version**
Indicates a particular configuration, feature set, or variation within the group
represented by the device compute version. For example, different models within
the same major version might have varying levels of support for certain features
or optimizations.
**VRAM**
Amount of memory available on the GPU.