Compare commits

...

62 Commits

Author SHA1 Message Date
Jeffrey Novotny
3c3847f9f7 Merge pull request #5224 from amd-jnovotny/dlf-matt-docs643
Deep learning frameworks edits for scale (#5189)
2025-08-22 11:56:27 -04:00
Matt Williams
249bd177ec Deep learning frameworks edits for scale (#5189)
* Deep learning frameworks edits for scale

Based on https://ontrack-internal.amd.com/browse/ROCDOC-1809

* update table

table

* leo comments

* formatting

* format

* update table based on feedback

* header

* Update machine learning page

* headers

* Apply suggestions from code review

Co-authored-by: anisha-amd <anisha.sankar@amd.com>

* Update .wordlist.txt

* formatting

* Update docs/how-to/deep-learning-rocm.rst

Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com>

---------

Co-authored-by: Matt Williams <Matt.Williams+amdeng@amd.com>
Co-authored-by: anisha-amd <anisha.sankar@amd.com>
Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com>
(cherry picked from commit 1d42f7cc62)
2025-08-22 11:52:32 -04:00
Peter Park
b2ee8d4b2e docs: Add Primus (Megatron) training Docker documentation (#5218) (#5222)
(cherry picked from commit 98029db4ee)
2025-08-22 09:02:40 -04:00
Peter Park
3f834cf520 Fix documented VRAM for Radeon AI Pro R9700 (#5203) (#5204)
(cherry picked from commit c154b7e0a3)
2025-08-18 10:20:22 -04:00
Peter Park
70ba866c5b vLLM inference benchmark doc: add missing data field (#5199) (#5200)
(cherry picked from commit 55d0a88ec5)
2025-08-15 13:52:51 -04:00
Peter Park
320ec4669a [docs/6.4.3] Update vLLM benchmark doc for 20250812 Docker release (#5196) (#5198)
(cherry picked from commit 7ee22790ce)
2025-08-14 15:51:58 -04:00
anisha-amd
c9bd93b537 [Docs] 6.4.3: compatibility matrix frameworks support update (#5186) 2025-08-12 14:25:45 -04:00
Peter Park
a060550bcd Add Hunyuan Video to PyTorch inference benchmark models doc (#5094)
(cherry picked from commit 80f7dc79b9)
2025-08-12 11:59:51 -04:00
Parag Bhandari
c92cbaee66 Merge branch 'roc-6.4.x' into docs/6.4.3 2025-08-08 08:49:53 -04:00
Parag Bhandari
c84afacc8d Merge branch 'develop' into roc-6.4.x 2025-08-08 08:49:39 -04:00
Pratik Basyal
af48464844 6.4.3 version updated (#492) 2025-08-08 08:45:39 -04:00
Parag Bhandari
843fd1b3fb Merge branch 'roc-6.4.x' into docs/6.4.3 2025-08-08 07:21:31 -04:00
Parag Bhandari
82221c4e2d Merge branch 'develop' into roc-6.4.x 2025-08-08 07:18:53 -04:00
pbhandar-amd
5b724a3780 Merge pull request #5167 syncing internal develop into external for ROCm 7.0
Syncing internal develop into external for ROCm 7.0 release.
2025-08-08 07:17:50 -04:00
Parag Bhandari
ffd5575cd9 Merge branch 'develop-internal' into amd/pbhandar/internal_to_external_700 2025-08-08 05:35:20 -04:00
pbhandar-amd
cfb7bd1883 Update versions.md 2025-08-07 16:36:12 -04:00
pbhandar-amd
ae7b791b22 Update versions.md 2025-08-07 16:34:26 -04:00
Daniel Su
3573239728 [Ex CI] update rocprofiler-register branch name (#5162) 2025-08-07 11:46:44 -04:00
Daniel Su
ec566f9623 [Ex CI] update rocprofiler-register pipeline ID to monorepo (#5158) 2025-08-07 10:35:14 -04:00
Pratik Basyal
30a862c4b9 PRE GA links reset to public (#491) 2025-08-07 10:23:34 -04:00
Parag Bhandari
3d3cfae976 Merge branch 'develop' into develop-internal 2025-08-07 09:42:50 -04:00
pbhandar-amd
d0ebe126e7 Sync develop into docs/6.4.3 2025-08-07 09:07:38 -04:00
Daniel Su
14f5316ade [Ex CI] enable rocprofiler-register monorepo builds (#5155) 2025-08-06 13:44:55 -04:00
Pratik Basyal
00d814ccbf ROCm SMI 7.7.0 update included in 6.4.3 (#488)
* ROCm SMI 7.7.0 update included

* Feedback incorporated

* Review feedback added
2025-08-06 13:26:59 -04:00
Parag Bhandari
948a6a469b Merge branch 'develop' into develop-internal 2025-08-06 12:00:31 -04:00
pbhandar-amd
74610893a9 Merge pull request #5154 from ROCm/amd/pbhandar/merge_from_develop
Merge develop into docs/6.4.3
2025-08-06 11:25:50 -04:00
Parag Bhandari
afe3e21cad Merge branch 'develop' into docs/6.4.3 2025-08-05 16:24:07 -04:00
Dominic Widdows
698d7f1d58 Updating old link that has been changed (#5149) 2025-08-05 15:23:55 -04:00
Dominic Widdows
9ab5dcfa59 Remove reference to build instructions that were moved 2025-08-05 15:22:32 -04:00
Pratik Basyal
8ba712bff3 Ram's review feedback incorporated (#487) 2025-08-05 11:12:35 -04:00
Daniel Su
e91e712888 [Ex CI] make MIOpen CK script no longer partially succeed (#5141) 2025-08-02 14:42:12 -04:00
Joseph Macaranas
8f1b075a79 [External CI] Disable downstream solver builds (#5150)
- Disable while migration to monorepo is postponed.
2025-08-02 14:41:27 -04:00
Pratik Basyal
f0cc7c573d No changes related updated added (#486) 2025-08-01 14:38:14 -04:00
Pratik Basyal
b271c9af9d 6.4.3 Release changes to ROCm documentation (#482)
* Initial changes to 6.4.3 added

* Additional changes to RN

* Deep learning framework updated added

* Tutorials for AI dev added

* Install on linux toc pointed to internal

* 6.4.3 compatibility table updated

* Datatype support docs improvement added

* Manifest 642 GA added

* 643 RC1 manifest added

* 6.4.2 version table added

* 6.4.3 version table added

* Leo's feedback incorporated

* Deeplearning framework support synced

* Historical Compatibility matrix updated

* Release highlight added

* DGL compatibility updated
2025-08-01 14:21:05 -04:00
Daniel Su
885ab8438a [Ex CI] reduce pipeline size (#5140)
* new fft miopen pipeline ids
* remove all references to mainline builds
2025-08-01 11:54:59 -04:00
anisha-amd
3837fe8440 Updates to the compatibility matrix with DGL fix (#5143) 2025-08-01 11:17:34 -04:00
Alex Xu
ae2440772f upgrade rocm-docs-core to 1.22.0 2025-07-31 16:41:40 -04:00
anisha-amd
61f970a24d Cherry pick into 6.4.3 - Docs: Adding frameworks compatibility for Megablocks and Taichi (#5138) 2025-07-31 14:02:09 -04:00
anisha-amd
98530811b4 Update megablocks-compatibility.rst (#5136) 2025-07-31 13:30:39 -04:00
anisha-amd
266387d816 Docs: Adding frameworks compatibility for Megablocks and Taichi (#5133) 2025-07-31 13:00:31 -04:00
Daniel Su
2e93925311 [Ex CI] disable rocSPARSE to hipSOLVER downstream path (#5134) 2025-07-31 12:42:04 -04:00
Daniel Su
88c2a2877b [Ex CI] enable hipSOLVER monorepo builds (#5119)
For ROCm/rocm-libraries#942

Enables hipSOLVER monorepo builds.

Enables downstream paths:
rocSOLVER -> hipSOLVER
rocSPARSE -> hipSOLVER

Sample runs:
hipSOLVER: https://dev.azure.com/ROCm-CI/ROCm-CI/_build/results?buildId=40959&view=results
rocSOLVER: https://dev.azure.com/ROCm-CI/ROCm-CI/_build/results?buildId=40948&view=results
rocSPARSE: https://dev.azure.com/ROCm-CI/ROCm-CI/_build/results?buildId=40949&view=results
2025-07-31 10:31:55 -04:00
Daniel Su
85e0580b28 [Ex CI] enable FFT downstream jobs (#5126)
Monorepo support for FFTs was already implemented and trigger files already exist, so just need to enable their downstream jobs.

Enables downstream path:
hipRAND -> rocFFT -> hipFFT

Sample runs:
hipRAND: https://dev.azure.com/ROCm-CI/ROCm-CI/_build/results?buildId=41270&view=results
rocFFT: https://dev.azure.com/ROCm-CI/ROCm-CI/_build/results?buildId=41268&view=results
hipFFT: https://dev.azure.com/ROCm-CI/ROCm-CI/_build/results?buildId=41269&view=results
2025-07-31 10:31:35 -04:00
Peter Park
b61d6a021e Update PyT and TF Docker images in compatibility pages for 6.4.2 (#5129) 2025-07-31 09:55:46 -04:00
Istvan Kiss
fb30dafa29 Update precision support page part I. (#5127) 2025-07-31 15:22:19 +02:00
Joseph Macaranas
b2012cb0b9 External CI: rocm-libraries superbuild component yaml (#5125)
- Subset of the hipblaslt component yaml, deleting extra gpu targets and the testing component.
- Sparse checkout details removed.
- Basic build flags from top-level invocation added.
2025-07-30 17:50:46 -04:00
Daniel Su
45cf2b9a80 [Ex CI] rocprof-systems: add libsqlite3-dev (#5124)
Fixes rocprofiler-systems builds following ROCm/rocprofiler-systems@26ae543

Sample build:
https://dev.azure.com/ROCm-CI/ROCm-CI/_build/results?buildId=41257&view=results
2025-07-30 16:05:26 -04:00
Daniel Su
bee363995b [Ex CI] revert miopen-get-ck script change (#5123) 2025-07-30 12:03:51 -07:00
Daniel Su
3a031fad3a [Ex CI] disable MIOpen downstream jobs (#5122) 2025-07-30 13:12:11 -04:00
Daniel Su
46f6c4ff9a [Ex CI] enable MIOpen monorepo (#5117)
* init

* fix source dir

* miopen specify test build dir

* fix test build dir

* revert change

* fix test build again

* move to ultra temporarily

* miopen-get-ck, working dir

* exclude flaky test

* move back to high

* Add MIVisionX and AMDMIGraphX downstream jobs to MIOpen

* comment sparsecheckoutdir

* quote component names

* fix artifact name

* miopen ck script exit on fail

* add downstream checkout repos

* mivisionx, add aomp
2025-07-30 09:53:52 -07:00
Pratik Basyal
f632f2879f ROCm Software Stack image for 6.4.0 updated (#5112) 2025-07-28 14:51:19 -04:00
yugang-amd
cc5bc5a882 Add SGLang inference benchmark doc w/ initial support for DeepSeek-R1-Distill-Qwen-32B (#4870) 2025-07-25 12:42:40 -04:00
Daniel Su
2c9c3d0ba1 [Ex CI] switch hipBLAS/SPARSE pipeline IDs to monorepo (#5098) 2025-07-24 16:53:29 -04:00
Peter Park
14249f24d8 Use madengine instead of tools/run_models.py in docs (#5095) 2025-07-24 15:38:12 -04:00
Daniel Su
0e8045cca7 [Ex CI] enable hipBLAS monorepo (#5090) 2025-07-24 12:37:34 -04:00
Daniel Su
541fe92947 [Ex CI] update to 6.4.2 (#5087) 2025-07-23 14:10:40 -04:00
Daniel Su
628d5f8a19 [Ex CI] create Docker images for nightly builds (#5005) 2025-07-23 12:16:11 -04:00
Peter Park
984a91f008 Add DeepSeek Janus Pro 7B to PyTorch inference benchmark doc (#5071)
---------

Co-authored-by: yugang-amd <yugang.wang@amd.com>
2025-07-22 16:26:06 -04:00
amd-hsivasun
ae2cc6ab38 [EX CI] ROCR-Runtime: migrate from rocm-smi to amd-smi (#5088)
* Update ROCR-Runtime.yml

Migrate from rocmsmi to amdsmi

* Update ROCR-Runtime.yml

Removed libhwloc.so.5 install

* Update ROCR-Runtime.yml

Link to hwloc.so.5

* Update ROCR-Runtime.yml

Added link in the rocrtst step

* Update ROCR-Runtime.yml
2025-07-22 14:17:53 -04:00
Peter Park
15ee605d18 Fix branches for install docs in _toc.yml.in (#5083) 2025-07-22 11:03:40 -04:00
anisha-amd
ae54add299 Sphinx warning for ROCm fixed (#5077) (#5082)
* Sphinx warning for DGL fixed

* Update dgl-compatibility.rst

removed benchmark line and updated link

---------

Co-authored-by: Pratik Basyal <prbasyal@amd.com>
2025-07-22 10:51:15 -04:00
Peter Park
2269e9d25d Remove broken link to deprecated AMDGPU installer documentation (#5078)
* remove link to deprecated AMDGPU installation method

* add deep learning frameworks
2025-07-21 19:36:20 -04:00
71 changed files with 5269 additions and 1695 deletions

View File

@@ -1,42 +0,0 @@
variables:
- group: common
- template: /.azuredevops/variables-global.yml
resources:
repositories:
- repository: aomp_repo
type: github
endpoint: ROCm
name: ROCm/aomp
ref: amd-mainline
- repository: aomp-extras_repo
type: github
endpoint: ROCm
name: ROCm/aomp-extras
ref: amd-mainline
- repository: flang_repo
type: github
endpoint: ROCm
name: ROCm/flang
ref: amd-mainline
- repository: llvm-project_repo
type: github
endpoint: ROCm
name: ROCm/llvm-project
ref: amd-mainline
pipelines:
- pipeline: rocr-runtime_pipeline
source: \ROCR-Runtime
trigger:
branches:
include:
- amd-mainline
# this job will only be triggered after successful build sequence of llvm-project and ROCR-Runtime
trigger: none
pr: none
jobs:
- template: ${{ variables.CI_COMPONENT_PATH }}/aomp.yml
parameters:
checkoutRepo: aomp_repo

View File

@@ -1,10 +1,29 @@
parameters:
- name: componentName
type: string
default: AMDMIGraphX
- name: checkoutRepo
type: string
default: 'self'
- name: checkoutRef
type: string
default: ''
# monorepo related parameters
# - name: sparseCheckoutDir
# type: string
# default: ''
- name: triggerDownstreamJobs
type: boolean
default: false
- name: downstreamAggregateNames
type: string
default: ''
- name: buildDependsOn
type: object
default: null
- name: unifiedBuild
type: boolean
default: false
# set to true if doing full build of ROCm stack
# and dependencies are pulled from same pipeline
- name: aggregatePipeline
@@ -93,7 +112,11 @@ parameters:
jobs:
- ${{ each job in parameters.jobMatrix.buildJobs }}:
- job: AMDMIGraphX_build_${{ job.target }}
- job: ${{ parameters.componentName }}_build_ubuntu2204_${{ job.target }}
${{ if parameters.buildDependsOn }}:
dependsOn:
- ${{ each build in parameters.buildDependsOn }}:
- ${{ build }}_ubuntu2204_${{ job.target }}
variables:
- group: common
- template: /.azuredevops/variables-global.yml
@@ -121,6 +144,8 @@ jobs:
dependencyList: ${{ parameters.rocmDependencies }}
gpuTarget: ${{ job.target }}
aggregatePipeline: ${{ parameters.aggregatePipeline }}
${{ if parameters.triggerDownstreamJobs }}:
downstreamAggregateNames: ${{ parameters.downstreamAggregateNames }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/build-cmake.yml
parameters:
extraBuildFlags: >-
@@ -146,12 +171,12 @@ jobs:
gpuTarget: ${{ job.target }}
- ${{ each job in parameters.jobMatrix.testJobs }}:
- job: AMDMIGraphX_test_${{ job.target }}
dependsOn: AMDMIGraphX_build_${{ job.target }}
- job: ${{ parameters.componentName }}_test_ubuntu2204_${{ job.target }}
dependsOn: ${{ parameters.componentName }}_build_ubuntu2204_${{ job.target }}
condition:
and(succeeded(),
eq(variables['ENABLE_${{ upper(job.target) }}_TESTS'], 'true'),
not(containsValue(split(variables['DISABLED_${{ upper(job.target) }}_TESTS'], ','), variables['Build.DefinitionName'])),
not(containsValue(split(variables['DISABLED_${{ upper(job.target) }}_TESTS'], ','), '${{ parameters.componentName }}')),
eq(${{ parameters.aggregatePipeline }}, False)
)
variables:
@@ -183,6 +208,8 @@ jobs:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmTestDependencies }}
gpuTarget: ${{ job.target }}
${{ if parameters.triggerDownstreamJobs }}:
downstreamAggregateNames: ${{ parameters.downstreamAggregateNames }}
- task: CMake@1
displayName: MIGraphXTest CMake Flags
inputs:
@@ -199,7 +226,7 @@ jobs:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/gpu-diagnostics.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/test.yml
parameters:
componentName: AMDMIGraphX
componentName: ${{ parameters.componentName }}
testExecutable: make
testParameters: -j$(nproc) check
testPublishResults: false

View File

@@ -1,10 +1,29 @@
parameters:
- name: componentName
type: string
default: MIOpen
- name: checkoutRepo
type: string
default: 'self'
- name: checkoutRef
type: string
default: ''
# monorepo related parameters
- name: sparseCheckoutDir
type: string
default: ''
- name: triggerDownstreamJobs
type: boolean
default: false
- name: downstreamAggregateNames
type: string
default: ''
- name: buildDependsOn
type: object
default: null
- name: unifiedBuild
type: boolean
default: false
# set to true if doing full build of ROCm stack
# and dependencies are pulled from same pipeline
- name: aggregatePipeline
@@ -74,10 +93,31 @@ parameters:
target: gfx942
- gfx90a:
target: gfx90a
- name: downstreamComponentMatrix
type: object
default:
- MIVisionX:
name: MIVisionX
checkoutRepo: mivisionx_repo
sparseCheckoutDir: ''
skipUnifiedBuild: 'false'
buildDependsOn:
- MIOpen_build
- AMDMIGraphX:
name: AMDMIGraphX
checkoutRepo: amdmigraphx_repo
sparseCheckoutDir: ''
skipUnifiedBuild: 'false'
buildDependsOn:
- MIOpen_build
jobs:
- ${{ each job in parameters.jobMatrix.buildJobs }}:
- job: MIOpen_build_${{ job.target }}
- job: ${{ parameters.componentName }}_build_ubuntu2204_${{ job.target }}
${{ if parameters.buildDependsOn }}:
dependsOn:
- ${{ each build in parameters.buildDependsOn }}:
- ${{ build }}_ubuntu2204_${{ job.target }}
variables:
- group: common
- template: /.azuredevops/variables-global.yml
@@ -95,6 +135,7 @@ jobs:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/checkout.yml
parameters:
checkoutRepo: ${{ parameters.checkoutRepo }}
sparseCheckoutDir: ${{ parameters.sparseCheckoutDir }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/miopen-get-ck-build.yml
parameters:
gpuTarget: ${{ job.target }}
@@ -104,11 +145,13 @@ jobs:
dependencyList: ${{ parameters.rocmDependencies }}
gpuTarget: ${{ job.target }}
aggregatePipeline: ${{ parameters.aggregatePipeline }}
${{ if parameters.triggerDownstreamJobs }}:
downstreamAggregateNames: ${{ parameters.downstreamAggregateNames }}
- task: Bash@3
displayName: Build and install other dependencies
inputs:
targetType: inline
workingDirectory: $(Build.SourcesDirectory)
workingDirectory: $(Agent.BuildDirectory)/s
script: |
sed -i '/composable_kernel/d' requirements.txt
mkdir -p $(Agent.BuildDirectory)/miopen-deps
@@ -130,8 +173,10 @@ jobs:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/manifest.yml
parameters:
gpuTarget: ${{ job.target }}
sparseCheckoutDir: ${{ parameters.sparseCheckoutDir }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-upload.yml
parameters:
componentName: ${{ parameters.componentName }}
gpuTarget: ${{ job.target }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-links.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
@@ -143,9 +188,9 @@ jobs:
- miopen-deps
- ${{ each job in parameters.jobMatrix.testJobs }}:
- job: MIOpen_test_${{ job.target }}
- job: ${{ parameters.componentName }}_test_ubuntu2204_${{ job.target }}
timeoutInMinutes: 180
dependsOn: MIOpen_build_${{ job.target }}
dependsOn: ${{ parameters.componentName }}_build_ubuntu2204_${{ job.target }}
condition:
and(succeeded(),
eq(variables['ENABLE_${{ upper(job.target) }}_TESTS'], 'true'),
@@ -169,6 +214,7 @@ jobs:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/checkout.yml
parameters:
checkoutRepo: ${{ parameters.checkoutRepo }}
sparseCheckoutDir: ${{ parameters.sparseCheckoutDir }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-aqlprofile.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/miopen-get-ck-build.yml
parameters:
@@ -178,11 +224,13 @@ jobs:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmTestDependencies }}
gpuTarget: ${{ job.target }}
${{ if parameters.triggerDownstreamJobs }}:
downstreamAggregateNames: ${{ parameters.downstreamAggregateNames }}
- task: Bash@3
displayName: Build and install other dependencies
inputs:
targetType: inline
workingDirectory: $(Build.SourcesDirectory)
workingDirectory: $(Agent.BuildDirectory)/s
script: |
sed -i '/composable_kernel/d' requirements.txt
mkdir -p $(Agent.BuildDirectory)/miopen-deps
@@ -193,7 +241,7 @@ jobs:
displayName: 'MIOpen Test CMake Flags'
inputs:
cmakeArgs: >-
-DCMAKE_PREFIX_PATH=$(Agent.BuildDirectory)/rocm;$(Build.SourcesDirectory)/bin;$(Agent.BuildDirectory)/miopen-deps
-DCMAKE_PREFIX_PATH=$(Agent.BuildDirectory)/rocm;$(Agent.BuildDirectory)/s/bin;$(Agent.BuildDirectory)/miopen-deps
-DCMAKE_INSTALL_PREFIX=$(Agent.BuildDirectory)/rocm
-DCMAKE_CXX_COMPILER=$(Agent.BuildDirectory)/rocm/llvm/bin/amdclang++
-DCMAKE_C_COMPILER=$(Agent.BuildDirectory)/rocm/llvm/bin/amdclang
@@ -203,19 +251,19 @@ jobs:
-DBUILD_DEV=OFF
-DMIOPEN_USE_MLIR=ON
-DMIOPEN_GPU_SYNC=OFF
..
$(Agent.BuildDirectory)/s
- task: Bash@3
displayName: 'MIOpen Test Build'
inputs:
targetType: inline
workingDirectory: build
script: |
cmake --build . --target tests -- -j$(nproc)
workingDirectory: $(Build.SourcesDirectory)/build
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/gpu-diagnostics.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/test.yml
parameters:
componentName: MIOpen
testParameters: '--output-on-failure --force-new-ctest-process --output-junit test_output.xml --exclude-regex "test_rnn_seq_api|GPU_Conv2dTuningAsm_FP32"'
componentName: ${{ parameters.componentName }}
testParameters: '--output-on-failure --force-new-ctest-process --output-junit test_output.xml --exclude-regex "test_rnn_seq_api|GPU_Conv2dTuningAsm_FP32|GPU_Conv2dTuningAsmBwdWrw_FP32"'
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
@@ -224,3 +272,15 @@ jobs:
gpuTarget: ${{ job.target }}
extraCopyDirectories:
- miopen-deps
# - ${{ if parameters.triggerDownstreamJobs }}:
# - ${{ each component in parameters.downstreamComponentMatrix }}:
# - ${{ if not(and(parameters.unifiedBuild, eq(component.skipUnifiedBuild, 'true'))) }}:
# - template: /.azuredevops/components/${{ component.name }}.yml@pipelines_repo
# parameters:
# checkoutRepo: ${{ component.checkoutRepo }}
# # sparseCheckoutDir: ${{ component.sparseCheckoutDir }}
# buildDependsOn: ${{ component.buildDependsOn }}
# downstreamAggregateNames: ${{ parameters.downstreamAggregateNames }}+${{ parameters.componentName }}
# triggerDownstreamJobs: true
# unifiedBuild: ${{ parameters.unifiedBuild }}

View File

@@ -1,10 +1,29 @@
parameters:
- name: componentName
type: string
default: MIVisionX
- name: checkoutRepo
type: string
default: 'self'
- name: checkoutRef
type: string
default: ''
# monorepo related parameters
# - name: sparseCheckoutDir
# type: string
# default: ''
- name: triggerDownstreamJobs
type: boolean
default: false
- name: downstreamAggregateNames
type: string
default: ''
- name: buildDependsOn
type: object
default: null
- name: unifiedBuild
type: boolean
default: false
# set to true if doing full build of ROCm stack
# and dependencies are pulled from same pipeline
- name: aggregatePipeline
@@ -60,6 +79,7 @@ parameters:
- name: rocmTestDependencies
type: object
default:
- aomp
- clr
- half
- hipBLAS-common
@@ -88,7 +108,11 @@ parameters:
jobs:
- ${{ each job in parameters.jobMatrix.buildJobs }}:
- job: MIVisionX_build_${{ job.target }}
- job: ${{ parameters.componentName }}_build_ubuntu2204_${{ job.target }}
${{ if parameters.buildDependsOn }}:
dependsOn:
- ${{ each build in parameters.buildDependsOn }}:
- ${{ build }}_ubuntu2204_${{ job.target }}
variables:
- group: common
- template: /.azuredevops/variables-global.yml
@@ -110,6 +134,8 @@ jobs:
dependencyList: ${{ parameters.rocmDependencies }}
gpuTarget: ${{ job.target }}
aggregatePipeline: ${{ parameters.aggregatePipeline }}
${{ if parameters.triggerDownstreamJobs }}:
downstreamAggregateNames: ${{ parameters.downstreamAggregateNames }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/build-cmake.yml
parameters:
extraBuildFlags: >-
@@ -131,12 +157,12 @@ jobs:
# gpuTarget: ${{ job.target }}
- ${{ each job in parameters.jobMatrix.testJobs }}:
- job: MIVisionX_test_${{ job.target }}
dependsOn: MIVisionX_build_${{ job.target }}
- job: ${{ parameters.componentName }}_test_ubuntu2204_${{ job.target }}
dependsOn: ${{ parameters.componentName }}_build_ubuntu2204_${{ job.target }}
condition:
and(succeeded(),
eq(variables['ENABLE_${{ upper(job.target) }}_TESTS'], 'true'),
not(containsValue(split(variables['DISABLED_${{ upper(job.target) }}_TESTS'], ','), variables['Build.DefinitionName'])),
not(containsValue(split(variables['DISABLED_${{ upper(job.target) }}_TESTS'], ','), '${{ parameters.componentName }}')),
eq(${{ parameters.aggregatePipeline }}, False)
)
variables:
@@ -161,6 +187,8 @@ jobs:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmTestDependencies }}
gpuTarget: ${{ job.target }}
${{ if parameters.triggerDownstreamJobs }}:
downstreamAggregateNames: ${{ parameters.downstreamAggregateNames }}
- task: Bash@3
displayName: Build MIVisionX tests
inputs:
@@ -174,7 +202,7 @@ jobs:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/gpu-diagnostics.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/test.yml
parameters:
componentName: MIVisionX
componentName: ${{ parameters.componentName }}
testDir: 'mivisionx-tests'
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:

View File

@@ -28,8 +28,8 @@ parameters:
- name: rocmTestDependencies
type: object
default:
- amdsmi
- llvm-project
- rocm_smi_lib
- rocprofiler-register
- name: jobMatrix
@@ -111,14 +111,6 @@ jobs:
parameters:
aptPackages: ${{ parameters.aptPackages }}
packageManager: ${{ job.packageManager }}
- task: Bash@3
displayName: Install libhwloc5
inputs:
targetType: 'inline'
script: |
wget http://ftp.us.debian.org/debian/pool/main/h/hwloc/libhwloc5_1.11.12-3_amd64.deb
wget http://ftp.us.debian.org/debian/pool/main/h/hwloc/libhwloc-dev_1.11.12-3_amd64.deb
sudo apt install -y --allow-downgrades ./libhwloc5_1.11.12-3_amd64.deb ./libhwloc-dev_1.11.12-3_amd64.deb
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/preamble.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/local-artifact-download.yml
parameters:
@@ -161,6 +153,10 @@ jobs:
targetType: 'inline'
workingDirectory: $(Build.SourcesDirectory)/rocrtst/suites/test_common
script: |
echo $(Build.SourcesDirectory)/rocrtst/thirdparty/lib | sudo tee -a /etc/ld.so.conf.d/rocm-ci.conf
sudo cat /etc/ld.so.conf.d/rocm-ci.conf
sudo ldconfig -v
ldconfig -p
if [ -e /opt/rh/gcc-toolset-14/enable ]; then
source /opt/rh/gcc-toolset-14/enable
fi

View File

@@ -1,10 +1,29 @@
parameters:
- name: componentName
type: string
default: hipBLAS
- name: checkoutRepo
type: string
default: 'self'
- name: checkoutRef
type: string
default: ''
# monorepo related parameters
- name: sparseCheckoutDir
type: string
default: ''
- name: triggerDownstreamJobs
type: boolean
default: false
- name: downstreamAggregateNames
type: string
default: ''
- name: buildDependsOn
type: object
default: null
- name: unifiedBuild
type: boolean
default: false
# set to true if doing full build of ROCm stack
# and dependencies are pulled from same pipeline
- name: aggregatePipeline
@@ -69,10 +88,30 @@ parameters:
target: gfx942
- gfx90a:
target: gfx90a
# MIOpen depends on both rocRAND and hipBLAS
# for a unified build, hipBLAS will be the one to call MIOpen
- name: downstreamComponentMatrix
type: object
default:
- MIOpen:
name: MIOpen
sparseCheckoutDir: projects/miopen
skipUnifiedBuild: 'false'
buildDependsOn:
- hipBLAS_build
unifiedBuild:
downstreamAggregateNames: hipBLAS+rocRAND
buildDependsOn:
- hipBLAS_build
- rocRAND_build
jobs:
- ${{ each job in parameters.jobMatrix.buildJobs }}:
- job: hipBLAS_build_${{ job.target }}
- job: ${{ parameters.componentName }}_build_ubuntu2204_${{ job.target }}
${{ if parameters.buildDependsOn }}:
dependsOn:
- ${{ each build in parameters.buildDependsOn }}:
- ${{ build }}_ubuntu2204_${{ job.target }}
variables:
- group: common
- template: /.azuredevops/variables-global.yml
@@ -88,6 +127,7 @@ jobs:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/checkout.yml
parameters:
checkoutRepo: ${{ parameters.checkoutRepo }}
sparseCheckoutDir: ${{ parameters.sparseCheckoutDir }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-aocl.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
@@ -95,6 +135,8 @@ jobs:
dependencyList: ${{ parameters.rocmDependencies }}
gpuTarget: ${{ job.target }}
aggregatePipeline: ${{ parameters.aggregatePipeline }}
${{ if parameters.triggerDownstreamJobs }}:
downstreamAggregateNames: ${{ parameters.downstreamAggregateNames }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/build-cmake.yml
parameters:
extraBuildFlags: >-
@@ -109,9 +151,12 @@ jobs:
-GNinja
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/manifest.yml
parameters:
componentName: ${{ parameters.componentName }}
sparseCheckoutDir: ${{ parameters.sparseCheckoutDir }}
gpuTarget: ${{ job.target }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-upload.yml
parameters:
componentName: ${{ parameters.componentName }}
gpuTarget: ${{ job.target }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-links.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
@@ -121,46 +166,67 @@ jobs:
installAOCL: true
gpuTarget: ${{ job.target }}
- ${{ each job in parameters.jobMatrix.testJobs }}:
- job: hipBLAS_test_${{ job.target }}
dependsOn: hipBLAS_build_${{ job.target }}
condition:
and(succeeded(),
eq(variables['ENABLE_${{ upper(job.target) }}_TESTS'], 'true'),
not(containsValue(split(variables['DISABLED_${{ upper(job.target) }}_TESTS'], ','), variables['Build.DefinitionName'])),
eq(${{ parameters.aggregatePipeline }}, False)
)
variables:
- group: common
- template: /.azuredevops/variables-global.yml
pool: ${{ job.target }}_test_pool
workspace:
clean: all
steps:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-other.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
pipModules: ${{ parameters.pipModules }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/preamble.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/local-artifact-download.yml
parameters:
gpuTarget: ${{ job.target }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-aqlprofile.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmTestDependencies }}
gpuTarget: ${{ job.target }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/gpu-diagnostics.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/test.yml
parameters:
componentName: hipBLAS
testExecutable: $(Agent.BuildDirectory)/rocm/bin/hipblas-test
testParameters: '--yaml hipblas_smoke.yaml --gtest_output=xml:./test_output.xml --gtest_color=yes'
testDir: '$(Agent.BuildDirectory)/rocm/bin'
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
pipModules: ${{ parameters.pipModules }}
environment: test
gpuTarget: ${{ job.target }}
- ${{ if eq(parameters.unifiedBuild, False) }}:
- ${{ each job in parameters.jobMatrix.testJobs }}:
- job: ${{ parameters.componentName }}_test_ubuntu2204_${{ job.target }}
dependsOn: ${{ parameters.componentName }}_build_ubuntu2204_${{ job.target }}
condition:
and(succeeded(),
eq(variables['ENABLE_${{ upper(job.target) }}_TESTS'], 'true'),
not(containsValue(split(variables['DISABLED_${{ upper(job.target) }}_TESTS'], ','), '${{ parameters.componentName }}')),
eq(${{ parameters.aggregatePipeline }}, False)
)
variables:
- group: common
- template: /.azuredevops/variables-global.yml
pool: ${{ job.target }}_test_pool
workspace:
clean: all
steps:
- checkout: none
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-other.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
pipModules: ${{ parameters.pipModules }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/preamble.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/local-artifact-download.yml
parameters:
preTargetFilter: ${{ parameters.componentName }}
gpuTarget: ${{ job.target }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-aqlprofile.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmTestDependencies }}
gpuTarget: ${{ job.target }}
${{ if parameters.triggerDownstreamJobs }}:
downstreamAggregateNames: ${{ parameters.downstreamAggregateNames }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/gpu-diagnostics.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/test.yml
parameters:
componentName: ${{ parameters.componentName }}
testExecutable: $(Agent.BuildDirectory)/rocm/bin/hipblas-test
testParameters: '--yaml hipblas_smoke.yaml --gtest_output=xml:./test_output.xml --gtest_color=yes'
testDir: '$(Agent.BuildDirectory)/rocm/bin'
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
pipModules: ${{ parameters.pipModules }}
environment: test
gpuTarget: ${{ job.target }}
- ${{ if parameters.triggerDownstreamJobs }}:
- ${{ each component in parameters.downstreamComponentMatrix }}:
- ${{ if not(and(parameters.unifiedBuild, eq(component.skipUnifiedBuild, 'true'))) }}:
- template: /.azuredevops/components/${{ component.name }}.yml@pipelines_repo
parameters:
checkoutRepo: ${{ parameters.checkoutRepo }}
sparseCheckoutDir: ${{ component.sparseCheckoutDir }}
triggerDownstreamJobs: true
unifiedBuild: ${{ parameters.unifiedBuild }}
${{ if parameters.unifiedBuild }}:
buildDependsOn: ${{ component.unifiedBuild.buildDependsOn }}
downstreamAggregateNames: ${{ parameters.downstreamAggregateNames }}+${{ component.unifiedBuild.downstreamAggregateNames }}
${{ else }}:
buildDependsOn: ${{ component.buildDependsOn }}
downstreamAggregateNames: ${{ parameters.downstreamAggregateNames }}+${{ parameters.componentName }}

View File

@@ -80,11 +80,11 @@ parameters:
jobs:
- ${{ each job in parameters.jobMatrix.buildJobs }}:
- job: ${{ parameters.componentName }}_build_${{ job.target }}
- job: ${{ parameters.componentName }}_build_ubuntu2204_${{ job.target }}
${{ if parameters.buildDependsOn }}:
dependsOn:
- ${{ each build in parameters.buildDependsOn }}:
- ${{ build }}_${{ job.target }} # todo: add OS
- ${{ build }}_ubuntu2204_${{ job.target }}
variables:
- group: common
- template: /.azuredevops/variables-global.yml
@@ -141,12 +141,12 @@ jobs:
# gpuTarget: ${{ job.target }}
- ${{ each job in parameters.jobMatrix.testJobs }}:
- job: ${{ parameters.componentName }}_test_${{ job.target }}
dependsOn: ${{ parameters.componentName }}_build_${{ job.target }}
- job: ${{ parameters.componentName }}_test_ubuntu2204_${{ job.target }}
dependsOn: ${{ parameters.componentName }}_build_ubuntu2204_${{ job.target }}
condition:
and(succeeded(),
eq(variables['ENABLE_${{ upper(job.target) }}_TESTS'], 'true'),
not(containsValue(split(variables['DISABLED_${{ upper(job.target) }}_TESTS'], ','), variables['Build.DefinitionName'])),
not(containsValue(split(variables['DISABLED_${{ upper(job.target) }}_TESTS'], ','), '${{ parameters.componentName }}')),
eq(${{ parameters.aggregatePipeline }}, False)
)
variables:

View File

@@ -72,15 +72,15 @@ parameters:
testJobs:
- { os: ubuntu2204, packageManager: apt, target: gfx942 }
- { os: ubuntu2204, packageManager: apt, target: gfx90a }
# - name: downstreamComponentMatrix
# type: object
# default:
# - rocFFT:
# name: rocFFT
# sparseCheckoutDir: projects/rocfft
# skipUnifiedBuild: 'false'
# buildDependsOn:
# - hipRAND_build
- name: downstreamComponentMatrix
type: object
default:
- rocFFT:
name: rocFFT
sparseCheckoutDir: projects/rocfft
skipUnifiedBuild: 'false'
buildDependsOn:
- hipRAND_build
jobs:
- ${{ each job in parameters.jobMatrix.buildJobs }}:
@@ -206,14 +206,14 @@ jobs:
environment: test
gpuTarget: ${{ job.target }}
# - ${{ if parameters.triggerDownstreamJobs }}:
# - ${{ each component in parameters.downstreamComponentMatrix }}:
# - ${{ if not(and(parameters.unifiedBuild, eq(component.skipUnifiedBuild, 'true'))) }}:
# - template: /.azuredevops/components/${{ component.name }}.yml@pipelines_repo
# parameters:
# checkoutRepo: ${{ parameters.checkoutRepo }}
# sparseCheckoutDir: ${{ component.sparseCheckoutDir }}
# buildDependsOn: ${{ component.buildDependsOn }}
# downstreamAggregateNames: ${{ parameters.downstreamAggregateNames }}+${{ parameters.componentName }}
# triggerDownstreamJobs: true
# unifiedBuild: ${{ parameters.unifiedBuild }}
- ${{ if parameters.triggerDownstreamJobs }}:
- ${{ each component in parameters.downstreamComponentMatrix }}:
- ${{ if not(and(parameters.unifiedBuild, eq(component.skipUnifiedBuild, 'true'))) }}:
- template: /.azuredevops/components/${{ component.name }}.yml@pipelines_repo
parameters:
checkoutRepo: ${{ parameters.checkoutRepo }}
sparseCheckoutDir: ${{ component.sparseCheckoutDir }}
buildDependsOn: ${{ component.buildDependsOn }}
downstreamAggregateNames: ${{ parameters.downstreamAggregateNames }}+${{ parameters.componentName }}
triggerDownstreamJobs: true
unifiedBuild: ${{ parameters.unifiedBuild }}

View File

@@ -1,10 +1,29 @@
parameters:
- name: componentName
type: string
default: hipSOLVER
- name: checkoutRepo
type: string
default: 'self'
- name: checkoutRef
type: string
default: ''
# monorepo related parameters
- name: sparseCheckoutDir
type: string
default: ''
- name: triggerDownstreamJobs
type: boolean
default: false
- name: downstreamAggregateNames
type: string
default: ''
- name: buildDependsOn
type: object
default: null
- name: unifiedBuild
type: boolean
default: false
# set to true if doing full build of ROCm stack
# and dependencies are pulled from same pipeline
- name: aggregatePipeline
@@ -66,7 +85,11 @@ parameters:
jobs:
- ${{ each job in parameters.jobMatrix.buildJobs }}:
- job: hipSOLVER_build_${{ job.target }}
- job: ${{ parameters.componentName }}_build_ubuntu2204_${{ job.target }}
${{ if parameters.buildDependsOn }}:
dependsOn:
- ${{ each build in parameters.buildDependsOn }}:
- ${{ build }}_ubuntu2204_${{ job.target }}
variables:
- group: common
- template: /.azuredevops/variables-global.yml
@@ -81,18 +104,21 @@ jobs:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/checkout.yml
parameters:
checkoutRepo: ${{ parameters.checkoutRepo }}
sparseCheckoutDir: ${{ parameters.sparseCheckoutDir }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmDependencies }}
gpuTarget: ${{ job.target }}
aggregatePipeline: ${{ parameters.aggregatePipeline }}
${{ if parameters.triggerDownstreamJobs }}:
downstreamAggregateNames: ${{ parameters.downstreamAggregateNames }}
# build external gtest and lapack
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/build-cmake.yml
parameters:
componentName: external
cmakeBuildDir: '$(Build.SourcesDirectory)/deps/build'
cmakeSourceDir: '$(Build.SourcesDirectory)/deps'
cmakeBuildDir: '$(Agent.BuildDirectory)/s/deps/build'
cmakeSourceDir: '$(Agent.BuildDirectory)/s/deps'
installDir: '$(Pipeline.Workspace)/deps-install'
extraBuildFlags: >-
-DBUILD_BOOST=OFF
@@ -111,8 +137,10 @@ jobs:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/manifest.yml
parameters:
gpuTarget: ${{ job.target }}
sparseCheckoutDir: ${{ parameters.sparseCheckoutDir }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-upload.yml
parameters:
componentName: ${{ parameters.componentName }}
gpuTarget: ${{ job.target }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-links.yml
# - template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
@@ -122,44 +150,49 @@ jobs:
# extraCopyDirectories:
# - deps-install
- ${{ each job in parameters.jobMatrix.testJobs }}:
- job: hipSOLVER_test_${{ job.target }}
dependsOn: hipSOLVER_build_${{ job.target }}
condition:
and(succeeded(),
eq(variables['ENABLE_${{ upper(job.target) }}_TESTS'], 'true'),
not(containsValue(split(variables['DISABLED_${{ upper(job.target) }}_TESTS'], ','), variables['Build.DefinitionName'])),
eq(${{ parameters.aggregatePipeline }}, False)
)
variables:
- group: common
- template: /.azuredevops/variables-global.yml
pool: ${{ job.target }}_test_pool
workspace:
clean: all
steps:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-other.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/preamble.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/local-artifact-download.yml
parameters:
gpuTarget: ${{ job.target }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-aqlprofile.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmTestDependencies }}
gpuTarget: ${{ job.target }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/gpu-diagnostics.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/test.yml
parameters:
componentName: hipSOLVER
testDir: '$(Agent.BuildDirectory)/rocm/bin'
testExecutable: './hipsolver-test'
testParameters: '--gtest_filter="*checkin*" --gtest_output=xml:./test_output.xml --gtest_color=yes'
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
environment: test
gpuTarget: ${{ job.target }}
- ${{ if eq(parameters.unifiedBuild, False) }}:
- ${{ each job in parameters.jobMatrix.testJobs }}:
- job: ${{ parameters.componentName }}_test_ubuntu2204_${{ job.target }}
dependsOn: ${{ parameters.componentName }}_build_ubuntu2204_${{ job.target }}
condition:
and(succeeded(),
eq(variables['ENABLE_${{ upper(job.target) }}_TESTS'], 'true'),
not(containsValue(split(variables['DISABLED_${{ upper(job.target) }}_TESTS'], ','), '${{ parameters.componentName }}')),
eq(${{ parameters.aggregatePipeline }}, False)
)
variables:
- group: common
- template: /.azuredevops/variables-global.yml
pool: ${{ job.target }}_test_pool
workspace:
clean: all
steps:
- checkout: none
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-other.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/preamble.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/local-artifact-download.yml
parameters:
preTargetFilter: ${{ parameters.componentName }}
gpuTarget: ${{ job.target }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-aqlprofile.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmTestDependencies }}
gpuTarget: ${{ job.target }}
${{ if parameters.triggerDownstreamJobs }}:
downstreamAggregateNames: ${{ parameters.downstreamAggregateNames }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/gpu-diagnostics.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/test.yml
parameters:
componentName: ${{ parameters.componentName }}
testDir: '$(Agent.BuildDirectory)/rocm/bin'
testExecutable: './hipsolver-test'
testParameters: '--gtest_filter="*checkin*" --gtest_output=xml:./test_output.xml --gtest_color=yes'
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
environment: test
gpuTarget: ${{ job.target }}

View File

@@ -104,17 +104,17 @@ parameters:
- rocBLAS_build
# rocSOLVER depends on both rocBLAS and rocPRIM
# for a unified build, rocBLAS will be the one to call rocSOLVER
# - rocSOLVER:
# name: rocSOLVER
# sparseCheckoutDir: projects/rocsolver
# skipUnifiedBuild: 'false'
# buildDependsOn:
# - rocBLAS_build
# unifiedBuild:
# downstreamAggregateNames: rocBLAS+rocPRIM
# buildDependsOn:
# - rocBLAS_build
# - rocPRIM_build
# - rocSOLVER:
# name: rocSOLVER
# sparseCheckoutDir: projects/rocsolver
# skipUnifiedBuild: 'false'
# buildDependsOn:
# - rocBLAS_build
# unifiedBuild:
# downstreamAggregateNames: rocBLAS+rocPRIM
# buildDependsOn:
# - rocBLAS_build
# - rocPRIM_build
jobs:
- ${{ each job in parameters.jobMatrix.buildJobs }}:

View File

@@ -78,19 +78,19 @@ parameters:
target: gfx942
- gfx90a:
target: gfx90a
# - name: downstreamComponentMatrix
# type: object
# default:
# - hipFFT:
# name: hipFFT
# sparseCheckoutDir: projects/hipfft
# skipUnifiedBuild: 'false'
# buildDependsOn:
# - rocFFT_build
- name: downstreamComponentMatrix
type: object
default:
- hipFFT:
name: hipFFT
sparseCheckoutDir: projects/hipfft
skipUnifiedBuild: 'false'
buildDependsOn:
- rocFFT_build
jobs:
- ${{ each job in parameters.jobMatrix.buildJobs }}:
- job: ${{ parameters.componentName }}_build_${{ job.target }}
- job: ${{ parameters.componentName }}_build_ubuntu2204_${{ job.target }}
${{ if parameters.buildDependsOn }}:
dependsOn:
- ${{ each build in parameters.buildDependsOn }}:
@@ -151,12 +151,12 @@ jobs:
- HIP_ROCCLR_HOME:::/home/user/workspace/rocm
- ${{ each job in parameters.jobMatrix.testJobs }}:
- job: ${{ parameters.componentName }}_test_${{ job.target }}
dependsOn: ${{ parameters.componentName }}_build_${{ job.target }}
- job: ${{ parameters.componentName }}_test_ubuntu2204_${{ job.target }}
dependsOn: ${{ parameters.componentName }}_build_ubuntu2204_${{ job.target }}
condition:
and(succeeded(),
eq(variables['ENABLE_${{ upper(job.target) }}_TESTS'], 'true'),
not(containsValue(split(variables['DISABLED_${{ upper(job.target) }}_TESTS'], ','), variables['Build.DefinitionName'])),
not(containsValue(split(variables['DISABLED_${{ upper(job.target) }}_TESTS'], ','), '${{ parameters.componentName }}')),
eq(${{ parameters.aggregatePipeline }}, False)
)
variables:
@@ -196,14 +196,14 @@ jobs:
environment: test
gpuTarget: ${{ job.target }}
# - ${{ if parameters.triggerDownstreamJobs }}:
# - ${{ each component in parameters.downstreamComponentMatrix }}:
# - ${{ if not(and(parameters.unifiedBuild, eq(component.skipUnifiedBuild, 'true'))) }}:
# - template: /.azuredevops/components/${{ component.name }}.yml@pipelines_repo
# parameters:
# checkoutRepo: ${{ parameters.checkoutRepo }}
# sparseCheckoutDir: ${{ component.sparseCheckoutDir }}
# buildDependsOn: ${{ component.buildDependsOn }}
# downstreamAggregateNames: ${{ parameters.downstreamAggregateNames }}+${{ parameters.componentName }}
# triggerDownstreamJobs: true
# unifiedBuild: ${{ parameters.unifiedBuild }}
- ${{ if parameters.triggerDownstreamJobs }}:
- ${{ each component in parameters.downstreamComponentMatrix }}:
- ${{ if not(and(parameters.unifiedBuild, eq(component.skipUnifiedBuild, 'true'))) }}:
- template: /.azuredevops/components/${{ component.name }}.yml@pipelines_repo
parameters:
checkoutRepo: ${{ parameters.checkoutRepo }}
sparseCheckoutDir: ${{ component.sparseCheckoutDir }}
buildDependsOn: ${{ component.buildDependsOn }}
downstreamAggregateNames: ${{ parameters.downstreamAggregateNames }}+${{ parameters.componentName }}
triggerDownstreamJobs: true
unifiedBuild: ${{ parameters.unifiedBuild }}

View File

@@ -91,12 +91,12 @@ parameters:
- rocPRIM_build
# rocSOLVER depends on both rocBLAS and rocPRIM
# for a unified build, rocBLAS will be the one to call rocSOLVER
# - rocSOLVER:
# name: rocSOLVER
# sparseCheckoutDir: projects/rocsolver
# skipUnifiedBuild: 'true'
# buildDependsOn:
# - rocPRIM_build
# - rocSOLVER:
# name: rocSOLVER
# sparseCheckoutDir: projects/rocsolver
# skipUnifiedBuild: 'true'
# buildDependsOn:
# - rocPRIM_build
jobs:
- ${{ each job in parameters.jobMatrix.buildJobs }}:

View File

@@ -79,6 +79,12 @@ parameters:
skipUnifiedBuild: 'false'
buildDependsOn:
- rocRAND_build
- MIOpen:
name: MIOpen
sparseCheckoutDir: projects/miopen
skipUnifiedBuild: 'true'
buildDependsOn:
- rocRAND_build
jobs:
- ${{ each job in parameters.jobMatrix.buildJobs }}:

View File

@@ -83,6 +83,28 @@ parameters:
testJobs:
- { os: ubuntu2204, packageManager: apt, target: gfx942 }
- { os: ubuntu2204, packageManager: apt, target: gfx90a }
- name: downstreamComponentMatrix
type: object
default:
- hipBLAS:
name: hipBLAS
sparseCheckoutDir: projects/hipblas
skipUnifiedBuild: 'false'
buildDependsOn:
- rocSOLVER_build
# hipSOLVER depends on both rocSOLVER and rocSPARSE
# for a unified build, rocSOLVER will be the one to call hipSOLVER
# - hipSOLVER:
# name: hipSOLVER
# sparseCheckoutDir: projects/hipsolver
# skipUnifiedBuild: 'false'
# buildDependsOn:
# - rocSOLVER_build
# unifiedBuild:
# downstreamAggregateNames: rocSOLVER+rocSPARSE
# buildDependsOn:
# - rocSOLVER_build
# - rocSPARSE_build
jobs:
- ${{ each job in parameters.jobMatrix.buildJobs }}:
@@ -228,3 +250,19 @@ jobs:
aptPackages: ${{ parameters.aptPackages }}
environment: test
gpuTarget: ${{ job.target }}
- ${{ if parameters.triggerDownstreamJobs }}:
- ${{ each component in parameters.downstreamComponentMatrix }}:
- ${{ if not(and(parameters.unifiedBuild, eq(component.skipUnifiedBuild, 'true'))) }}:
- template: /.azuredevops/components/${{ component.name }}.yml@pipelines_repo
parameters:
checkoutRepo: ${{ parameters.checkoutRepo }}
sparseCheckoutDir: ${{ component.sparseCheckoutDir }}
triggerDownstreamJobs: true
unifiedBuild: ${{ parameters.unifiedBuild }}
${{ if parameters.unifiedBuild }}:
buildDependsOn: ${{ component.unifiedBuild.buildDependsOn }}
downstreamAggregateNames: ${{ parameters.downstreamAggregateNames }}+${{ component.unifiedBuild.downstreamAggregateNames }}
${{ else }}:
buildDependsOn: ${{ component.buildDependsOn }}
downstreamAggregateNames: ${{ parameters.downstreamAggregateNames }}+${{ parameters.componentName }}

View File

@@ -0,0 +1,163 @@
parameters:
- name: componentName
type: string
default: rocm_libraries
- name: checkoutRepo
type: string
default: 'self'
- name: checkoutRef
type: string
default: ''
# monorepo related parameters
- name: sparseCheckoutDir
type: string
default: ''
- name: triggerDownstreamJobs
type: boolean
default: false
- name: downstreamAggregateNames
type: string
default: ''
- name: buildDependsOn
type: object
default: null
- name: unifiedBuild
type: boolean
default: false
# set to true if doing full build of ROCm stack
# and dependencies are pulled from same pipeline
- name: aggregatePipeline
type: boolean
default: false
- name: aptPackages
type: object
default:
- ccache
- gfortran
- git
- libdrm-dev
- libmsgpack-dev
- libnuma-dev
- ninja-build
- python3-pip
- python3-venv
- name: pipModules
type: object
default:
- joblib
- "packaging>=22.0"
- --upgrade
- name: rocmDependencies
type: object
default:
- aomp
- clr
- llvm-project
- rocminfo
- rocm-cmake
- rocm_smi_lib
- rocprofiler-register
- ROCR-Runtime
- roctracer
- name: rocmTestDependencies
type: object
default:
- aomp
- clr
- llvm-project
- rocminfo
- rocm_smi_lib
- rocprofiler-register
- ROCR-Runtime
- roctracer
- name: jobMatrix
type: object
default:
buildJobs:
- { pool: rocm-ci_ultra_build_pool, os: ubuntu2204, packageManager: apt, target: gfx942 }
jobs:
- ${{ each job in parameters.jobMatrix.buildJobs }}:
- job: ${{ parameters.componentName }}_build_${{ job.os }}_${{ job.target }}
timeoutInMinutes: 300
${{ if parameters.buildDependsOn }}:
dependsOn:
- ${{ each build in parameters.buildDependsOn }}:
- ${{ build }}_${{ job.os }}
variables:
- group: common
- template: /.azuredevops/variables-global.yml
- name: DAY_STRING
value: $[format('{0:ddMMyyyy}', pipeline.startTime)]
pool: ${{ job.pool }}
${{ if eq(job.os, 'almalinux8') }}:
container:
image: rocmexternalcicd.azurecr.io/manylinux228:latest
endpoint: ContainerService3
workspace:
clean: all
steps:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-other.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
pipModules: ${{ parameters.pipModules }}
packageManager: ${{ job.packageManager }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-cmake-latest.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/preamble.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/checkout.yml
parameters:
checkoutRepo: ${{ parameters.checkoutRepo }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-vendor.yml
parameters:
dependencyList:
- gtest
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmDependencies }}
os: ${{ job.os }}
gpuTarget: ${{ job.target }}
aggregatePipeline: ${{ parameters.aggregatePipeline }}
${{ if parameters.triggerDownstreamJobs }}:
downstreamAggregateNames: ${{ parameters.downstreamAggregateNames }}
- script: |
mkdir -p $(CCACHE_DIR)
echo "##vso[task.prependpath]/usr/lib/ccache"
displayName: Update path for ccache
- task: Cache@2
displayName: Ccache caching
inputs:
key: rocm-libraries | ${{ job.os }} | ${{ job.target }} | $(DAY_STRING) | $(Agent.BuildDirectory)/rocm/llvm/bin/amdclang++
path: $(CCACHE_DIR)
restoreKeys: |
rocm-libraries | ${{ job.os }} | ${{ job.target }} | $(DAY_STRING)
rocm-libraries | ${{ job.os }} | ${{ job.target }}
rocm-libraries | ${{ job.os }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/build-cmake.yml
parameters:
os: ${{ job.os }}
extraBuildFlags: >-
-DROCM_LIBRARIES_SUPERBUILD=ON
-GNinja
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/manifest.yml
parameters:
componentName: ${{ parameters.componentName }}
os: ${{ job.os }}
gpuTarget: ${{ job.target }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-upload.yml
parameters:
componentName: ${{ parameters.componentName }}
os: ${{ job.os }}
gpuTarget: ${{ job.target }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-links.yml
- ${{ if eq(job.os, 'ubuntu2204') }}:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
pipModules: ${{ parameters.pipModules }}
gpuTarget: ${{ job.target }}
extraPaths: /home/user/workspace/rocm/llvm/bin:/home/user/workspace/rocm/bin
installLatestCMake: true
extraCopyDirectories:
- deps

View File

@@ -65,43 +65,19 @@ parameters:
type: object
default:
buildJobs:
- gfx942-staging:
name: gfx942_staging
- gfx942:
target: gfx942
dependencySource: staging
- gfx942-mainline:
name: gfx942_mainline
target: gfx942
dependencySource: mainline
- gfx90a-staging:
name: gfx90a_staging
- gfx90a:
target: gfx90a
dependencySource: staging
- gfx90a-mainline:
name: gfx90a_mainline
target: gfx90a
dependencySource: mainline
testJobs:
- gfx942-staging:
name: gfx942_staging
- gfx942:
target: gfx942
dependencySource: staging
- gfx942-mainline:
name: gfx942_mainline
target: gfx942
dependencySource: mainline
- gfx90a-staging:
name: gfx90a_staging
- gfx90a:
target: gfx90a
dependencySource: staging
- gfx90a-mainline:
name: gfx90a_mainline
target: gfx90a
dependencySource: mainline
jobs:
- ${{ each job in parameters.jobMatrix.buildJobs }}:
- job: rocprofiler_compute_build_${{ job.name }}
- job: rocprofiler_compute_build_${{ job.target }}
variables:
- group: common
- template: /.azuredevops/variables-global.yml
@@ -124,11 +100,9 @@ jobs:
-GNinja
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/manifest.yml
parameters:
artifactName: ${{ job.dependencySource }}
gpuTarget: ${{ job.target }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-upload.yml
parameters:
artifactName: ${{ job.dependencySource }}
gpuTarget: ${{ job.target }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-links.yml
# - template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
@@ -138,9 +112,9 @@ jobs:
# gpuTarget: ${{ job.target }}
- ${{ each job in parameters.jobMatrix.testJobs }}:
- job: rocprofiler_compute_test_${{ job.name }}
- job: rocprofiler_compute_test_${{ job.target }}
timeoutInMinutes: 120
dependsOn: rocprofiler_compute_build_${{ job.name }}
dependsOn: rocprofiler_compute_build_${{ job.target }}
condition:
and(succeeded(),
eq(variables['ENABLE_${{ upper(job.target) }}_TESTS'], 'true'),
@@ -166,14 +140,12 @@ jobs:
checkoutRepo: ${{ parameters.checkoutRepo }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/local-artifact-download.yml
parameters:
postTargetFilter: ${{ job.dependencySource }}
gpuTarget: ${{ job.target }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-aqlprofile.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmTestDependencies }}
dependencySource: ${{ job.dependencySource }}
gpuTarget: ${{ job.target }}
- task: Bash@3
displayName: Add en_US.UTF-8 locale

View File

@@ -1,10 +1,29 @@
parameters:
- name: componentName
type: string
default: rocprofiler-register
- name: checkoutRepo
type: string
default: 'self'
- name: checkoutRef
type: string
default: ''
# monorepo related parameters
- name: sparseCheckoutDir
type: string
default: ''
- name: triggerDownstreamJobs
type: boolean
default: false
- name: downstreamAggregateNames
type: string
default: ''
- name: buildDependsOn
type: object
default: null
- name: unifiedBuild
type: boolean
default: false
# set to true if doing full build of ROCm stack
# and dependencies are pulled from same pipeline
- name: aggregatePipeline
@@ -27,6 +46,10 @@ parameters:
jobs:
- ${{ each job in parameters.jobMatrix.buildJobs }}:
- job: rocprofiler_register_${{ job.os }}
${{ if parameters.buildDependsOn }}:
dependsOn:
- ${{ each build in parameters.buildDependsOn }}:
- ${{ build }}_${{ job.os }}
pool:
${{ if eq(job.os, 'ubuntu2404') }}:
vmImage: 'ubuntu-24.04'
@@ -50,9 +73,10 @@ jobs:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/checkout.yml
parameters:
checkoutRepo: ${{ parameters.checkoutRepo }}
sparseCheckoutDir: ${{ parameters.sparseCheckoutDir }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/build-cmake.yml
parameters:
componentName: rocprofiler-register
componentName: ${{ parameters.componentName }}
os: ${{ job.os }}
useAmdclang: false
extraBuildFlags: >-
@@ -62,12 +86,16 @@ jobs:
-GNinja
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/test.yml
parameters:
componentName: rocprofiler-register
componentName: ${{ parameters.componentName }}
testDir: $(Agent.BuildDirectory)/s/build
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/manifest.yml
parameters:
componentName: ${{ parameters.componentName }}
sparseCheckoutDir: ${{ parameters.sparseCheckoutDir }}
os: ${{ job.os }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-upload.yml
parameters:
componentName: ${{ parameters.componentName }}
os: ${{ job.os }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-links.yml
# - template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml

View File

@@ -37,6 +37,7 @@ parameters:
- libpfm4-dev
- libtool
- libopenmpi-dev
- libsqlite3-dev
- m4
- ninja-build
- openmpi-bin

View File

@@ -40,7 +40,6 @@ jobs:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
dependencyList: ${{ parameters.rocmDependencies }}
dependencySource: staging
- task: Bash@3
displayName: Add ROCm binaries to PATH
inputs:

View File

@@ -219,7 +219,6 @@ jobs:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
dependencyList: ${{ parameters.rocmDependencies }}
dependencySource: staging
gpuTarget: $(JOB_GPU_TARGET)
setupHIPLibrarySymlinks: true
- task: Bash@3
@@ -406,7 +405,6 @@ jobs:
parameters:
dependencyList: ${{ parameters.rocmTestDependencies }}
gpuTarget: $(JOB_GPU_TARGET)
dependencySource: staging
# get sources to run test scripts
- task: Bash@3
displayName: git clone upstream pytorch

View File

@@ -3,21 +3,21 @@ parameters:
- name: jobList
type: object
default:
- { os: ubuntu2204, target: gfx942, source: staging }
- { os: ubuntu2204, target: gfx90a, source: staging }
- { os: ubuntu2204, target: gfx1201, source: staging }
- { os: ubuntu2204, target: gfx1100, source: staging }
- { os: ubuntu2204, target: gfx1030, source: staging }
- { os: ubuntu2404, target: gfx942, source: staging }
- { os: ubuntu2404, target: gfx90a, source: staging }
- { os: ubuntu2404, target: gfx1201, source: staging }
- { os: ubuntu2404, target: gfx1100, source: staging }
- { os: ubuntu2404, target: gfx1030, source: staging }
- { os: almalinux8, target: gfx942, source: staging }
- { os: almalinux8, target: gfx90a, source: staging }
- { os: almalinux8, target: gfx1201, source: staging }
- { os: almalinux8, target: gfx1100, source: staging }
- { os: almalinux8, target: gfx1030, source: staging }
- { os: ubuntu2204, packageManager: apt, target: gfx942 }
- { os: ubuntu2204, packageManager: apt, target: gfx90a }
- { os: ubuntu2204, packageManager: apt, target: gfx1201 }
- { os: ubuntu2204, packageManager: apt, target: gfx1100 }
- { os: ubuntu2204, packageManager: apt, target: gfx1030 }
- { os: ubuntu2404, packageManager: apt, target: gfx942 }
- { os: ubuntu2404, packageManager: apt, target: gfx90a }
- { os: ubuntu2404, packageManager: apt, target: gfx1201 }
- { os: ubuntu2404, packageManager: apt, target: gfx1100 }
- { os: ubuntu2404, packageManager: apt, target: gfx1030 }
- { os: almalinux8, packageManager: dnf, target: gfx942 }
- { os: almalinux8, packageManager: dnf, target: gfx90a }
- { os: almalinux8, packageManager: dnf, target: gfx1201 }
- { os: almalinux8, packageManager: dnf, target: gfx1100 }
- { os: almalinux8, packageManager: dnf, target: gfx1030 }
- name: rocmDependencies
type: object
default:
@@ -92,7 +92,8 @@ schedules:
jobs:
- ${{ each job in parameters.jobList }}:
- job: rocm_nightly_${{ job.os }}_${{ job.target }}_${{ job.source }}
- job: nightly_${{ job.os }}_${{ job.target }}
timeoutInMinutes: 90
variables:
- group: common
- template: /.azuredevops/variables-global.yml
@@ -115,7 +116,6 @@ jobs:
displayName: System disk space before ROCm
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
dependencySource: ${{ job.source }}
dependencyList: ${{ parameters.rocmDependencies }}
os: ${{ job.os }}
gpuTarget: ${{ job.target }}
@@ -131,7 +131,7 @@ jobs:
includeRootFolder: false
archiveType: tar
tarCompression: gz
archiveFile: $(Build.ArtifactStagingDirectory)/$(Build.DefinitionName)_$(Build.BuildNumber)_ubuntu2204_${{ job.target }}.tar.gz
archiveFile: $(Build.ArtifactStagingDirectory)/$(Build.DefinitionName)_$(Build.BuildNumber)_${{ job.os }}_${{ job.target }}.tar.gz
- script: du -sh $(Build.ArtifactStagingDirectory)
displayName: Compressed ROCm size
- task: PublishPipelineArtifact@1
@@ -144,5 +144,95 @@ jobs:
inputs:
workingDirectory: $(Pipeline.Workspace)
targetType: inline
script: echo "$(Build.DefinitionName)_$(Build.BuildNumber)_ubuntu2204_${{ job.target }}.tar.gz" >> pipelineArtifacts.txt
script: echo "$(Build.DefinitionName)_$(Build.BuildNumber)_${{ job.os }}_${{ job.target }}.tar.gz" >> pipelineArtifacts.txt
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-links.yml
- ${{ if eq(job.packageManager, 'apt') }}:
- task: Bash@3
displayName: Create Dockerfile
inputs:
workingDirectory: $(Agent.BuildDirectory)
targetType: inline
script: |
cat <<'EOF' > Dockerfile
${{ iif(eq(job.os, 'ubuntu2204'), 'FROM ubuntu:22.04', '') }}
${{ iif(eq(job.os, 'ubuntu2404'), 'FROM ubuntu:24.04', '') }}
WORKDIR /root
RUN mkdir rocm
RUN apt update \
&& apt upgrade -y \
&& apt install -y cmake curl git gcc g++ gpg lsb-release lsof ninja-build pkg-config python3 python3-pip wget zip libdrm-dev libelf-dev libgtest-dev libhsakmt-dev libhwloc-dev libnuma-dev libstdc++-12-dev libtbb-dev jq \
&& apt clean all
RUN PACKAGE_NAME=$(curl -s https://repo.radeon.com/rocm/apt/latest/pool/main/h/hsa-amd-aqlprofile/ | grep -oP "href=\"\K[^\"]*$(lsb_release -rs)[^\"]*\.deb") \
&& wget -nv --retry-connrefused https://repo.radeon.com/rocm/apt/latest/pool/main/h/hsa-amd-aqlprofile/$PACKAGE_NAME \
&& mkdir hsa-amd-aqlprofile \
&& dpkg-deb -R $PACKAGE_NAME hsa-amd-aqlprofile \
&& cp -R hsa-amd-aqlprofile/opt/rocm-*/* rocm
RUN ARTIFACT_URL="https://dev.azure.com/ROCm-CI/ROCm-CI/_apis/build/builds/$(Build.BuildId)/artifacts?artifactName=nightly${{ job.os }}${{ job.target }}&api-version=7.1" \
&& DOWNLOAD_URL=$(curl -s $ARTIFACT_URL | jq ".resource.downloadUrl" | tr -d '"') \
&& wget -nv --retry-connrefused $DOWNLOAD_URL -O nightly.zip \
&& unzip nightly.zip \
&& tar -xf nightly${{ job.os }}${{ job.target }}/rocm-nightly*${{ job.os }}*${{ job.target }}*.tar.gz -C rocm
RUN echo /root/rocm/lib | tee /etc/ld.so.conf.d/rocm-ci.conf
RUN echo /root/rocm/llvm/lib | tee -a /etc/ld.so.conf.d/rocm-ci.conf
RUN echo /root/rocm/lib64 | tee -a /etc/ld.so.conf.d/rocm-ci.conf
RUN echo /root/rocm/llvm/lib64 | tee -a /etc/ld.so.conf.d/rocm-ci.conf
RUN ldconfig -v
ENV PATH="$PATH:/root/rocm/bin"
ENTRYPOINT ["/bin/bash"]
EOF
cat Dockerfile
- ${{ elseif eq(job.packageManager, 'dnf') }}:
- task: Bash@3
displayName: Create Dockerfile
inputs:
workingDirectory: $(Agent.BuildDirectory)
targetType: inline
script: |
cat <<'EOF' > Dockerfile
${{ iif(eq(job.os, 'almalinux8'), 'FROM almalinux:8', '') }}
WORKDIR /root
RUN mkdir rocm
RUN dnf install -y cmake curl git gcc gcc-c++ gnupg2 redhat-lsb-core lsof pkgconf python3 python3-pip wget zip libdrm-devel elfutils-libelf-devel numactl-devel libstdc++-devel tbb-devel jq \
&& dnf clean all
RUN PACKAGE_NAME=$(curl -s https://repo.radeon.com/rocm/rhel8/$(REPO_RADEON_VERSION)/main/ | grep -oP "hsa-amd-aqlprofile-[^\"]+\.rpm" | head -n1) \
&& wget -nv --retry-connrefused https://repo.radeon.com/rocm/rhel8/$(REPO_RADEON_VERSION)/main/$PACKAGE_NAME \
&& mkdir hsa-amd-aqlprofile \
&& dnf -y install rpm-build cpio \
&& rpm2cpio $PACKAGE_NAME | (cd hsa-amd-aqlprofile && cpio -idmv) \
&& cp -R hsa-amd-aqlprofile/opt/rocm-*/* rocm
RUN ARTIFACT_URL="https://dev.azure.com/ROCm-CI/ROCm-CI/_apis/build/builds/$(Build.BuildId)/artifacts?artifactName=nightly${{ job.os }}${{ job.target }}&api-version=7.1" \
&& DOWNLOAD_URL=$(curl -s $ARTIFACT_URL | jq ".resource.downloadUrl" | tr -d '"') \
&& wget -nv --retry-connrefused $DOWNLOAD_URL -O nightly.zip \
&& UNZIP_DISABLE_ZIPBOMB_DETECTION=TRUE unzip nightly.zip \
&& tar -xf nightly${{ job.os }}${{ job.target }}/rocm-nightly*${{ job.os }}*${{ job.target }}*.tar.gz -C rocm
RUN echo /root/rocm/lib | tee /etc/ld.so.conf.d/rocm-ci.conf
RUN echo /root/rocm/llvm/lib | tee -a /etc/ld.so.conf.d/rocm-ci.conf
RUN echo /root/rocm/lib64 | tee -a /etc/ld.so.conf.d/rocm-ci.conf
RUN echo /root/rocm/llvm/lib64 | tee -a /etc/ld.so.conf.d/rocm-ci.conf
RUN ldconfig -v
ENV PATH="$PATH:/root/rocm/bin"
ENTRYPOINT ["/bin/bash"]
EOF
cat Dockerfile
- task: Docker@2
displayName: Build and upload Docker image
inputs:
containerRegistry: ContainerService3
repository: 'nightly-${{ job.os }}-${{ job.target }}'
Dockerfile: '$(Agent.BuildDirectory)/Dockerfile'
buildContext: '$(Agent.BuildDirectory)'
- task: Bash@3
displayName: '!! Docker Run Command !!'
inputs:
targetType: inline
script: echo "docker run -it --network=host --device=/dev/kfd --device=/dev/dri --security-opt seccomp=unconfined rocmexternalcicd.azurecr.io/nightly-${{ job.os }}-${{ job.target }}:$(Build.BuildId)" | tr '[:upper:]' '[:lower:]'

View File

@@ -3,13 +3,6 @@ parameters:
- name: checkoutRef
type: string
default: ''
- name: dependencySource # optional, overrides checkoutRef
type: string
default: null
values:
- null # empty strings aren't allowed as values, use null instead
- staging
- mainline
- name: dependencyList
type: object
default: []
@@ -38,309 +31,240 @@ parameters:
type: object
default:
AMDMIGraphX:
pipelineId: $(AMDMIGRAPHX_PIPELINE_ID)
stagingBranch: develop
mainlineBranch: master
pipelineId: 113
developBranch: develop
hasGpuTarget: true
amdsmi:
pipelineId: $(AMDSMI_PIPELINE_ID)
stagingBranch: amd-staging
mainlineBranch: amd-mainline
pipelineId: 99
developBranch: amd-staging
hasGpuTarget: false
aomp-extras:
pipelineId: $(AOMP_EXTRAS_PIPELINE_ID)
stagingBranch: aomp-dev
mainlineBranch: aomp-dev
pipelineId: 111
developBranch: aomp-dev
hasGpuTarget: false
aomp:
pipelineId: $(AOMP_PIPELINE_ID)
stagingBranch: aomp-dev
mainlineBranch: amd-mainline
pipelineId: 115
developBranch: aomp-dev
hasGpuTarget: false
clr:
pipelineId: $(CLR_PIPELINE_ID)
stagingBranch: amd-staging
mainlineBranch: amd-mainline
pipelineId: 145
developBranch: amd-staging
hasGpuTarget: false
composable_kernel:
pipelineId: $(COMPOSABLE_KERNEL_PIPELINE_ID)
stagingBranch: develop
mainlineBranch: mainline
pipelineId: 86
developBranch: develop
hasGpuTarget: true
half:
pipelineId: $(HALF_PIPELINE_ID)
stagingBranch: rocm
mainlineBranch: rocm
pipelineId: 101
developBranch: rocm
hasGpuTarget: false
HIP:
pipelineId: $(HIP_PIPELINE_ID)
stagingBranch: amd-staging
mainlineBranch: amd-mainline
pipelineId: 93
developBranch: amd-staging
hasGpuTarget: false
hip-tests:
pipelineId: $(HIP_TESTS_PIPELINE_ID)
stagingBranch: amd-staging
mainlineBranch: amd-mainline
pipelineId: 233
developBranch: amd-staging
hasGpuTarget: false
hipBLAS:
pipelineId: $(HIPBLAS_PIPELINE_ID)
stagingBranch: develop
mainlineBranch: mainline
pipelineId: 317
developBranch: develop
hasGpuTarget: true
hipBLASLt:
pipelineId: $(HIPBLASLT_PIPELINE_ID)
stagingBranch: develop
mainlineBranch: mainline
pipelineId: 301
developBranch: develop
hasGpuTarget: true
hipBLAS-common:
pipelineId: $(HIPBLAS_COMMON_PIPELINE_ID)
stagingBranch: develop
mainlineBranch: mainline
pipelineId: 300
developBranch: develop
hasGpuTarget: false
hipCUB:
pipelineId: $(HIPCUB_PIPELINE_ID)
stagingBranch: develop
mainlineBranch: develop
pipelineId: 277
developBranch: develop
hasGpuTarget: true
hipFFT:
pipelineId: $(HIPFFT_PIPELINE_ID)
stagingBranch: develop
mainlineBranch: mainline
pipelineId: 283
developBranch: develop
hasGpuTarget: true
hipfort:
pipelineId: $(HIPFORT_PIPELINE_ID)
stagingBranch: develop
mainlineBranch: mainline
pipelineId: 102
developBranch: develop
hasGpuTarget: false
HIPIFY:
pipelineId: $(HIPIFY_PIPELINE_ID)
stagingBranch: amd-staging
mainlineBranch: amd-mainline
pipelineId: 92
developBranch: amd-staging
hasGpuTarget: false
hipRAND:
pipelineId: $(HIPRAND_PIPELINE_ID)
stagingBranch: develop
mainlineBranch: develop
pipelineId: 275
developBranch: develop
hasGpuTarget: true
hipSOLVER:
pipelineId: $(HIPSOLVER_PIPELINE_ID)
stagingBranch: develop
mainlineBranch: mainline
pipelineId: 84
developBranch: develop
hasGpuTarget: true
hipSPARSE:
pipelineId: $(HIPSPARSE_PIPELINE_ID)
stagingBranch: develop
mainlineBranch: mainline
pipelineId: 315
developBranch: develop
hasGpuTarget: true
hipSPARSELt:
pipelineId: $(HIPSPARSELT_PIPELINE_ID)
stagingBranch: develop
mainlineBranch: mainline
pipelineId: 309
developBranch: develop
hasGpuTarget: true
hipTensor:
pipelineId: $(HIPTENSOR_PIPELINE_ID)
stagingBranch: develop
mainlineBranch: mainline
pipelineId: 105
developBranch: develop
hasGpuTarget: true
llvm-project:
pipelineId: $(LLVM_PROJECT_PIPELINE_ID)
stagingBranch: amd-staging
mainlineBranch: amd-mainline
pipelineId: 2
developBranch: amd-staging
hasGpuTarget: false
MIOpen:
pipelineId: $(MIOpen_PIPELINE_ID)
stagingBranch: develop
mainlineBranch: amd-master
pipelineId: 320
developBranch: develop
hasGpuTarget: true
MIVisionX:
pipelineId: $(MIVISIONX_PIPELINE_ID)
stagingBranch: develop
mainlineBranch: master
hasGpuTarget: true
omnitrace: # deprecated
pipelineId: $(OMNITRACE_PIPELINE_ID)
stagingBranch: amd-staging
mainlineBranch: amd-mainline
pipelineId: 80
developBranch: develop
hasGpuTarget: true
rccl:
pipelineId: $(RCCL_PIPELINE_ID)
stagingBranch: develop
mainlineBranch: mainline
pipelineId: 107
developBranch: develop
hasGpuTarget: true
rdc:
pipelineId: $(RDC_PIPELINE_ID)
stagingBranch: amd-staging
mainlineBranch: amd-mainline
pipelineId: 100
developBranch: amd-staging
hasGpuTarget: false
rocAL:
pipelineId: $(ROCAL_PIPELINE_ID)
stagingBranch: develop
mainlineBranch: mainline
pipelineId: 151
developBranch: develop
hasGpuTarget: true
rocALUTION:
pipelineId: $(ROCALUTION_PIPELINE_ID)
stagingBranch: develop
mainlineBranch: mainline
pipelineId: 89
developBranch: develop
hasGpuTarget: true
rocBLAS:
pipelineId: $(ROCBLAS_PIPELINE_ID)
stagingBranch: develop
mainlineBranch: mainline
pipelineId: 302
developBranch: develop
hasGpuTarget: true
ROCdbgapi:
pipelineId: $(ROCDBGAPI_PIPELINE_ID)
stagingBranch: amd-staging
mainlineBranch: amd-mainline
pipelineId: 135
developBranch: amd-staging
hasGpuTarget: false
rocDecode:
pipelineId: $(ROCDECODE_PIPELINE_ID)
stagingBranch: develop
mainlineBranch: mainline
pipelineId: 79
developBranch: develop
hasGpuTarget: false
rocFFT:
pipelineId: $(ROCFFT_PIPELINE_ID)
stagingBranch: develop
mainlineBranch: mainline
pipelineId: 282
developBranch: develop
hasGpuTarget: true
ROCgdb:
pipelineId: $(ROCGDB_PIPELINE_ID)
stagingBranch: amd-staging
mainlineBranch: amd-mainline-rocgdb-15
pipelineId: 134
developBranch: amd-staging
hasGpuTarget: false
rocJPEG:
pipelineId: $(ROCJPEG_PIPELINE_ID)
stagingBranch: develop
mainlineBranch: mainline
pipelineId: 262
developBranch: develop
hasGpuTarget: false
rocm-cmake:
pipelineId: $(ROCM_CMAKE_PIPELINE_ID)
stagingBranch: develop
mainlineBranch: mainline
pipelineId: 6
developBranch: develop
hasGpuTarget: false
rocm-core:
pipelineId: $(ROCM_CORE_PIPELINE_ID)
stagingBranch: master
mainlineBranch: amd-master
pipelineId: 103
developBranch: master
hasGpuTarget: false
rocm-examples:
pipelineId: $(ROCM_EXAMPLES_PIPELINE_ID)
stagingBranch: amd-staging
mainlineBranch: amd-mainline
pipelineId: 216
developBranch: amd-staging
hasGpuTarget: true
rocminfo:
pipelineId: $(ROCMINFO_PIPELINE_ID)
stagingBranch: amd-staging
mainlineBranch: amd-mainline
pipelineId: 91
developBranch: amd-staging
hasGpuTarget: false
rocMLIR:
pipelineId: $(ROCMLIR_PIPELINE_ID)
stagingBranch: develop
mainlineBranch: mainline
pipelineId: 229
developBranch: develop
hasGpuTarget: false
ROCmValidationSuite:
pipelineId: $(ROCMVALIDATIONSUITE_PIPELINE_ID)
stagingBranch: master
mainlineBranch: master
pipelineId: 106
developBranch: master
hasGpuTarget: true
rocm_bandwidth_test:
pipelineId: $(ROCM_BANDWIDTH_TEST_PIPELINE_ID)
stagingBranch: master
mainlineBranch: master
pipelineId: 88
developBranch: master
hasGpuTarget: false
rocm_smi_lib:
pipelineId: $(ROCM_SMI_LIB_PIPELINE_ID)
stagingBranch: amd-staging
mainlineBranch: amd-mainline
pipelineId: 96
developBranch: amd-staging
hasGpuTarget: false
rocPRIM:
pipelineId: $(ROCPRIM_PIPELINE_ID)
stagingBranch: develop
mainlineBranch: develop
pipelineId: 273
developBranch: develop
hasGpuTarget: true
rocprofiler:
pipelineId: $(ROCPROFILER_PIPELINE_ID)
stagingBranch: amd-staging
mainlineBranch: amd-master
pipelineId: 143
developBranch: amd-staging
hasGpuTarget: true
rocprofiler-compute:
pipelineId: $(ROCPROFILER_COMPUTE_PIPELINE_ID)
stagingBranch: develop
mainlineBranch: amd-mainline
pipelineId: 257
developBranch: develop
hasGpuTarget: true
rocprofiler-register:
pipelineId: $(ROCPROFILER_REGISTER_PIPELINE_ID)
stagingBranch: amd-staging
mainlineBranch: amd-mainline
pipelineId: 327
developBranch: develop
hasGpuTarget: false
rocprofiler-sdk:
pipelineId: $(ROCPROFILER_SDK_PIPELINE_ID)
stagingBranch: amd-staging
mainlineBranch: amd-mainline
pipelineId: 246
developBranch: amd-staging
hasGpuTarget: true
rocprofiler-systems:
pipelineId: $(ROCPROFILER_SYSTEMS_PIPELINE_ID)
stagingBranch: amd-staging
mainlineBranch: amd-mainline
pipelineId: 255
developBranch: amd-staging
hasGpuTarget: true
rocPyDecode:
pipelineId: $(ROCPYDECODE_PIPELINE_ID)
stagingBranch: develop
mainlineBranch: mainline
pipelineId: 239
developBranch: develop
hasGpuTarget: true
ROCR-Runtime:
pipelineId: $(ROCR_RUNTIME_PIPELINE_ID)
stagingBranch: amd-staging
mainlineBranch: amd-mainline
pipelineId: 10
developBranch: amd-staging
hasGpuTarget: false
rocRAND:
pipelineId: $(ROCRAND_PIPELINE_ID)
stagingBranch: develop
mainlineBranch: develop
pipelineId: 274
developBranch: develop
hasGpuTarget: true
rocr_debug_agent:
pipelineId: $(ROCR_DEBUG_AGENT_PIPELINE_ID)
stagingBranch: amd-staging
mainlineBranch: amd-mainline
pipelineId: 136
developBranch: amd-staging
hasGpuTarget: false
rocSOLVER:
pipelineId: $(ROCSOLVER_PIPELINE_ID)
stagingBranch: develop
mainlineBranch: mainline
pipelineId: 81
developBranch: develop
hasGpuTarget: true
rocSPARSE:
pipelineId: $(ROCSPARSE_PIPELINE_ID)
stagingBranch: develop
mainlineBranch: mainline
pipelineId: 314
developBranch: develop
hasGpuTarget: true
ROCT-Thunk-Interface: # deprecated
pipelineId: $(ROCT_THUNK_INTERFACE_PIPELINE_ID)
stagingBranch: master
mainlineBranch: master
hasGpuTarget: false
rocThrust:
pipelineId: $(ROCTHRUST_PIPELINE_ID)
stagingBranch: develop
mainlineBranch: develop
pipelineId: 276
developBranch: develop
hasGpuTarget: true
roctracer:
pipelineId: $(ROCTRACER_PIPELINE_ID)
stagingBranch: amd-staging
mainlineBranch: amd-mainline
pipelineId: 141
developBranch: amd-staging
hasGpuTarget: true
rocWMMA:
pipelineId: $(ROCWMMA_PIPELINE_ID)
stagingBranch: develop
mainlineBranch: mainline
pipelineId: 109
developBranch: develop
hasGpuTarget: true
rpp:
pipelineId: $(RPP_PIPELINE_ID)
stagingBranch: develop
mainlineBranch: mainline
pipelineId: 78
developBranch: develop
hasGpuTarget: true
TransferBench:
pipelineId: $(TRANSFERBENCH_PIPELINE_ID)
stagingBranch: develop
mainlineBranch: mainline
pipelineId: 265
developBranch: develop
hasGpuTarget: true
steps:
@@ -356,72 +280,30 @@ steps:
parameters:
componentName: ${{ split(dependency, ':')[0] }}
pipelineId: ${{ parameters.componentVarList[split(dependency, ':')[0]].pipelineId }}
branchName: ${{ parameters.componentVarList[split(dependency, ':')[0]].developBranch }}
aggregatePipeline: ${{ parameters.aggregatePipeline }}
extractAndDeleteFiles: false
${{ if parameters.componentVarList[split(dependency, ':')[0]].hasGpuTarget }}:
fileFilter: "${{ split(dependency, ':')[1] }}*_${{ parameters.os }}_${{ parameters.gpuTarget }}"
# dependencySource = staging
${{ if eq(parameters.dependencySource, 'staging')}}:
branchName: ${{ parameters.componentVarList[split(dependency, ':')[0]].stagingBranch }}
# dependencySource = mainline
${{ elseif eq(parameters.dependencySource, 'mainline')}}:
branchName: ${{ parameters.componentVarList[split(dependency, ':')[0]].mainlineBranch }}
# checkoutRef = staging
${{ elseif eq(parameters.checkoutRef, parameters.componentVarList[variables['Build.DefinitionName']].stagingBranch) }}:
branchName: ${{ parameters.componentVarList[split(dependency, ':')[0]].stagingBranch }}
# checkoutRef = mainline
${{ elseif eq(parameters.checkoutRef, parameters.componentVarList[variables['Build.DefinitionName']].mainlineBranch) }}:
branchName: ${{ parameters.componentVarList[split(dependency, ':')[0]].mainlineBranch }}
# SourceBranchName = staging
${{ elseif eq(variables['Build.SourceBranchName'], parameters.componentVarlist[variables['Build.DefinitionName']].stagingBranch) }}:
branchName: ${{ parameters.componentVarList[split(dependency, ':')[0]].stagingBranch }}
# SourceBranchName = mainline
${{ elseif eq(variables['Build.SourceBranchName'], parameters.componentVarlist[variables['Build.DefinitionName']].mainlineBranch) }}:
branchName: ${{ parameters.componentVarList[split(dependency, ':')[0]].mainlineBranch }}
# default = staging
${{ else }}:
branchName: ${{ parameters.componentVarList[split(dependency, ':')[0]].stagingBranch }}
# no colon (:) found in this item in the list
- ${{ elseif containsValue(split(parameters.downstreamAggregateNames, '+'), dependency) }}:
- template: local-artifact-download.yml
parameters:
${{ if parameters.componentVarList[dependency].hasGpuTarget }}:
gpuTarget: ${{ parameters.gpuTarget }}
buildType: current
preTargetFilter: ${{ dependency }}
os: ${{ parameters.os }}
buildType: current
${{ if parameters.componentVarList[dependency].hasGpuTarget }}:
gpuTarget: ${{ parameters.gpuTarget }}
- ${{ else }}:
- template: artifact-download.yml
parameters:
componentName: ${{ dependency }}
pipelineId: ${{ parameters.componentVarList[dependency].pipelineId }}
branchName: ${{ parameters.componentVarList[dependency].developBranch }}
aggregatePipeline: ${{ parameters.aggregatePipeline }}
extractAndDeleteFiles: false
${{ if parameters.componentVarList[dependency].hasGpuTarget }}:
fileFilter: ${{ parameters.os }}_${{ parameters.gpuTarget }}
${{ else }}:
fileFilter: ${{ parameters.os }}
# dependencySource = staging
${{ if eq(parameters.dependencySource, 'staging')}}:
branchName: ${{ parameters.componentVarList[dependency].stagingBranch }}
# dependencySource = mainline
${{ elseif eq(parameters.dependencySource, 'mainline')}}:
branchName: ${{ parameters.componentVarList[dependency].mainlineBranch }}
# checkoutRef = staging
${{ elseif eq(parameters.checkoutRef, parameters.componentVarList[variables['Build.DefinitionName']].stagingBranch) }}:
branchName: ${{ parameters.componentVarList[dependency].stagingBranch }}
# checkoutRef = mainline
${{ elseif eq(parameters.checkoutRef, parameters.componentVarList[variables['Build.DefinitionName']].mainlineBranch) }}:
branchName: ${{ parameters.componentVarList[dependency].mainlineBranch }}
# SourceBranchName = staging
${{ elseif eq(variables['Build.SourceBranchName'], parameters.componentVarlist[variables['Build.DefinitionName']].stagingBranch) }}:
branchName: ${{ parameters.componentVarList[dependency].stagingBranch }}
# SourceBranchName = mainline
${{ elseif eq(variables['Build.SourceBranchName'], parameters.componentVarlist[variables['Build.DefinitionName']].mainlineBranch) }}:
branchName: ${{ parameters.componentVarList[dependency].mainlineBranch }}
# default = staging
${{ else }}:
branchName: ${{ parameters.componentVarList[dependency].stagingBranch }}
- task: ExtractFiles@1
displayName: Extract ROCm artifacts
inputs:

View File

@@ -7,13 +7,12 @@ steps:
- task: Bash@3
name: downloadCKBuild
displayName: Download specific CK build
continueOnError: true
env:
CXX: $(Agent.BuildDirectory)/rocm/llvm/bin/amdclang++
CC: $(Agent.BuildDirectory)/rocm/llvm/bin/amdclang
inputs:
targetType: inline
workingDirectory: $(Build.SourcesDirectory)
workingDirectory: $(Agent.BuildDirectory)/s
script: |
AZ_API="https://dev.azure.com/ROCm-CI/ROCm-CI/_apis"
GH_API="https://api.github.com/repos/ROCm"
@@ -67,7 +66,19 @@ steps:
fi
echo "Downloading CK artifact from $ARTIFACT_URL"
wget --tries=5 --waitretry=10 --retry-connrefused -nv $ARTIFACT_URL -O $(System.ArtifactsDirectory)/ck.zip
RETRIES=0
MAX_RETRIES=5
until wget -nv $ARTIFACT_URL -O $(System.ArtifactsDirectory)/ck.zip; do
RETRIES=$((RETRIES+1))
if [[ $RETRIES -ge $MAX_RETRIES ]]; then
echo "Failed to download CK artifact after $MAX_RETRIES attempts."
exit 1
fi
echo "Download failed, retrying ($RETRIES/$MAX_RETRIES)..."
sleep 5
done
unzip $(System.ArtifactsDirectory)/ck.zip -d $(System.ArtifactsDirectory)
mkdir -p $(Agent.BuildDirectory)/rocm
tar -zxvf $(System.ArtifactsDirectory)/composable_kernel*/*.tar.gz -C $(Agent.BuildDirectory)/rocm
@@ -82,4 +93,3 @@ steps:
fi
echo "Instead used latest CK build $CK_BUILD_ID for commit $BUILD_COMMIT"
fi
exit $EXIT_CODE

View File

@@ -23,145 +23,25 @@ variables:
value: rocm-ci_high_build_pool
- name: ULTRA_BUILD_POOL
value: rocm-ci_ultra_build_pool
- name: ON_PREM_BUILD_POOL
value: rocm-ci_build_pool
- name: LARGE_DISK_BUILD_POOL
value: rocm-ci_larger_base_disk_pool
- name: GFX942_TEST_POOL
value: gfx942_test_pool
- name: GFX90A_TEST_POOL
value: gfx90a_test_pool
- name: LATEST_RELEASE_VERSION
value: 6.4.1
value: 6.4.2
- name: REPO_RADEON_VERSION
value: 6.4.1
value: 6.4.2
- name: NEXT_RELEASE_VERSION
value: 7.0.0
- name: LATEST_RELEASE_TAG
value: rocm-6.4.1
value: rocm-6.4.2
- name: DOCKER_SKIP_GFX
value: gfx90a
- name: AMDMIGRAPHX_PIPELINE_ID
value: 113
- name: AMDSMI_PIPELINE_ID
value: 99
- name: AOMP_EXTRAS_PIPELINE_ID
value: 111
- name: AOMP_PIPELINE_ID
value: 115
- name: CLR_PIPELINE_ID
value: 145
- name: COMPOSABLE_KERNEL_PIPELINE_ID
value: 86
- name: FLANG_LEGACY_PIPELINE_ID
value: 77
- name: HALF_PIPELINE_ID
value: 101
- name: HALF560_PIPELINE_ID
value: 68
- name: HALF560_BUILD_ID
value: 621
- name: HIP_PIPELINE_ID
value: 93
- name: HIP_TESTS_PIPELINE_ID
value: 233
- name: HIPBLAS_COMMON_PIPELINE_ID
value: 300
- name: HIPBLAS_PIPELINE_ID
value: 87
- name: HIPBLASLT_PIPELINE_ID
value: 301
- name: HIPCUB_PIPELINE_ID
value: 277
- name: HIPFFT_PIPELINE_ID
value: 121
- name: HIPFORT_PIPELINE_ID
value: 102
- name: HIPIFY_PIPELINE_ID
value: 92
- name: HIPRAND_PIPELINE_ID
value: 275
- name: HIPSOLVER_PIPELINE_ID
value: 84
- name: HIPSPARSE_PIPELINE_ID
value: 83
- name: HIPSPARSELT_PIPELINE_ID
value: 309
- name: HIPTENSOR_PIPELINE_ID
value: 105
- name: LLVM_PROJECT_PIPELINE_ID
value: 2
- name: MIOPEN_PIPELINE_ID
value: 108
- name: MIVISIONX_PIPELINE_ID
value: 80
- name: RCCL_PIPELINE_ID
value: 107
- name: RDC_PIPELINE_ID
value: 100
- name: ROCAL_PIPELINE_ID
value: 151
- name: ROCALUTION_PIPELINE_ID
value: 89
- name: ROCBLAS_PIPELINE_ID
value: 302
- name: ROCDBGAPI_PIPELINE_ID
value: 135
- name: ROCDECODE_PIPELINE_ID
value: 79
- name: ROCFFT_PIPELINE_ID
value: 120
- name: ROCGDB_PIPELINE_ID
value: 134
- name: ROCJPEG_PIPELINE_ID
value: 262
- name: ROCM_BANDWIDTH_TEST_PIPELINE_ID
value: 88
- name: ROCM_CMAKE_PIPELINE_ID
value: 6
- name: ROCM_CORE_PIPELINE_ID
value: 103
- name: ROCM_EXAMPLES_PIPELINE_ID
value: 216
- name: ROCM_SMI_LIB_PIPELINE_ID
value: 96
- name: ROCMINFO_PIPELINE_ID
value: 91
- name: ROCMLIR_PIPELINE_ID
value: 229
- name: ROCMVALIDATIONSUITE_PIPELINE_ID
value: 106
- name: ROCPRIM_PIPELINE_ID
value: 273
- name: ROCPROFILER_COMPUTE_PIPELINE_ID
value: 257
- name: ROCPROFILER_REGISTER_PIPELINE_ID
value: 1
- name: ROCPROFILER_SDK_PIPELINE_ID
value: 246
- name: ROCPROFILER_SYSTEMS_PIPELINE_ID
value: 255
- name: ROCPROFILER_PIPELINE_ID
value: 143
- name: ROCPYDECODE_PIPELINE_ID
value: 239
- name: ROCR_DEBUG_AGENT_PIPELINE_ID
value: 136
- name: ROCR_RUNTIME_PIPELINE_ID
value: 10
- name: ROCRAND_PIPELINE_ID
value: 274
- name: ROCSOLVER_PIPELINE_ID
value: 81
- name: ROCSPARSE_PIPELINE_ID
value: 314
- name: ROCTHRUST_PIPELINE_ID
value: 276
- name: ROCTRACER_PIPELINE_ID
value: 141
- name: ROCWMMA_PIPELINE_ID
value: 109
- name: RPP_PIPELINE_ID
value: 78
- name: TRANSFERBENCH_PIPELINE_ID
value: 265

View File

@@ -5,6 +5,7 @@ ACEs
ACS
AccVGPR
AccVGPRs
AITER
ALU
AllReduce
AMD
@@ -45,6 +46,7 @@ Bootloader
CAS
CCD
CDNA
CGUI
CHTML
CIFAR
CLI
@@ -114,12 +116,15 @@ Deprecations
DevCap
DirectX
Dockerfile
Dockerized
Doxygen
dropless
ELMo
ENDPGM
EPYC
ESXi
EoS
fas
FBGEMM
FFT
FFTs
@@ -176,6 +181,7 @@ HBM
HCA
HGX
HIPCC
hipDataType
HIPExtension
HIPIFY
HIPification
@@ -191,6 +197,7 @@ HWE
HWS
Haswell
Higgs
href
Hyperparameters
Huggingface
ICD
@@ -270,6 +277,7 @@ Makefiles
Matplotlib
Matrox
MaxText
Megablocks
Megatrends
Megatron
Mellanox
@@ -279,6 +287,7 @@ Miniconda
MirroredStrategy
Mixtral
MosaicML
MoEs
Mpops
Multicore
Multithreaded
@@ -355,6 +364,7 @@ PowerEdge
PowerShell
Pretrained
Pretraining
Primus
Profiler's
PyPi
Pytest
@@ -408,6 +418,7 @@ SDMA
SDPA
SDRAM
SENDMSG
SGLang
SGPR
SGPRs
SHA
@@ -452,6 +463,8 @@ TPS
TPU
TPUs
TSME
Taichi
Taichi's
Tagram
TensileLite
TensorBoard
@@ -516,6 +529,7 @@ Xilinx
Xnack
Xteam
YAML
YAMLs
YML
YModel
ZeRO
@@ -576,6 +590,7 @@ completers
composable
concretization
config
configs
conformant
constructible
convolutional
@@ -786,7 +801,9 @@ preprocessing
preprocessor
prequantized
prerequisites
pretrain
pretraining
primus
profiler
profilers
protobuf
@@ -863,6 +880,7 @@ seealso
sendmsg
seqs
serializers
sglang
shader
sharding
sigmoid

View File

@@ -4,6 +4,21 @@ This page is a historical overview of changes made to ROCm components. This
consolidated changelog documents key modifications and improvements across
different versions of the ROCm software stack and its components.
## ROCm 6.4.3
See the [ROCm 6.4.3 release notes](https://rocm.docs.amd.com/en/docs-6.4.3/about/release-notes.html)
for a complete overview of this release.
### **ROCm SMI** (7.7.0)
#### Added
- Support for getting the GPU Board voltage.
```{note}
See the full [ROCm SMI changelog](https://github.com/ROCm/rocm_smi_lib/blob/release/rocm-rel-6.4/CHANGELOG.md) for details, examples, and in-depth descriptions.
```
## ROCm 6.4.2
See the [ROCm 6.4.2 release notes](https://rocm.docs.amd.com/en/docs-6.4.2/about/release-notes.html)

View File

@@ -23,9 +23,6 @@ source software compilers, debuggers, and libraries. ROCm is fully integrated in
> A new open source build platform for ROCm is under development at
> https://github.com/ROCm/TheRock, featuring a unified CMake build with bundled
> dependencies, Windows support, and more.
>
> The instructions below describe the prior process for building from source
> which will be replaced once TheRock is mature enough.
## Getting and Building ROCm from Source

View File

@@ -10,7 +10,7 @@
<!-- markdownlint-disable reference-links-images -->
<!-- markdownlint-disable no-missing-space-atx -->
<!-- spellcheck-disable -->
# ROCm 6.4.2 release notes
# ROCm 6.4.3 release notes
The release notes provide a summary of notable changes since the previous ROCm release.
@@ -24,8 +24,6 @@ The release notes provide a summary of notable changes since the previous ROCm r
- [ROCm known issues](#rocm-known-issues)
- [ROCm resolved issues](#rocm-resolved-issues)
- [ROCm upcoming changes](#rocm-upcoming-changes)
```{note}
@@ -35,70 +33,40 @@ documentation to verify compatibility and system requirements.
## Release highlights
The following are notable new features and improvements in ROCm 6.4.2. For changes to individual components, see
[Detailed component changes](#detailed-component-changes).
ROCm 6.4.3 is a quality release that resolves the following issues. For changes to individual components, see [Detailed component changes](#detailed-component-changes).
### ROCm Compute Profiler enhancements
### AMDGPU driver updates
[ROCm Compute Profiler](https://rocm.docs.amd.com/projects/rocprofiler-compute/en/latest/index.html) includes the following changes:
* Resolved an issue causing performance degradation in communication operations, caused by increased latency in certain RCCL applications. The fix prevents unnecessary queue eviction during the fork process.
* Fixed an issue in the AMDGPU drivers scheduler constraints that could cause queue preemption to fail during workload execution.
* The ``--roofline-data-type`` option now supports FP8, FP16, BF16, FP32, FP64, I8, I32, and I64 data types. This is dependent on the GPU architecture. For more information, see [Roofline options](https://rocm.docs.amd.com/projects/rocprofiler-compute/en/docs-6.4.2/how-to/profile/mode.html#roofline-options).
* ROCm Compute Profiler now uses [AMD SMI](https://rocm.docs.amd.com/projects/amdsmi/en/latest/index.html) instead of [ROCm SMI](https://rocm.docs.amd.com/projects/rocm_smi_lib/en/latest/index.html). The AMD System Management Interface Library (AMD SMI) is a successor to ROCm SMI. It is a unified system management interface tool that provides a user-space interface for applications to monitor and control GPU applications and gives users the ability to query information about drivers and GPUs on the system. For more information, see [https://github.com/ROCm/amdsmi](https://github.com/ROCm/amdsmi) and the [AMD SMI documentation](https://rocm.docs.amd.com/projects/amdsmi/en/latest/index.html).
* ROCm Compute Profiler has added 8-bit floating point (FP8) metrics support for AMD Instinct MI300 series accelerators. For more information, see [System Speed-of-Light](https://rocm.docs.amd.com/projects/rocprofiler-compute/en/docs-6.4.2/conceptual/system-speed-of-light.html).
### rocSOLVER enhancements
rocSOLVER has improved the performance of eigensolvers and singular value decomposition (SVD). For more information, see [rocSOLVER documentation](https://rocm.docs.amd.com/projects/rocSOLVER/en/docs-6.4.2/index.html).
### ROCm Offline Installer Creator updates
The ROCm Offline Installer Creator 6.4.2 includes the following features and improvements:
* Added support for Oracle Linux 8.10 and 9.6, and SLES 15 SP7.
* Additional package options for the Offline Installer Creator, including `amd-smi`, `rocdecode`, `rocjpeg`, and `rdc`.
* ROCm meta packages are now used for selecting ROCm components and use cases.
* Improved separation of kernel/driver and ROCm prerequisite packages to reduce the size of ROCm-only or driver-only offline installers.
In addition, the option to build an offline installer based on ROCm version 5.7.3 has been removed. To build an offline installer for ROCm 5.7.3, use the Offline Installer Creator from version 6.4.1 or earlier. See [ROCm Offline Installer Creator](https://rocm.docs.amd.com/projects/install-on-linux/en/docs-6.4.2/install/rocm-offline-installer.html) for more information.
### ROCm Runfile Installer updates
The ROCm Runfile Installer 6.4.2 adds support for Oracle Linux 8.10 and 9.6 (using the RHEL 8 or 9 .run files), Debian 12 (using the Ubuntu 22.04 .run file), and SLES 15 SP7. It also fixes permission settings issues during ROCm and AMDGPU driver installation. For more information, see [ROCm Runfile Installer](https://rocm.docs.amd.com/projects/install-on-linux/en/docs-6.4.2/install/rocm-runfile-installer.html).
### ROCm SMI update
* Fixed the failure to load GPU data like System Clock (SCLK) by adjusting the logic for retrieving GPU board voltage.
### ROCm documentation updates
ROCm documentation continues to be updated to provide clearer and more comprehensive guidance for a wider variety of user needs and use cases.
* [Tutorials for AI developers](https://rocm.docs.amd.com/projects/ai-developer-hub/en/latest/) have been expanded with the following four new tutorials:
* Inference tutorial: [AI agent with MCPs using vLLM and PydanticAI](https://rocm.docs.amd.com/projects/ai-developer-hub/en/latest/notebooks/inference/build_airbnb_agent_mcp.html)
* GPU development and optimization tutorials:
* [Kernel development and optimization with Triton](https://rocm.docs.amd.com/projects/ai-developer-hub/en/latest/notebooks/gpu_dev_optimize/triton_kernel_dev.html)
* [Profiling Llama-4 inference with vLLM](https://rocm.docs.amd.com/projects/ai-developer-hub/en/latest/notebooks/gpu_dev_optimize/llama4_profiling_vllm.html)
* [FP8 quantization with AMD Quark for vLLM](https://rocm.docs.amd.com/projects/ai-developer-hub/en/latest/notebooks/gpu_dev_optimize/fp8_quantization_quark_vllm.html)
* [Tutorials for AI developers](https://rocm.docs.amd.com/projects/ai-developer-hub/en/latest/) have been expanded with the following five new tutorials:
* Inference tutorials
* [ChatQnA vLLM deployment and performance evaluation](https://rocm.docs.amd.com/projects/ai-developer-hub/en/latest/notebooks/inference/opea_deployment_and_evaluation.html)
* [Text-to-video generation with ComfyUI](https://rocm.docs.amd.com/projects/ai-developer-hub/en/latest/notebooks/inference/t2v_comfyui_radeon.html)
* [DeepSeek Janus Pro on CPU or GPU](https://rocm.docs.amd.com/projects/ai-developer-hub/en/latest/notebooks/inference/deepseek_janus_cpu_gpu.html)
* [DeepSeek-R1 with vLLM V1](https://rocm.docs.amd.com/projects/ai-developer-hub/en/latest/notebooks/inference/vllm_v1_DSR1.html)
* GPU development and optimization tutorial: [MLA decoding kernel of AITER library](https://rocm.docs.amd.com/projects/ai-developer-hub/en/latest/notebooks/gpu_dev_optimize/aiter_mla_decode_kernel.html)
For more information about the changes, see [Changelog for the AI Developer Hub](https://rocm.docs.amd.com/projects/ai-developer-hub/en/latest/changelog.html).
* ROCm provides a comprehensive ecosystem for deep learning development. For more details, see [Deep learning frameworks for ROCm](https://rocm.docs.amd.com/en/docs-6.4.2/how-to/deep-learning-rocm.html). As of July 2025, AMD ROCm provides support for the following additional deep learning frameworks:
* ROCm provides a comprehensive ecosystem for deep learning development. For more details, see [Deep learning frameworks for ROCm](https://rocm.docs.amd.com/en/docs-6.4.3/how-to/deep-learning-rocm.html). AMD ROCm adds support for the following deep learning frameworks:
* Deep Graph Library is an easy-to-use, high-performance, and scalable Python package for deep learning on graphs. DGL is framework agnostic, meaning if a deep graph model is a component in an end-to-end application, the rest of the logic is implemented using PyTorch. It is currently supported on ROCm 6.4.0. For more information, see [DGL compatibility](https://rocm.docs.amd.com/en/docs-6.4.2/compatibility/ml-compatibility/dgl-compatibility.html).
* Stanford Megatron-LM is a large-scale language model training framework. Its designed to train massive transformer-based language models efficiently by model and data parallelism. It is currently supported on ROCm 6.3.0. For more information, see [Stanford Megatron-LM compatibility](https://rocm.docs.amd.com/en/docs-6.4.2/compatibility/ml-compatibility/stanford-megatron-lm-compatibility.html).
* Volcano Engine Reinforcement Learning for LLMs (verl) is a reinforcement learning framework designed for large language models (LLMs). verl offers a scalable, open-source fine-tuning solution optimized for AMD Instinct GPUs with full ROCm support. It is currently supported on ROCm 6.2.0. For more information, see [verl compatibility](https://rocm.docs.amd.com/en/docs-6.4.2/compatibility/ml-compatibility/verl-compatibility.html).
* Taichi is an open-source, imperative, and parallel programming language designed for high-performance numerical computation. Embedded in Python, it leverages just-in-time (JIT) compilation frameworks such as LLVM to accelerate compute-intensive Python code by compiling it to native GPU or CPU instructions. It is currently supported on ROCm 6.3.2. For more information, see [Taichi compatibility](https://rocm.docs.amd.com/en/docs-6.4.3/compatibility/ml-compatibility/taichi-compatibility.html).
* Megablocks is a light-weight library for mixture-of-experts (MoE) training. The core of the system is efficient "dropless-MoE" and standard MoE layers. Megablocks is integrated with Megatron-LM, where data and pipeline parallel training of MoEs is supported. It is currently supported on ROCm 6.3.0. For more information, see [Megablocks compatibility](https://rocm.docs.amd.com/en/docs-6.4.3/compatibility/ml-compatibility/megablocks-compatibility.html).
* Documentation for the new [ROCprof Compute Viewer](https://rocm.docs.amd.com/projects/rocprof-compute-viewer/en/docs-6.4.2/) was added in May 2025. This tool is used to visualize and analyze GPU thread trace data collected using [rocprofv3](https://rocm.docs.amd.com/projects/rocprofiler-sdk/en/latest/index.html). Note that [ROCprof Compute Viewer](https://rocm.docs.amd.com/projects/rocprof-compute-viewer/en/docs-6.4.2/) is in an early access state. Running production workloads is not recommended.
* The AMDGPU installer documentation has been removed to encourage the use of the package manager for ROCm installation. While the package manager is the recommended method, you can still install ROCm using the AMDGPU installer by following the [legacy process](https://rocm.docs.amd.com/projects/install-on-linux/en/docs-6.4.1/install/install-methods/amdgpu-installer-index.html). Ensure to update the command with the intended ROCm version before running it. For more information, see [Installation via native package manager](https://rocm.docs.amd.com/projects/install-on-linux/en/docs-6.4.2/install/install-methods/package-manager-index.html).
* The [Data types and precision support](https://rocm.docs.amd.com/en/latest/reference/precision-support.html) topic now includes new hardware and library support information.
## Operating system and hardware support changes
ROCm 6.4.2 adds support for SLES 15 SP7. For more information, see [SLES installation](https://rocm.docs.amd.com/projects/install-on-linux/en/docs-6.4.2/install/install-methods/package-manager/package-manager-sles.html).
ROCm 6.4.2 marks the end of support (EoS) for RHEL 9.5.
ROCm 6.4.2 adds support for RDNA3 architecture-based [Radeon RX 7700 XT](https://www.amd.com/en/products/graphics/desktops/radeon/7000-series/amd-radeon-rx-7700-xt.html) GPU. This GPU is supported on Ubuntu 24.04.2 and RHEL 9.6.
For details, see the full list of [Supported GPUs
(Linux)](https://rocm.docs.amd.com/projects/install-on-linux/en/docs-6.4.2/reference/system-requirements.html#supported-gpus).
Operating system and hardware support remain unchanged in this release.
See the [Compatibility
matrix](../../docs/compatibility/compatibility-matrix.rst)
@@ -106,8 +74,7 @@ for more information about operating system and hardware compatibility.
## ROCm components
The following table lists the versions of ROCm components for ROCm 6.4.2, including any version
changes from 6.4.1 to 6.4.2. Click the component's updated version to go to a list of its changes.
The following table lists the versions of ROCm components for ROCm 6.4.3.
Click {fab}`github` to go to the component's source code on GitHub.
<div class="pst-scrollable-table-container">
@@ -129,47 +96,47 @@ Click {fab}`github` to go to the component's source code on GitHub.
<tr>
<th rowspan="9">Libraries</th>
<th rowspan="9">Machine learning and computer vision</th>
<td><a href="https://rocm.docs.amd.com/projects/composable_kernel/en/docs-6.4.2/index.html">Composable Kernel</a></td>
<td><a href="https://rocm.docs.amd.com/projects/composable_kernel/en/docs-6.4.3/index.html">Composable Kernel</a></td>
<td>1.1.0</td>
<td><a href="https://github.com/ROCm/composable_kernel"><i class="fab fa-github fa-lg"></i></a></td>
</tr>
<tr>
<td><a href="https://rocm.docs.amd.com/projects/AMDMIGraphX/en/docs-6.4.2/index.html">MIGraphX</a></td>
<td><a href="https://rocm.docs.amd.com/projects/AMDMIGraphX/en/docs-6.4.3/index.html">MIGraphX</a></td>
<td>2.12.0</td>
<td><a href="https://github.com/ROCm/AMDMIGraphX"><i class="fab fa-github fa-lg"></i></a></td>
</tr>
<tr>
<td><a href="https://rocm.docs.amd.com/projects/MIOpen/en/docs-6.4.2/index.html">MIOpen</a></td>
<td><a href="https://rocm.docs.amd.com/projects/MIOpen/en/docs-6.4.3/index.html">MIOpen</a></td>
<td>3.4.0</td>
<td><a href="https://github.com/ROCm/MIOpen"><i class="fab fa-github fa-lg"></i></a></td>
</tr>
<tr>
<td><a href="https://rocm.docs.amd.com/projects/MIVisionX/en/docs-6.4.2/index.html">MIVisionX</a></td>
<td><a href="https://rocm.docs.amd.com/projects/MIVisionX/en/docs-6.4.3/index.html">MIVisionX</a></td>
<td>3.2.0</td>
<td><a href="https://github.com/ROCm/MIVisionX"><i class="fab fa-github fa-lg"></i></a></td>
</tr>
<tr>
<td><a href="https://rocm.docs.amd.com/projects/rocAL/en/docs-6.4.2/index.html">rocAL</a></td>
<td><a href="https://rocm.docs.amd.com/projects/rocAL/en/docs-6.4.3/index.html">rocAL</a></td>
<td>2.2.0</td>
<td><a href="https://github.com/ROCm/rocAL"><i class="fab fa-github fa-lg"></i></a></td>
</tr>
<tr>
<td><a href="https://rocm.docs.amd.com/projects/rocDecode/en/docs-6.4.2/index.html">rocDecode</a></td>
<td><a href="https://rocm.docs.amd.com/projects/rocDecode/en/docs-6.4.3/index.html">rocDecode</a></td>
<td>0.10.0</td>
<td><a href="https://github.com/ROCm/rocDecode"><i class="fab fa-github fa-lg"></i></a></td>
</tr>
<tr>
<td><a href="https://rocm.docs.amd.com/projects/rocJPEG/en/docs-6.4.2/index.html">rocJPEG</a></td>
<td><a href="https://rocm.docs.amd.com/projects/rocJPEG/en/docs-6.4.3/index.html">rocJPEG</a></td>
<td>0.8.0</td>
<td><a href="https://github.com/ROCm/rocJPEG"><i class="fab fa-github fa-lg"></i></a></td>
</tr>
<tr>
<td><a href="https://rocm.docs.amd.com/projects/rocPyDecode/en/docs-6.4.2/index.html">rocPyDecode</a></td>
<td><a href="https://rocm.docs.amd.com/projects/rocPyDecode/en/docs-6.4.3/index.html">rocPyDecode</a></td>
<td>0.3.1</td>
<td><a href="https://github.com/ROCm/rocPyDecode"><i class="fab fa-github fa-lg"></i></a></td>
</tr>
<tr>
<td><a href="https://rocm.docs.amd.com/projects/rpp/en/docs-6.4.2/index.html">RPP</a></td>
<td><a href="https://rocm.docs.amd.com/projects/rpp/en/docs-6.4.3/index.html">RPP</a></td>
<td>1.9.10</td>
<td><a href="https://github.com/ROCm/rpp"><i class="fab fa-github fa-lg"></i></a></td>
</tr>
@@ -178,13 +145,13 @@ Click {fab}`github` to go to the component's source code on GitHub.
<tr>
<th rowspan="2"></th>
<th rowspan="2">Communication</th>
<td><a href="https://rocm.docs.amd.com/projects/rccl/en/docs-6.4.2/index.html">RCCL</a></td>
<td>2.22.3&nbsp;&Rightarrow;&nbsp;<a href="#rccl-2-22-3">2.22.3</td>
<td><a href="https://rocm.docs.amd.com/projects/rccl/en/docs-6.4.3/index.html">RCCL</a></td>
<td>2.22.3</td>
<td><a href="https://github.com/ROCm/rccl"><i class="fab fa-github fa-lg"></i></a></td>
</tr>
<tr>
<td><a href="https://rocm.docs.amd.com/projects/rocSHMEM/en/docs-6.4.2/index.html">rocSHMEM</a></td>
<td>2.0.0&nbsp;&Rightarrow;&nbsp;<a href="#rocshmem-2-0-1">2.0.1</td>
<td><a href="https://rocm.docs.amd.com/projects/rocSHMEM/en/docs-6.4.3/index.html">rocSHMEM</a></td>
<td>2.0.1</td>
<td><a href="https://github.com/ROCm/rocSHMEM"><i class="fab fa-github fa-lg"></i></a></td>
</tr>
</tbody>
@@ -192,82 +159,82 @@ Click {fab}`github` to go to the component's source code on GitHub.
<tr>
<th rowspan="16"></th>
<th rowspan="16">Math</th>
<td><a href="https://rocm.docs.amd.com/projects/hipBLAS/en/docs-6.4.2/index.html">hipBLAS</a></td>
<td><a href="https://rocm.docs.amd.com/projects/hipBLAS/en/docs-6.4.3/index.html">hipBLAS</a></td>
<td>2.4.0</td>
<td><a href="https://github.com/ROCm/hipBLAS"><i class="fab fa-github fa-lg"></i></a></td>
</tr>
<tr>
<td><a href="https://rocm.docs.amd.com/projects/hipBLASLt/en/docs-6.4.2/index.html">hipBLASLt</a></td>
<td>0.12.1&nbsp;&Rightarrow;&nbsp;<a href="#hipblaslt-0-12-1">0.12.1</td>
<td><a href="https://rocm.docs.amd.com/projects/hipBLASLt/en/docs-6.4.3/index.html">hipBLASLt</a></td>
<td>0.12.1</td>
<td><a href="https://github.com/ROCm/hipBLASLt"><i class="fab fa-github fa-lg"></i></a></td>
</tr>
<tr>
<td><a href="https://rocm.docs.amd.com/projects/hipFFT/en/docs-6.4.2/index.html">hipFFT</a></td>
<td><a href="https://rocm.docs.amd.com/projects/hipFFT/en/docs-6.4.3/index.html">hipFFT</a></td>
<td>1.0.18</td>
<td><a href="https://github.com/ROCm/hipFFT"><i class="fab fa-github fa-lg"></i></a></td>
</tr>
<tr>
<td><a href="https://rocm.docs.amd.com/projects/hipfort/en/docs-6.4.2/index.html">hipfort</a></td>
<td><a href="https://rocm.docs.amd.com/projects/hipfort/en/docs-6.4.3/index.html">hipfort</a></td>
<td>0.6.0</td>
<td><a href="https://github.com/ROCm/hipfort"><i class="fab fa-github fa-lg"></i></a></td>
</tr>
<tr>
<td><a href="https://rocm.docs.amd.com/projects/hipRAND/en/docs-6.4.2/index.html">hipRAND</a></td>
<td><a href="https://rocm.docs.amd.com/projects/hipRAND/en/docs-6.4.3/index.html">hipRAND</a></td>
<td>2.12.0</td>
<td><a href="https://github.com/ROCm/hipRAND"><i class="fab fa-github fa-lg"></i></a></td>
</tr>
<tr>
<td><a href="https://rocm.docs.amd.com/projects/hipSOLVER/en/docs-6.4.2/index.html">hipSOLVER</a></td>
<td><a href="https://rocm.docs.amd.com/projects/hipSOLVER/en/docs-6.4.3/index.html">hipSOLVER</a></td>
<td>2.4.0</td>
<td><a href="https://github.com/ROCm/hipSOLVER"><i class="fab fa-github fa-lg"></i></a></td>
</tr>
<tr>
<td><a href="https://rocm.docs.amd.com/projects/hipSPARSE/en/docs-6.4.2/index.html">hipSPARSE</a></td>
<td><a href="https://rocm.docs.amd.com/projects/hipSPARSE/en/docs-6.4.3/index.html">hipSPARSE</a></td>
<td>3.2.0</td>
<td><a href="https://github.com/ROCm/hipSPARSE"><i class="fab fa-github fa-lg"></i></a></td>
</tr>
<tr>
<td><a href="https://rocm.docs.amd.com/projects/hipSPARSELt/en/docs-6.4.2/index.html">hipSPARSELt</a></td>
<td><a href="https://rocm.docs.amd.com/projects/hipSPARSELt/en/docs-6.4.3/index.html">hipSPARSELt</a></td>
<td>0.2.3</td>
<td><a href="https://github.com/ROCm/hipSPARSELt"><i class="fab fa-github fa-lg"></i></a></td>
</tr>
<tr>
<td><a href="https://rocm.docs.amd.com/projects/rocALUTION/en/docs-6.4.2/index.html">rocALUTION</a></td>
<td><a href="https://rocm.docs.amd.com/projects/rocALUTION/en/docs-6.4.3/index.html">rocALUTION</a></td>
<td>3.2.3</td>
<td><a href="https://github.com/ROCm/rocALUTION"><i class="fab fa-github fa-lg"></i></a></td>
</tr>
<tr>
<td><a href="https://rocm.docs.amd.com/projects/rocBLAS/en/docs-6.4.2/index.html">rocBLAS</a></td>
<td>4.4.0&nbsp;&Rightarrow;&nbsp;<a href="#rocblas-4-4-1">4.4.1</td></td>
<td><a href="https://rocm.docs.amd.com/projects/rocBLAS/en/docs-6.4.3/index.html">rocBLAS</a></td>
<td>4.4.1</td></td>
<td><a href="https://github.com/ROCm/rocBLAS"><i class="fab fa-github fa-lg"></i></a></td>
</tr>
<tr>
<td><a href="https://rocm.docs.amd.com/projects/rocFFT/en/docs-6.4.2/index.html">rocFFT</a></td>
<td><a href="https://rocm.docs.amd.com/projects/rocFFT/en/docs-6.4.3/index.html">rocFFT</a></td>
<td>1.0.32</td>
<td><a href="https://github.com/ROCm/rocFFT"><i class="fab fa-github fa-lg"></i></a></td>
</tr>
<tr>
<td><a href="https://rocm.docs.amd.com/projects/rocRAND/en/docs-6.4.2/index.html">rocRAND</a></td>
<td><a href="https://rocm.docs.amd.com/projects/rocRAND/en/docs-6.4.3/index.html">rocRAND</a></td>
<td>3.3.0</td>
<td><a href="https://github.com/ROCm/rocRAND"><i class="fab fa-github fa-lg"></i></a></td>
</tr>
<tr>
<td><a href="https://rocm.docs.amd.com/projects/rocSOLVER/en/docs-6.4.2/index.html">rocSOLVER</a></td>
<td>3.28.0&nbsp;&Rightarrow;&nbsp;<a href="#rocsolver-3-28-2">3.28.2</td>
<td><a href="https://rocm.docs.amd.com/projects/rocSOLVER/en/docs-6.4.3/index.html">rocSOLVER</a></td>
<td>3.28.2</td>
<td><a href="https://github.com/ROCm/rocSOLVER"><i class="fab fa-github fa-lg"></i></a></td>
</tr>
<tr>
<td><a href="https://rocm.docs.amd.com/projects/rocSPARSE/en/docs-6.4.2/index.html">rocSPARSE</a></td>
<td><a href="https://rocm.docs.amd.com/projects/rocSPARSE/en/docs-6.4.3/index.html">rocSPARSE</a></td>
<td>3.4.0</td>
<td><a href="https://github.com/ROCm/rocSPARSE"><i class="fab fa-github fa-lg"></i></a></td>
</tr>
<tr>
<td><a href="https://rocm.docs.amd.com/projects/rocWMMA/en/docs-6.4.2/index.html">rocWMMA</a></td>
<td><a href="https://rocm.docs.amd.com/projects/rocWMMA/en/docs-6.4.3/index.html">rocWMMA</a></td>
<td>1.7.0</td>
<td><a href="https://github.com/ROCm/rocWMMA"><i class="fab fa-github fa-lg"></i></a></td>
</tr>
<tr>
<td><a href="https://rocm.docs.amd.com/projects/Tensile/en/docs-6.4.2/src/index.html">Tensile</a></td>
<td><a href="https://rocm.docs.amd.com/projects/Tensile/en/docs-6.4.3/src/index.html">Tensile</a></td>
<td>4.43.0</td>
<td><a href="https://github.com/ROCm/Tensile"><i class="fab fa-github fa-lg"></i></a></td>
</tr>
@@ -276,22 +243,22 @@ Click {fab}`github` to go to the component's source code on GitHub.
<tr>
<th rowspan="4"></th>
<th rowspan="4">Primitives</th>
<td><a href="https://rocm.docs.amd.com/projects/hipCUB/en/docs-6.4.2/index.html">hipCUB</a></td>
<td><a href="https://rocm.docs.amd.com/projects/hipCUB/en/docs-6.4.3/index.html">hipCUB</a></td>
<td>3.4.0</td>
<td><a href="https://github.com/ROCm/hipCUB"><i class="fab fa-github fa-lg"></i></a></td>
</tr>
<tr>
<td><a href="https://rocm.docs.amd.com/projects/hipTensor/en/docs-6.4.2/index.html">hipTensor</a></td>
<td><a href="https://rocm.docs.amd.com/projects/hipTensor/en/docs-6.4.3/index.html">hipTensor</a></td>
<td>1.5.0</td>
<td><a href="https://github.com/ROCm/hipTensor"><i class="fab fa-github fa-lg"></i></a></td>
</tr>
<tr>
<td><a href="https://rocm.docs.amd.com/projects/rocPRIM/en/docs-6.4.2/index.html">rocPRIM</a></td>
<td>3.4.0&nbsp;&Rightarrow;&nbsp;<a href="#rocprim-3-4-1">3.4.1</td>
<td><a href="https://rocm.docs.amd.com/projects/rocPRIM/en/docs-6.4.3/index.html">rocPRIM</a></td>
<td>3.4.1</td>
<td><a href="https://github.com/ROCm/rocPRIM"><i class="fab fa-github fa-lg"></i></a></td>
</tr>
<tr>
<td><a href="https://rocm.docs.amd.com/projects/rocThrust/en/docs-6.4.2/index.html">rocThrust</a></td>
<td><a href="https://rocm.docs.amd.com/projects/rocThrust/en/docs-6.4.3/index.html">rocThrust</a></td>
<td>3.3.0</td>
<td><a href="https://github.com/ROCm/rocThrust"><i class="fab fa-github fa-lg"></i></a></td>
</tr>
@@ -300,28 +267,28 @@ Click {fab}`github` to go to the component's source code on GitHub.
<tr>
<th rowspan="7">Tools</th>
<th rowspan="7">System management</th>
<td><a href="https://rocm.docs.amd.com/projects/amdsmi/en/docs-6.4.2/index.html">AMD SMI</a></td>
<td>25.4.2&nbsp;&Rightarrow;&nbsp;<a href="#amd-smi-25-5-1">25.5.1</a></td>
<td><a href="https://rocm.docs.amd.com/projects/amdsmi/en/docs-6.4.3/index.html">AMD SMI</a></td>
<td>25.5.1</a></td>
<td><a href="https://github.com/ROCm/amdsmi"><i class="fab fa-github fa-lg"></i></a></td>
</tr>
<tr>
<td><a href="https://rocm.docs.amd.com/projects/rdc/en/docs-6.4.2/index.html">ROCm Data Center Tool</a></td>
<td><a href="https://rocm.docs.amd.com/projects/rdc/en/docs-6.4.3/index.html">ROCm Data Center Tool</a></td>
<td>0.3.0</td>
<td><a href="https://github.com/ROCm/rdc"><i class="fab fa-github fa-lg"></i></a></td>
</tr>
<tr>
<td><a href="https://rocm.docs.amd.com/projects/rocminfo/en/docs-6.4.2/index.html">rocminfo</a></td>
<td><a href="https://rocm.docs.amd.com/projects/rocminfo/en/docs-6.4.3/index.html">rocminfo</a></td>
<td>1.0.0</td>
<td><a href="https://github.com/ROCm/rocminfo"><i class="fab fa-github fa-lg"></i></a></td>
</tr>
<tr>
<td><a href="https://rocm.docs.amd.com/projects/rocm_smi_lib/en/docs-6.4.2/index.html">ROCm SMI</a></td>
<td>7.5.0</td>
<td><a href="https://rocm.docs.amd.com/projects/rocm_smi_lib/en/docs-6.4.3/index.html">ROCm SMI</a></td>
<td>7.5.0&nbsp;&Rightarrow;&nbsp;<a href="#rocm-smi-7-7-0">7.7.0</td>
<td><a href="https://github.com/ROCm/rocm_smi_lib"><i class="fab fa-github fa-lg"></i></a></td>
</tr>
<tr>
<td><a href="https://rocm.docs.amd.com/projects/ROCmValidationSuite/en/docs-6.4.2/index.html">ROCm Validation Suite</a></td>
<td>1.1.0&nbsp;&Rightarrow;&nbsp;<a href="#rocm-validation-suite-1-1-0">1.1.0</td>
<td><a href="https://rocm.docs.amd.com/projects/ROCmValidationSuite/en/docs-6.4.3/index.html">ROCm Validation Suite</a></td>
<td>1.1.0</td>
<td><a href="https://github.com/ROCm/ROCmValidationSuite"><i class="fab fa-github fa-lg"></i></a></td>
</tr>
</tbody>
@@ -329,38 +296,38 @@ Click {fab}`github` to go to the component's source code on GitHub.
<tr>
<th rowspan="6"></th>
<th rowspan="6">Performance</th>
<td><a href="https://rocm.docs.amd.com/projects/rocm_bandwidth_test/en/docs-6.4.2/index.html">ROCm Bandwidth
<td><a href="https://rocm.docs.amd.com/projects/rocm_bandwidth_test/en/docs-6.4.3/index.html">ROCm Bandwidth
Test</a></td>
<td>1.4.0</td>
<td><a href="https://github.com/ROCm/rocm_bandwidth_test/"><i
class="fab fa-github fa-lg"></i></a></td>
</tr>
<tr>
<td><a href="https://rocm.docs.amd.com/projects/rocprofiler-compute/en/docs-6.4.2/index.html">ROCm Compute Profiler</a></td>
<td>3.1.0&nbsp;&Rightarrow;&nbsp;<a href="#rocm-compute-profiler-3-1-1">3.1.1</td>
<td><a href="https://rocm.docs.amd.com/projects/rocprofiler-compute/en/docs-6.4.3/index.html">ROCm Compute Profiler</a></td>
<td>3.1.1</td>
<td><a href="https://github.com/ROCm/rocprofiler-compute"><i
class="fab fa-github fa-lg"></i></a></td>
</tr>
<tr>
<td><a href="https://rocm.docs.amd.com/projects/rocprofiler-systems/en/docs-6.4.2/index.html">ROCm Systems Profiler</a></td>
<td>1.0.1&nbsp;&Rightarrow;&nbsp;<a href="#rocm-systems-profiler-1-0-2">1.0.2</td>
<td><a href="https://rocm.docs.amd.com/projects/rocprofiler-systems/en/docs-6.4.3/index.html">ROCm Systems Profiler</a></td>
<td>1.0.2</td>
<td><a href="https://github.com/ROCm/rocprofiler-systems"><i
class="fab fa-github fa-lg"></i></a></td>
</tr>
<tr>
<td><a href="https://rocm.docs.amd.com/projects/rocprofiler/en/docs-6.4.2/index.html">ROCProfiler</a></td>
<td><a href="https://rocm.docs.amd.com/projects/rocprofiler/en/docs-6.4.3/index.html">ROCProfiler</a></td>
<td>2.0.0</td>
<td><a href="https://github.com/ROCm/ROCProfiler/"><i
class="fab fa-github fa-lg"></i></a></td>
</tr>
<tr>
<td><a href="https://rocm.docs.amd.com/projects/rocprofiler-sdk/en/docs-6.4.2/index.html">ROCprofiler-SDK</a></td>
<td><a href="https://rocm.docs.amd.com/projects/rocprofiler-sdk/en/docs-6.4.3/index.html">ROCprofiler-SDK</a></td>
<td>0.6.0</td>
<td><a href="https://github.com/ROCm/rocprofiler-sdk/"><i
class="fab fa-github fa-lg"></i></a></td>
</tr>
<tr >
<td><a href="https://rocm.docs.amd.com/projects/roctracer/en/docs-6.4.2/index.html">ROCTracer</a></td>
<td><a href="https://rocm.docs.amd.com/projects/roctracer/en/docs-6.4.3/index.html">ROCTracer</a></td>
<td>4.1.0</td>
<td><a href="https://github.com/ROCm/ROCTracer/"><i
class="fab fa-github fa-lg"></i></a></td>
@@ -370,32 +337,32 @@ Click {fab}`github` to go to the component's source code on GitHub.
<tr>
<th rowspan="5"></th>
<th rowspan="5">Development</th>
<td><a href="https://rocm.docs.amd.com/projects/HIPIFY/en/docs-6.4.2/index.html">HIPIFY</a></td>
<td><a href="https://rocm.docs.amd.com/projects/HIPIFY/en/docs-6.4.3/index.html">HIPIFY</a></td>
<td>19.0.0</td>
<td><a href="https://github.com/ROCm/HIPIFY/"><i
class="fab fa-github fa-lg"></i></a></td>
</tr>
<tr>
<td><a href="https://rocm.docs.amd.com/projects/ROCdbgapi/en/docs-6.4.2/index.html">ROCdbgapi</a></td>
<td><a href="https://rocm.docs.amd.com/projects/ROCdbgapi/en/docs-6.4.3/index.html">ROCdbgapi</a></td>
<td>0.77.2</td>
<td><a href="https://github.com/ROCm/ROCdbgapi/"><i
class="fab fa-github fa-lg"></i></a></td>
</tr>
<tr>
<td><a href="https://rocm.docs.amd.com/projects/ROCmCMakeBuildTools/en/docs-6.4.2/index.html">ROCm CMake</a></td>
<td><a href="https://rocm.docs.amd.com/projects/ROCmCMakeBuildTools/en/docs-6.4.3/index.html">ROCm CMake</a></td>
<td>0.14.0</td>
<td><a href="https://github.com/ROCm/rocm-cmake/"><i
class="fab fa-github fa-lg"></i></a></td>
</tr>
<tr>
<td><a href="https://rocm.docs.amd.com/projects/ROCgdb/en/docs-6.4.2/index.html">ROCm Debugger (ROCgdb)</a>
<td><a href="https://rocm.docs.amd.com/projects/ROCgdb/en/docs-6.4.3/index.html">ROCm Debugger (ROCgdb)</a>
</td>
<td>15.2</td>
<td><a href="https://github.com/ROCm/ROCgdb/"><i
class="fab fa-github fa-lg"></i></a></td>
</tr>
<tr>
<td><a href="https://rocm.docs.amd.com/projects/rocr_debug_agent/en/docs-6.4.2/index.html">ROCr Debug Agent</a>
<td><a href="https://rocm.docs.amd.com/projects/rocr_debug_agent/en/docs-6.4.3/index.html">ROCr Debug Agent</a>
</td>
<td>2.0.4</td>
<td><a href="https://github.com/ROCm/rocr_debug_agent/"><i
@@ -405,13 +372,13 @@ Click {fab}`github` to go to the component's source code on GitHub.
<tbody class="rocm-components-compilers tbody-reverse-zebra">
<tr>
<th rowspan="2" colspan="2">Compilers</th>
<td><a href="https://rocm.docs.amd.com/projects/HIPCC/en/docs-6.4.2/index.html">HIPCC</a></td>
<td><a href="https://rocm.docs.amd.com/projects/HIPCC/en/docs-6.4.3/index.html">HIPCC</a></td>
<td>1.1.1</td>
<td><a href="https://github.com/ROCm/llvm-project/tree/amd-staging/amd/hipcc"><i
class="fab fa-github fa-lg"></i></a></td>
</tr>
<tr>
<td><a href="https://rocm.docs.amd.com/projects/llvm-project/en/docs-6.4.2/index.html">llvm-project</a></td>
<td><a href="https://rocm.docs.amd.com/projects/llvm-project/en/docs-6.4.3/index.html">llvm-project</a></td>
<td>19.0.0</td>
<td><a href="https://github.com/ROCm/llvm-project/"><i
class="fab fa-github fa-lg"></i></a></td>
@@ -420,12 +387,12 @@ Click {fab}`github` to go to the component's source code on GitHub.
<tbody class="rocm-components-runtimes tbody-reverse-zebra">
<tr>
<th rowspan="2" colspan="2">Runtimes</th>
<td><a href="https://rocm.docs.amd.com/projects/HIP/en/docs-6.4.2/index.html">HIP</a></td>
<td>6.4.1&nbsp;&Rightarrow;&nbsp;<a href="#hip-6-4-2">6.4.2</td>
<td><a href="https://rocm.docs.amd.com/projects/HIP/en/docs-6.4.3/index.html">HIP</a></td>
<td>6.4.3</td>
<td><a href="https://github.com/ROCm/HIP/"><i class="fab fa-github fa-lg"></i></a></td>
</tr>
<tr>
<td><a href="https://rocm.docs.amd.com/projects/ROCR-Runtime/en/docs-6.4.2/index.html">ROCr Runtime</a></td>
<td><a href="https://rocm.docs.amd.com/projects/ROCR-Runtime/en/docs-6.4.3/index.html">ROCr Runtime</a></td>
<td>1.15.0</td>
<td><a href="https://github.com/ROCm/ROCR-Runtime/"><i class="fab fa-github fa-lg"></i></a></td>
</tr>
@@ -441,179 +408,21 @@ The following sections describe key changes to ROCm components.
For a historical overview of ROCm component updates, see the {doc}`ROCm consolidated changelog </release/changelog>`.
```
### **AMD SMI** (25.5.1)
### **ROCm SMI** (7.7.0)
#### Added
- Compute Unit Occupancy information per process.
- Support for getting the GPU Board voltage.
- New firmware PLDM_BUNDLE. `amd-smi firmware` can now show the PLDM Bundle on supported systems.
- `amd-smi ras --afid --cper-file <file_path>` to decode CPER records.
#### Changed
- Padded `asic_serial` in `amdsmi_get_asic_info` with 0s.
- Renamed field `COMPUTE_PARTITION` to `ACCELERATOR_PARTITION` in CLI call `amd-smi --partition`.
#### Resolved issues
- Corrected VRAM memory calculation in `amdsmi_get_gpu_process_list`. Previously, the VRAM memory usage reported by `amdsmi_get_gpu_process_list` was inaccurate and was calculated using KB instead of KiB.
```{note}
See the full [AMD SMI changelog](https://github.com/ROCm/amdsmi/blob/release/rocm-rel-6.4/CHANGELOG.md) for details, examples, and in-depth descriptions.
See the full [ROCm SMI changelog](https://github.com/ROCm/rocm_smi_lib/blob/release/rocm-rel-6.4/CHANGELOG.md) for details, examples, and in-depth descriptions.
```
### **HIP** (6.4.2)
#### Added
* HIP API implementation for `hipEventRecordWithFlags`, records an event in the specified stream with flags.
* Support for the pointer attribute `HIP_POINTER_ATTRIBUTE_CONTEXT`.
* Support for the flags `hipEventWaitDefault` and `hipEventWaitExternal`.
#### Optimized
* Improved implementation in `hipEventSynchronize`, HIP runtime now makes internal callbacks as non-blocking operations to improve performance.
#### Resolved issues
* Issue of dependency on `libgcc-s1` during rocm-dev install on Debian Buster. HIP runtime removed this Debian package dependency, and uses `libgcc1` instead for this distros.
* Building issue for `COMGR` dynamic load on Fedora and other Distros. HIP runtime now doesn't link against `libamd_comgr.so`.
* Failure in the API `hipStreamDestroy`, when stream type is `hipStreamLegacy`. The API now returns error code `hipErrorInvalidResourceHandle` on this condition.
* Kernel launch errors, such as `shared object initialization failed`, `invalid device function` or `kernel execution failure`. HIP runtime now loads `COMGR` properly considering the file with its name and mapped image.
* Memory access fault in some applications. HIP runtime fixed offset accumulation in memory address.
* The memory leak in virtual memory management (VMM). HIP runtime now uses the size of handle for allocated memory range instead of actual size for physical memory, which fixed the issue of address clash with VMM.
* Large memory allocation issue. HIP runtime now checks GPU video RAM and system RAM properly and sets size limits during memory allocation either on the host or the GPU device.
* Support of `hipDeviceMallocContiguous` flags in `hipExtMallocWithFlags()`. It now enables `HSA_AMD_MEMORY_POOL_CONTIGUOUS_FLAG` in the memory pool allocation on GPU device.
* Radom memory segmentation fault in handling `GraphExec` object release and `hipDeviceSyncronization`. HIP runtime now uses internal device synchronize function in `__hipUnregisterFatBinary`.
### **hipBLASLt** (0.12.1)
#### Added
* Support for gfx1151 on Linux, complementing the previous support in the HIP SDK for Windows.
### **RCCL** (2.22.3)
#### Added
* Added support for the LL128 protocol on gfx942.
### **rocBLAS** (4.4.1)
#### Resolved issues
* rocBLAS might have failed to produce correct results for cherk/zherk on gfx90a/gfx942 with problem sizes k > 500 due to the imaginary portion on the C matrix diagonal not being zeros. rocBLAS now zeros the imaginary portion.
### **ROCm Compute Profiler** (3.1.1)
#### Added
* 8-bit floating point (FP8) metrics support for AMD Instinct MI300 GPUs.
* Additional data types for roofline: FP8, FP16, BF16, FP32, FP64, I8, I32, I64 (dependent on the GPU architecture).
* Data type selection option ``--roofline-data-type / -R`` for roofline profiling. The default data type is FP32.
#### Changed
* Changed dependency from `rocm-smi` to `amd-smi`.
#### Resolved issues
* Fixed a crash related to Agent ID caused by the new format of the `rocprofv3` output CSV file.
### **ROCm Systems Profiler** (1.0.2)
#### Optimized
* Improved readability of the OpenMP target offload traces by showing on a single Perfetto track.
#### Resolved issues
* Fixed the file path to the script that merges Perfetto files from multi-process MPI runs. The script has also been renamed from `merge-multiprocess-output.sh` to `rocprof-sys-merge-output.sh`.
### **ROCm Validation Suite** (1.1.0)
#### Added
* NPS2/DPX and NPS4/CPX partition modes support for AMD Instinct MI300X.
### **rocPRIM** (3.4.1)
#### Upcoming changes
* Changes to the template parameters of warp and block algorithms will be made in an upcoming release.
* Due to an upcoming compiler change, the following symbols related to warp size have been marked as deprecated and will be removed in an upcoming major release:
* `rocprim::device_warp_size()`. This has been replaced by `rocprim::arch::wavefront::min_size()` and `rocprim::arch::wavefront::max_size()` for compile-time constants. Use these when allocating global or shared memory. For run-time constants, use `rocprim::arch::wavefront::size()`.
* `rocprim::warp_size()`
* `ROCPRIM_WAVEFRONT_SIZE`
* The default scan accumulator types for device-level scan algorithms will be changed in an upcoming release, resulting in a breaking change. Previously, the default accumulator type was set to the input type for the inclusive scans and to the initial value type for the exclusive scans. This could lead to unexpected overflow if the input or initial type was smaller than the output type when the accumulator type wasn't explicitly set using the `AccType` template parameter. The new default accumulator types will be set to the type that results when the input or initial value type is applied to the scan operator.
The following is the complete list of affected functions and how their default accumulator types are changing:
* `rocprim::inclusive_scan`
* current default: `class AccType = typename std::iterator_traits<InputIterator>::value_type>`
* future default: `class AccType = rocprim::invoke_result_binary_op_t<typename std::iterator_traits<InputIterator>::value_type, BinaryFunction>`
* `rocprim::deterministic_inclusive_scan`
* current default: `class AccType = typename std::iterator_traits<InputIterator>::value_type>`
* future default: `class AccType = rocprim::invoke_result_binary_op_t<typename std::iterator_traits<InputIterator>::value_type, BinaryFunction>`
* `rocprim::exclusive_scan`
* current default: `class AccType = detail::input_type_t<InitValueType>>`
* future default: `class AccType = rocprim::invoke_result_binary_op_t<rocprim::detail::input_type_t<InitValueType>, BinaryFunction>`
* `rocprim::deterministic_exclusive_scan`
* current default: `class AccType = detail::input_type_t<InitValueType>>`
* future default: `class AccType = rocprim::invoke_result_binary_op_t<rocprim::detail::input_type_t<InitValueType>, BinaryFunction>`
* `rocprim::load_cs` and `rocprim::store_cs` are deprecated and will be removed in an upcoming release. Alternatively, you can use `rocprim::load_nontemporal` and `rocprim::store_nontemporal` to load and store values in specific conditions (like bypassing the cache) for `rocprim::thread_load` and `rocprim::thread_store`.
### **rocSHMEM** (2.0.1)
#### Resolved issues
* Incorrect output for `rocshmem_ctx_my_pe` and `rocshmem_ctx_n_pes`.
* Multi-team errors by providing team specific buffers in `rocshmem_ctx_wg_team_sync`.
* Missing implementation of `rocshmem_g` for IPC conduit.
### **rocSOLVER** (3.28.2)
#### Added
* Hybrid computation support for existing routines, such as STERF.
* SVD for general matrices based on Cuppen's Divide and Conquer algorithm:
- GESDD (with batched and strided\_batched versions)
#### Optimized
* Reduced the device memory requirements for STEDC, SYEVD/HEEVD, and SYGVD/HEGVD.
* Improved the performance of STEDC and divide and conquer Eigensolvers.
* Improved the performance of SYTRD, the initial step of the Eigensolvers that start with the tridiagonalization of the input matrix.
## ROCm known issues
ROCm known issues are noted on {fab}`github` [GitHub](https://github.com/ROCm/ROCm/labels/Verified%20Issue). For known
issues related to individual components, review the [Detailed component changes](#detailed-component-changes).
## ROCm resolved issues
The following are previously known issues resolved in this release. For resolved issues related to
individual components, review the [Detailed component changes](#detailed-component-changes).
### AMD SMI CLI: CPER entries not dumped continuously when using follow flag
An issue where CPER entries were not streamed continuously as intended when using the `--follow` flag with `amd-smi ras --cper` has been resolved. See [GitHub issue #4768](https://github.com/ROCm/ROCm/issues/4768).
### Instinct MI300X reports incorrect raw GPU timestamps
An issue where the command processor firmware reported incorrect raw GPU timestamps on MI300X accelerators has been resolved. See [GitHub issue #4079](https://github.com/ROCm/ROCm/issues/4079).
### MIOpen generates incorrect results for particular input with FP32 data type
An issue where MIOpen generated incorrect results on the `conv2dbackward` function for a particular input with 32-bit floating point (FP32) data types has been resolved. The issue was only specific to FP32 data types with 2 * 2 kernel size and dilation 2 * 1. See [GitHub issue #4606](https://github.com/ROCm/ROCm/issues/4606).
## ROCm upcoming changes
The following changes to the ROCm software stack are anticipated for future releases.
@@ -649,10 +458,10 @@ and will be disabled in a future release.
* The `__AMDGCN_WAVEFRONT_SIZE__` macro and `__AMDGCN_WAVEFRONT_SIZE` alias will be removed in an upcoming release.
It is recommended to remove any use of this macro. For more information, see
[AMDGPU support](https://rocm.docs.amd.com/projects/llvm-project/en/docs-6.4.2/LLVM/clang/html/AMDGPUSupport.html).
[AMDGPU support](https://rocm.docs.amd.com/projects/llvm-project/en/docs-6.4.3/LLVM/clang/html/AMDGPUSupport.html).
* `warpSize` will only be available as a non-`constexpr` variable. Where required,
the wavefront size should be queried via the `warpSize` variable in device code,
or via `hipGetDeviceProperties` in host code. Neither of these will result in a compile-time constant. For more information, see [warpSize](https://rocm.docs.amd.com/projects/HIP/en/docs-6.4.2/how-to/hip_cpp_language_extensions.html#warpsize).
or via `hipGetDeviceProperties` in host code. Neither of these will result in a compile-time constant. For more information, see [warpSize](https://rocm.docs.amd.com/projects/HIP/en/docs-6.4.3/how-to/hip_cpp_language_extensions.html#warpsize).
* For cases where compile-time evaluation of the wavefront size cannot be avoided,
uses of `__AMDGCN_WAVEFRONT_SIZE`, `__AMDGCN_WAVEFRONT_SIZE__`, or `warpSize`
can be replaced with a user-defined macro or `constexpr` variable with the wavefront

View File

@@ -1,129 +1,131 @@
ROCm Version,6.4.2,6.4.1,6.4.0,6.3.3,6.3.2,6.3.1,6.3.0,6.2.4,6.2.2,6.2.1,6.2.0, 6.1.5, 6.1.2, 6.1.1, 6.1.0, 6.0.2, 6.0.0
:ref:`Operating systems & kernels <OS-kernel-versions>`,Ubuntu 24.04.2,Ubuntu 24.04.2,Ubuntu 24.04.2,Ubuntu 24.04.2,Ubuntu 24.04.2,Ubuntu 24.04.2,Ubuntu 24.04.2,"Ubuntu 24.04.1, 24.04","Ubuntu 24.04.1, 24.04","Ubuntu 24.04.1, 24.04",Ubuntu 24.04,,,,,,
,Ubuntu 22.04.5,Ubuntu 22.04.5,Ubuntu 22.04.5,Ubuntu 22.04.5,Ubuntu 22.04.5,Ubuntu 22.04.5,Ubuntu 22.04.5,"Ubuntu 22.04.5, 22.04.4","Ubuntu 22.04.5, 22.04.4","Ubuntu 22.04.5, 22.04.4","Ubuntu 22.04.5, 22.04.4","Ubuntu 22.04.5, 22.04.4, 22.04.3","Ubuntu 22.04.4, 22.04.3","Ubuntu 22.04.4, 22.04.3","Ubuntu 22.04.4, 22.04.3","Ubuntu 22.04.4, 22.04.3, 22.04.2","Ubuntu 22.04.4, 22.04.3, 22.04.2"
,,,,,,,,,,,,"Ubuntu 20.04.6, 20.04.5","Ubuntu 20.04.6, 20.04.5","Ubuntu 20.04.6, 20.04.5","Ubuntu 20.04.6, 20.04.5","Ubuntu 20.04.6, 20.04.5","Ubuntu 20.04.6, 20.04.5"
,"RHEL 9.6, 9.4","RHEL 9.6, 9.5, 9.4","RHEL 9.5, 9.4","RHEL 9.5, 9.4","RHEL 9.5, 9.4","RHEL 9.5, 9.4","RHEL 9.5, 9.4","RHEL 9.4, 9.3","RHEL 9.4, 9.3","RHEL 9.4, 9.3","RHEL 9.4, 9.3","RHEL 9.4, 9.3, 9.2","RHEL 9.4, 9.3, 9.2","RHEL 9.4, 9.3, 9.2","RHEL 9.4, 9.3, 9.2","RHEL 9.3, 9.2","RHEL 9.3, 9.2"
,RHEL 8.10,RHEL 8.10,RHEL 8.10,RHEL 8.10,RHEL 8.10,RHEL 8.10,RHEL 8.10,"RHEL 8.10, 8.9","RHEL 8.10, 8.9","RHEL 8.10, 8.9","RHEL 8.10, 8.9","RHEL 8.9, 8.8","RHEL 8.9, 8.8","RHEL 8.9, 8.8","RHEL 8.9, 8.8","RHEL 8.9, 8.8","RHEL 8.9, 8.8"
,"SLES 15 SP7, SP6",SLES 15 SP6,SLES 15 SP6,"SLES 15 SP6, SP5","SLES 15 SP6, SP5","SLES 15 SP6, SP5","SLES 15 SP6, SP5","SLES 15 SP6, SP5","SLES 15 SP6, SP5","SLES 15 SP6, SP5","SLES 15 SP6, SP5","SLES 15 SP5, SP4","SLES 15 SP5, SP4","SLES 15 SP5, SP4","SLES 15 SP5, SP4","SLES 15 SP5, SP4","SLES 15 SP5, SP4"
,,,,,,,,,,,,,CentOS 7.9,CentOS 7.9,CentOS 7.9,CentOS 7.9,CentOS 7.9
,"Oracle Linux 9, 8 [#mi300x-past-60]_","Oracle Linux 9, 8 [#mi300x-past-60]_","Oracle Linux 9, 8 [#mi300x-past-60]_",Oracle Linux 8.10 [#mi300x-past-60]_,Oracle Linux 8.10 [#mi300x-past-60]_,Oracle Linux 8.10 [#mi300x-past-60]_,Oracle Linux 8.10 [#mi300x-past-60]_,Oracle Linux 8.9 [#mi300x-past-60]_,Oracle Linux 8.9 [#mi300x-past-60]_,Oracle Linux 8.9 [#mi300x-past-60]_,Oracle Linux 8.9 [#mi300x-past-60]_,Oracle Linux 8.9 [#mi300x-past-60]_,Oracle Linux 8.9 [#mi300x-past-60]_,Oracle Linux 8.9 [#mi300x-past-60]_,,,
,Debian 12 [#single-node-past-60]_,Debian 12 [#single-node-past-60]_,Debian 12 [#single-node-past-60]_,Debian 12 [#single-node-past-60]_,Debian 12 [#single-node-past-60]_,Debian 12 [#single-node-past-60]_,,,,,,,,,,,
,Azure Linux 3.0 [#mi300x-past-60]_,Azure Linux 3.0 [#mi300x-past-60]_,Azure Linux 3.0 [#mi300x-past-60]_,Azure Linux 3.0 [#mi300x-past-60]_,Azure Linux 3.0 [#mi300x-past-60]_,,,,,,,,,,,,
,.. _architecture-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,
:doc:`Architecture <rocm-install-on-linux:reference/system-requirements>`,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3
,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2
,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA
,RDNA4,RDNA4,,,,,,,,,,,,,,,
,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3
,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2
,.. _gpu-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,
:doc:`GPU / LLVM target <rocm-install-on-linux:reference/system-requirements>`,gfx1201 [#RDNA-OS-past-60]_,gfx1201 [#RDNA-OS-past-60]_,,,,,,,,,,,,,,,
,gfx1200 [#RDNA-OS-past-60]_,gfx1200 [#RDNA-OS-past-60]_,,,,,,,,,,,,,,,
,gfx1101 [#RDNA-OS-past-60]_ [#7700XT-OS-past-60]_,gfx1101 [#RDNA-OS-past-60]_,,,,,,,,,,,,,,,
,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100
,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030
,gfx942,gfx942,gfx942,gfx942,gfx942,gfx942,gfx942,gfx942 [#mi300_624-past-60]_,gfx942 [#mi300_622-past-60]_,gfx942 [#mi300_621-past-60]_,gfx942 [#mi300_620-past-60]_, gfx942 [#mi300_612-past-60]_, gfx942 [#mi300_612-past-60]_, gfx942 [#mi300_611-past-60]_, gfx942 [#mi300_610-past-60]_, gfx942 [#mi300_602-past-60]_, gfx942 [#mi300_600-past-60]_
,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a
,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908
,,,,,,,,,,,,,,,,,
FRAMEWORK SUPPORT,.. _framework-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,
:doc:`PyTorch <../compatibility/ml-compatibility/pytorch-compatibility>`,"2.6, 2.5, 2.4, 2.3","2.6, 2.5, 2.4, 2.3","2.6, 2.5, 2.4, 2.3","2.4, 2.3, 2.2, 1.13","2.4, 2.3, 2.2, 1.13","2.4, 2.3, 2.2, 1.13","2.4, 2.3, 2.2, 2.1, 2.0, 1.13","2.3, 2.2, 2.1, 2.0, 1.13","2.3, 2.2, 2.1, 2.0, 1.13","2.3, 2.2, 2.1, 2.0, 1.13","2.3, 2.2, 2.1, 2.0, 1.13","2.1, 2.0, 1.13","2.1, 2.0, 1.13","2.1, 2.0, 1.13","2.1, 2.0, 1.13","2.1, 2.0, 1.13","2.1, 2.0, 1.13"
:doc:`TensorFlow <../compatibility/ml-compatibility/tensorflow-compatibility>`,"2.18.1, 2.17.1, 2.16.2","2.18.1, 2.17.1, 2.16.2","2.18.1, 2.17.1, 2.16.2","2.17.0, 2.16.2, 2.15.1","2.17.0, 2.16.2, 2.15.1","2.17.0, 2.16.2, 2.15.1","2.17.0, 2.16.2, 2.15.1","2.16.1, 2.15.1, 2.14.1","2.16.1, 2.15.1, 2.14.1","2.16.1, 2.15.1, 2.14.1","2.16.1, 2.15.1, 2.14.1","2.15.0, 2.14.0, 2.13.1","2.15.0, 2.14.0, 2.13.1","2.15.0, 2.14.0, 2.13.1","2.15.0, 2.14.0, 2.13.1","2.14.0, 2.13.1, 2.12.1","2.14.0, 2.13.1, 2.12.1"
:doc:`JAX <../compatibility/ml-compatibility/jax-compatibility>`,0.4.35,0.4.35,0.4.35,0.4.31,0.4.31,0.4.31,0.4.31,0.4.26,0.4.26,0.4.26,0.4.26,0.4.26,0.4.26,0.4.26,0.4.26,0.4.26,0.4.26
:doc:`Stanford Megatron-LM <../compatibility/ml-compatibility/stanford-megatron-lm-compatibility>`,N/A,N/A,N/A,N/A,N/A,85f95ae,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A
:doc:`DGL <../compatibility/ml-compatibility/dgl-compatibility>`,2.4.0,2.4.0,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A
:doc:`verl <../compatibility/ml-compatibility/verl-compatibility>` [#verl_compat]_,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,0.3.0.post0,N/A,N/A,N/A,N/A,N/A,N/A
`ONNX Runtime <https://onnxruntime.ai/docs/build/eps.html#amd-migraphx>`_,1.2,1.2,1.2,1.17.3,1.17.3,1.17.3,1.17.3,1.17.3,1.17.3,1.17.3,1.17.3,1.17.3,1.17.3,1.17.3,1.17.3,1.14.1,1.14.1
,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,
THIRD PARTY COMMS,.. _thirdpartycomms-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,
`UCC <https://github.com/ROCm/ucc>`_,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.2.0,>=1.2.0
`UCX <https://github.com/ROCm/ucx>`_,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.14.1,>=1.14.1,>=1.14.1,>=1.14.1,>=1.14.1,>=1.14.1
,,,,,,,,,,,,,,,,,
THIRD PARTY ALGORITHM,.. _thirdpartyalgorithm-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,
Thrust,2.5.0,2.5.0,2.5.0,2.3.2,2.3.2,2.3.2,2.3.2,2.2.0,2.2.0,2.2.0,2.2.0,2.1.0,2.1.0,2.1.0,2.1.0,2.0.1,2.0.1
CUB,2.5.0,2.5.0,2.5.0,2.3.2,2.3.2,2.3.2,2.3.2,2.2.0,2.2.0,2.2.0,2.2.0,2.1.0,2.1.0,2.1.0,2.1.0,2.0.1,2.0.1
,,,,,,,,,,,,,,,,,
KMD & USER SPACE [#kfd_support-past-60]_,.. _kfd-userspace-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,
:doc:`KMD versions <rocm-install-on-linux:reference/user-kernel-space-compat-matrix>`,"6.4.x, 6.3.x, 6.2.x, 6.1.x","6.4.x, 6.3.x, 6.2.x, 6.1.x","6.4.x, 6.3.x, 6.2.x, 6.1.x","6.4.x, 6.3.x, 6.2.x, 6.1.x","6.4.x, 6.3.x, 6.2.x, 6.1.x","6.4.x, 6.3.x, 6.2.x, 6.1.x","6.4.x, 6.3.x, 6.2.x, 6.1.x","6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x","6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x","6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x","6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x","6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x, 5.7.x","6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x, 5.7.x","6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x, 5.7.x","6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x, 5.7.x","6.2.x, 6.1.x, 6.0.x, 5.7.x, 5.6.x","6.2.x, 6.1.x, 6.0.x, 5.7.x, 5.6.x"
,,,,,,,,,,,,,,,,,
ML & COMPUTER VISION,.. _mllibs-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,
:doc:`Composable Kernel <composable_kernel:index>`,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0
:doc:`MIGraphX <amdmigraphx:index>`,2.12.0,2.12.0,2.12.0,2.11.0,2.11.0,2.11.0,2.11.0,2.10.0,2.10.0,2.10.0,2.10.0,2.9.0,2.9.0,2.9.0,2.9.0,2.8.0,2.8.0
:doc:`MIOpen <miopen:index>`,3.4.0,3.4.0,3.4.0,3.3.0,3.3.0,3.3.0,3.3.0,3.2.0,3.2.0,3.2.0,3.2.0,3.1.0,3.1.0,3.1.0,3.1.0,3.0.0,3.0.0
:doc:`MIVisionX <mivisionx:index>`,3.2.0,3.2.0,3.2.0,3.1.0,3.1.0,3.1.0,3.1.0,3.0.0,3.0.0,3.0.0,3.0.0,2.5.0,2.5.0,2.5.0,2.5.0,2.5.0,2.5.0
:doc:`rocAL <rocal:index>`,2.2.0,2.2.0,2.2.0,2.1.0,2.1.0,2.1.0,2.1.0,2.0.0,2.0.0,2.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0
:doc:`rocDecode <rocdecode:index>`,0.10.0,0.10.0,0.10.0,0.8.0,0.8.0,0.8.0,0.8.0,0.6.0,0.6.0,0.6.0,0.6.0,0.6.0,0.6.0,0.5.0,0.5.0,N/A,N/A
:doc:`rocJPEG <rocjpeg:index>`,0.8.0,0.8.0,0.8.0,0.6.0,0.6.0,0.6.0,0.6.0,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A
:doc:`rocPyDecode <rocpydecode:index>`,0.3.1,0.3.1,0.3.1,0.2.0,0.2.0,0.2.0,0.2.0,0.1.0,0.1.0,0.1.0,0.1.0,N/A,N/A,N/A,N/A,N/A,N/A
:doc:`RPP <rpp:index>`,1.9.10,1.9.10,1.9.10,1.9.1,1.9.1,1.9.1,1.9.1,1.8.0,1.8.0,1.8.0,1.8.0,1.5.0,1.5.0,1.5.0,1.5.0,1.4.0,1.4.0
,,,,,,,,,,,,,,,,,
COMMUNICATION,.. _commlibs-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,
:doc:`RCCL <rccl:index>`,2.22.3,2.22.3,2.22.3,2.21.5,2.21.5,2.21.5,2.21.5,2.20.5,2.20.5,2.20.5,2.20.5,2.18.6,2.18.6,2.18.6,2.18.6,2.18.3,2.18.3
:doc:`rocSHMEM <rocshmem:index>`,2.0.1,2.0.0,2.0.0,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A
,,,,,,,,,,,,,,,,,
MATH LIBS,.. _mathlibs-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,
`half <https://github.com/ROCm/half>`_ ,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0
:doc:`hipBLAS <hipblas:index>`,2.4.0,2.4.0,2.4.0,2.3.0,2.3.0,2.3.0,2.3.0,2.2.0,2.2.0,2.2.0,2.2.0,2.1.0,2.1.0,2.1.0,2.1.0,2.0.0,2.0.0
:doc:`hipBLASLt <hipblaslt:index>`,0.12.1,0.12.1,0.12.0,0.10.0,0.10.0,0.10.0,0.10.0,0.8.0,0.8.0,0.8.0,0.8.0,0.7.0,0.7.0,0.7.0,0.7.0,0.6.0,0.6.0
:doc:`hipFFT <hipfft:index>`,1.0.18,1.0.18,1.0.18,1.0.17,1.0.17,1.0.17,1.0.17,1.0.16,1.0.15,1.0.15,1.0.14,1.0.14,1.0.14,1.0.14,1.0.14,1.0.13,1.0.13
:doc:`hipfort <hipfort:index>`,0.6.0,0.6.0,0.6.0,0.5.1,0.5.1,0.5.0,0.5.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0
:doc:`hipRAND <hiprand:index>`,2.12.0,2.12.0,2.12.0,2.11.1,2.11.1,2.11.1,2.11.0,2.11.1,2.11.0,2.11.0,2.11.0,2.10.16,2.10.16,2.10.16,2.10.16,2.10.16,2.10.16
:doc:`hipSOLVER <hipsolver:index>`,2.4.0,2.4.0,2.4.0,2.3.0,2.3.0,2.3.0,2.3.0,2.2.0,2.2.0,2.2.0,2.2.0,2.1.1,2.1.1,2.1.1,2.1.0,2.0.0,2.0.0
:doc:`hipSPARSE <hipsparse:index>`,3.2.0,3.2.0,3.2.0,3.1.2,3.1.2,3.1.2,3.1.2,3.1.1,3.1.1,3.1.1,3.1.1,3.0.1,3.0.1,3.0.1,3.0.1,3.0.0,3.0.0
:doc:`hipSPARSELt <hipsparselt:index>`,0.2.3,0.2.3,0.2.3,0.2.2,0.2.2,0.2.2,0.2.2,0.2.1,0.2.1,0.2.1,0.2.1,0.2.0,0.2.0,0.1.0,0.1.0,0.1.0,0.1.0
:doc:`rocALUTION <rocalution:index>`,3.2.3,3.2.3,3.2.2,3.2.1,3.2.1,3.2.1,3.2.1,3.2.1,3.2.0,3.2.0,3.2.0,3.1.1,3.1.1,3.1.1,3.1.1,3.0.3,3.0.3
:doc:`rocBLAS <rocblas:index>`,4.4.1,4.4.0,4.4.0,4.3.0,4.3.0,4.3.0,4.3.0,4.2.4,4.2.1,4.2.1,4.2.0,4.1.2,4.1.2,4.1.0,4.1.0,4.0.0,4.0.0
:doc:`rocFFT <rocfft:index>`,1.0.32,1.0.32,1.0.32,1.0.31,1.0.31,1.0.31,1.0.31,1.0.30,1.0.29,1.0.29,1.0.28,1.0.27,1.0.27,1.0.27,1.0.26,1.0.25,1.0.23
:doc:`rocRAND <rocrand:index>`,3.3.0,3.3.0,3.3.0,3.2.0,3.2.0,3.2.0,3.2.0,3.1.1,3.1.0,3.1.0,3.1.0,3.0.1,3.0.1,3.0.1,3.0.1,3.0.0,2.10.17
:doc:`rocSOLVER <rocsolver:index>`,3.28.2,3.28.0,3.28.0,3.27.0,3.27.0,3.27.0,3.27.0,3.26.2,3.26.0,3.26.0,3.26.0,3.25.0,3.25.0,3.25.0,3.25.0,3.24.0,3.24.0
:doc:`rocSPARSE <rocsparse:index>`,3.4.0,3.4.0,3.4.0,3.3.0,3.3.0,3.3.0,3.3.0,3.2.1,3.2.0,3.2.0,3.2.0,3.1.2,3.1.2,3.1.2,3.1.2,3.0.2,3.0.2
:doc:`rocWMMA <rocwmma:index>`,1.7.0,1.7.0,1.7.0,1.6.0,1.6.0,1.6.0,1.6.0,1.5.0,1.5.0,1.5.0,1.5.0,1.4.0,1.4.0,1.4.0,1.4.0,1.3.0,1.3.0
:doc:`Tensile <tensile:src/index>`,4.43.0,4.43.0,4.43.0,4.42.0,4.42.0,4.42.0,4.42.0,4.41.0,4.41.0,4.41.0,4.41.0,4.40.0,4.40.0,4.40.0,4.40.0,4.39.0,4.39.0
,,,,,,,,,,,,,,,,,
PRIMITIVES,.. _primitivelibs-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,
:doc:`hipCUB <hipcub:index>`,3.4.0,3.4.0,3.4.0,3.3.0,3.3.0,3.3.0,3.3.0,3.2.1,3.2.0,3.2.0,3.2.0,3.1.0,3.1.0,3.1.0,3.1.0,3.0.0,3.0.0
:doc:`hipTensor <hiptensor:index>`,1.5.0,1.5.0,1.5.0,1.4.0,1.4.0,1.4.0,1.4.0,1.3.0,1.3.0,1.3.0,1.3.0,1.2.0,1.2.0,1.2.0,1.2.0,1.1.0,1.1.0
:doc:`rocPRIM <rocprim:index>`,3.4.1,3.4.0,3.4.0,3.3.0,3.3.0,3.3.0,3.3.0,3.2.2,3.2.0,3.2.0,3.2.0,3.1.0,3.1.0,3.1.0,3.1.0,3.0.0,3.0.0
:doc:`rocThrust <rocthrust:index>`,3.3.0,3.3.0,3.3.0,3.3.0,3.3.0,3.3.0,3.3.0,3.1.1,3.1.0,3.1.0,3.0.1,3.0.1,3.0.1,3.0.1,3.0.1,3.0.0,3.0.0
,,,,,,,,,,,,,,,,,
SUPPORT LIBS,,,,,,,,,,,,,,,,,
`hipother <https://github.com/ROCm/hipother>`_,6.4.43483,6.4.43483,6.4.43482,6.3.42134,6.3.42134,6.3.42133,6.3.42131,6.2.41134,6.2.41134,6.2.41134,6.2.41133,6.1.40093,6.1.40093,6.1.40092,6.1.40091,6.1.32831,6.1.32830
`rocm-core <https://github.com/ROCm/rocm-core>`_,6.4.2,6.4.1,6.4.0,6.3.3,6.3.2,6.3.1,6.3.0,6.2.4,6.2.2,6.2.1,6.2.0,6.1.5,6.1.2,6.1.1,6.1.0,6.0.2,6.0.0
`ROCT-Thunk-Interface <https://github.com/ROCm/ROCT-Thunk-Interface>`_,N/A [#ROCT-rocr-past-60]_,N/A [#ROCT-rocr-past-60]_,N/A [#ROCT-rocr-past-60]_,N/A [#ROCT-rocr-past-60]_,N/A [#ROCT-rocr-past-60]_,N/A [#ROCT-rocr-past-60]_,N/A [#ROCT-rocr-past-60]_,20240607.5.7,20240607.5.7,20240607.4.05,20240607.1.4246,20240125.5.08,20240125.5.08,20240125.5.08,20240125.3.30,20231016.2.245,20231016.2.245
,,,,,,,,,,,,,,,,,
SYSTEM MGMT TOOLS,.. _tools-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,
:doc:`AMD SMI <amdsmi:index>`,25.5.1,25.4.2,25.3.0,24.7.1,24.7.1,24.7.1,24.7.1,24.6.3,24.6.3,24.6.3,24.6.2,24.5.1,24.5.1,24.5.1,24.4.1,23.4.2,23.4.2
:doc:`ROCm Data Center Tool <rdc:index>`,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0
:doc:`rocminfo <rocminfo:index>`,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0
:doc:`ROCm SMI <rocm_smi_lib:index>`,7.5.0,7.5.0,7.5.0,7.4.0,7.4.0,7.4.0,7.4.0,7.3.0,7.3.0,7.3.0,7.3.0,7.2.0,7.2.0,7.0.0,7.0.0,6.0.2,6.0.0
:doc:`ROCm Validation Suite <rocmvalidationsuite:index>`,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.0.60204,1.0.60202,1.0.60201,1.0.60200,1.0.60105,1.0.60102,1.0.60101,1.0.60100,1.0.60002,1.0.60000
,,,,,,,,,,,,,,,,,
PERFORMANCE TOOLS,,,,,,,,,,,,,,,,,
:doc:`ROCm Bandwidth Test <rocm_bandwidth_test:index>`,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0
:doc:`ROCm Compute Profiler <rocprofiler-compute:index>`,3.1.1,3.1.0,3.1.0,3.0.0,3.0.0,3.0.0,3.0.0,2.0.1,2.0.1,2.0.1,2.0.1,N/A,N/A,N/A,N/A,N/A,N/A
:doc:`ROCm Systems Profiler <rocprofiler-systems:index>`,1.0.2,1.0.1,1.0.0,0.1.2,0.1.1,0.1.0,0.1.0,1.11.2,1.11.2,1.11.2,1.11.2,N/A,N/A,N/A,N/A,N/A,N/A
:doc:`ROCProfiler <rocprofiler:index>`,2.0.60402,2.0.60401,2.0.60400,2.0.60303,2.0.60302,2.0.60301,2.0.60300,2.0.60204,2.0.60202,2.0.60201,2.0.60200,2.0.60105,2.0.60102,2.0.60101,2.0.60100,2.0.60002,2.0.60000
:doc:`ROCprofiler-SDK <rocprofiler-sdk:index>`,0.6.0,0.6.0,0.6.0,0.5.0,0.5.0,0.5.0,0.5.0,0.4.0,0.4.0,0.4.0,0.4.0,N/A,N/A,N/A,N/A,N/A,N/A
:doc:`ROCTracer <roctracer:index>`,4.1.60402,4.1.60401,4.1.60400,4.1.60303,4.1.60302,4.1.60301,4.1.60300,4.1.60204,4.1.60202,4.1.60201,4.1.60200,4.1.60105,4.1.60102,4.1.60101,4.1.60100,4.1.60002,4.1.60000
,,,,,,,,,,,,,,,,,
DEVELOPMENT TOOLS,,,,,,,,,,,,,,,,,
:doc:`HIPIFY <hipify:index>`,19.0.0,19.0.0,19.0.0,18.0.0.25012,18.0.0.25012,18.0.0.24491,18.0.0.24455,18.0.0.24392,18.0.0.24355,18.0.0.24355,18.0.0.24232,17.0.0.24193,17.0.0.24193,17.0.0.24154,17.0.0.24103,17.0.0.24012,17.0.0.23483
:doc:`ROCm CMake <rocmcmakebuildtools:index>`,0.14.0,0.14.0,0.14.0,0.14.0,0.14.0,0.14.0,0.14.0,0.13.0,0.13.0,0.13.0,0.13.0,0.12.0,0.12.0,0.12.0,0.12.0,0.11.0,0.11.0
:doc:`ROCdbgapi <rocdbgapi:index>`,0.77.2,0.77.2,0.77.2,0.77.0,0.77.0,0.77.0,0.77.0,0.76.0,0.76.0,0.76.0,0.76.0,0.71.0,0.71.0,0.71.0,0.71.0,0.71.0,0.71.0
:doc:`ROCm Debugger (ROCgdb) <rocgdb:index>`,15.2.0,15.2.0,15.2.0,15.2.0,15.2.0,15.2.0,15.2.0,14.2.0,14.2.0,14.2.0,14.2.0,14.1.0,14.1.0,14.1.0,14.1.0,13.2.0,13.2.0
`rocprofiler-register <https://github.com/ROCm/rocprofiler-register>`_,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.3.0,0.3.0,0.3.0,0.3.0,N/A,N/A
:doc:`ROCr Debug Agent <rocr_debug_agent:index>`,2.0.4,2.0.4,2.0.4,2.0.3,2.0.3,2.0.3,2.0.3,2.0.3,2.0.3,2.0.3,2.0.3,2.0.3,2.0.3,2.0.3,2.0.3,2.0.3,2.0.3
,,,,,,,,,,,,,,,,,
COMPILERS,.. _compilers-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,
`clang-ocl <https://github.com/ROCm/clang-ocl>`_,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,0.5.0,0.5.0,0.5.0,0.5.0,0.5.0,0.5.0
:doc:`hipCC <hipcc:index>`,1.1.1,1.1.1,1.1.1,1.1.1,1.1.1,1.1.1,1.1.1,1.1.1,1.1.1,1.1.1,1.1.1,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0
`Flang <https://github.com/ROCm/flang>`_,19.0.0.25224,19.0.0.25184,19.0.0.25133,18.0.0.25012,18.0.0.25012,18.0.0.24491,18.0.0.24455,18.0.0.24392,18.0.0.24355,18.0.0.24355,18.0.0.24232,17.0.0.24193,17.0.0.24193,17.0.0.24154,17.0.0.24103,17.0.0.24012,17.0.0.23483
:doc:`llvm-project <llvm-project:index>`,19.0.0.25224,19.0.0.25184,19.0.0.25133,18.0.0.25012,18.0.0.25012,18.0.0.24491,18.0.0.24491,18.0.0.24392,18.0.0.24355,18.0.0.24355,18.0.0.24232,17.0.0.24193,17.0.0.24193,17.0.0.24154,17.0.0.24103,17.0.0.24012,17.0.0.23483
`OpenMP <https://github.com/ROCm/llvm-project/tree/amd-staging/openmp>`_,19.0.0.25224,19.0.0.25184,19.0.0.25133,18.0.0.25012,18.0.0.25012,18.0.0.24491,18.0.0.24491,18.0.0.24392,18.0.0.24355,18.0.0.24355,18.0.0.24232,17.0.0.24193,17.0.0.24193,17.0.0.24154,17.0.0.24103,17.0.0.24012,17.0.0.23483
,,,,,,,,,,,,,,,,,
RUNTIMES,.. _runtime-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,
:doc:`AMD CLR <hip:understand/amd_clr>`,6.4.43484,6.4.43483,6.4.43482,6.3.42134,6.3.42134,6.3.42133,6.3.42131,6.2.41134,6.2.41134,6.2.41134,6.2.41133,6.1.40093,6.1.40093,6.1.40092,6.1.40091,6.1.32831,6.1.32830
:doc:`HIP <hip:index>`,6.4.43484,6.4.43483,6.4.43482,6.3.42134,6.3.42134,6.3.42133,6.3.42131,6.2.41134,6.2.41134,6.2.41134,6.2.41133,6.1.40093,6.1.40093,6.1.40092,6.1.40091,6.1.32831,6.1.32830
`OpenCL Runtime <https://github.com/ROCm/clr/tree/develop/opencl>`_,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0
:doc:`ROCr Runtime <rocr-runtime:index>`,1.15.0,1.15.0,1.15.0,1.14.0,1.14.0,1.14.0,1.14.0,1.14.0,1.14.0,1.14.0,1.13.0,1.13.0,1.13.0,1.13.0,1.13.0,1.12.0,1.12.0
ROCm Version,6.4.3,6.4.2,6.4.1,6.4.0,6.3.3,6.3.2,6.3.1,6.3.0,6.2.4,6.2.2,6.2.1,6.2.0, 6.1.5, 6.1.2, 6.1.1, 6.1.0, 6.0.2, 6.0.0
:ref:`Operating systems & kernels <OS-kernel-versions>`,Ubuntu 24.04.2,Ubuntu 24.04.2,Ubuntu 24.04.2,Ubuntu 24.04.2,Ubuntu 24.04.2,Ubuntu 24.04.2,Ubuntu 24.04.2,Ubuntu 24.04.2,"Ubuntu 24.04.1, 24.04","Ubuntu 24.04.1, 24.04","Ubuntu 24.04.1, 24.04",Ubuntu 24.04,,,,,,
,Ubuntu 22.04.5,Ubuntu 22.04.5,Ubuntu 22.04.5,Ubuntu 22.04.5,Ubuntu 22.04.5,Ubuntu 22.04.5,Ubuntu 22.04.5,Ubuntu 22.04.5,"Ubuntu 22.04.5, 22.04.4","Ubuntu 22.04.5, 22.04.4","Ubuntu 22.04.5, 22.04.4","Ubuntu 22.04.5, 22.04.4","Ubuntu 22.04.5, 22.04.4, 22.04.3","Ubuntu 22.04.4, 22.04.3","Ubuntu 22.04.4, 22.04.3","Ubuntu 22.04.4, 22.04.3","Ubuntu 22.04.4, 22.04.3, 22.04.2","Ubuntu 22.04.4, 22.04.3, 22.04.2"
,,,,,,,,,,,,,"Ubuntu 20.04.6, 20.04.5","Ubuntu 20.04.6, 20.04.5","Ubuntu 20.04.6, 20.04.5","Ubuntu 20.04.6, 20.04.5","Ubuntu 20.04.6, 20.04.5","Ubuntu 20.04.6, 20.04.5"
,"RHEL 9.6, 9.4","RHEL 9.6, 9.4","RHEL 9.6, 9.5, 9.4","RHEL 9.5, 9.4","RHEL 9.5, 9.4","RHEL 9.5, 9.4","RHEL 9.5, 9.4","RHEL 9.5, 9.4","RHEL 9.4, 9.3","RHEL 9.4, 9.3","RHEL 9.4, 9.3","RHEL 9.4, 9.3","RHEL 9.4, 9.3, 9.2","RHEL 9.4, 9.3, 9.2","RHEL 9.4, 9.3, 9.2","RHEL 9.4, 9.3, 9.2","RHEL 9.3, 9.2","RHEL 9.3, 9.2"
,RHEL 8.10,RHEL 8.10,RHEL 8.10,RHEL 8.10,RHEL 8.10,RHEL 8.10,RHEL 8.10,RHEL 8.10,"RHEL 8.10, 8.9","RHEL 8.10, 8.9","RHEL 8.10, 8.9","RHEL 8.10, 8.9","RHEL 8.9, 8.8","RHEL 8.9, 8.8","RHEL 8.9, 8.8","RHEL 8.9, 8.8","RHEL 8.9, 8.8","RHEL 8.9, 8.8"
,"SLES 15 SP7, SP6","SLES 15 SP7, SP6",SLES 15 SP6,SLES 15 SP6,"SLES 15 SP6, SP5","SLES 15 SP6, SP5","SLES 15 SP6, SP5","SLES 15 SP6, SP5","SLES 15 SP6, SP5","SLES 15 SP6, SP5","SLES 15 SP6, SP5","SLES 15 SP6, SP5","SLES 15 SP5, SP4","SLES 15 SP5, SP4","SLES 15 SP5, SP4","SLES 15 SP5, SP4","SLES 15 SP5, SP4","SLES 15 SP5, SP4"
,,,,,,,,,,,,,,CentOS 7.9,CentOS 7.9,CentOS 7.9,CentOS 7.9,CentOS 7.9
,"Oracle Linux 9, 8 [#mi300x-past-60]_","Oracle Linux 9, 8 [#mi300x-past-60]_","Oracle Linux 9, 8 [#mi300x-past-60]_","Oracle Linux 9, 8 [#mi300x-past-60]_",Oracle Linux 8.10 [#mi300x-past-60]_,Oracle Linux 8.10 [#mi300x-past-60]_,Oracle Linux 8.10 [#mi300x-past-60]_,Oracle Linux 8.10 [#mi300x-past-60]_,Oracle Linux 8.9 [#mi300x-past-60]_,Oracle Linux 8.9 [#mi300x-past-60]_,Oracle Linux 8.9 [#mi300x-past-60]_,Oracle Linux 8.9 [#mi300x-past-60]_,Oracle Linux 8.9 [#mi300x-past-60]_,Oracle Linux 8.9 [#mi300x-past-60]_,Oracle Linux 8.9 [#mi300x-past-60]_,,,
,Debian 12 [#single-node-past-60]_,Debian 12 [#single-node-past-60]_,Debian 12 [#single-node-past-60]_,Debian 12 [#single-node-past-60]_,Debian 12 [#single-node-past-60]_,Debian 12 [#single-node-past-60]_,Debian 12 [#single-node-past-60]_,,,,,,,,,,,
,Azure Linux 3.0 [#mi300x-past-60]_,Azure Linux 3.0 [#mi300x-past-60]_,Azure Linux 3.0 [#mi300x-past-60]_,Azure Linux 3.0 [#mi300x-past-60]_,Azure Linux 3.0 [#mi300x-past-60]_,Azure Linux 3.0 [#mi300x-past-60]_,,,,,,,,,,,,
,.. _architecture-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,,
:doc:`Architecture <rocm-install-on-linux:reference/system-requirements>`,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3
,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2
,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA
,RDNA4,RDNA4,RDNA4,,,,,,,,,,,,,,,
,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3
,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2
,.. _gpu-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,,
:doc:`GPU / LLVM target <rocm-install-on-linux:reference/system-requirements>`,gfx1201 [#RDNA-OS-past-60]_,gfx1201 [#RDNA-OS-past-60]_,gfx1201 [#RDNA-OS-past-60]_,,,,,,,,,,,,,,,
,gfx1200 [#RDNA-OS-past-60]_,gfx1200 [#RDNA-OS-past-60]_,gfx1200 [#RDNA-OS-past-60]_,,,,,,,,,,,,,,,
,gfx1101 [#RDNA-OS-past-60]_ [#7700XT-OS-past-60]_,gfx1101 [#RDNA-OS-past-60]_ [#7700XT-OS-past-60]_,gfx1101 [#RDNA-OS-past-60]_,,,,,,,,,,,,,,,
,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100
,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030
,gfx942,gfx942,gfx942,gfx942,gfx942,gfx942,gfx942,gfx942,gfx942 [#mi300_624-past-60]_,gfx942 [#mi300_622-past-60]_,gfx942 [#mi300_621-past-60]_,gfx942 [#mi300_620-past-60]_, gfx942 [#mi300_612-past-60]_, gfx942 [#mi300_612-past-60]_, gfx942 [#mi300_611-past-60]_, gfx942 [#mi300_610-past-60]_, gfx942 [#mi300_602-past-60]_, gfx942 [#mi300_600-past-60]_
,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a
,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908
,,,,,,,,,,,,,,,,,,
FRAMEWORK SUPPORT,.. _framework-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,,
:doc:`PyTorch <../compatibility/ml-compatibility/pytorch-compatibility>`,"2.6, 2.5, 2.4, 2.3","2.6, 2.5, 2.4, 2.3","2.6, 2.5, 2.4, 2.3","2.6, 2.5, 2.4, 2.3","2.4, 2.3, 2.2, 1.13","2.4, 2.3, 2.2, 1.13","2.4, 2.3, 2.2, 1.13","2.4, 2.3, 2.2, 2.1, 2.0, 1.13","2.3, 2.2, 2.1, 2.0, 1.13","2.3, 2.2, 2.1, 2.0, 1.13","2.3, 2.2, 2.1, 2.0, 1.13","2.3, 2.2, 2.1, 2.0, 1.13","2.1, 2.0, 1.13","2.1, 2.0, 1.13","2.1, 2.0, 1.13","2.1, 2.0, 1.13","2.1, 2.0, 1.13","2.1, 2.0, 1.13"
:doc:`TensorFlow <../compatibility/ml-compatibility/tensorflow-compatibility>`,"2.18.1, 2.17.1, 2.16.2","2.18.1, 2.17.1, 2.16.2","2.18.1, 2.17.1, 2.16.2","2.18.1, 2.17.1, 2.16.2","2.17.0, 2.16.2, 2.15.1","2.17.0, 2.16.2, 2.15.1","2.17.0, 2.16.2, 2.15.1","2.17.0, 2.16.2, 2.15.1","2.16.1, 2.15.1, 2.14.1","2.16.1, 2.15.1, 2.14.1","2.16.1, 2.15.1, 2.14.1","2.16.1, 2.15.1, 2.14.1","2.15.0, 2.14.0, 2.13.1","2.15.0, 2.14.0, 2.13.1","2.15.0, 2.14.0, 2.13.1","2.15.0, 2.14.0, 2.13.1","2.14.0, 2.13.1, 2.12.1","2.14.0, 2.13.1, 2.12.1"
:doc:`JAX <../compatibility/ml-compatibility/jax-compatibility>`,0.4.35,0.4.35,0.4.35,0.4.35,0.4.31,0.4.31,0.4.31,0.4.31,0.4.26,0.4.26,0.4.26,0.4.26,0.4.26,0.4.26,0.4.26,0.4.26,0.4.26,0.4.26
:doc:`verl <../compatibility/ml-compatibility/verl-compatibility>` [#verl_compat]_,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,0.3.0.post0,N/A,N/A,N/A,N/A,N/A
:doc:`Stanford Megatron-LM <../compatibility/ml-compatibility/stanford-megatron-lm-compatibility>` [#stanford-megatron-lm_compat]_,N/A,N/A,N/A,N/A,N/A,N/A,N/A,85f95ae,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A
:doc:`DGL <../compatibility/ml-compatibility/dgl-compatibility>` [#dgl_compat]_,N/A,N/A,N/A,2.4.0,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,
:doc:`Megablocks <../compatibility/ml-compatibility/megablocks-compatibility>` [#megablocks_compat]_,N/A,N/A,N/A,N/A,N/A,N/A,N/A,0.7.0,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A
:doc:`Taichi <../compatibility/ml-compatibility/taichi-compatibility>` [#taichi_compat]_,N/A,N/A,N/A,N/A,N/A,1.8.0b1,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A
`ONNX Runtime <https://onnxruntime.ai/docs/build/eps.html#amd-migraphx>`_,1.2,1.2,1.2,1.2,1.17.3,1.17.3,1.17.3,1.17.3,1.17.3,1.17.3,1.17.3,1.17.3,1.17.3,1.17.3,1.17.3,1.17.3,1.14.1,1.14.1
,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,
THIRD PARTY COMMS,.. _thirdpartycomms-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,,
`UCC <https://github.com/ROCm/ucc>`_,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.2.0,>=1.2.0
`UCX <https://github.com/ROCm/ucx>`_,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.14.1,>=1.14.1,>=1.14.1,>=1.14.1,>=1.14.1,>=1.14.1
,,,,,,,,,,,,,,,,,,
THIRD PARTY ALGORITHM,.. _thirdpartyalgorithm-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,,
Thrust,2.5.0,2.5.0,2.5.0,2.5.0,2.3.2,2.3.2,2.3.2,2.3.2,2.2.0,2.2.0,2.2.0,2.2.0,2.1.0,2.1.0,2.1.0,2.1.0,2.0.1,2.0.1
CUB,2.5.0,2.5.0,2.5.0,2.5.0,2.3.2,2.3.2,2.3.2,2.3.2,2.2.0,2.2.0,2.2.0,2.2.0,2.1.0,2.1.0,2.1.0,2.1.0,2.0.1,2.0.1
,,,,,,,,,,,,,,,,,,
KMD & USER SPACE [#kfd_support-past-60]_,.. _kfd-userspace-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,,
:doc:`KMD versions <rocm-install-on-linux:reference/user-kernel-space-compat-matrix>`,"6.4.x, 6.3.x, 6.2.x, 6.1.x","6.4.x, 6.3.x, 6.2.x, 6.1.x","6.4.x, 6.3.x, 6.2.x, 6.1.x","6.4.x, 6.3.x, 6.2.x, 6.1.x","6.4.x, 6.3.x, 6.2.x, 6.1.x","6.4.x, 6.3.x, 6.2.x, 6.1.x","6.4.x, 6.3.x, 6.2.x, 6.1.x","6.4.x, 6.3.x, 6.2.x, 6.1.x","6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x","6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x","6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x","6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x","6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x, 5.7.x","6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x, 5.7.x","6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x, 5.7.x","6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x, 5.7.x","6.2.x, 6.1.x, 6.0.x, 5.7.x, 5.6.x","6.2.x, 6.1.x, 6.0.x, 5.7.x, 5.6.x"
,,,,,,,,,,,,,,,,,,
ML & COMPUTER VISION,.. _mllibs-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,,
:doc:`Composable Kernel <composable_kernel:index>`,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0
:doc:`MIGraphX <amdmigraphx:index>`,2.12.0,2.12.0,2.12.0,2.12.0,2.11.0,2.11.0,2.11.0,2.11.0,2.10.0,2.10.0,2.10.0,2.10.0,2.9.0,2.9.0,2.9.0,2.9.0,2.8.0,2.8.0
:doc:`MIOpen <miopen:index>`,3.4.0,3.4.0,3.4.0,3.4.0,3.3.0,3.3.0,3.3.0,3.3.0,3.2.0,3.2.0,3.2.0,3.2.0,3.1.0,3.1.0,3.1.0,3.1.0,3.0.0,3.0.0
:doc:`MIVisionX <mivisionx:index>`,3.2.0,3.2.0,3.2.0,3.2.0,3.1.0,3.1.0,3.1.0,3.1.0,3.0.0,3.0.0,3.0.0,3.0.0,2.5.0,2.5.0,2.5.0,2.5.0,2.5.0,2.5.0
:doc:`rocAL <rocal:index>`,2.2.0,2.2.0,2.2.0,2.2.0,2.1.0,2.1.0,2.1.0,2.1.0,2.0.0,2.0.0,2.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0
:doc:`rocDecode <rocdecode:index>`,0.10.0,0.10.0,0.10.0,0.10.0,0.8.0,0.8.0,0.8.0,0.8.0,0.6.0,0.6.0,0.6.0,0.6.0,0.6.0,0.6.0,0.5.0,0.5.0,N/A,N/A
:doc:`rocJPEG <rocjpeg:index>`,0.8.0,0.8.0,0.8.0,0.8.0,0.6.0,0.6.0,0.6.0,0.6.0,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A
:doc:`rocPyDecode <rocpydecode:index>`,0.3.1,0.3.1,0.3.1,0.3.1,0.2.0,0.2.0,0.2.0,0.2.0,0.1.0,0.1.0,0.1.0,0.1.0,N/A,N/A,N/A,N/A,N/A,N/A
:doc:`RPP <rpp:index>`,1.9.10,1.9.10,1.9.10,1.9.10,1.9.1,1.9.1,1.9.1,1.9.1,1.8.0,1.8.0,1.8.0,1.8.0,1.5.0,1.5.0,1.5.0,1.5.0,1.4.0,1.4.0
,,,,,,,,,,,,,,,,,,
COMMUNICATION,.. _commlibs-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,,
:doc:`RCCL <rccl:index>`,2.22.3,2.22.3,2.22.3,2.22.3,2.21.5,2.21.5,2.21.5,2.21.5,2.20.5,2.20.5,2.20.5,2.20.5,2.18.6,2.18.6,2.18.6,2.18.6,2.18.3,2.18.3
:doc:`rocSHMEM <rocshmem:index>`,2.0.1,2.0.1,2.0.0,2.0.0,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A
,,,,,,,,,,,,,,,,,,
MATH LIBS,.. _mathlibs-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,,
`half <https://github.com/ROCm/half>`_ ,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0
:doc:`hipBLAS <hipblas:index>`,2.4.0,2.4.0,2.4.0,2.4.0,2.3.0,2.3.0,2.3.0,2.3.0,2.2.0,2.2.0,2.2.0,2.2.0,2.1.0,2.1.0,2.1.0,2.1.0,2.0.0,2.0.0
:doc:`hipBLASLt <hipblaslt:index>`,0.12.1,0.12.1,0.12.1,0.12.0,0.10.0,0.10.0,0.10.0,0.10.0,0.8.0,0.8.0,0.8.0,0.8.0,0.7.0,0.7.0,0.7.0,0.7.0,0.6.0,0.6.0
:doc:`hipFFT <hipfft:index>`,1.0.18,1.0.18,1.0.18,1.0.18,1.0.17,1.0.17,1.0.17,1.0.17,1.0.16,1.0.15,1.0.15,1.0.14,1.0.14,1.0.14,1.0.14,1.0.14,1.0.13,1.0.13
:doc:`hipfort <hipfort:index>`,0.6.0,0.6.0,0.6.0,0.6.0,0.5.1,0.5.1,0.5.0,0.5.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0
:doc:`hipRAND <hiprand:index>`,2.12.0,2.12.0,2.12.0,2.12.0,2.11.1,2.11.1,2.11.1,2.11.0,2.11.1,2.11.0,2.11.0,2.11.0,2.10.16,2.10.16,2.10.16,2.10.16,2.10.16,2.10.16
:doc:`hipSOLVER <hipsolver:index>`,2.4.0,2.4.0,2.4.0,2.4.0,2.3.0,2.3.0,2.3.0,2.3.0,2.2.0,2.2.0,2.2.0,2.2.0,2.1.1,2.1.1,2.1.1,2.1.0,2.0.0,2.0.0
:doc:`hipSPARSE <hipsparse:index>`,3.2.0,3.2.0,3.2.0,3.2.0,3.1.2,3.1.2,3.1.2,3.1.2,3.1.1,3.1.1,3.1.1,3.1.1,3.0.1,3.0.1,3.0.1,3.0.1,3.0.0,3.0.0
:doc:`hipSPARSELt <hipsparselt:index>`,0.2.3,0.2.3,0.2.3,0.2.3,0.2.2,0.2.2,0.2.2,0.2.2,0.2.1,0.2.1,0.2.1,0.2.1,0.2.0,0.2.0,0.1.0,0.1.0,0.1.0,0.1.0
:doc:`rocALUTION <rocalution:index>`,3.2.3,3.2.3,3.2.3,3.2.2,3.2.1,3.2.1,3.2.1,3.2.1,3.2.1,3.2.0,3.2.0,3.2.0,3.1.1,3.1.1,3.1.1,3.1.1,3.0.3,3.0.3
:doc:`rocBLAS <rocblas:index>`,4.4.1,4.4.1,4.4.0,4.4.0,4.3.0,4.3.0,4.3.0,4.3.0,4.2.4,4.2.1,4.2.1,4.2.0,4.1.2,4.1.2,4.1.0,4.1.0,4.0.0,4.0.0
:doc:`rocFFT <rocfft:index>`,1.0.32,1.0.32,1.0.32,1.0.32,1.0.31,1.0.31,1.0.31,1.0.31,1.0.30,1.0.29,1.0.29,1.0.28,1.0.27,1.0.27,1.0.27,1.0.26,1.0.25,1.0.23
:doc:`rocRAND <rocrand:index>`,3.3.0,3.3.0,3.3.0,3.3.0,3.2.0,3.2.0,3.2.0,3.2.0,3.1.1,3.1.0,3.1.0,3.1.0,3.0.1,3.0.1,3.0.1,3.0.1,3.0.0,2.10.17
:doc:`rocSOLVER <rocsolver:index>`,3.28.2,3.28.2,3.28.0,3.28.0,3.27.0,3.27.0,3.27.0,3.27.0,3.26.2,3.26.0,3.26.0,3.26.0,3.25.0,3.25.0,3.25.0,3.25.0,3.24.0,3.24.0
:doc:`rocSPARSE <rocsparse:index>`,3.4.0,3.4.0,3.4.0,3.4.0,3.3.0,3.3.0,3.3.0,3.3.0,3.2.1,3.2.0,3.2.0,3.2.0,3.1.2,3.1.2,3.1.2,3.1.2,3.0.2,3.0.2
:doc:`rocWMMA <rocwmma:index>`,1.7.0,1.7.0,1.7.0,1.7.0,1.6.0,1.6.0,1.6.0,1.6.0,1.5.0,1.5.0,1.5.0,1.5.0,1.4.0,1.4.0,1.4.0,1.4.0,1.3.0,1.3.0
:doc:`Tensile <tensile:src/index>`,4.43.0,4.43.0,4.43.0,4.43.0,4.42.0,4.42.0,4.42.0,4.42.0,4.41.0,4.41.0,4.41.0,4.41.0,4.40.0,4.40.0,4.40.0,4.40.0,4.39.0,4.39.0
,,,,,,,,,,,,,,,,,,
PRIMITIVES,.. _primitivelibs-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,,
:doc:`hipCUB <hipcub:index>`,3.4.0,3.4.0,3.4.0,3.4.0,3.3.0,3.3.0,3.3.0,3.3.0,3.2.1,3.2.0,3.2.0,3.2.0,3.1.0,3.1.0,3.1.0,3.1.0,3.0.0,3.0.0
:doc:`hipTensor <hiptensor:index>`,1.5.0,1.5.0,1.5.0,1.5.0,1.4.0,1.4.0,1.4.0,1.4.0,1.3.0,1.3.0,1.3.0,1.3.0,1.2.0,1.2.0,1.2.0,1.2.0,1.1.0,1.1.0
:doc:`rocPRIM <rocprim:index>`,3.4.1,3.4.1,3.4.0,3.4.0,3.3.0,3.3.0,3.3.0,3.3.0,3.2.2,3.2.0,3.2.0,3.2.0,3.1.0,3.1.0,3.1.0,3.1.0,3.0.0,3.0.0
:doc:`rocThrust <rocthrust:index>`,3.3.0,3.3.0,3.3.0,3.3.0,3.3.0,3.3.0,3.3.0,3.3.0,3.1.1,3.1.0,3.1.0,3.0.1,3.0.1,3.0.1,3.0.1,3.0.1,3.0.0,3.0.0
,,,,,,,,,,,,,,,,,,
SUPPORT LIBS,,,,,,,,,,,,,,,,,,
`hipother <https://github.com/ROCm/hipother>`_,6.4.43483,6.4.43483,6.4.43483,6.4.43482,6.3.42134,6.3.42134,6.3.42133,6.3.42131,6.2.41134,6.2.41134,6.2.41134,6.2.41133,6.1.40093,6.1.40093,6.1.40092,6.1.40091,6.1.32831,6.1.32830
`rocm-core <https://github.com/ROCm/rocm-core>`_,6.4.3,6.4.2,6.4.1,6.4.0,6.3.3,6.3.2,6.3.1,6.3.0,6.2.4,6.2.2,6.2.1,6.2.0,6.1.5,6.1.2,6.1.1,6.1.0,6.0.2,6.0.0
`ROCT-Thunk-Interface <https://github.com/ROCm/ROCT-Thunk-Interface>`_,N/A [#ROCT-rocr-past-60]_,N/A [#ROCT-rocr-past-60]_,N/A [#ROCT-rocr-past-60]_,N/A [#ROCT-rocr-past-60]_,N/A [#ROCT-rocr-past-60]_,N/A [#ROCT-rocr-past-60]_,N/A [#ROCT-rocr-past-60]_,N/A [#ROCT-rocr-past-60]_,20240607.5.7,20240607.5.7,20240607.4.05,20240607.1.4246,20240125.5.08,20240125.5.08,20240125.5.08,20240125.3.30,20231016.2.245,20231016.2.245
,,,,,,,,,,,,,,,,,,
SYSTEM MGMT TOOLS,.. _tools-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,,
:doc:`AMD SMI <amdsmi:index>`,25.5.1,25.5.1,25.4.2,25.3.0,24.7.1,24.7.1,24.7.1,24.7.1,24.6.3,24.6.3,24.6.3,24.6.2,24.5.1,24.5.1,24.5.1,24.4.1,23.4.2,23.4.2
:doc:`ROCm Data Center Tool <rdc:index>`,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0
:doc:`rocminfo <rocminfo:index>`,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0
:doc:`ROCm SMI <rocm_smi_lib:index>`,7.7.0,7.5.0,7.5.0,7.5.0,7.4.0,7.4.0,7.4.0,7.4.0,7.3.0,7.3.0,7.3.0,7.3.0,7.2.0,7.2.0,7.0.0,7.0.0,6.0.2,6.0.0
:doc:`ROCm Validation Suite <rocmvalidationsuite:index>`,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.0.60204,1.0.60202,1.0.60201,1.0.60200,1.0.60105,1.0.60102,1.0.60101,1.0.60100,1.0.60002,1.0.60000
,,,,,,,,,,,,,,,,,,
PERFORMANCE TOOLS,,,,,,,,,,,,,,,,,,
:doc:`ROCm Bandwidth Test <rocm_bandwidth_test:index>`,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0
:doc:`ROCm Compute Profiler <rocprofiler-compute:index>`,3.1.1,3.1.1,3.1.0,3.1.0,3.0.0,3.0.0,3.0.0,3.0.0,2.0.1,2.0.1,2.0.1,2.0.1,N/A,N/A,N/A,N/A,N/A,N/A
:doc:`ROCm Systems Profiler <rocprofiler-systems:index>`,1.0.2,1.0.2,1.0.1,1.0.0,0.1.2,0.1.1,0.1.0,0.1.0,1.11.2,1.11.2,1.11.2,1.11.2,N/A,N/A,N/A,N/A,N/A,N/A
:doc:`ROCProfiler <rocprofiler:index>`,2.0.60403,2.0.60402,2.0.60401,2.0.60400,2.0.60303,2.0.60302,2.0.60301,2.0.60300,2.0.60204,2.0.60202,2.0.60201,2.0.60200,2.0.60105,2.0.60102,2.0.60101,2.0.60100,2.0.60002,2.0.60000
:doc:`ROCprofiler-SDK <rocprofiler-sdk:index>`,0.6.0,0.6.0,0.6.0,0.6.0,0.5.0,0.5.0,0.5.0,0.5.0,0.4.0,0.4.0,0.4.0,0.4.0,N/A,N/A,N/A,N/A,N/A,N/A
:doc:`ROCTracer <roctracer:index>`,4.1.60403,4.1.60402,4.1.60401,4.1.60400,4.1.60303,4.1.60302,4.1.60301,4.1.60300,4.1.60204,4.1.60202,4.1.60201,4.1.60200,4.1.60105,4.1.60102,4.1.60101,4.1.60100,4.1.60002,4.1.60000
,,,,,,,,,,,,,,,,,,
DEVELOPMENT TOOLS,,,,,,,,,,,,,,,,,,
:doc:`HIPIFY <hipify:index>`,19.0.0,19.0.0,19.0.0,19.0.0,18.0.0.25012,18.0.0.25012,18.0.0.24491,18.0.0.24455,18.0.0.24392,18.0.0.24355,18.0.0.24355,18.0.0.24232,17.0.0.24193,17.0.0.24193,17.0.0.24154,17.0.0.24103,17.0.0.24012,17.0.0.23483
:doc:`ROCm CMake <rocmcmakebuildtools:index>`,0.14.0,0.14.0,0.14.0,0.14.0,0.14.0,0.14.0,0.14.0,0.14.0,0.13.0,0.13.0,0.13.0,0.13.0,0.12.0,0.12.0,0.12.0,0.12.0,0.11.0,0.11.0
:doc:`ROCdbgapi <rocdbgapi:index>`,0.77.2,0.77.2,0.77.2,0.77.2,0.77.0,0.77.0,0.77.0,0.77.0,0.76.0,0.76.0,0.76.0,0.76.0,0.71.0,0.71.0,0.71.0,0.71.0,0.71.0,0.71.0
:doc:`ROCm Debugger (ROCgdb) <rocgdb:index>`,15.2.0,15.2.0,15.2.0,15.2.0,15.2.0,15.2.0,15.2.0,15.2.0,14.2.0,14.2.0,14.2.0,14.2.0,14.1.0,14.1.0,14.1.0,14.1.0,13.2.0,13.2.0
`rocprofiler-register <https://github.com/ROCm/rocprofiler-register>`_,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.3.0,0.3.0,0.3.0,0.3.0,N/A,N/A
:doc:`ROCr Debug Agent <rocr_debug_agent:index>`,2.0.4,2.0.4,2.0.4,2.0.4,2.0.3,2.0.3,2.0.3,2.0.3,2.0.3,2.0.3,2.0.3,2.0.3,2.0.3,2.0.3,2.0.3,2.0.3,2.0.3,2.0.3
,,,,,,,,,,,,,,,,,,
COMPILERS,.. _compilers-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,,
`clang-ocl <https://github.com/ROCm/clang-ocl>`_,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,0.5.0,0.5.0,0.5.0,0.5.0,0.5.0,0.5.0
:doc:`hipCC <hipcc:index>`,1.1.1,1.1.1,1.1.1,1.1.1,1.1.1,1.1.1,1.1.1,1.1.1,1.1.1,1.1.1,1.1.1,1.1.1,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0
`Flang <https://github.com/ROCm/flang>`_,19.0.0.25224,19.0.0.25224,19.0.0.25184,19.0.0.25133,18.0.0.25012,18.0.0.25012,18.0.0.24491,18.0.0.24455,18.0.0.24392,18.0.0.24355,18.0.0.24355,18.0.0.24232,17.0.0.24193,17.0.0.24193,17.0.0.24154,17.0.0.24103,17.0.0.24012,17.0.0.23483
:doc:`llvm-project <llvm-project:index>`,19.0.0.25224,19.0.0.25224,19.0.0.25184,19.0.0.25133,18.0.0.25012,18.0.0.25012,18.0.0.24491,18.0.0.24491,18.0.0.24392,18.0.0.24355,18.0.0.24355,18.0.0.24232,17.0.0.24193,17.0.0.24193,17.0.0.24154,17.0.0.24103,17.0.0.24012,17.0.0.23483
`OpenMP <https://github.com/ROCm/llvm-project/tree/amd-staging/openmp>`_,19.0.0.25224,19.0.0.25224,19.0.0.25184,19.0.0.25133,18.0.0.25012,18.0.0.25012,18.0.0.24491,18.0.0.24491,18.0.0.24392,18.0.0.24355,18.0.0.24355,18.0.0.24232,17.0.0.24193,17.0.0.24193,17.0.0.24154,17.0.0.24103,17.0.0.24012,17.0.0.23483
,,,,,,,,,,,,,,,,,,
RUNTIMES,.. _runtime-support-compatibility-matrix-past-60:,,,,,,,,,,,,,,,,,
:doc:`AMD CLR <hip:understand/amd_clr>`,6.4.43484,6.4.43484,6.4.43483,6.4.43482,6.3.42134,6.3.42134,6.3.42133,6.3.42131,6.2.41134,6.2.41134,6.2.41134,6.2.41133,6.1.40093,6.1.40093,6.1.40092,6.1.40091,6.1.32831,6.1.32830
:doc:`HIP <hip:index>`,6.4.43484,6.4.43484,6.4.43483,6.4.43482,6.3.42134,6.3.42134,6.3.42133,6.3.42131,6.2.41134,6.2.41134,6.2.41134,6.2.41133,6.1.40093,6.1.40093,6.1.40092,6.1.40091,6.1.32831,6.1.32830
`OpenCL Runtime <https://github.com/ROCm/clr/tree/develop/opencl>`_,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0
:doc:`ROCr Runtime <rocr-runtime:index>`,1.15.0,1.15.0,1.15.0,1.15.0,1.14.0,1.14.0,1.14.0,1.14.0,1.14.0,1.14.0,1.14.0,1.13.0,1.13.0,1.13.0,1.13.0,1.13.0,1.12.0,1.12.0
1 ROCm Version 6.4.3 6.4.2 6.4.1 6.4.0 6.3.3 6.3.2 6.3.1 6.3.0 6.2.4 6.2.2 6.2.1 6.2.0 6.1.5 6.1.2 6.1.1 6.1.0 6.0.2 6.0.0
2 :ref:`Operating systems & kernels <OS-kernel-versions>` Ubuntu 24.04.2 Ubuntu 24.04.2 Ubuntu 24.04.2 Ubuntu 24.04.2 Ubuntu 24.04.2 Ubuntu 24.04.2 Ubuntu 24.04.2 Ubuntu 24.04.2 Ubuntu 24.04.1, 24.04 Ubuntu 24.04.1, 24.04 Ubuntu 24.04.1, 24.04 Ubuntu 24.04
3 Ubuntu 22.04.5 Ubuntu 22.04.5 Ubuntu 22.04.5 Ubuntu 22.04.5 Ubuntu 22.04.5 Ubuntu 22.04.5 Ubuntu 22.04.5 Ubuntu 22.04.5 Ubuntu 22.04.5, 22.04.4 Ubuntu 22.04.5, 22.04.4 Ubuntu 22.04.5, 22.04.4 Ubuntu 22.04.5, 22.04.4 Ubuntu 22.04.5, 22.04.4, 22.04.3 Ubuntu 22.04.4, 22.04.3 Ubuntu 22.04.4, 22.04.3 Ubuntu 22.04.4, 22.04.3 Ubuntu 22.04.4, 22.04.3, 22.04.2 Ubuntu 22.04.4, 22.04.3, 22.04.2
4 Ubuntu 20.04.6, 20.04.5 Ubuntu 20.04.6, 20.04.5 Ubuntu 20.04.6, 20.04.5 Ubuntu 20.04.6, 20.04.5 Ubuntu 20.04.6, 20.04.5 Ubuntu 20.04.6, 20.04.5
5 RHEL 9.6, 9.4 RHEL 9.6, 9.4 RHEL 9.6, 9.5, 9.4 RHEL 9.5, 9.4 RHEL 9.5, 9.4 RHEL 9.5, 9.4 RHEL 9.5, 9.4 RHEL 9.5, 9.4 RHEL 9.4, 9.3 RHEL 9.4, 9.3 RHEL 9.4, 9.3 RHEL 9.4, 9.3 RHEL 9.4, 9.3, 9.2 RHEL 9.4, 9.3, 9.2 RHEL 9.4, 9.3, 9.2 RHEL 9.4, 9.3, 9.2 RHEL 9.3, 9.2 RHEL 9.3, 9.2
6 RHEL 8.10 RHEL 8.10 RHEL 8.10 RHEL 8.10 RHEL 8.10 RHEL 8.10 RHEL 8.10 RHEL 8.10 RHEL 8.10, 8.9 RHEL 8.10, 8.9 RHEL 8.10, 8.9 RHEL 8.10, 8.9 RHEL 8.9, 8.8 RHEL 8.9, 8.8 RHEL 8.9, 8.8 RHEL 8.9, 8.8 RHEL 8.9, 8.8 RHEL 8.9, 8.8
7 SLES 15 SP7, SP6 SLES 15 SP7, SP6 SLES 15 SP6 SLES 15 SP6 SLES 15 SP6, SP5 SLES 15 SP6, SP5 SLES 15 SP6, SP5 SLES 15 SP6, SP5 SLES 15 SP6, SP5 SLES 15 SP6, SP5 SLES 15 SP6, SP5 SLES 15 SP6, SP5 SLES 15 SP5, SP4 SLES 15 SP5, SP4 SLES 15 SP5, SP4 SLES 15 SP5, SP4 SLES 15 SP5, SP4 SLES 15 SP5, SP4
8 CentOS 7.9 CentOS 7.9 CentOS 7.9 CentOS 7.9 CentOS 7.9
9 Oracle Linux 9, 8 [#mi300x-past-60]_ Oracle Linux 9, 8 [#mi300x-past-60]_ Oracle Linux 9, 8 [#mi300x-past-60]_ Oracle Linux 9, 8 [#mi300x-past-60]_ Oracle Linux 8.10 [#mi300x-past-60]_ Oracle Linux 8.10 [#mi300x-past-60]_ Oracle Linux 8.10 [#mi300x-past-60]_ Oracle Linux 8.10 [#mi300x-past-60]_ Oracle Linux 8.9 [#mi300x-past-60]_ Oracle Linux 8.9 [#mi300x-past-60]_ Oracle Linux 8.9 [#mi300x-past-60]_ Oracle Linux 8.9 [#mi300x-past-60]_ Oracle Linux 8.9 [#mi300x-past-60]_ Oracle Linux 8.9 [#mi300x-past-60]_ Oracle Linux 8.9 [#mi300x-past-60]_
10 Debian 12 [#single-node-past-60]_ Debian 12 [#single-node-past-60]_ Debian 12 [#single-node-past-60]_ Debian 12 [#single-node-past-60]_ Debian 12 [#single-node-past-60]_ Debian 12 [#single-node-past-60]_ Debian 12 [#single-node-past-60]_
11 Azure Linux 3.0 [#mi300x-past-60]_ Azure Linux 3.0 [#mi300x-past-60]_ Azure Linux 3.0 [#mi300x-past-60]_ Azure Linux 3.0 [#mi300x-past-60]_ Azure Linux 3.0 [#mi300x-past-60]_ Azure Linux 3.0 [#mi300x-past-60]_
12 .. _architecture-support-compatibility-matrix-past-60: .. _architecture-support-compatibility-matrix-past-60:
13 :doc:`Architecture <rocm-install-on-linux:reference/system-requirements>` CDNA3 CDNA3 CDNA3 CDNA3 CDNA3 CDNA3 CDNA3 CDNA3 CDNA3 CDNA3 CDNA3 CDNA3 CDNA3 CDNA3 CDNA3 CDNA3 CDNA3 CDNA3
14 CDNA2 CDNA2 CDNA2 CDNA2 CDNA2 CDNA2 CDNA2 CDNA2 CDNA2 CDNA2 CDNA2 CDNA2 CDNA2 CDNA2 CDNA2 CDNA2 CDNA2 CDNA2
15 CDNA CDNA CDNA CDNA CDNA CDNA CDNA CDNA CDNA CDNA CDNA CDNA CDNA CDNA CDNA CDNA CDNA CDNA
16 RDNA4 RDNA4 RDNA4
17 RDNA3 RDNA3 RDNA3 RDNA3 RDNA3 RDNA3 RDNA3 RDNA3 RDNA3 RDNA3 RDNA3 RDNA3 RDNA3 RDNA3 RDNA3 RDNA3 RDNA3 RDNA3
18 RDNA2 RDNA2 RDNA2 RDNA2 RDNA2 RDNA2 RDNA2 RDNA2 RDNA2 RDNA2 RDNA2 RDNA2 RDNA2 RDNA2 RDNA2 RDNA2 RDNA2 RDNA2
19 .. _gpu-support-compatibility-matrix-past-60: .. _gpu-support-compatibility-matrix-past-60:
20 :doc:`GPU / LLVM target <rocm-install-on-linux:reference/system-requirements>` gfx1201 [#RDNA-OS-past-60]_ gfx1201 [#RDNA-OS-past-60]_ gfx1201 [#RDNA-OS-past-60]_
21 gfx1200 [#RDNA-OS-past-60]_ gfx1200 [#RDNA-OS-past-60]_ gfx1200 [#RDNA-OS-past-60]_
22 gfx1101 [#RDNA-OS-past-60]_ [#7700XT-OS-past-60]_ gfx1101 [#RDNA-OS-past-60]_ [#7700XT-OS-past-60]_ gfx1101 [#RDNA-OS-past-60]_
23 gfx1100 gfx1100 gfx1100 gfx1100 gfx1100 gfx1100 gfx1100 gfx1100 gfx1100 gfx1100 gfx1100 gfx1100 gfx1100 gfx1100 gfx1100 gfx1100 gfx1100 gfx1100
24 gfx1030 gfx1030 gfx1030 gfx1030 gfx1030 gfx1030 gfx1030 gfx1030 gfx1030 gfx1030 gfx1030 gfx1030 gfx1030 gfx1030 gfx1030 gfx1030 gfx1030 gfx1030
25 gfx942 gfx942 gfx942 gfx942 gfx942 gfx942 gfx942 gfx942 gfx942 [#mi300_624-past-60]_ gfx942 [#mi300_622-past-60]_ gfx942 [#mi300_621-past-60]_ gfx942 [#mi300_620-past-60]_ gfx942 [#mi300_612-past-60]_ gfx942 [#mi300_612-past-60]_ gfx942 [#mi300_611-past-60]_ gfx942 [#mi300_610-past-60]_ gfx942 [#mi300_602-past-60]_ gfx942 [#mi300_600-past-60]_
26 gfx90a gfx90a gfx90a gfx90a gfx90a gfx90a gfx90a gfx90a gfx90a gfx90a gfx90a gfx90a gfx90a gfx90a gfx90a gfx90a gfx90a gfx90a
27 gfx908 gfx908 gfx908 gfx908 gfx908 gfx908 gfx908 gfx908 gfx908 gfx908 gfx908 gfx908 gfx908 gfx908 gfx908 gfx908 gfx908 gfx908
28
29 FRAMEWORK SUPPORT .. _framework-support-compatibility-matrix-past-60: .. _framework-support-compatibility-matrix-past-60:
30 :doc:`PyTorch <../compatibility/ml-compatibility/pytorch-compatibility>` 2.6, 2.5, 2.4, 2.3 2.6, 2.5, 2.4, 2.3 2.6, 2.5, 2.4, 2.3 2.6, 2.5, 2.4, 2.3 2.4, 2.3, 2.2, 1.13 2.4, 2.3, 2.2, 1.13 2.4, 2.3, 2.2, 1.13 2.4, 2.3, 2.2, 2.1, 2.0, 1.13 2.3, 2.2, 2.1, 2.0, 1.13 2.3, 2.2, 2.1, 2.0, 1.13 2.3, 2.2, 2.1, 2.0, 1.13 2.3, 2.2, 2.1, 2.0, 1.13 2.1, 2.0, 1.13 2.1, 2.0, 1.13 2.1, 2.0, 1.13 2.1, 2.0, 1.13 2.1, 2.0, 1.13 2.1, 2.0, 1.13
31 :doc:`TensorFlow <../compatibility/ml-compatibility/tensorflow-compatibility>` 2.18.1, 2.17.1, 2.16.2 2.18.1, 2.17.1, 2.16.2 2.18.1, 2.17.1, 2.16.2 2.18.1, 2.17.1, 2.16.2 2.17.0, 2.16.2, 2.15.1 2.17.0, 2.16.2, 2.15.1 2.17.0, 2.16.2, 2.15.1 2.17.0, 2.16.2, 2.15.1 2.16.1, 2.15.1, 2.14.1 2.16.1, 2.15.1, 2.14.1 2.16.1, 2.15.1, 2.14.1 2.16.1, 2.15.1, 2.14.1 2.15.0, 2.14.0, 2.13.1 2.15.0, 2.14.0, 2.13.1 2.15.0, 2.14.0, 2.13.1 2.15.0, 2.14.0, 2.13.1 2.14.0, 2.13.1, 2.12.1 2.14.0, 2.13.1, 2.12.1
32 :doc:`JAX <../compatibility/ml-compatibility/jax-compatibility>` 0.4.35 0.4.35 0.4.35 0.4.35 0.4.31 0.4.31 0.4.31 0.4.31 0.4.26 0.4.26 0.4.26 0.4.26 0.4.26 0.4.26 0.4.26 0.4.26 0.4.26 0.4.26
33 :doc:`Stanford Megatron-LM <../compatibility/ml-compatibility/stanford-megatron-lm-compatibility>` :doc:`verl <../compatibility/ml-compatibility/verl-compatibility>` [#verl_compat]_ N/A N/A N/A N/A N/A N/A 85f95ae N/A N/A N/A N/A N/A N/A 0.3.0.post0 N/A N/A N/A N/A N/A
34 :doc:`DGL <../compatibility/ml-compatibility/dgl-compatibility>` :doc:`Stanford Megatron-LM <../compatibility/ml-compatibility/stanford-megatron-lm-compatibility>` [#stanford-megatron-lm_compat]_ N/A 2.4.0 N/A 2.4.0 N/A N/A N/A N/A N/A N/A 85f95ae N/A N/A N/A N/A N/A N/A N/A N/A N/A
35 :doc:`verl <../compatibility/ml-compatibility/verl-compatibility>` [#verl_compat]_ :doc:`DGL <../compatibility/ml-compatibility/dgl-compatibility>` [#dgl_compat]_ N/A N/A N/A N/A 2.4.0 N/A N/A N/A N/A N/A N/A 0.3.0.post0 N/A N/A N/A N/A N/A N/A N/A
36 `ONNX Runtime <https://onnxruntime.ai/docs/build/eps.html#amd-migraphx>`_ :doc:`Megablocks <../compatibility/ml-compatibility/megablocks-compatibility>` [#megablocks_compat]_ N/A 1.2 N/A 1.2 N/A 1.2 N/A 1.17.3 N/A 1.17.3 N/A 1.17.3 N/A 1.17.3 0.7.0 1.17.3 N/A 1.17.3 N/A 1.17.3 N/A 1.17.3 N/A 1.17.3 N/A 1.17.3 N/A 1.17.3 N/A 1.17.3 N/A 1.14.1 N/A
37 :doc:`Taichi <../compatibility/ml-compatibility/taichi-compatibility>` [#taichi_compat]_ N/A N/A N/A N/A N/A 1.8.0b1 N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A
38 `ONNX Runtime <https://onnxruntime.ai/docs/build/eps.html#amd-migraphx>`_ 1.2 1.2 1.2 1.2 1.17.3 1.17.3 1.17.3 1.17.3 1.17.3 1.17.3 1.17.3 1.17.3 1.17.3 1.17.3 1.17.3 1.17.3 1.14.1 1.14.1
39 THIRD PARTY COMMS .. _thirdpartycomms-support-compatibility-matrix-past-60:
40 `UCC <https://github.com/ROCm/ucc>`_ >=1.3.0 >=1.3.0 >=1.3.0 >=1.3.0 >=1.3.0 >=1.3.0 >=1.3.0 >=1.3.0 >=1.3.0 >=1.3.0 >=1.3.0 >=1.3.0 >=1.3.0 >=1.3.0 >=1.3.0 >=1.2.0 >=1.2.0
41 `UCX <https://github.com/ROCm/ucx>`_ THIRD PARTY COMMS .. _thirdpartycomms-support-compatibility-matrix-past-60: >=1.15.0 >=1.15.0 >=1.15.0 >=1.15.0 >=1.15.0 >=1.15.0 >=1.15.0 >=1.15.0 >=1.15.0 >=1.15.0 >=1.15.0 >=1.14.1 >=1.14.1 >=1.14.1 >=1.14.1 >=1.14.1 >=1.14.1
42 `UCC <https://github.com/ROCm/ucc>`_ >=1.3.0 >=1.3.0 >=1.3.0 >=1.3.0 >=1.3.0 >=1.3.0 >=1.3.0 >=1.3.0 >=1.3.0 >=1.3.0 >=1.3.0 >=1.3.0 >=1.3.0 >=1.3.0 >=1.3.0 >=1.3.0 >=1.2.0 >=1.2.0
43 THIRD PARTY ALGORITHM `UCX <https://github.com/ROCm/ucx>`_ >=1.15.0 .. _thirdpartyalgorithm-support-compatibility-matrix-past-60: >=1.15.0 >=1.15.0 >=1.15.0 >=1.15.0 >=1.15.0 >=1.15.0 >=1.15.0 >=1.15.0 >=1.15.0 >=1.15.0 >=1.15.0 >=1.14.1 >=1.14.1 >=1.14.1 >=1.14.1 >=1.14.1 >=1.14.1
44 Thrust 2.5.0 2.5.0 2.5.0 2.3.2 2.3.2 2.3.2 2.3.2 2.2.0 2.2.0 2.2.0 2.2.0 2.1.0 2.1.0 2.1.0 2.1.0 2.0.1 2.0.1
45 CUB THIRD PARTY ALGORITHM .. _thirdpartyalgorithm-support-compatibility-matrix-past-60: 2.5.0 2.5.0 2.5.0 2.3.2 2.3.2 2.3.2 2.3.2 2.2.0 2.2.0 2.2.0 2.2.0 2.1.0 2.1.0 2.1.0 2.1.0 2.0.1 2.0.1
46 Thrust 2.5.0 2.5.0 2.5.0 2.5.0 2.3.2 2.3.2 2.3.2 2.3.2 2.2.0 2.2.0 2.2.0 2.2.0 2.1.0 2.1.0 2.1.0 2.1.0 2.0.1 2.0.1
47 KMD & USER SPACE [#kfd_support-past-60]_ CUB 2.5.0 .. _kfd-userspace-support-compatibility-matrix-past-60: 2.5.0 2.5.0 2.5.0 2.3.2 2.3.2 2.3.2 2.3.2 2.2.0 2.2.0 2.2.0 2.2.0 2.1.0 2.1.0 2.1.0 2.1.0 2.0.1 2.0.1
48 :doc:`KMD versions <rocm-install-on-linux:reference/user-kernel-space-compat-matrix>` 6.4.x, 6.3.x, 6.2.x, 6.1.x 6.4.x, 6.3.x, 6.2.x, 6.1.x 6.4.x, 6.3.x, 6.2.x, 6.1.x 6.4.x, 6.3.x, 6.2.x, 6.1.x 6.4.x, 6.3.x, 6.2.x, 6.1.x 6.4.x, 6.3.x, 6.2.x, 6.1.x 6.4.x, 6.3.x, 6.2.x, 6.1.x 6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x 6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x 6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x 6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x 6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x, 5.7.x 6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x, 5.7.x 6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x, 5.7.x 6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x, 5.7.x 6.2.x, 6.1.x, 6.0.x, 5.7.x, 5.6.x 6.2.x, 6.1.x, 6.0.x, 5.7.x, 5.6.x
49 KMD & USER SPACE [#kfd_support-past-60]_ .. _kfd-userspace-support-compatibility-matrix-past-60:
50 ML & COMPUTER VISION :doc:`KMD versions <rocm-install-on-linux:reference/user-kernel-space-compat-matrix>` 6.4.x, 6.3.x, 6.2.x, 6.1.x .. _mllibs-support-compatibility-matrix-past-60: 6.4.x, 6.3.x, 6.2.x, 6.1.x 6.4.x, 6.3.x, 6.2.x, 6.1.x 6.4.x, 6.3.x, 6.2.x, 6.1.x 6.4.x, 6.3.x, 6.2.x, 6.1.x 6.4.x, 6.3.x, 6.2.x, 6.1.x 6.4.x, 6.3.x, 6.2.x, 6.1.x 6.4.x, 6.3.x, 6.2.x, 6.1.x 6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x 6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x 6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x 6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x 6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x, 5.7.x 6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x, 5.7.x 6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x, 5.7.x 6.4.x, 6.3.x, 6.2.x, 6.1.x, 6.0.x, 5.7.x 6.2.x, 6.1.x, 6.0.x, 5.7.x, 5.6.x 6.2.x, 6.1.x, 6.0.x, 5.7.x, 5.6.x
51 :doc:`Composable Kernel <composable_kernel:index>` 1.1.0 1.1.0 1.1.0 1.1.0 1.1.0 1.1.0 1.1.0 1.1.0 1.1.0 1.1.0 1.1.0 1.1.0 1.1.0 1.1.0 1.1.0 1.1.0 1.1.0
52 :doc:`MIGraphX <amdmigraphx:index>` ML & COMPUTER VISION .. _mllibs-support-compatibility-matrix-past-60: 2.12.0 2.12.0 2.12.0 2.11.0 2.11.0 2.11.0 2.11.0 2.10.0 2.10.0 2.10.0 2.10.0 2.9.0 2.9.0 2.9.0 2.9.0 2.8.0 2.8.0
53 :doc:`MIOpen <miopen:index>` :doc:`Composable Kernel <composable_kernel:index>` 1.1.0 3.4.0 1.1.0 3.4.0 1.1.0 3.4.0 1.1.0 3.3.0 1.1.0 3.3.0 1.1.0 3.3.0 1.1.0 3.3.0 1.1.0 3.2.0 1.1.0 3.2.0 1.1.0 3.2.0 1.1.0 3.2.0 1.1.0 3.1.0 1.1.0 3.1.0 1.1.0 3.1.0 1.1.0 3.1.0 1.1.0 3.0.0 1.1.0 3.0.0 1.1.0
54 :doc:`MIVisionX <mivisionx:index>` :doc:`MIGraphX <amdmigraphx:index>` 2.12.0 3.2.0 2.12.0 3.2.0 2.12.0 3.2.0 2.12.0 3.1.0 2.11.0 3.1.0 2.11.0 3.1.0 2.11.0 3.1.0 2.11.0 3.0.0 2.10.0 3.0.0 2.10.0 3.0.0 2.10.0 3.0.0 2.10.0 2.5.0 2.9.0 2.5.0 2.9.0 2.5.0 2.9.0 2.5.0 2.9.0 2.5.0 2.8.0 2.5.0 2.8.0
55 :doc:`rocAL <rocal:index>` :doc:`MIOpen <miopen:index>` 3.4.0 2.2.0 3.4.0 2.2.0 3.4.0 2.2.0 3.4.0 2.1.0 3.3.0 2.1.0 3.3.0 2.1.0 3.3.0 2.1.0 3.3.0 2.0.0 3.2.0 2.0.0 3.2.0 2.0.0 3.2.0 1.0.0 3.2.0 1.0.0 3.1.0 1.0.0 3.1.0 1.0.0 3.1.0 1.0.0 3.1.0 1.0.0 3.0.0 1.0.0 3.0.0
56 :doc:`rocDecode <rocdecode:index>` :doc:`MIVisionX <mivisionx:index>` 3.2.0 0.10.0 3.2.0 0.10.0 3.2.0 0.10.0 3.2.0 0.8.0 3.1.0 0.8.0 3.1.0 0.8.0 3.1.0 0.8.0 3.1.0 0.6.0 3.0.0 0.6.0 3.0.0 0.6.0 3.0.0 0.6.0 3.0.0 0.6.0 2.5.0 0.6.0 2.5.0 0.5.0 2.5.0 0.5.0 2.5.0 N/A 2.5.0 N/A 2.5.0
57 :doc:`rocJPEG <rocjpeg:index>` :doc:`rocAL <rocal:index>` 2.2.0 0.8.0 2.2.0 0.8.0 2.2.0 0.8.0 2.2.0 0.6.0 2.1.0 0.6.0 2.1.0 0.6.0 2.1.0 0.6.0 2.1.0 N/A 2.0.0 N/A 2.0.0 N/A 2.0.0 N/A 1.0.0 N/A 1.0.0 N/A 1.0.0 N/A 1.0.0 N/A 1.0.0 N/A 1.0.0 N/A 1.0.0
58 :doc:`rocPyDecode <rocpydecode:index>` :doc:`rocDecode <rocdecode:index>` 0.10.0 0.3.1 0.10.0 0.3.1 0.10.0 0.3.1 0.10.0 0.2.0 0.8.0 0.2.0 0.8.0 0.2.0 0.8.0 0.2.0 0.8.0 0.1.0 0.6.0 0.1.0 0.6.0 0.1.0 0.6.0 0.1.0 0.6.0 N/A 0.6.0 N/A 0.6.0 N/A 0.5.0 N/A 0.5.0 N/A N/A
59 :doc:`RPP <rpp:index>` :doc:`rocJPEG <rocjpeg:index>` 0.8.0 1.9.10 0.8.0 1.9.10 0.8.0 1.9.10 0.8.0 1.9.1 0.6.0 1.9.1 0.6.0 1.9.1 0.6.0 1.9.1 0.6.0 1.8.0 N/A 1.8.0 N/A 1.8.0 N/A 1.8.0 N/A 1.5.0 N/A 1.5.0 N/A 1.5.0 N/A 1.5.0 N/A 1.4.0 N/A 1.4.0 N/A
60 :doc:`rocPyDecode <rocpydecode:index>` 0.3.1 0.3.1 0.3.1 0.3.1 0.2.0 0.2.0 0.2.0 0.2.0 0.1.0 0.1.0 0.1.0 0.1.0 N/A N/A N/A N/A N/A N/A
61 COMMUNICATION :doc:`RPP <rpp:index>` 1.9.10 .. _commlibs-support-compatibility-matrix-past-60: 1.9.10 1.9.10 1.9.10 1.9.1 1.9.1 1.9.1 1.9.1 1.8.0 1.8.0 1.8.0 1.8.0 1.5.0 1.5.0 1.5.0 1.5.0 1.4.0 1.4.0
62 :doc:`RCCL <rccl:index>` 2.22.3 2.22.3 2.22.3 2.21.5 2.21.5 2.21.5 2.21.5 2.20.5 2.20.5 2.20.5 2.20.5 2.18.6 2.18.6 2.18.6 2.18.6 2.18.3 2.18.3
63 :doc:`rocSHMEM <rocshmem:index>` COMMUNICATION .. _commlibs-support-compatibility-matrix-past-60: 2.0.1 2.0.0 2.0.0 N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A
64 :doc:`RCCL <rccl:index>` 2.22.3 2.22.3 2.22.3 2.22.3 2.21.5 2.21.5 2.21.5 2.21.5 2.20.5 2.20.5 2.20.5 2.20.5 2.18.6 2.18.6 2.18.6 2.18.6 2.18.3 2.18.3
65 MATH LIBS :doc:`rocSHMEM <rocshmem:index>` 2.0.1 .. _mathlibs-support-compatibility-matrix-past-60: 2.0.1 2.0.0 2.0.0 N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A
66 `half <https://github.com/ROCm/half>`_ 1.12.0 1.12.0 1.12.0 1.12.0 1.12.0 1.12.0 1.12.0 1.12.0 1.12.0 1.12.0 1.12.0 1.12.0 1.12.0 1.12.0 1.12.0 1.12.0 1.12.0
67 :doc:`hipBLAS <hipblas:index>` MATH LIBS .. _mathlibs-support-compatibility-matrix-past-60: 2.4.0 2.4.0 2.4.0 2.3.0 2.3.0 2.3.0 2.3.0 2.2.0 2.2.0 2.2.0 2.2.0 2.1.0 2.1.0 2.1.0 2.1.0 2.0.0 2.0.0
68 :doc:`hipBLASLt <hipblaslt:index>` `half <https://github.com/ROCm/half>`_ 1.12.0 0.12.1 1.12.0 0.12.1 1.12.0 0.12.0 1.12.0 0.10.0 1.12.0 0.10.0 1.12.0 0.10.0 1.12.0 0.10.0 1.12.0 0.8.0 1.12.0 0.8.0 1.12.0 0.8.0 1.12.0 0.8.0 1.12.0 0.7.0 1.12.0 0.7.0 1.12.0 0.7.0 1.12.0 0.7.0 1.12.0 0.6.0 1.12.0 0.6.0 1.12.0
69 :doc:`hipFFT <hipfft:index>` :doc:`hipBLAS <hipblas:index>` 2.4.0 1.0.18 2.4.0 1.0.18 2.4.0 1.0.18 2.4.0 1.0.17 2.3.0 1.0.17 2.3.0 1.0.17 2.3.0 1.0.17 2.3.0 1.0.16 2.2.0 1.0.15 2.2.0 1.0.15 2.2.0 1.0.14 2.2.0 1.0.14 2.1.0 1.0.14 2.1.0 1.0.14 2.1.0 1.0.14 2.1.0 1.0.13 2.0.0 1.0.13 2.0.0
70 :doc:`hipfort <hipfort:index>` :doc:`hipBLASLt <hipblaslt:index>` 0.12.1 0.6.0 0.12.1 0.6.0 0.12.1 0.6.0 0.12.0 0.5.1 0.10.0 0.5.1 0.10.0 0.5.0 0.10.0 0.5.0 0.10.0 0.4.0 0.8.0 0.4.0 0.8.0 0.4.0 0.8.0 0.4.0 0.8.0 0.4.0 0.7.0 0.4.0 0.7.0 0.4.0 0.7.0 0.4.0 0.7.0 0.4.0 0.6.0 0.4.0 0.6.0
71 :doc:`hipRAND <hiprand:index>` :doc:`hipFFT <hipfft:index>` 1.0.18 2.12.0 1.0.18 2.12.0 1.0.18 2.12.0 1.0.18 2.11.1 1.0.17 2.11.1 1.0.17 2.11.1 1.0.17 2.11.0 1.0.17 2.11.1 1.0.16 2.11.0 1.0.15 2.11.0 1.0.15 2.11.0 1.0.14 2.10.16 1.0.14 2.10.16 1.0.14 2.10.16 1.0.14 2.10.16 1.0.14 2.10.16 1.0.13 2.10.16 1.0.13
72 :doc:`hipSOLVER <hipsolver:index>` :doc:`hipfort <hipfort:index>` 0.6.0 2.4.0 0.6.0 2.4.0 0.6.0 2.4.0 0.6.0 2.3.0 0.5.1 2.3.0 0.5.1 2.3.0 0.5.0 2.3.0 0.5.0 2.2.0 0.4.0 2.2.0 0.4.0 2.2.0 0.4.0 2.2.0 0.4.0 2.1.1 0.4.0 2.1.1 0.4.0 2.1.1 0.4.0 2.1.0 0.4.0 2.0.0 0.4.0 2.0.0 0.4.0
73 :doc:`hipSPARSE <hipsparse:index>` :doc:`hipRAND <hiprand:index>` 2.12.0 3.2.0 2.12.0 3.2.0 2.12.0 3.2.0 2.12.0 3.1.2 2.11.1 3.1.2 2.11.1 3.1.2 2.11.1 3.1.2 2.11.0 3.1.1 2.11.1 3.1.1 2.11.0 3.1.1 2.11.0 3.1.1 2.11.0 3.0.1 2.10.16 3.0.1 2.10.16 3.0.1 2.10.16 3.0.1 2.10.16 3.0.0 2.10.16 3.0.0 2.10.16
74 :doc:`hipSPARSELt <hipsparselt:index>` :doc:`hipSOLVER <hipsolver:index>` 2.4.0 0.2.3 2.4.0 0.2.3 2.4.0 0.2.3 2.4.0 0.2.2 2.3.0 0.2.2 2.3.0 0.2.2 2.3.0 0.2.2 2.3.0 0.2.1 2.2.0 0.2.1 2.2.0 0.2.1 2.2.0 0.2.1 2.2.0 0.2.0 2.1.1 0.2.0 2.1.1 0.1.0 2.1.1 0.1.0 2.1.0 0.1.0 2.0.0 0.1.0 2.0.0
75 :doc:`rocALUTION <rocalution:index>` :doc:`hipSPARSE <hipsparse:index>` 3.2.0 3.2.3 3.2.0 3.2.3 3.2.0 3.2.2 3.2.0 3.2.1 3.1.2 3.2.1 3.1.2 3.2.1 3.1.2 3.2.1 3.1.2 3.2.1 3.1.1 3.2.0 3.1.1 3.2.0 3.1.1 3.2.0 3.1.1 3.1.1 3.0.1 3.1.1 3.0.1 3.1.1 3.0.1 3.1.1 3.0.1 3.0.3 3.0.0 3.0.3 3.0.0
76 :doc:`rocBLAS <rocblas:index>` :doc:`hipSPARSELt <hipsparselt:index>` 0.2.3 4.4.1 0.2.3 4.4.0 0.2.3 4.4.0 0.2.3 4.3.0 0.2.2 4.3.0 0.2.2 4.3.0 0.2.2 4.3.0 0.2.2 4.2.4 0.2.1 4.2.1 0.2.1 4.2.1 0.2.1 4.2.0 0.2.1 4.1.2 0.2.0 4.1.2 0.2.0 4.1.0 0.1.0 4.1.0 0.1.0 4.0.0 0.1.0 4.0.0 0.1.0
77 :doc:`rocFFT <rocfft:index>` :doc:`rocALUTION <rocalution:index>` 3.2.3 1.0.32 3.2.3 1.0.32 3.2.3 1.0.32 3.2.2 1.0.31 3.2.1 1.0.31 3.2.1 1.0.31 3.2.1 1.0.31 3.2.1 1.0.30 3.2.1 1.0.29 3.2.0 1.0.29 3.2.0 1.0.28 3.2.0 1.0.27 3.1.1 1.0.27 3.1.1 1.0.27 3.1.1 1.0.26 3.1.1 1.0.25 3.0.3 1.0.23 3.0.3
78 :doc:`rocRAND <rocrand:index>` :doc:`rocBLAS <rocblas:index>` 4.4.1 3.3.0 4.4.1 3.3.0 4.4.0 3.3.0 4.4.0 3.2.0 4.3.0 3.2.0 4.3.0 3.2.0 4.3.0 3.2.0 4.3.0 3.1.1 4.2.4 3.1.0 4.2.1 3.1.0 4.2.1 3.1.0 4.2.0 3.0.1 4.1.2 3.0.1 4.1.2 3.0.1 4.1.0 3.0.1 4.1.0 3.0.0 4.0.0 2.10.17 4.0.0
79 :doc:`rocSOLVER <rocsolver:index>` :doc:`rocFFT <rocfft:index>` 1.0.32 3.28.2 1.0.32 3.28.0 1.0.32 3.28.0 1.0.32 3.27.0 1.0.31 3.27.0 1.0.31 3.27.0 1.0.31 3.27.0 1.0.31 3.26.2 1.0.30 3.26.0 1.0.29 3.26.0 1.0.29 3.26.0 1.0.28 3.25.0 1.0.27 3.25.0 1.0.27 3.25.0 1.0.27 3.25.0 1.0.26 3.24.0 1.0.25 3.24.0 1.0.23
80 :doc:`rocSPARSE <rocsparse:index>` :doc:`rocRAND <rocrand:index>` 3.3.0 3.4.0 3.3.0 3.4.0 3.3.0 3.4.0 3.3.0 3.3.0 3.2.0 3.3.0 3.2.0 3.3.0 3.2.0 3.3.0 3.2.0 3.2.1 3.1.1 3.2.0 3.1.0 3.2.0 3.1.0 3.2.0 3.1.0 3.1.2 3.0.1 3.1.2 3.0.1 3.1.2 3.0.1 3.1.2 3.0.1 3.0.2 3.0.0 3.0.2 2.10.17
81 :doc:`rocWMMA <rocwmma:index>` :doc:`rocSOLVER <rocsolver:index>` 3.28.2 1.7.0 3.28.2 1.7.0 3.28.0 1.7.0 3.28.0 1.6.0 3.27.0 1.6.0 3.27.0 1.6.0 3.27.0 1.6.0 3.27.0 1.5.0 3.26.2 1.5.0 3.26.0 1.5.0 3.26.0 1.5.0 3.26.0 1.4.0 3.25.0 1.4.0 3.25.0 1.4.0 3.25.0 1.4.0 3.25.0 1.3.0 3.24.0 1.3.0 3.24.0
82 :doc:`Tensile <tensile:src/index>` :doc:`rocSPARSE <rocsparse:index>` 3.4.0 4.43.0 3.4.0 4.43.0 3.4.0 4.43.0 3.4.0 4.42.0 3.3.0 4.42.0 3.3.0 4.42.0 3.3.0 4.42.0 3.3.0 4.41.0 3.2.1 4.41.0 3.2.0 4.41.0 3.2.0 4.41.0 3.2.0 4.40.0 3.1.2 4.40.0 3.1.2 4.40.0 3.1.2 4.40.0 3.1.2 4.39.0 3.0.2 4.39.0 3.0.2
83 :doc:`rocWMMA <rocwmma:index>` 1.7.0 1.7.0 1.7.0 1.7.0 1.6.0 1.6.0 1.6.0 1.6.0 1.5.0 1.5.0 1.5.0 1.5.0 1.4.0 1.4.0 1.4.0 1.4.0 1.3.0 1.3.0
84 PRIMITIVES :doc:`Tensile <tensile:src/index>` 4.43.0 .. _primitivelibs-support-compatibility-matrix-past-60: 4.43.0 4.43.0 4.43.0 4.42.0 4.42.0 4.42.0 4.42.0 4.41.0 4.41.0 4.41.0 4.41.0 4.40.0 4.40.0 4.40.0 4.40.0 4.39.0 4.39.0
85 :doc:`hipCUB <hipcub:index>` 3.4.0 3.4.0 3.4.0 3.3.0 3.3.0 3.3.0 3.3.0 3.2.1 3.2.0 3.2.0 3.2.0 3.1.0 3.1.0 3.1.0 3.1.0 3.0.0 3.0.0
86 :doc:`hipTensor <hiptensor:index>` PRIMITIVES .. _primitivelibs-support-compatibility-matrix-past-60: 1.5.0 1.5.0 1.5.0 1.4.0 1.4.0 1.4.0 1.4.0 1.3.0 1.3.0 1.3.0 1.3.0 1.2.0 1.2.0 1.2.0 1.2.0 1.1.0 1.1.0
87 :doc:`rocPRIM <rocprim:index>` :doc:`hipCUB <hipcub:index>` 3.4.0 3.4.1 3.4.0 3.4.0 3.4.0 3.3.0 3.3.0 3.3.0 3.3.0 3.2.2 3.2.1 3.2.0 3.2.0 3.2.0 3.1.0 3.1.0 3.1.0 3.1.0 3.0.0 3.0.0
88 :doc:`rocThrust <rocthrust:index>` :doc:`hipTensor <hiptensor:index>` 1.5.0 3.3.0 1.5.0 3.3.0 1.5.0 3.3.0 1.5.0 3.3.0 1.4.0 3.3.0 1.4.0 3.3.0 1.4.0 3.3.0 1.4.0 3.1.1 1.3.0 3.1.0 1.3.0 3.1.0 1.3.0 3.0.1 1.3.0 3.0.1 1.2.0 3.0.1 1.2.0 3.0.1 1.2.0 3.0.1 1.2.0 3.0.0 1.1.0 3.0.0 1.1.0
89 :doc:`rocPRIM <rocprim:index>` 3.4.1 3.4.1 3.4.0 3.4.0 3.3.0 3.3.0 3.3.0 3.3.0 3.2.2 3.2.0 3.2.0 3.2.0 3.1.0 3.1.0 3.1.0 3.1.0 3.0.0 3.0.0
90 SUPPORT LIBS :doc:`rocThrust <rocthrust:index>` 3.3.0 3.3.0 3.3.0 3.3.0 3.3.0 3.3.0 3.3.0 3.3.0 3.1.1 3.1.0 3.1.0 3.0.1 3.0.1 3.0.1 3.0.1 3.0.1 3.0.0 3.0.0
91 `hipother <https://github.com/ROCm/hipother>`_ 6.4.43483 6.4.43483 6.4.43482 6.3.42134 6.3.42134 6.3.42133 6.3.42131 6.2.41134 6.2.41134 6.2.41134 6.2.41133 6.1.40093 6.1.40093 6.1.40092 6.1.40091 6.1.32831 6.1.32830
92 `rocm-core <https://github.com/ROCm/rocm-core>`_ SUPPORT LIBS 6.4.2 6.4.1 6.4.0 6.3.3 6.3.2 6.3.1 6.3.0 6.2.4 6.2.2 6.2.1 6.2.0 6.1.5 6.1.2 6.1.1 6.1.0 6.0.2 6.0.0
93 `ROCT-Thunk-Interface <https://github.com/ROCm/ROCT-Thunk-Interface>`_ `hipother <https://github.com/ROCm/hipother>`_ 6.4.43483 N/A [#ROCT-rocr-past-60]_ 6.4.43483 N/A [#ROCT-rocr-past-60]_ 6.4.43483 N/A [#ROCT-rocr-past-60]_ 6.4.43482 N/A [#ROCT-rocr-past-60]_ 6.3.42134 N/A [#ROCT-rocr-past-60]_ 6.3.42134 N/A [#ROCT-rocr-past-60]_ 6.3.42133 N/A [#ROCT-rocr-past-60]_ 6.3.42131 20240607.5.7 6.2.41134 20240607.5.7 6.2.41134 20240607.4.05 6.2.41134 20240607.1.4246 6.2.41133 20240125.5.08 6.1.40093 20240125.5.08 6.1.40093 20240125.5.08 6.1.40092 20240125.3.30 6.1.40091 20231016.2.245 6.1.32831 20231016.2.245 6.1.32830
94 `rocm-core <https://github.com/ROCm/rocm-core>`_ 6.4.3 6.4.2 6.4.1 6.4.0 6.3.3 6.3.2 6.3.1 6.3.0 6.2.4 6.2.2 6.2.1 6.2.0 6.1.5 6.1.2 6.1.1 6.1.0 6.0.2 6.0.0
95 SYSTEM MGMT TOOLS `ROCT-Thunk-Interface <https://github.com/ROCm/ROCT-Thunk-Interface>`_ N/A [#ROCT-rocr-past-60]_ .. _tools-support-compatibility-matrix-past-60: N/A [#ROCT-rocr-past-60]_ N/A [#ROCT-rocr-past-60]_ N/A [#ROCT-rocr-past-60]_ N/A [#ROCT-rocr-past-60]_ N/A [#ROCT-rocr-past-60]_ N/A [#ROCT-rocr-past-60]_ N/A [#ROCT-rocr-past-60]_ 20240607.5.7 20240607.5.7 20240607.4.05 20240607.1.4246 20240125.5.08 20240125.5.08 20240125.5.08 20240125.3.30 20231016.2.245 20231016.2.245
96 :doc:`AMD SMI <amdsmi:index>` 25.5.1 25.4.2 25.3.0 24.7.1 24.7.1 24.7.1 24.7.1 24.6.3 24.6.3 24.6.3 24.6.2 24.5.1 24.5.1 24.5.1 24.4.1 23.4.2 23.4.2
97 :doc:`ROCm Data Center Tool <rdc:index>` SYSTEM MGMT TOOLS .. _tools-support-compatibility-matrix-past-60: 0.3.0 0.3.0 0.3.0 0.3.0 0.3.0 0.3.0 0.3.0 0.3.0 0.3.0 0.3.0 0.3.0 0.3.0 0.3.0 0.3.0 0.3.0 0.3.0 0.3.0
98 :doc:`rocminfo <rocminfo:index>` :doc:`AMD SMI <amdsmi:index>` 25.5.1 1.0.0 25.5.1 1.0.0 25.4.2 1.0.0 25.3.0 1.0.0 24.7.1 1.0.0 24.7.1 1.0.0 24.7.1 1.0.0 24.7.1 1.0.0 24.6.3 1.0.0 24.6.3 1.0.0 24.6.3 1.0.0 24.6.2 1.0.0 24.5.1 1.0.0 24.5.1 1.0.0 24.5.1 1.0.0 24.4.1 1.0.0 23.4.2 1.0.0 23.4.2
99 :doc:`ROCm SMI <rocm_smi_lib:index>` :doc:`ROCm Data Center Tool <rdc:index>` 0.3.0 7.5.0 0.3.0 7.5.0 0.3.0 7.5.0 0.3.0 7.4.0 0.3.0 7.4.0 0.3.0 7.4.0 0.3.0 7.4.0 0.3.0 7.3.0 0.3.0 7.3.0 0.3.0 7.3.0 0.3.0 7.3.0 0.3.0 7.2.0 0.3.0 7.2.0 0.3.0 7.0.0 0.3.0 7.0.0 0.3.0 6.0.2 0.3.0 6.0.0 0.3.0
100 :doc:`ROCm Validation Suite <rocmvalidationsuite:index>` :doc:`rocminfo <rocminfo:index>` 1.0.0 1.1.0 1.0.0 1.1.0 1.0.0 1.1.0 1.0.0 1.1.0 1.0.0 1.1.0 1.0.0 1.1.0 1.0.0 1.1.0 1.0.0 1.0.60204 1.0.0 1.0.60202 1.0.0 1.0.60201 1.0.0 1.0.60200 1.0.0 1.0.60105 1.0.0 1.0.60102 1.0.0 1.0.60101 1.0.0 1.0.60100 1.0.0 1.0.60002 1.0.0 1.0.60000 1.0.0
101 :doc:`ROCm SMI <rocm_smi_lib:index>` 7.7.0 7.5.0 7.5.0 7.5.0 7.4.0 7.4.0 7.4.0 7.4.0 7.3.0 7.3.0 7.3.0 7.3.0 7.2.0 7.2.0 7.0.0 7.0.0 6.0.2 6.0.0
102 PERFORMANCE TOOLS :doc:`ROCm Validation Suite <rocmvalidationsuite:index>` 1.1.0 1.1.0 1.1.0 1.1.0 1.1.0 1.1.0 1.1.0 1.1.0 1.0.60204 1.0.60202 1.0.60201 1.0.60200 1.0.60105 1.0.60102 1.0.60101 1.0.60100 1.0.60002 1.0.60000
103 :doc:`ROCm Bandwidth Test <rocm_bandwidth_test:index>` 1.4.0 1.4.0 1.4.0 1.4.0 1.4.0 1.4.0 1.4.0 1.4.0 1.4.0 1.4.0 1.4.0 1.4.0 1.4.0 1.4.0 1.4.0 1.4.0 1.4.0
104 :doc:`ROCm Compute Profiler <rocprofiler-compute:index>` PERFORMANCE TOOLS 3.1.1 3.1.0 3.1.0 3.0.0 3.0.0 3.0.0 3.0.0 2.0.1 2.0.1 2.0.1 2.0.1 N/A N/A N/A N/A N/A N/A
105 :doc:`ROCm Systems Profiler <rocprofiler-systems:index>` :doc:`ROCm Bandwidth Test <rocm_bandwidth_test:index>` 1.4.0 1.0.2 1.4.0 1.0.1 1.4.0 1.0.0 1.4.0 0.1.2 1.4.0 0.1.1 1.4.0 0.1.0 1.4.0 0.1.0 1.4.0 1.11.2 1.4.0 1.11.2 1.4.0 1.11.2 1.4.0 1.11.2 1.4.0 N/A 1.4.0 N/A 1.4.0 N/A 1.4.0 N/A 1.4.0 N/A 1.4.0 N/A 1.4.0
106 :doc:`ROCProfiler <rocprofiler:index>` :doc:`ROCm Compute Profiler <rocprofiler-compute:index>` 3.1.1 2.0.60402 3.1.1 2.0.60401 3.1.0 2.0.60400 3.1.0 2.0.60303 3.0.0 2.0.60302 3.0.0 2.0.60301 3.0.0 2.0.60300 3.0.0 2.0.60204 2.0.1 2.0.60202 2.0.1 2.0.60201 2.0.1 2.0.60200 2.0.1 2.0.60105 N/A 2.0.60102 N/A 2.0.60101 N/A 2.0.60100 N/A 2.0.60002 N/A 2.0.60000 N/A
107 :doc:`ROCprofiler-SDK <rocprofiler-sdk:index>` :doc:`ROCm Systems Profiler <rocprofiler-systems:index>` 1.0.2 0.6.0 1.0.2 0.6.0 1.0.1 0.6.0 1.0.0 0.5.0 0.1.2 0.5.0 0.1.1 0.5.0 0.1.0 0.5.0 0.1.0 0.4.0 1.11.2 0.4.0 1.11.2 0.4.0 1.11.2 0.4.0 1.11.2 N/A N/A N/A N/A N/A N/A
108 :doc:`ROCTracer <roctracer:index>` :doc:`ROCProfiler <rocprofiler:index>` 2.0.60403 4.1.60402 2.0.60402 4.1.60401 2.0.60401 4.1.60400 2.0.60400 4.1.60303 2.0.60303 4.1.60302 2.0.60302 4.1.60301 2.0.60301 4.1.60300 2.0.60300 4.1.60204 2.0.60204 4.1.60202 2.0.60202 4.1.60201 2.0.60201 4.1.60200 2.0.60200 4.1.60105 2.0.60105 4.1.60102 2.0.60102 4.1.60101 2.0.60101 4.1.60100 2.0.60100 4.1.60002 2.0.60002 4.1.60000 2.0.60000
109 :doc:`ROCprofiler-SDK <rocprofiler-sdk:index>` 0.6.0 0.6.0 0.6.0 0.6.0 0.5.0 0.5.0 0.5.0 0.5.0 0.4.0 0.4.0 0.4.0 0.4.0 N/A N/A N/A N/A N/A N/A
110 DEVELOPMENT TOOLS :doc:`ROCTracer <roctracer:index>` 4.1.60403 4.1.60402 4.1.60401 4.1.60400 4.1.60303 4.1.60302 4.1.60301 4.1.60300 4.1.60204 4.1.60202 4.1.60201 4.1.60200 4.1.60105 4.1.60102 4.1.60101 4.1.60100 4.1.60002 4.1.60000
111 :doc:`HIPIFY <hipify:index>` 19.0.0 19.0.0 19.0.0 18.0.0.25012 18.0.0.25012 18.0.0.24491 18.0.0.24455 18.0.0.24392 18.0.0.24355 18.0.0.24355 18.0.0.24232 17.0.0.24193 17.0.0.24193 17.0.0.24154 17.0.0.24103 17.0.0.24012 17.0.0.23483
112 :doc:`ROCm CMake <rocmcmakebuildtools:index>` DEVELOPMENT TOOLS 0.14.0 0.14.0 0.14.0 0.14.0 0.14.0 0.14.0 0.14.0 0.13.0 0.13.0 0.13.0 0.13.0 0.12.0 0.12.0 0.12.0 0.12.0 0.11.0 0.11.0
113 :doc:`ROCdbgapi <rocdbgapi:index>` :doc:`HIPIFY <hipify:index>` 19.0.0 0.77.2 19.0.0 0.77.2 19.0.0 0.77.2 19.0.0 0.77.0 18.0.0.25012 0.77.0 18.0.0.25012 0.77.0 18.0.0.24491 0.77.0 18.0.0.24455 0.76.0 18.0.0.24392 0.76.0 18.0.0.24355 0.76.0 18.0.0.24355 0.76.0 18.0.0.24232 0.71.0 17.0.0.24193 0.71.0 17.0.0.24193 0.71.0 17.0.0.24154 0.71.0 17.0.0.24103 0.71.0 17.0.0.24012 0.71.0 17.0.0.23483
114 :doc:`ROCm Debugger (ROCgdb) <rocgdb:index>` :doc:`ROCm CMake <rocmcmakebuildtools:index>` 0.14.0 15.2.0 0.14.0 15.2.0 0.14.0 15.2.0 0.14.0 15.2.0 0.14.0 15.2.0 0.14.0 15.2.0 0.14.0 15.2.0 0.14.0 14.2.0 0.13.0 14.2.0 0.13.0 14.2.0 0.13.0 14.2.0 0.13.0 14.1.0 0.12.0 14.1.0 0.12.0 14.1.0 0.12.0 14.1.0 0.12.0 13.2.0 0.11.0 13.2.0 0.11.0
115 `rocprofiler-register <https://github.com/ROCm/rocprofiler-register>`_ :doc:`ROCdbgapi <rocdbgapi:index>` 0.77.2 0.4.0 0.77.2 0.4.0 0.77.2 0.4.0 0.77.2 0.4.0 0.77.0 0.4.0 0.77.0 0.4.0 0.77.0 0.4.0 0.77.0 0.4.0 0.76.0 0.4.0 0.76.0 0.4.0 0.76.0 0.4.0 0.76.0 0.3.0 0.71.0 0.3.0 0.71.0 0.3.0 0.71.0 0.3.0 0.71.0 N/A 0.71.0 N/A 0.71.0
116 :doc:`ROCr Debug Agent <rocr_debug_agent:index>` :doc:`ROCm Debugger (ROCgdb) <rocgdb:index>` 15.2.0 2.0.4 15.2.0 2.0.4 15.2.0 2.0.4 15.2.0 2.0.3 15.2.0 2.0.3 15.2.0 2.0.3 15.2.0 2.0.3 15.2.0 2.0.3 14.2.0 2.0.3 14.2.0 2.0.3 14.2.0 2.0.3 14.2.0 2.0.3 14.1.0 2.0.3 14.1.0 2.0.3 14.1.0 2.0.3 14.1.0 2.0.3 13.2.0 2.0.3 13.2.0
117 `rocprofiler-register <https://github.com/ROCm/rocprofiler-register>`_ 0.4.0 0.4.0 0.4.0 0.4.0 0.4.0 0.4.0 0.4.0 0.4.0 0.4.0 0.4.0 0.4.0 0.4.0 0.3.0 0.3.0 0.3.0 0.3.0 N/A N/A
118 COMPILERS :doc:`ROCr Debug Agent <rocr_debug_agent:index>` 2.0.4 .. _compilers-support-compatibility-matrix-past-60: 2.0.4 2.0.4 2.0.4 2.0.3 2.0.3 2.0.3 2.0.3 2.0.3 2.0.3 2.0.3 2.0.3 2.0.3 2.0.3 2.0.3 2.0.3 2.0.3 2.0.3
119 `clang-ocl <https://github.com/ROCm/clang-ocl>`_ N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A 0.5.0 0.5.0 0.5.0 0.5.0 0.5.0 0.5.0
120 :doc:`hipCC <hipcc:index>` COMPILERS .. _compilers-support-compatibility-matrix-past-60: 1.1.1 1.1.1 1.1.1 1.1.1 1.1.1 1.1.1 1.1.1 1.1.1 1.1.1 1.1.1 1.1.1 1.0.0 1.0.0 1.0.0 1.0.0 1.0.0 1.0.0
121 `Flang <https://github.com/ROCm/flang>`_ `clang-ocl <https://github.com/ROCm/clang-ocl>`_ N/A 19.0.0.25224 N/A 19.0.0.25184 N/A 19.0.0.25133 N/A 18.0.0.25012 N/A 18.0.0.25012 N/A 18.0.0.24491 N/A 18.0.0.24455 N/A 18.0.0.24392 N/A 18.0.0.24355 N/A 18.0.0.24355 N/A 18.0.0.24232 N/A 17.0.0.24193 0.5.0 17.0.0.24193 0.5.0 17.0.0.24154 0.5.0 17.0.0.24103 0.5.0 17.0.0.24012 0.5.0 17.0.0.23483 0.5.0
122 :doc:`llvm-project <llvm-project:index>` :doc:`hipCC <hipcc:index>` 1.1.1 19.0.0.25224 1.1.1 19.0.0.25184 1.1.1 19.0.0.25133 1.1.1 18.0.0.25012 1.1.1 18.0.0.25012 1.1.1 18.0.0.24491 1.1.1 18.0.0.24491 1.1.1 18.0.0.24392 1.1.1 18.0.0.24355 1.1.1 18.0.0.24355 1.1.1 18.0.0.24232 1.1.1 17.0.0.24193 1.0.0 17.0.0.24193 1.0.0 17.0.0.24154 1.0.0 17.0.0.24103 1.0.0 17.0.0.24012 1.0.0 17.0.0.23483 1.0.0
123 `OpenMP <https://github.com/ROCm/llvm-project/tree/amd-staging/openmp>`_ `Flang <https://github.com/ROCm/flang>`_ 19.0.0.25224 19.0.0.25224 19.0.0.25184 19.0.0.25133 18.0.0.25012 18.0.0.25012 18.0.0.24491 18.0.0.24491 18.0.0.24455 18.0.0.24392 18.0.0.24355 18.0.0.24355 18.0.0.24232 17.0.0.24193 17.0.0.24193 17.0.0.24154 17.0.0.24103 17.0.0.24012 17.0.0.23483
124 :doc:`llvm-project <llvm-project:index>` 19.0.0.25224 19.0.0.25224 19.0.0.25184 19.0.0.25133 18.0.0.25012 18.0.0.25012 18.0.0.24491 18.0.0.24491 18.0.0.24392 18.0.0.24355 18.0.0.24355 18.0.0.24232 17.0.0.24193 17.0.0.24193 17.0.0.24154 17.0.0.24103 17.0.0.24012 17.0.0.23483
125 RUNTIMES `OpenMP <https://github.com/ROCm/llvm-project/tree/amd-staging/openmp>`_ 19.0.0.25224 .. _runtime-support-compatibility-matrix-past-60: 19.0.0.25224 19.0.0.25184 19.0.0.25133 18.0.0.25012 18.0.0.25012 18.0.0.24491 18.0.0.24491 18.0.0.24392 18.0.0.24355 18.0.0.24355 18.0.0.24232 17.0.0.24193 17.0.0.24193 17.0.0.24154 17.0.0.24103 17.0.0.24012 17.0.0.23483
126 :doc:`AMD CLR <hip:understand/amd_clr>` 6.4.43484 6.4.43483 6.4.43482 6.3.42134 6.3.42134 6.3.42133 6.3.42131 6.2.41134 6.2.41134 6.2.41134 6.2.41133 6.1.40093 6.1.40093 6.1.40092 6.1.40091 6.1.32831 6.1.32830
127 :doc:`HIP <hip:index>` RUNTIMES .. _runtime-support-compatibility-matrix-past-60: 6.4.43484 6.4.43483 6.4.43482 6.3.42134 6.3.42134 6.3.42133 6.3.42131 6.2.41134 6.2.41134 6.2.41134 6.2.41133 6.1.40093 6.1.40093 6.1.40092 6.1.40091 6.1.32831 6.1.32830
128 `OpenCL Runtime <https://github.com/ROCm/clr/tree/develop/opencl>`_ :doc:`AMD CLR <hip:understand/amd_clr>` 6.4.43484 2.0.0 6.4.43484 2.0.0 6.4.43483 2.0.0 6.4.43482 2.0.0 6.3.42134 2.0.0 6.3.42134 2.0.0 6.3.42133 2.0.0 6.3.42131 2.0.0 6.2.41134 2.0.0 6.2.41134 2.0.0 6.2.41134 2.0.0 6.2.41133 2.0.0 6.1.40093 2.0.0 6.1.40093 2.0.0 6.1.40092 2.0.0 6.1.40091 2.0.0 6.1.32831 2.0.0 6.1.32830
129 :doc:`ROCr Runtime <rocr-runtime:index>` :doc:`HIP <hip:index>` 6.4.43484 1.15.0 6.4.43484 1.15.0 6.4.43483 1.15.0 6.4.43482 1.14.0 6.3.42134 1.14.0 6.3.42134 1.14.0 6.3.42133 1.14.0 6.3.42131 1.14.0 6.2.41134 1.14.0 6.2.41134 1.14.0 6.2.41134 1.13.0 6.2.41133 1.13.0 6.1.40093 1.13.0 6.1.40093 1.13.0 6.1.40092 1.13.0 6.1.40091 1.12.0 6.1.32831 1.12.0 6.1.32830
130 `OpenCL Runtime <https://github.com/ROCm/clr/tree/develop/opencl>`_ 2.0.0 2.0.0 2.0.0 2.0.0 2.0.0 2.0.0 2.0.0 2.0.0 2.0.0 2.0.0 2.0.0 2.0.0 2.0.0 2.0.0 2.0.0 2.0.0 2.0.0 2.0.0
131 :doc:`ROCr Runtime <rocr-runtime:index>` 1.15.0 1.15.0 1.15.0 1.15.0 1.14.0 1.14.0 1.14.0 1.14.0 1.14.0 1.14.0 1.14.0 1.13.0 1.13.0 1.13.0 1.13.0 1.13.0 1.12.0 1.12.0

View File

@@ -23,14 +23,14 @@ compatibility and system requirements.
.. container:: format-big-table
.. csv-table::
:header: "ROCm Version", "6.4.2", "6.4.1", "6.3.0"
:header: "ROCm Version", "6.4.3", "6.4.2", "6.3.0"
:stub-columns: 1
:ref:`Operating systems & kernels <OS-kernel-versions>`,Ubuntu 24.04.2,Ubuntu 24.04.2,Ubuntu 24.04.2
,Ubuntu 22.04.5,Ubuntu 22.04.5,Ubuntu 22.04.5
,"RHEL 9.6, 9.4","RHEL 9.6, 9.5, 9.4","RHEL 9.5, 9.4"
,"RHEL 9.6, 9.4","RHEL 9.6, 9.4","RHEL 9.5, 9.4"
,RHEL 8.10,RHEL 8.10,RHEL 8.10
,"SLES 15 SP7, SP6",SLES 15 SP6,"SLES 15 SP6, SP5"
,"SLES 15 SP7, SP6","SLES 15 SP7, SP6","SLES 15 SP6, SP5"
,"Oracle Linux 9, 8 [#mi300x]_","Oracle Linux 9, 8 [#mi300x]_",Oracle Linux 8.10 [#mi300x]_
,Debian 12 [#single-node]_,Debian 12 [#single-node]_,
,Azure Linux 3.0 [#mi300x]_,Azure Linux 3.0 [#mi300x]_,
@@ -44,7 +44,7 @@ compatibility and system requirements.
,.. _gpu-support-compatibility-matrix:,,
:doc:`GPU / LLVM target <rocm-install-on-linux:reference/system-requirements>`,gfx1201 [#RDNA-OS]_,gfx1201 [#RDNA-OS]_,
,gfx1200 [#RDNA-OS]_,gfx1200 [#RDNA-OS]_,
,gfx1101 [#RDNA-OS]_ [#7700XT-OS]_,gfx1101 [#RDNA-OS]_,
,gfx1101 [#RDNA-OS]_ [#7700XT-OS]_,gfx1101 [#RDNA-OS]_ [#7700XT-OS]_,
,gfx1100,gfx1100,gfx1100
,gfx1030,gfx1030,gfx1030
,gfx942,gfx942,gfx942
@@ -54,9 +54,9 @@ compatibility and system requirements.
FRAMEWORK SUPPORT,.. _framework-support-compatibility-matrix:,,
:doc:`PyTorch <../compatibility/ml-compatibility/pytorch-compatibility>`,"2.6, 2.5, 2.4, 2.3","2.6, 2.5, 2.4, 2.3","2.4, 2.3, 2.2, 2.1, 2.0, 1.13"
:doc:`TensorFlow <../compatibility/ml-compatibility/tensorflow-compatibility>`,"2.18.1, 2.17.1, 2.16.2","2.18.1, 2.17.1, 2.16.2","2.17.0, 2.16.2, 2.15.1"
:doc:`JAX <../compatibility/ml-compatibility/jax-compatibility>`,0.4.35,0.4.35,0.4.31
:doc:`JAX <../compatibility/ml-compatibility/jax-compatibility>`,0.4.35,0.4.35,0.4.31
:doc:`Stanford Megatron-LM <../compatibility/ml-compatibility/stanford-megatron-lm-compatibility>`,N/A,N/A,85f95ae
:doc:`DGL <../compatibility/ml-compatibility/dgl-compatibility>`,2.4.0,2.4.0,N/A
:doc:`Megablocks <../compatibility/ml-compatibility/megablocks-compatibility>`,N/A,N/A,0.7.0
`ONNX Runtime <https://onnxruntime.ai/docs/build/eps.html#amd-migraphx>`_,1.2,1.2,1.17.3
,,,
THIRD PARTY COMMS,.. _thirdpartycomms-support-compatibility-matrix:,,
@@ -83,7 +83,7 @@ compatibility and system requirements.
,,,
COMMUNICATION,.. _commlibs-support-compatibility-matrix:,,
:doc:`RCCL <rccl:index>`,2.22.3,2.22.3,2.21.5
:doc:`rocSHMEM <rocshmem:index>`,2.0.1,2.0.0,N/A
:doc:`rocSHMEM <rocshmem:index>`,2.0.1,2.0.1,N/A
,,,
MATH LIBS,.. _mathlibs-support-compatibility-matrix:,,
`half <https://github.com/ROCm/half>`_ ,1.12.0,1.12.0,1.12.0
@@ -96,10 +96,10 @@ compatibility and system requirements.
:doc:`hipSPARSE <hipsparse:index>`,3.2.0,3.2.0,3.1.2
:doc:`hipSPARSELt <hipsparselt:index>`,0.2.3,0.2.3,0.2.2
:doc:`rocALUTION <rocalution:index>`,3.2.3,3.2.3,3.2.1
:doc:`rocBLAS <rocblas:index>`,4.4.1,4.4.0,4.3.0
:doc:`rocBLAS <rocblas:index>`,4.4.1,4.4.1,4.3.0
:doc:`rocFFT <rocfft:index>`,1.0.32,1.0.32,1.0.31
:doc:`rocRAND <rocrand:index>`,3.3.0,3.3.0,3.2.0
:doc:`rocSOLVER <rocsolver:index>`,3.28.2,3.28.0,3.27.0
:doc:`rocSOLVER <rocsolver:index>`,3.28.2,3.28.2,3.27.0
:doc:`rocSPARSE <rocsparse:index>`,3.4.0,3.4.0,3.3.0
:doc:`rocWMMA <rocwmma:index>`,1.7.0,1.7.0,1.6.0
:doc:`Tensile <tensile:src/index>`,4.43.0,4.43.0,4.42.0
@@ -107,28 +107,28 @@ compatibility and system requirements.
PRIMITIVES,.. _primitivelibs-support-compatibility-matrix:,,
:doc:`hipCUB <hipcub:index>`,3.4.0,3.4.0,3.3.0
:doc:`hipTensor <hiptensor:index>`,1.5.0,1.5.0,1.4.0
:doc:`rocPRIM <rocprim:index>`,3.4.1,3.4.0,3.3.0
:doc:`rocPRIM <rocprim:index>`,3.4.1,3.4.1,3.3.0
:doc:`rocThrust <rocthrust:index>`,3.3.0,3.3.0,3.3.0
,,,
SUPPORT LIBS,,,
`hipother <https://github.com/ROCm/hipother>`_,6.4.43483,6.4.43483,6.3.42131
`rocm-core <https://github.com/ROCm/rocm-core>`_,6.4.2,6.4.1,6.3.0
`rocm-core <https://github.com/ROCm/rocm-core>`_,6.4.3,6.4.2,6.3.0
`ROCT-Thunk-Interface <https://github.com/ROCm/ROCT-Thunk-Interface>`_,N/A [#ROCT-rocr]_,N/A [#ROCT-rocr]_,N/A [#ROCT-rocr]_
,,,
SYSTEM MGMT TOOLS,.. _tools-support-compatibility-matrix:,,
:doc:`AMD SMI <amdsmi:index>`,25.5.1,25.4.2,24.7.1
:doc:`AMD SMI <amdsmi:index>`,25.5.1,25.5.1,24.7.1
:doc:`ROCm Data Center Tool <rdc:index>`,0.3.0,0.3.0,0.3.0
:doc:`rocminfo <rocminfo:index>`,1.0.0,1.0.0,1.0.0
:doc:`ROCm SMI <rocm_smi_lib:index>`,7.5.0,7.5.0,7.4.0
:doc:`ROCm SMI <rocm_smi_lib:index>`,7.7.0,7.5.0,7.4.0
:doc:`ROCm Validation Suite <rocmvalidationsuite:index>`,1.1.0,1.1.0,1.1.0
,,,
PERFORMANCE TOOLS,,,
:doc:`ROCm Bandwidth Test <rocm_bandwidth_test:index>`,1.4.0,1.4.0,1.4.0
:doc:`ROCm Compute Profiler <rocprofiler-compute:index>`,3.1.1,3.1.0,3.0.0
:doc:`ROCm Systems Profiler <rocprofiler-systems:index>`,1.0.2,1.0.1,0.1.0
:doc:`ROCProfiler <rocprofiler:index>`,2.0.60402,2.0.60401,2.0.60300
:doc:`ROCm Compute Profiler <rocprofiler-compute:index>`,3.1.1,3.1.1,3.0.0
:doc:`ROCm Systems Profiler <rocprofiler-systems:index>`,1.0.2,1.0.2,0.1.0
:doc:`ROCProfiler <rocprofiler:index>`,2.0.60403,2.0.60402,2.0.60300
:doc:`ROCprofiler-SDK <rocprofiler-sdk:index>`,0.6.0,0.6.0,0.5.0
:doc:`ROCTracer <roctracer:index>`,4.1.60402,4.1.60401,4.1.60300
:doc:`ROCTracer <roctracer:index>`,4.1.60403,4.1.60402,4.1.60300
,,,
DEVELOPMENT TOOLS,,,
:doc:`HIPIFY <hipify:index>`,19.0.0,19.0.0,18.0.0.24455
@@ -141,16 +141,17 @@ compatibility and system requirements.
COMPILERS,.. _compilers-support-compatibility-matrix:,,
`clang-ocl <https://github.com/ROCm/clang-ocl>`_,N/A,N/A,N/A
:doc:`hipCC <hipcc:index>`,1.1.1,1.1.1,1.1.1
`Flang <https://github.com/ROCm/flang>`_,19.0.0.25224,19.0.0.25184,18.0.0.24455
:doc:`llvm-project <llvm-project:index>`,19.0.0.25224,19.0.0.25184,18.0.0.24491
`OpenMP <https://github.com/ROCm/llvm-project/tree/amd-staging/openmp>`_,19.0.0.25224,19.0.0.25184,18.0.0.24491
`Flang <https://github.com/ROCm/flang>`_,19.0.0.25224,19.0.0.25224,18.0.0.24455
:doc:`llvm-project <llvm-project:index>`,19.0.0.25224,19.0.0.25224,18.0.0.24491
`OpenMP <https://github.com/ROCm/llvm-project/tree/amd-staging/openmp>`_,19.0.0.25224,19.0.0.25224,18.0.0.24491
,,,
RUNTIMES,.. _runtime-support-compatibility-matrix:,,
:doc:`AMD CLR <hip:understand/amd_clr>`,6.4.43484,6.4.43483,6.3.42131
:doc:`HIP <hip:index>`,6.4.43484,6.4.43483,6.3.42131
:doc:`AMD CLR <hip:understand/amd_clr>`,6.4.43484,6.4.43484,6.3.42131
:doc:`HIP <hip:index>`,6.4.43484,6.4.43484,6.3.42131
`OpenCL Runtime <https://github.com/ROCm/clr/tree/develop/opencl>`_,2.0.0,2.0.0,2.0.0
:doc:`ROCr Runtime <rocr-runtime:index>`,1.15.0,1.15.0,1.14.0
.. rubric:: Footnotes
.. [#mi300x] Oracle Linux and Azure Linux are supported only on AMD Instinct MI300X.
@@ -241,6 +242,10 @@ Expand for full historical view of:
.. [#mi300_602-past-60] **For ROCm 6.0.2** - MI300A (gfx942) is supported on Ubuntu 22.04.3, RHEL 8.9, and SLES 15 SP5. MI300X (gfx942) is only supported on Ubuntu 22.04.3.
.. [#mi300_600-past-60] **For ROCm 6.0.0** - MI300A (gfx942) is supported on Ubuntu 22.04.3, RHEL 8.9, and SLES 15 SP5. MI300X (gfx942) is only supported on Ubuntu 22.04.3.
.. [#verl_compat] verl is only supported on ROCm 6.2.0.
.. [#stanford-megatron-lm_compat] Stanford Megatron-LM is only supported on ROCm 6.3.0.
.. [#dgl_compat] DGL is only supported on ROCm 6.4.0.
.. [#megablocks_compat] Megablocks is only supported on ROCm 6.3.0.
.. [#taichi_compat] Taichi is only supported on ROCm 6.3.2.
.. [#kfd_support-past-60] As of ROCm 6.4.0, forward and backward compatibility between the AMD Kernel-mode GPU Driver (KMD) and its user space software is provided up to a year apart. For earlier ROCm releases, the compatibility is provided for +/- 2 releases. The tested user space versions on this page were accurate as of the time of initial ROCm release. For the most up-to-date information, see the latest version of this information at `User and kernel-space support matrix <https://rocm.docs.amd.com/projects/install-on-linux/en/latest/reference/user-kernel-space-compat-matrix.html>`_.
.. [#ROCT-rocr-past-60] Starting from ROCm 6.3.0, the ROCT Thunk Interface is included as part of the ROCr runtime package.

View File

@@ -42,16 +42,16 @@ GAT, GCN and GraphSage. Using these we can support a variety of use-cases such a
- 1D (Temporal) and 2D (Image) Classification
- Drug Discovery
Refer to :doc:`ROCm DGL blog posts <https://rocm.blogs.amd.com/blog/tag/dgl.html>`
for examples and best practices to optimize your training workflows on AMD GPUs.
Multiple use cases of DGL have been tested and verified.
However, a recommended example follows a drug discovery pipeline using the ``SE3Transformer``.
Refer to the `AMD ROCm blog <https://rocm.blogs.amd.com/>`_,
where you can search for DGL examples and best practices to optimize your training workflows on AMD GPUs.
Coverage includes:
- Single-GPU training/inference
- Multi-GPU training
Benchmarking details are included in the :doc:`Benchmarks` section.
.. _dgl-docker-compat:
@@ -252,4 +252,4 @@ Unsupported functions
* ``gather_mm_idx_b``
* ``pgexplainer``
* ``sample_labors_prob``
* ``sample_labors_noprob``
* ``sample_labors_noprob``

View File

@@ -97,7 +97,7 @@ Docker image compatibility
AMD validates and publishes ready-made `ROCm JAX Docker images <https://hub.docker.com/r/rocm/jax>`_
with ROCm backends on Docker Hub. The following Docker image tags and
associated inventories represent the latest JAX version from the official Docker Hub and are validated for
`ROCm 6.4.1 <https://repo.radeon.com/rocm/apt/6.4.1/>`_. Click the |docker-icon|
`ROCm 6.4.2 <https://repo.radeon.com/rocm/apt/6.4.2/>`_. Click the |docker-icon|
icon to view the image on Docker Hub.
.. list-table:: JAX Docker image components
@@ -110,7 +110,7 @@ icon to view the image on Docker Hub.
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/jax/rocm6.4.1-jax0.4.35-py3.12/images/sha256-7a0745a2a2758bdf86397750bac00e9086cbf67d170cfdbb08af73f7c7d18a6a"><i class="fab fa-docker fa-lg"></i> rocm/jax</a>
<a href="https://hub.docker.com/layers/rocm/jax/rocm6.4.2-jax0.4.35-py3.12/images/sha256-8918fa806a172c1a10eb2f57131eb31b5d7c8fa1656b8729fe7d3d736112de83"><i class="fab fa-docker fa-lg"></i> rocm/jax</a>
- `0.4.35 <https://github.com/ROCm/jax/releases/tag/rocm-jax-v0.4.35>`_
- Ubuntu 24.04
@@ -118,7 +118,7 @@ icon to view the image on Docker Hub.
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/jax/rocm6.4.1-jax0.4.35-py3.10/images/sha256-5f9e8d6e6e69fdc9a1a3f2ba3b1234c3f46c53b7468538c07fd18b00899da54f"><i class="fab fa-docker fa-lg"></i> rocm/jax</a>
<a href="https://hub.docker.com/layers/rocm/jax/rocm6.4.2-jax0.4.35-py3.10/images/sha256-a394be13c67b7fc602216abee51233afd4b6cb7adaa57ca97e688fba82f9ad79"><i class="fab fa-docker fa-lg"></i> rocm/jax</a>
- `0.4.35 <https://github.com/ROCm/jax/releases/tag/rocm-jax-v0.4.35>`_
- Ubuntu 22.04

View File

@@ -0,0 +1,93 @@
:orphan:
.. meta::
:description: Megablocks compatibility
:keywords: GPU, megablocks, compatibility
.. version-set:: rocm_version latest
********************************************************************************
Megablocks compatibility
********************************************************************************
Megablocks is a light-weight library for mixture-of-experts (MoE) training.
The core of the system is efficient "dropless-MoE" and standard MoE layers.
Megablocks is integrated with `https://github.com/stanford-futuredata/Megatron-LM <https://github.com/stanford-futuredata/Megatron-LM>`_,
where data and pipeline parallel training of MoEs is supported.
* ROCm support for Megablocks is hosted in the official `https://github.com/ROCm/megablocks <https://github.com/ROCm/megablocks>`_ repository.
* Due to independent compatibility considerations, this location differs from the `https://github.com/stanford-futuredata/Megatron-LM <https://github.com/stanford-futuredata/Megatron-LM>`_ upstream repository.
* Use the prebuilt :ref:`Docker image <megablocks-docker-compat>` with ROCm, PyTorch, and Megablocks preinstalled.
* See the :doc:`ROCm Megablocks installation guide <rocm-install-on-linux:install/3rd-party/megablocks-install>` to install and get started.
.. note::
Megablocks is supported on ROCm 6.3.0.
Supported devices
================================================================================
- **Officially Supported**: AMD Instinct MI300X
- **Partially Supported** (functionality or performance limitations): AMD Instinct MI250X, MI210X
Supported models and features
================================================================================
This section summarizes the Megablocks features supported by ROCm.
* Distributed Pre-training
* Activation Checkpointing and Recomputation
* Distributed Optimizer
* Mixture-of-Experts
* dropless-Mixture-of-Experts
.. _megablocks-recommendations:
Use cases and recommendations
================================================================================
The `ROCm Megablocks blog posts <https://rocm.blogs.amd.com/artificial-intelligence/megablocks/README.html>`_
guide how to leverage the ROCm platform for pre-training using the Megablocks framework.
It features how to pre-process datasets and how to begin pre-training on AMD GPUs through:
* Single-GPU pre-training
* Multi-GPU pre-training
.. _megablocks-docker-compat:
Docker image compatibility
================================================================================
.. |docker-icon| raw:: html
<i class="fab fa-docker"></i>
AMD validates and publishes `ROCm Megablocks images <https://hub.docker.com/r/rocm/megablocks/tags>`_
with ROCm and Pytorch backends on Docker Hub. The following Docker image tags and associated
inventories represent the latest Megatron-LM version from the official Docker Hub.
The Docker images have been validated for `ROCm 6.3.0 <https://repo.radeon.com/rocm/apt/6.3/>`_.
Click |docker-icon| to view the image on Docker Hub.
.. list-table::
:header-rows: 1
:class: docker-image-compatibility
* - Docker image
- ROCm
- Megablocks
- PyTorch
- Ubuntu
- Python
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/megablocks/megablocks-0.7.0_rocm6.3.0_ubuntu24.04_py3.12_pytorch2.4.0/images/sha256-372ff89b96599019b8f5f9db469c84add2529b713456781fa62eb9a148659ab4"><i class="fab fa-docker fa-lg"></i> rocm/megablocks</a>
- `6.3.0 <https://repo.radeon.com/rocm/apt/6.3/>`_
- `0.7.0 <https://github.com/databricks/megablocks/releases/tag/v0.7.0>`_
- `2.4.0 <https://github.com/ROCm/pytorch/tree/release/2.4>`_
- 24.04
- `3.12.9 <https://www.python.org/downloads/release/python-3129/>`_

View File

@@ -95,7 +95,7 @@ Docker image compatibility
AMD validates and publishes `PyTorch images <https://hub.docker.com/r/rocm/pytorch>`__
with ROCm backends on Docker Hub. The following Docker image tags and associated
inventories were tested on `ROCm 6.4.1 <https://repo.radeon.com/rocm/apt/6.4.1/>`__.
inventories were tested on `ROCm 6.4.2 <https://repo.radeon.com/rocm/apt/6.4.2/>`__.
Click |docker-icon| to view the image on Docker Hub.
.. list-table:: PyTorch Docker image components
@@ -112,127 +112,118 @@ Click |docker-icon| to view the image on Docker Hub.
- MAGMA
- UCX
- OMPI
- OFED
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/pytorch/rocm6.4.1_ubuntu24.04_py3.12_pytorch_release_2.6.0/images/sha256-c76af9bfb1c25b0f40d4c29e8652105c57250bf018d23ff595b06bd79666fdd7"><i class="fab fa-docker fa-lg"></i></a>
<a href="https://hub.docker.com/layers/rocm/pytorch/rocm6.4.2_ubuntu24.04_py3.12_pytorch_release_2.6.0/images/sha256-6a287591500b4048a9556c1ecc92bc411fd3d552f6c8233bc399f18eb803e8d6"><i class="fab fa-docker fa-lg"></i></a>
- `2.6.0 <https://github.com/ROCm/pytorch/tree/release/2.6>`__
- 24.04
- `3.12.10 <https://www.python.org/downloads/release/python-31210/>`__
- `3.12 <https://www.python.org/downloads/release/python-31210/>`__
- `1.6.0 <https://github.com/ROCm/apex/tree/release/1.6.0>`__
- `0.21.0 <https://github.com/pytorch/vision/tree/v0.21.0>`__
- `2.13.0 <https://github.com/tensorflow/tensorboard/tree/2.13.0>`__
- `2.18.0 <https://github.com/tensorflow/tensorboard/tree/2.18.0>`__
- `master <https://bitbucket.org/icl/magma/src/master/>`__
- `1.16.0 <https://github.com/openucx/ucx/tree/v1.16.0>`__
- `1.16.0+ds-5ubuntu1 <https://github.com/openucx/ucx/tree/v1.16.0>`__
- `4.1.6-7ubuntu2 <https://github.com/open-mpi/ompi/tree/v4.1.6>`__
- `5.3-1.0.5.0 <https://content.mellanox.com/ofed/MLNX_OFED-5.3-1.0.5.0/MLNX_OFED_LINUX-5.3-1.0.5.0-ubuntu20.04-x86_64.tgz>`__
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/pytorch/rocm6.4.1_ubuntu22.04_py3.10_pytorch_release_2.6.0/images/sha256-f9d226135d51831c810dcb1251636ec61f85c65fcdda03e188c053a5d4f6585b"><i class="fab fa-docker fa-lg"></i></a>
<a href="https://hub.docker.com/layers/rocm/pytorch/rocm6.4.2_ubuntu22.04_py3.10_pytorch_release_2.6.0/images/sha256-06b967629ba6657709f04169832cd769a11e6b491e8b1394c361d42d7a0c8b43"><i class="fab fa-docker fa-lg"></i></a>
- `2.6.0 <https://github.com/ROCm/pytorch/tree/release/2.6>`__
- 22.04
- `3.10.17 <https://www.python.org/downloads/release/python-31017/>`__
- `3.10 <https://www.python.org/downloads/release/python-31017/>`__
- `1.6.0 <https://github.com/ROCm/apex/tree/release/1.6.0>`__
- `0.21.0 <https://github.com/pytorch/vision/tree/v0.21.0>`__
- `2.13.0 <https://github.com/tensorflow/tensorboard/tree/2.13.0>`__
- `2.18.0 <https://github.com/tensorflow/tensorboard/tree/2.18.0>`__
- `master <https://bitbucket.org/icl/magma/src/master/>`__
- `1.12.1~rc2-1 <https://github.com/openucx/ucx/tree/v1.12.1>`__
- `4.1.2-2ubuntu1 <https://github.com/open-mpi/ompi/tree/v4.1.2>`__
- `5.3-1.0.5.0 <https://content.mellanox.com/ofed/MLNX_OFED-5.3-1.0.5.0/MLNX_OFED_LINUX-5.3-1.0.5.0-ubuntu20.04-x86_64.tgz>`__
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/pytorch/rocm6.4.1_ubuntu24.04_py3.12_pytorch_release_2.5.1/images/sha256-3490e74d4f43dcdb3351dd334108d1ccd47e5a687c0523a2424ac1bcdd3dd6dd"><i class="fab fa-docker fa-lg"></i></a>
<a href="https://hub.docker.com/layers/rocm/pytorch/rocm6.4.2_ubuntu24.04_py3.12_pytorch_release_2.5.1/images/sha256-62022414217ef6de33ac5b1341e57db8a48e8573fa2ace12d48aa5edd4b99ef0"><i class="fab fa-docker fa-lg"></i></a>
- `2.5.1 <https://github.com/ROCm/pytorch/tree/release/2.5>`__
- 24.04
- `3.12.10 <https://www.python.org/downloads/release/python-31210/>`__
- `3.12 <https://www.python.org/downloads/release/python-31210/>`__
- `1.5.0 <https://github.com/ROCm/apex/tree/release/1.5.0>`__
- `0.20.1 <https://github.com/pytorch/vision/tree/v0.20.1>`__
- `2.13.0 <https://github.com/tensorflow/tensorboard/tree/2.13.0>`__
- `2.18.0 <https://github.com/tensorflow/tensorboard/tree/2.18.0>`__
- `master <https://bitbucket.org/icl/magma/src/master/>`__
- `1.16.0+ds-5ubuntu1 <https://github.com/openucx/ucx/tree/v1.10.0>`__
- `4.1.6-7ubuntu2 <https://github.com/open-mpi/ompi/tree/v4.1.6>`__
- `5.3-1.0.5.0 <https://content.mellanox.com/ofed/MLNX_OFED-5.3-1.0.5.0/MLNX_OFED_LINUX-5.3-1.0.5.0-ubuntu20.04-x86_64.tgz>`__
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/pytorch/rocm6.4.1_ubuntu22.04_py3.10_pytorch_release_2.5.1/images/sha256-26c5dfffb4a54625884abca83166940f17dd27bc75f1b24f6e80fbcb7d4e9afb"><i class="fab fa-docker fa-lg"></i></a>
<a href="https://hub.docker.com/layers/rocm/pytorch/rocm6.4.2_ubuntu22.04_py3.11_pytorch_release_2.5.1/images/sha256-469a7f74fc149aff31797e011ee41978f6a190adc69fa423b3c6a718a77bd985"><i class="fab fa-docker fa-lg"></i></a>
- `2.5.1 <https://github.com/ROCm/pytorch/tree/release/2.5>`__
- 22.04
- `3.10.17 <https://www.python.org/downloads/release/python-31017/>`__
- `3.11 <https://www.python.org/downloads/release/python-31113/>`__
- `1.5.0 <https://github.com/ROCm/apex/tree/release/1.5.0>`__
- `0.20.1 <https://github.com/pytorch/vision/tree/v0.20.1>`__
- `2.13.0 <https://github.com/tensorflow/tensorboard/tree/2.13.0>`__
- `2.18.0 <https://github.com/tensorflow/tensorboard/tree/2.18.0>`__
- `master <https://bitbucket.org/icl/magma/src/master/>`__
- `1.12.1~rc2-1 <https://github.com/openucx/ucx/tree/v1.12.1>`__
- `4.1.2-2ubuntu1 <https://github.com/open-mpi/ompi/tree/v4.1.2>`__
- `5.3-1.0.5.0 <https://content.mellanox.com/ofed/MLNX_OFED-5.3-1.0.5.0/MLNX_OFED_LINUX-5.3-1.0.5.0-ubuntu20.04-x86_64.tgz>`__
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/pytorch/rocm6.4.1_ubuntu24.04_py3.12_pytorch_release_2.4.1/images/sha256-f378a24561fa6efc178b6dc93fc7d82e5b93653ecd59c89d4476674d29e1284d"><i class="fab fa-docker fa-lg"></i></a>
<a href="https://hub.docker.com/layers/rocm/pytorch/rocm6.4.2_ubuntu22.04_py3.10_pytorch_release_2.5.1/images/sha256-37f41a1cd94019688669a1b20d33ea74156e0c129ef6b8270076ef214a6a1a2c"><i class="fab fa-docker fa-lg"></i></a>
- `2.5.1 <https://github.com/ROCm/pytorch/tree/release/2.5>`__
- 22.04
- `3.10 <https://www.python.org/downloads/release/python-31017/>`__
- `1.5.0 <https://github.com/ROCm/apex/tree/release/1.5.0>`__
- `0.20.1 <https://github.com/pytorch/vision/tree/v0.20.1>`__
- `2.18.0 <https://github.com/tensorflow/tensorboard/tree/2.18.0>`__
- `master <https://bitbucket.org/icl/magma/src/master/>`__
- `1.12.1~rc2-1 <https://github.com/openucx/ucx/tree/v1.12.1>`__
- `4.1.2-2ubuntu1 <https://github.com/open-mpi/ompi/tree/v4.1.2>`__
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/pytorch/rocm6.4.2_ubuntu24.04_py3.12_pytorch_release_2.4.1/images/sha256-60824ba83dc1b9d94164925af1f81c0235c105dd555091ec04c57e05177ead1b"><i class="fab fa-docker fa-lg"></i></a>
- `2.4.1 <https://github.com/ROCm/pytorch/tree/release/2.4>`__
- 24.04
- `3.12.10 <https://www.python.org/downloads/release/python-31210/>`__
- `3.12 <https://www.python.org/downloads/release/python-31210/>`__
- `1.4.0 <https://github.com/ROCm/apex/tree/release/1.4.0>`__
- `0.19.0 <https://github.com/pytorch/vision/tree/v0.19.0>`__
- `2.13.0 <https://github.com/tensorflow/tensorboard/tree/2.13.0>`__
- `2.18.0 <https://github.com/tensorflow/tensorboard/tree/2.18.0>`__
- `master <https://bitbucket.org/icl/magma/src/master/>`__
- `1.16.0+ds-5ubuntu1 <https://github.com/openucx/ucx/tree/v1.16.0>`__
- `4.1.6-7ubuntu2 <https://github.com/open-mpi/ompi/tree/v4.1.6>`__
- `5.3-1.0.5.0 <https://content.mellanox.com/ofed/MLNX_OFED-5.3-1.0.5.0/MLNX_OFED_LINUX-5.3-1.0.5.0-ubuntu20.04-x86_64.tgz>`__
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/pytorch/rocm6.4.1_ubuntu22.04_py3.10_pytorch_release_2.4.1/images/sha256-2308dbd0e650b7bf8d548575cbb6e2bdc021f9386384ce570da16d58ee684d22"><i class="fab fa-docker fa-lg"></i></a>
<a href="https://hub.docker.com/layers/rocm/pytorch/rocm6.4.2_ubuntu22.04_py3.10_pytorch_release_2.4.1/images/sha256-fe944fe083312f901be6891ab4d3ffebf2eaf2cf4f5f0f435ef0b76ec714fabd"><i class="fab fa-docker fa-lg"></i></a>
- `2.4.1 <https://github.com/ROCm/pytorch/tree/release/2.4>`__
- 22.04
- `3.10.17 <https://www.python.org/downloads/release/python-31017/>`__
- `3.10 <https://www.python.org/downloads/release/python-31017/>`__
- `1.4.0 <https://github.com/ROCm/apex/tree/release/1.4.0>`__
- `0.19.0 <https://github.com/pytorch/vision/tree/v0.19.0>`__
- `2.13.0 <https://github.com/tensorflow/tensorboard/tree/2.13.0>`__
- `2.18.0 <https://github.com/tensorflow/tensorboard/tree/2.18.0>`__
- `master <https://bitbucket.org/icl/magma/src/master/>`__
- `1.12.1~rc2-1 <https://github.com/openucx/ucx/tree/v1.12.1>`__
- `4.1.2-2ubuntu1 <https://github.com/open-mpi/ompi/tree/v4.1.2>`__
- `5.3-1.0.5.0 <https://content.mellanox.com/ofed/MLNX_OFED-5.3-1.0.5.0/MLNX_OFED_LINUX-5.3-1.0.5.0-ubuntu20.04-x86_64.tgz>`__
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/pytorch/rocm6.4.1_ubuntu24.04_py3.12_pytorch_release_2.3.0/images/sha256-eefd2ab019728f91f94c5e6a9463cb0ea900b3011458d18fe5d88e50c0b57d86"><i class="fab fa-docker fa-lg"></i></a>
<a href="https://hub.docker.com/layers/rocm/pytorch/rocm6.4.2_ubuntu24.04_py3.12_pytorch_release_2.3.0/images/sha256-1d59251c47170c5b8960d1172a4dbe52f5793d8966edd778f168eaf32d56661a"><i class="fab fa-docker fa-lg"></i></a>
- `2.3.0 <https://github.com/ROCm/pytorch/tree/release/2.3>`__
- 24.04
- `3.12.10 <https://www.python.org/downloads/release/python-31210/>`__
- `3.12 <https://www.python.org/downloads/release/python-31210/>`__
- `1.3.0 <https://github.com/ROCm/apex/tree/release/1.3.0>`__
- `0.18.0 <https://github.com/pytorch/vision/tree/v0.18.0>`__
- `2.13.0 <https://github.com/tensorflow/tensorboard/tree/2.13>`__
- `master <https://bitbucket.org/icl/magma/src/master/>`__
- `1.16.0+ds-5ubuntu1 <https://github.com/openucx/ucx/tree/v1.16.0>`__
- `4.1.6-7ubuntu2 <https://github.com/open-mpi/ompi/tree/v4.1.6>`__
- `5.3-1.0.5.0 <https://content.mellanox.com/ofed/MLNX_OFED-5.3-1.0.5.0/MLNX_OFED_LINUX-5.3-1.0.5.0-ubuntu20.04-x86_64.tgz>`__
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/pytorch/rocm6.4.1_ubuntu22.04_py3.10_pytorch_release_2.3.0/images/sha256-473643226ab0e93a04720b256ed772619878abf9c42b9f84828cefed522696fd"><i class="fab fa-docker fa-lg"></i></a>
- `2.3.0 <https://github.com/ROCm/pytorch/tree/release/2.3>`__
- 22.04
- `3.10.17 <https://www.python.org/downloads/release/python-31017/>`__
- `1.3.0 <https://github.com/ROCm/apex/tree/release/1.3.0>`__
- `0.18.0 <https://github.com/pytorch/vision/tree/v0.18.0>`__
- `2.13.0 <https://github.com/tensorflow/tensorboard/tree/2.13>`__
- `master <https://bitbucket.org/icl/magma/src/master/>`__
- `1.12.1~rc2-1 <https://github.com/openucx/ucx/tree/v1.12.1>`__
- `4.1.2-2ubuntu1 <https://github.com/open-mpi/ompi/tree/v4.1.2>`__
- `5.3-1.0.5.0 <https://content.mellanox.com/ofed/MLNX_OFED-5.3-1.0.5.0/MLNX_OFED_LINUX-5.3-1.0.5.0-ubuntu20.04-x86_64.tgz>`__
Key ROCm libraries for PyTorch
================================================================================

View File

@@ -0,0 +1,76 @@
:orphan:
.. meta::
:description: Taichi compatibility
:keywords: GPU, Taichi compatibility
.. version-set:: rocm_version latest
*******************************************************************************
Taichi compatibility
*******************************************************************************
`Taichi <https://www.taichi-lang.org/>`_ is an open-source, imperative, and parallel
programming language designed for high-performance numerical computation.
Embedded in Python, it leverages just-in-time (JIT) compilation frameworks such as LLVM to accelerate
compute-intensive Python code by compiling it to native GPU or CPU instructions.
Taichi is widely used across various domains, including real-time physical simulation,
numerical computing, augmented reality, artificial intelligence, computer vision, robotics,
visual effects in film and gaming, and general-purpose computing.
* ROCm support for Taichi is hosted in the official `https://github.com/ROCm/taichi <https://github.com/ROCm/taichi>`_ repository.
* Due to independent compatibility considerations, this location differs from the `https://github.com/taichi-dev <https://github.com/taichi-dev>`_ upstream repository.
* Use the prebuilt :ref:`Docker image <taichi-docker-compat>` with ROCm, PyTorch, and Taichi preinstalled.
* See the :doc:`ROCm Taichi installation guide <rocm-install-on-linux:install/3rd-party/taichi-install>` to install and get started.
.. note::
Taichi is supported on ROCm 6.3.2.
Supported devices and features
===============================================================================
There is support through the ROCm software stack for all Taichi GPU features on AMD Instinct MI250X and MI210X series GPUs with the exception of Taichis GPU rendering system, CGUI.
AMD Instinct MI300X series GPUs will be supported by November.
.. _taichi-recommendations:
Use cases and recommendations
================================================================================
To fully leverage Taichi's performance capabilities in compute-intensive tasks, it is best to adhere to specific coding patterns and utilize Taichi decorators.
A collection of example use cases is available in the `https://github.com/ROCm/taichi_examples <https://github.com/ROCm/taichi_examples>`_ repository,
providing practical insights and foundational knowledge for working with the Taichi programming language.
You can also refer to the `AMD ROCm blog <https://rocm.blogs.amd.com/>`_ to search for Taichi examples and best practices to optimize your workflows on AMD GPUs.
.. _taichi-docker-compat:
Docker image compatibility
================================================================================
.. |docker-icon| raw:: html
<i class="fab fa-docker"></i>
AMD validates and publishes ready-made `ROCm Taichi Docker images <https://hub.docker.com/r/rocm/taichi/tags>`_
with ROCm backends on Docker Hub. The following Docker image tags and associated inventories
represent the latest Taichi version from the official Docker Hub.
The Docker images have been validated for `ROCm 6.3.2 <https://rocm.docs.amd.com/en/docs-6.3.2/about/release-notes.html>`_.
Click |docker-icon| to view the image on Docker Hub.
.. list-table::
:header-rows: 1
:class: docker-image-compatibility
* - Docker image
- ROCm
- Taichi
- Ubuntu
- Python
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/taichi/taichi-1.8.0b1_rocm6.3.2_ubuntu22.04_py3.10.12/images/sha256-e016964a751e6a92199032d23e70fa3a564fff8555afe85cd718f8aa63f11fc6"><i class="fab fa-docker fa-lg"></i> rocm/taichi</a>
- `6.3.2 <https://repo.radeon.com/rocm/apt/6.3.2/>`_
- `1.8.0b1 <https://github.com/taichi-dev/taichi>`_
- 22.04
- `3.10.12 <https://www.python.org/downloads/release/python-31012/>`_

View File

@@ -56,7 +56,7 @@ Docker image compatibility
AMD validates and publishes ready-made `TensorFlow images
<https://hub.docker.com/r/rocm/tensorflow>`__ with ROCm backends on
Docker Hub. The following Docker image tags and associated inventories are
validated for `ROCm 6.4.1 <https://repo.radeon.com/rocm/apt/6.4.1/>`__. Click
validated for `ROCm 6.4.2 <https://repo.radeon.com/rocm/apt/6.4.2/>`__. Click
the |docker-icon| icon to view the image on Docker Hub.
.. list-table:: TensorFlow Docker image components
@@ -65,128 +65,61 @@ the |docker-icon| icon to view the image on Docker Hub.
* - Docker image
- TensorFlow
- Ubuntu
- Dev
- Python
- TensorBoard
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/tensorflow/rocm6.4-py3.12-tf2.18-dev/images/sha256-fa9cf5fa6c6079a7118727531ccd0056c6e3224a42c3d6e78a49e7781daafff4"><i class="fab fa-docker fa-lg"></i> rocm/tensorflow</a>
<a href="https://hub.docker.com/layers/rocm/tensorflow/rocm6.4.2-py3.12-tf2.18-dev/images/sha256-96754ce2d30f729e19b497279915b5212ba33d5e408e7e5dd3f2304d87e3441e"><i class="fab fa-docker fa-lg"></i> rocm/tensorflow</a>
- `tensorflow-rocm 2.18.1 <https://repo.radeon.com/rocm/manylinux/rocm-rel-6.4.1/tensorflow_rocm-2.18.1-cp312-cp312-manylinux_2_28_x86_64.whl>`__
- dev
- `tensorflow-rocm 2.18.1 <https://repo.radeon.com/rocm/manylinux/rocm-rel-6.4.2/tensorflow_rocm-2.18.1-cp312-cp312-manylinux_2_28_x86_64.whl>`__
- 24.04
- `Python 3.12.10 <https://www.python.org/downloads/release/python-31210/>`__
- `Python 3.12 <https://www.python.org/downloads/release/python-31210/>`__
- `TensorBoard 2.18.0 <https://github.com/tensorflow/tensorboard/tree/2.18.0>`__
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/tensorflow/rocm6.4.1-py3.12-tf2.18-runtime/images/sha256-d14d8c4989e7c9a60f4e72461b9e349de72347c6162dcd6897e6f4f80ffbb440"><i class="fab fa-docker fa-lg"></i> rocm/tensorflow</a>
<a href="https://hub.docker.com/layers/rocm/tensorflow/rocm6.4.2-py3.10-tf2.18-dev/images/sha256-fa741508d383858e86985a9efac85174529127408102558ae2e3a4ac894eea1e"><i class="fab fa-docker fa-lg"></i> rocm/tensorflow</a>
- `tensorflow-rocm 2.18.1 <https://repo.radeon.com/rocm/manylinux/rocm-rel-6.4.1/tensorflow_rocm-2.18.1-cp312-cp312-manylinux_2_28_x86_64.whl>`__
- runtime
- 24.04
- `Python 3.12.10 <https://www.python.org/downloads/release/python-31210/>`__
- `tensorflow-rocm 2.18.1 <https://repo.radeon.com/rocm/manylinux/rocm-rel-6.4.2/tensorflow_rocm-2.18.1-cp310-cp310-manylinux_2_28_x86_64.whl>`__
- 22.04
- `Python 3.10 <https://www.python.org/downloads/release/python-31017/>`__
- `TensorBoard 2.18.0 <https://github.com/tensorflow/tensorboard/tree/2.18.0>`__
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/tensorflow/rocm6.4.1-py3.10-tf2.18-dev/images/sha256-081e5bd6615a5dc17247ebd2ccc26895c3feeff086720400fa39b477e60a77c0"><i class="fab fa-docker fa-lg"></i> rocm/tensorflow</a>
<a href="https://hub.docker.com/layers/rocm/tensorflow/rocm6.4.2-py3.12-tf2.17-dev/images/sha256-3a0aef09f2a8833c2b64b85874dd9449ffc2ad257351857338ff5b706c03a418"><i class="fab fa-docker fa-lg"></i> rocm/tensorflow</a>
- `tensorflow-rocm 2.18.1 <https://repo.radeon.com/rocm/manylinux/rocm-rel-6.4.1/tensorflow_rocm-2.18.1-cp310-cp310-manylinux_2_28_x86_64.whl>`__
- dev
- 22.04
- `Python 3.10.17 <https://www.python.org/downloads/release/python-31017/>`__
- `TensorBoard 2.18.0 <https://github.com/tensorflow/tensorboard/tree/2.18.0>`__
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/tensorflow/rocm6.4.1-py3.10-tf2.18-runtime/images/sha256-bf369637378264f4af6ddad5ca8b8611d3e372ffbea9ab7a06f1e122f0a0867b"><i class="fab fa-docker fa-lg"></i> rocm/tensorflow</a>
- `tensorflow-rocm 2.18.1 <https://repo.radeon.com/rocm/manylinux/rocm-rel-6.4.1/tensorflow_rocm-2.18.1-cp310-cp310-manylinux_2_28_x86_64.whl>`__
- runtime
- 22.04
- `Python 3.10.17 <https://www.python.org/downloads/release/python-31017/>`__
- `TensorBoard 2.18.0 <https://github.com/tensorflow/tensorboard/tree/2.18.0>`__
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/tensorflow/rocm6.4.1-py3.12-tf2.17-dev/images/sha256-5a502008c50d0b6508e6027f911bdff070a7493700ae064bed74e1d22b91ed50"><i class="fab fa-docker fa-lg"></i> rocm/tensorflow</a>
- `tensorflow-rocm 2.17.1 <https://repo.radeon.com/rocm/manylinux/rocm-rel-6.4/tensorflow_rocm-2.17.1-cp312-cp312-manylinux_2_28_x86_64.whl>`__
- dev
- `tensorflow-rocm 2.17.1 <https://repo.radeon.com/rocm/manylinux/rocm-rel-6.4.2/tensorflow_rocm-2.17.1-cp312-cp312-manylinux_2_28_x86_64.whl>`__
- 24.04
- `Python 3.12.10 <https://www.python.org/downloads/release/python-31210/>`__
- `Python 3.12 <https://www.python.org/downloads/release/python-31210/>`__
- `TensorBoard 2.17.1 <https://github.com/tensorflow/tensorboard/tree/2.17.1>`__
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/tensorflow/rocm6.4.1-py3.12-tf2.17-runtime/images/sha256-1ee5dfffceb71ac66617ada33de3a10de0cb74199cc4b82441192e5e92fa2ddf"><i class="fab fa-docker fa-lg"></i> rocm/tensorflow</a>
<a href="https://hub.docker.com/layers/rocm/tensorflow/rocm6.4.2-py3.10-tf2.17-dev/images/sha256-bc7341a41ebe7ab261aa100732874507c452421ef733e408ac4f05ed453b0bc5"><i class="fab fa-docker fa-lg"></i> rocm/tensorflow</a>
- `tensorflow-rocm 2.18.1 <https://repo.radeon.com/rocm/manylinux/rocm-rel-6.4/tensorflow_rocm-2.17.1-cp312-cp312-manylinux_2_28_x86_64.whl>`__
- runtime
- 24.04
- `Python 3.12.10 <https://www.python.org/downloads/release/python-3124/>`__
- `TensorBoard 2.17.1 <https://github.com/tensorflow/tensorboard/tree/2.17.1>`__
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/tensorflow/rocm6.4.1-py3.10-tf2.17-dev/images/sha256-109218ad92bfae83bbd2710475f7502166e1ed54ca0b9748a9cbc3f5a1d75af1"><i class="fab fa-docker fa-lg"></i> rocm/tensorflow</a>
- `tensorflow-rocm 2.17.1 <https://repo.radeon.com/rocm/manylinux/rocm-rel-6.4.1/tensorflow_rocm-2.17.1-cp312-cp312-manylinux_2_28_x86_64.whl>`__
- dev
- `tensorflow-rocm 2.17.1 <https://repo.radeon.com/rocm/manylinux/rocm-rel-6.4.2/tensorflow_rocm-2.17.1-cp310-cp310-manylinux_2_28_x86_64.whl>`__
- 22.04
- `Python 3.10.17 <https://www.python.org/downloads/release/python-31017/>`__
- `Python 3.10 <https://www.python.org/downloads/release/python-31017/>`__
- `TensorBoard 2.17.1 <https://github.com/tensorflow/tensorboard/tree/2.17.1>`__
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/tensorflow/rocm6.4.1-py3.10-tf2.17-runtime/images/sha256-5d78bd5918d394f92263daa2990e88d695d27200dd90ed83ec64d20c7661c9c1"><i class="fab fa-docker fa-lg"></i> rocm/tensorflow</a>
<a href="https://hub.docker.com/layers/rocm/tensorflow/rocm6.4.2-py3.12-tf2.16-dev/images/sha256-4841a8df7c340dab79bf9362dad687797649a00d594e0832eb83ea6880a40d3b"><i class="fab fa-docker fa-lg"></i> rocm/tensorflow</a>
- `tensorflow-rocm 2.17.1 <https://repo.radeon.com/rocm/manylinux/rocm-rel-6.4.1/tensorflow_rocm-2.17.1-cp310-cp310-manylinux_2_28_x86_64.whl>`__
- runtime
- 22.04
- `Python 3.10.17 <https://www.python.org/downloads/release/python-31017/>`__
- `TensorBoard 2.17.1 <https://github.com/tensorflow/tensorboard/tree/2.17.1>`__
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/tensorflow/rocm6.4.1-py3.12-tf2.16-dev/images/sha256-b09b1ad921c09c687b7c916141051e9fcf15539a5686e5aa67c689195a522719"><i class="fab fa-docker fa-lg"></i> rocm/tensorflow</a>
- `tensorflow-rocm 2.16.2 <https://repo.radeon.com/rocm/manylinux/rocm-rel-6.4.1/tensorflow_rocm-2.16.2-cp312-cp312-manylinux_2_28_x86_64.whl>`__
- dev
- `tensorflow-rocm 2.16.2 <https://repo.radeon.com/rocm/manylinux/rocm-rel-6.4.2/tensorflow_rocm-2.16.2-cp312-cp312-manylinux_2_28_x86_64.whl>`__
- 24.04
- `Python 3.12.10 <https://www.python.org/downloads/release/python-31210/>`__
- `Python 3.12 <https://www.python.org/downloads/release/python-31210/>`__
- `TensorBoard 2.16.2 <https://github.com/tensorflow/tensorboard/tree/2.16.2>`__
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/tensorflow/rocm6.4.1-py3.12-tf2.16-runtime/images/sha256-20dbd824e85558abfe33fc9283cc547d88cde3c623fe95322743a5082f883a64"><i class="fab fa-docker fa-lg"></i> rocm/tensorflow</a>
<a href="https://hub.docker.com/layers/rocm/tensorflow/rocm6.4.2-py3.10-tf2.16-dev/images/sha256-883fa95aba960c58a3e46fceaa18f03ede2c7df89b8e9fd603ab2d47e0852897"><i class="fab fa-docker fa-lg"></i> rocm/tensorflow</a>
- `tensorflow-rocm 2.16.2 <https://repo.radeon.com/rocm/manylinux/rocm-rel-6.4.1/tensorflow_rocm-2.16.2-cp312-cp312-manylinux_2_28_x86_64.whl>`__
- runtime
- 24.04
- `Python 3.12.10 <https://www.python.org/downloads/release/python-31210/>`__
- `TensorBoard 2.16.2 <https://github.com/tensorflow/tensorboard/tree/2.16.2>`__
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/tensorflow/rocm6.4.1-py3.10-tf2.16-dev/images/sha256-36c4fa047c86e2470ac473ec1429aea6d4b8934b90ffeb34d1afab40e7e5b377"><i class="fab fa-docker fa-lg"></i> rocm/tensorflow</a>
- `tensorflow-rocm 2.16.2 <https://hub.docker.com/layers/rocm/tensorflow/rocm6.4.1-py3.10-tf2.16-dev/images/sha256-36c4fa047c86e2470ac473ec1429aea6d4b8934b90ffeb34d1afab40e7e5b377>`__
- dev
- `tensorflow-rocm 2.16.2 <https://repo.radeon.com/rocm/manylinux/rocm-rel-6.4.2/tensorflow_rocm-2.16.2-cp310-cp310-manylinux_2_28_x86_64.whl>`__
- 22.04
- `Python 3.10.17 <https://www.python.org/downloads/release/python-31017/>`__
- `TensorBoard 2.16.2 <https://github.com/tensorflow/tensorboard/tree/2.16.2>`__
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/tensorflow/rocm6.4.1-py3.10-tf2.16-runtime/images/sha256-a94150ffb81365234ebfa34e764db5474bc6ab7d141b56495eac349778dafcf3"><i class="fab fa-docker fa-lg"></i> rocm/tensorflow</a>
- `tensorflow-rocm 2.16.2 <https://repo.radeon.com/rocm/manylinux/rocm-rel-6.4.1/tensorflow_rocm-2.16.2-cp312-cp312-manylinux_2_28_x86_64.whl>`__
- runtime
- 22.04
- `Python 3.10.17 <https://www.python.org/downloads/release/python-31017/>`__
- `Python 3.10 <https://www.python.org/downloads/release/python-31017/>`__
- `TensorBoard 2.16.2 <https://github.com/tensorflow/tensorboard/tree/2.16.2>`__

View File

@@ -16,56 +16,25 @@ verl offers a scalable, open-source fine-tuning solution optimized for AMD Insti
* See the `verl documentation <https://verl.readthedocs.io/en/latest/>`_ for more information about verl.
* The official verl GitHub repository is `https://github.com/volcengine/verl <https://github.com/volcengine/verl>`_.
* Use the AMD-validated :ref:`Docker images <verl-docker-compat>` with ROCm and verl preinstalled.
* See the :doc:`ROCm verl installation guide <rocm-install-on-linux:install/3rd-party/verl-install>` to get started.
* See the :doc:`ROCm verl installation guide <rocm-install-on-linux:install/3rd-party/verl-install>` to install and get started.
.. note::
verl is supported on ROCm 6.2.0.
.. _verl-recommendations:
Use cases and recommendations
================================================================================
The benefits of verl in large-scale reinforcement leaning from human feedback (RLHF) are discussed in the `Reinforcement Learning from Human Feedback on AMD GPUs with verl and ROCm Integration <https://rocm.blogs.amd.com/artificial-intelligence/verl-large-scale/README.html>`_ blog.
.. _verl-docker-compat:
Docker image compatibility
================================================================================
.. |docker-icon| raw:: html
<i class="fab fa-docker"></i>
AMD validates and publishes ready-made `ROCm verl Docker images <https://hub.docker.com/r/rocm/verl>`_
with ROCm backends on Docker Hub. The following Docker image tags and associated inventories represent the latest verl version from the official Docker Hub. The Docker images have been validated for `ROCm 6.2.0 <https://repo.radeon.com/rocm/apt/6.2/>`_.
.. list-table::
:header-rows: 1
* - Docker image
- verl
- Linux
- Pytorch
- Python
- vllm
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/verl/verl-0.3.0.post0_rocm6.2_vllm0.6.3/images/sha256-cbe423803fd7850448b22444176bee06f4dcf22cd3c94c27732752d3a39b04b2"><i class="fab fa-docker fa-lg"></i> rocm/verl</a>
- `0.3.0post0 <https://github.com/volcengine/verl/releases/tag/v0.3.0.post0>`_
- Ubuntu 20.04
- `2.5.0 <https://download.pytorch.org/whl/cu118/torch-2.5.0%2Bcu118-cp39-cp39-linux_x86_64.whl#sha256=1ee24b267418c37b297529ede875b961e382c1c365482f4142af2398b92ed127>`_
- `3.9.19 <https://www.python.org/downloads/release/python-3919/>`_
- `0.6.4 <https://github.com/vllm-project/vllm/releases/tag/v0.6.4>`_
The benefits of verl in large-scale reinforcement learning from human feedback (RLHF) are discussed in the `Reinforcement Learning from Human Feedback on AMD GPUs with verl and ROCm Integration <https://rocm.blogs.amd.com/artificial-intelligence/verl-large-scale/README.html>`_ blog.
.. _verl-supported_features:
Supported features
===============================================================================
The following table shows verl and ROCm support for GPU-accelerated modules.
The following table shows verl on ROCm support for GPU-accelerated modules.
.. list-table::
:header-rows: 1
@@ -77,9 +46,41 @@ The following table shows verl and ROCm support for GPU-accelerated modules.
* - ``FSDP``
- Training engine
- 0.3.0.post0
- 6.2
- 6.2.0
* - ``vllm``
- Inference engine
- 0.3.0.post0
- 6.2
- 6.2.0
.. _verl-docker-compat:
Docker image compatibility
================================================================================
.. |docker-icon| raw:: html
<i class="fab fa-docker"></i>
AMD validates and publishes ready-made `ROCm verl Docker images <https://hub.docker.com/r/rocm/verl/tags>`_
with ROCm backends on Docker Hub. The following Docker image tags and associated inventories represent the available verl versions from the official Docker Hub.
.. list-table::
:header-rows: 1
* - Docker image
- ROCm
- verl
- Ubuntu
- Pytorch
- Python
- vllm
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/verl/verl-0.3.0.post0_rocm6.2_vllm0.6.3/images/sha256-cbe423803fd7850448b22444176bee06f4dcf22cd3c94c27732752d3a39b04b2"><i class="fab fa-docker fa-lg"></i> rocm/verl</a>
- `6.2.0 <https://repo.radeon.com/rocm/apt/6.2/>`_
- `0.3.0post0 <https://github.com/volcengine/verl/releases/tag/v0.3.0.post0>`_
- 20.04
- `2.5.0 <https://github.com/ROCm/pytorch/tree/release/2.5>`_
- `3.9.19 <https://www.python.org/downloads/release/python-3919/>`_
- `0.6.3 <https://github.com/vllm-project/vllm/releases/tag/v0.6.3>`_

View File

@@ -82,20 +82,25 @@ project = "ROCm Documentation"
project_path = os.path.abspath(".").replace("\\", "/")
author = "Advanced Micro Devices, Inc."
copyright = "Copyright (c) 2025 Advanced Micro Devices, Inc. All rights reserved."
version = "6.4.2"
release = "6.4.2"
version = "6.4.3"
release = "6.4.3"
setting_all_article_info = True
all_article_info_os = ["linux", "windows"]
all_article_info_author = ""
# pages with specific settings
article_pages = [
{"file": "about/release-notes", "os": ["linux"], "date": "2025-07-21"},
{"file": "about/release-notes", "os": ["linux"], "date": "2025-08-07"},
{"file": "release/changelog", "os": ["linux"],},
{"file": "compatibility/compatibility-matrix", "os": ["linux"]},
{"file": "compatibility/ml-compatibility/pytorch-compatibility", "os": ["linux"]},
{"file": "compatibility/ml-compatibility/tensorflow-compatibility", "os": ["linux"]},
{"file": "compatibility/ml-compatibility/jax-compatibility", "os": ["linux"]},
{"file": "compatibility/ml-compatibility/verl-compatibility", "os": ["linux"]},
{"file": "compatibility/ml-compatibility/stanford-megatron-lm-compatibility", "os": ["linux"]},
{"file": "compatibility/ml-compatibility/dgl-compatibility", "os": ["linux"]},
{"file": "compatibility/ml-compatibility/megablocks-compatibility", "os": ["linux"]},
{"file": "compatibility/ml-compatibility/taichi-compatibility", "os": ["linux"]},
{"file": "how-to/deep-learning-rocm", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/index", "os": ["linux"]},
@@ -142,6 +147,8 @@ article_pages = [
{"file": "how-to/rocm-for-ai/inference/benchmark-docker/previous-versions/vllm-0.8.5-20250521", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/inference/benchmark-docker/previous-versions/vllm-0.9.0.1-20250605", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/inference/benchmark-docker/previous-versions/vllm-0.9.0.1-20250702", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/inference/benchmark-docker/previous-versions/vllm-0.9.1-20250702", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/inference/benchmark-docker/previous-versions/vllm-0.9.1-20250715", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/inference/benchmark-docker/pytorch-inference", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/inference/deploy-your-model", "os": ["linux"]},

View File

@@ -0,0 +1,163 @@
vllm_benchmark:
unified_docker:
latest:
# TODO: update me
pull_tag: rocm/vllm:rocm6.4.1_vllm_0.9.1_20250715
docker_hub_url: https://hub.docker.com/layers/rocm/vllm/rocm6.4.1_vllm_0.9.1_20250715/images/sha256-4a429705fa95a58f6d20aceab43b1b76fa769d57f32d5d28bd3f4e030e2a78ea
rocm_version: 6.4.1
vllm_version: 0.9.1 (0.9.2.dev364+gb432b7a28.rocm641)
pytorch_version: 2.7.0+gitf717b2a
hipblaslt_version: 0.15
model_groups:
- group: Meta Llama
tag: llama
models:
- model: Llama 3.1 8B
mad_tag: pyt_vllm_llama-3.1-8b
model_repo: meta-llama/Llama-3.1-8B-Instruct
url: https://huggingface.co/meta-llama/Llama-3.1-8B
precision: float16
- model: Llama 3.1 70B
mad_tag: pyt_vllm_llama-3.1-70b
model_repo: meta-llama/Llama-3.1-70B-Instruct
url: https://huggingface.co/meta-llama/Llama-3.1-70B-Instruct
precision: float16
- model: Llama 3.1 405B
mad_tag: pyt_vllm_llama-3.1-405b
model_repo: meta-llama/Llama-3.1-405B-Instruct
url: https://huggingface.co/meta-llama/Llama-3.1-405B-Instruct
precision: float16
- model: Llama 2 7B
mad_tag: pyt_vllm_llama-2-7b
model_repo: meta-llama/Llama-2-7b-chat-hf
url: https://huggingface.co/meta-llama/Llama-2-7b-chat-hf
precision: float16
- model: Llama 2 70B
mad_tag: pyt_vllm_llama-2-70b
model_repo: meta-llama/Llama-2-70b-chat-hf
url: https://huggingface.co/meta-llama/Llama-2-70b-chat-hf
precision: float16
- model: Llama 3.1 8B FP8
mad_tag: pyt_vllm_llama-3.1-8b_fp8
model_repo: amd/Llama-3.1-8B-Instruct-FP8-KV
url: https://huggingface.co/amd/Llama-3.1-8B-Instruct-FP8-KV
precision: float8
- model: Llama 3.1 70B FP8
mad_tag: pyt_vllm_llama-3.1-70b_fp8
model_repo: amd/Llama-3.1-70B-Instruct-FP8-KV
url: https://huggingface.co/amd/Llama-3.1-70B-Instruct-FP8-KV
precision: float8
- model: Llama 3.1 405B FP8
mad_tag: pyt_vllm_llama-3.1-405b_fp8
model_repo: amd/Llama-3.1-405B-Instruct-FP8-KV
url: https://huggingface.co/amd/Llama-3.1-405B-Instruct-FP8-KV
precision: float8
- group: Mistral AI
tag: mistral
models:
- model: Mixtral MoE 8x7B
mad_tag: pyt_vllm_mixtral-8x7b
model_repo: mistralai/Mixtral-8x7B-Instruct-v0.1
url: https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1
precision: float16
- model: Mixtral MoE 8x22B
mad_tag: pyt_vllm_mixtral-8x22b
model_repo: mistralai/Mixtral-8x22B-Instruct-v0.1
url: https://huggingface.co/mistralai/Mixtral-8x22B-Instruct-v0.1
precision: float16
- model: Mistral 7B
mad_tag: pyt_vllm_mistral-7b
model_repo: mistralai/Mistral-7B-Instruct-v0.3
url: https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3
precision: float16
- model: Mixtral MoE 8x7B FP8
mad_tag: pyt_vllm_mixtral-8x7b_fp8
model_repo: amd/Mixtral-8x7B-Instruct-v0.1-FP8-KV
url: https://huggingface.co/amd/Mixtral-8x7B-Instruct-v0.1-FP8-KV
precision: float8
- model: Mixtral MoE 8x22B FP8
mad_tag: pyt_vllm_mixtral-8x22b_fp8
model_repo: amd/Mixtral-8x22B-Instruct-v0.1-FP8-KV
url: https://huggingface.co/amd/Mixtral-8x22B-Instruct-v0.1-FP8-KV
precision: float8
- model: Mistral 7B FP8
mad_tag: pyt_vllm_mistral-7b_fp8
model_repo: amd/Mistral-7B-v0.1-FP8-KV
url: https://huggingface.co/amd/Mistral-7B-v0.1-FP8-KV
precision: float8
- group: Qwen
tag: qwen
models:
- model: Qwen2 7B
mad_tag: pyt_vllm_qwen2-7b
model_repo: Qwen/Qwen2-7B-Instruct
url: https://huggingface.co/Qwen/Qwen2-7B-Instruct
precision: float16
- model: Qwen2 72B
mad_tag: pyt_vllm_qwen2-72b
model_repo: Qwen/Qwen2-72B-Instruct
url: https://huggingface.co/Qwen/Qwen2-72B-Instruct
precision: float16
- model: QwQ-32B
mad_tag: pyt_vllm_qwq-32b
model_repo: Qwen/QwQ-32B
url: https://huggingface.co/Qwen/QwQ-32B
precision: float16
tunableop: true
- group: Databricks DBRX
tag: dbrx
models:
- model: DBRX Instruct
mad_tag: pyt_vllm_dbrx-instruct
model_repo: databricks/dbrx-instruct
url: https://huggingface.co/databricks/dbrx-instruct
precision: float16
- model: DBRX Instruct FP8
mad_tag: pyt_vllm_dbrx_fp8
model_repo: amd/dbrx-instruct-FP8-KV
url: https://huggingface.co/amd/dbrx-instruct-FP8-KV
precision: float8
- group: Google Gemma
tag: gemma
models:
- model: Gemma 2 27B
mad_tag: pyt_vllm_gemma-2-27b
model_repo: google/gemma-2-27b
url: https://huggingface.co/google/gemma-2-27b
precision: float16
- group: Cohere
tag: cohere
models:
- model: C4AI Command R+ 08-2024
mad_tag: pyt_vllm_c4ai-command-r-plus-08-2024
model_repo: CohereForAI/c4ai-command-r-plus-08-2024
url: https://huggingface.co/CohereForAI/c4ai-command-r-plus-08-2024
precision: float16
- model: C4AI Command R+ 08-2024 FP8
mad_tag: pyt_vllm_command-r-plus_fp8
model_repo: amd/c4ai-command-r-plus-FP8-KV
url: https://huggingface.co/amd/c4ai-command-r-plus-FP8-KV
precision: float8
- group: DeepSeek
tag: deepseek
models:
- model: DeepSeek MoE 16B
mad_tag: pyt_vllm_deepseek-moe-16b-chat
model_repo: deepseek-ai/deepseek-moe-16b-chat
url: https://huggingface.co/deepseek-ai/deepseek-moe-16b-chat
precision: float16
- group: Microsoft Phi
tag: phi
models:
- model: Phi-4
mad_tag: pyt_vllm_phi-4
model_repo: microsoft/phi-4
url: https://huggingface.co/microsoft/phi-4
- group: TII Falcon
tag: falcon
models:
- model: Falcon 180B
mad_tag: pyt_vllm_falcon-180b
model_repo: tiiuae/falcon-180B
url: https://huggingface.co/tiiuae/falcon-180B
precision: float16

View File

@@ -1,6 +1,6 @@
pytorch_inference_benchmark:
unified_docker:
latest: &rocm-pytorch-docker-latest
latest:
pull_tag: rocm/pytorch:latest
docker_hub_url:
rocm_version:
@@ -39,3 +39,19 @@ pytorch_inference_benchmark:
model_repo: Wan-AI/Wan2.1-T2V-14B
url: https://huggingface.co/Wan-AI/Wan2.1-T2V-14B
precision: bfloat16
- group: Janus Pro
tag: janus-pro
models:
- model: Janus Pro 7B
mad_tag: pyt_janus_pro_inference
model_repo: deepseek-ai/Janus-Pro-7B
url: https://huggingface.co/deepseek-ai/Janus-Pro-7B
precision: bfloat16
- group: Hunyuan Video
tag: hunyuan
models:
- model: Hunyuan Video
mad_tag: pyt_hy_video
model_repo: tencent/HunyuanVideo
url: https://huggingface.co/tencent/HunyuanVideo
precision: float16

View File

@@ -0,0 +1,17 @@
sglang_benchmark:
unified_docker:
latest:
pull_tag: lmsysorg/sglang:v0.4.5-rocm630
docker_hub_url: https://hub.docker.com/layers/lmsysorg/sglang/v0.4.5-rocm630/images/sha256-63d2cb760a237125daf6612464cfe2f395c0784e21e8b0ea37d551cd10d3c951
rocm_version: 6.3.0
sglang_version: 0.4.5 (0.4.5-rocm)
pytorch_version: 2.6.0a0+git8d4926e
model_groups:
- group: DeepSeek
tag: deepseek
models:
- model: DeepSeek-R1-Distill-Qwen-32B
mad_tag: pyt_sglang_deepseek-r1-distill-qwen-32b
model_repo: deepseek-ai/DeepSeek-R1-Distill-Qwen-32B
url: https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-32B
precision: bfloat16

View File

@@ -2,11 +2,11 @@ vllm_benchmark:
unified_docker:
latest:
# TODO: update me
pull_tag: rocm/vllm:rocm6.4.1_vllm_0.9.1_20250715
docker_hub_url: https://hub.docker.com/layers/rocm/vllm/rocm6.4.1_vllm_0.9.1_20250715/images/sha256-4a429705fa95a58f6d20aceab43b1b76fa769d57f32d5d28bd3f4e030e2a78ea
pull_tag: rocm/vllm:rocm6.4.1_vllm_0.10.0_20250812
docker_hub_url: https://hub.docker.com/layers/rocm/vllm/rocm6.4.1_vllm_0.10.0_20250812/images/sha256-4c277ad39af3a8c9feac9b30bf78d439c74d9b4728e788a419d3f1d0c30cacaa
rocm_version: 6.4.1
vllm_version: 0.9.1 (0.9.2.dev364+gb432b7a28.rocm641)
pytorch_version: 2.7.0+gitf717b2a
vllm_version: 0.10.0 (0.10.1.dev395+g340ea86df.rocm641)
pytorch_version: 2.7.0+gitf717b2a (2.7.0+gitf717b2a)
hipblaslt_version: 0.15
model_groups:
- group: Meta Llama
@@ -27,11 +27,6 @@ vllm_benchmark:
model_repo: meta-llama/Llama-3.1-405B-Instruct
url: https://huggingface.co/meta-llama/Llama-3.1-405B-Instruct
precision: float16
- model: Llama 2 7B
mad_tag: pyt_vllm_llama-2-7b
model_repo: meta-llama/Llama-2-7b-chat-hf
url: https://huggingface.co/meta-llama/Llama-2-7b-chat-hf
precision: float16
- model: Llama 2 70B
mad_tag: pyt_vllm_llama-2-70b
model_repo: meta-llama/Llama-2-70b-chat-hf
@@ -65,11 +60,6 @@ vllm_benchmark:
model_repo: mistralai/Mixtral-8x22B-Instruct-v0.1
url: https://huggingface.co/mistralai/Mixtral-8x22B-Instruct-v0.1
precision: float16
- model: Mistral 7B
mad_tag: pyt_vllm_mistral-7b
model_repo: mistralai/Mistral-7B-Instruct-v0.3
url: https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3
precision: float16
- model: Mixtral MoE 8x7B FP8
mad_tag: pyt_vllm_mixtral-8x7b_fp8
model_repo: amd/Mixtral-8x7B-Instruct-v0.1-FP8-KV
@@ -80,72 +70,15 @@ vllm_benchmark:
model_repo: amd/Mixtral-8x22B-Instruct-v0.1-FP8-KV
url: https://huggingface.co/amd/Mixtral-8x22B-Instruct-v0.1-FP8-KV
precision: float8
- model: Mistral 7B FP8
mad_tag: pyt_vllm_mistral-7b_fp8
model_repo: amd/Mistral-7B-v0.1-FP8-KV
url: https://huggingface.co/amd/Mistral-7B-v0.1-FP8-KV
precision: float8
- group: Qwen
tag: qwen
models:
- model: Qwen2 7B
mad_tag: pyt_vllm_qwen2-7b
model_repo: Qwen/Qwen2-7B-Instruct
url: https://huggingface.co/Qwen/Qwen2-7B-Instruct
precision: float16
- model: Qwen2 72B
mad_tag: pyt_vllm_qwen2-72b
model_repo: Qwen/Qwen2-72B-Instruct
url: https://huggingface.co/Qwen/Qwen2-72B-Instruct
precision: float16
- model: QwQ-32B
mad_tag: pyt_vllm_qwq-32b
model_repo: Qwen/QwQ-32B
url: https://huggingface.co/Qwen/QwQ-32B
precision: float16
tunableop: true
- group: Databricks DBRX
tag: dbrx
models:
- model: DBRX Instruct
mad_tag: pyt_vllm_dbrx-instruct
model_repo: databricks/dbrx-instruct
url: https://huggingface.co/databricks/dbrx-instruct
precision: float16
- model: DBRX Instruct FP8
mad_tag: pyt_vllm_dbrx_fp8
model_repo: amd/dbrx-instruct-FP8-KV
url: https://huggingface.co/amd/dbrx-instruct-FP8-KV
precision: float8
- group: Google Gemma
tag: gemma
models:
- model: Gemma 2 27B
mad_tag: pyt_vllm_gemma-2-27b
model_repo: google/gemma-2-27b
url: https://huggingface.co/google/gemma-2-27b
precision: float16
- group: Cohere
tag: cohere
models:
- model: C4AI Command R+ 08-2024
mad_tag: pyt_vllm_c4ai-command-r-plus-08-2024
model_repo: CohereForAI/c4ai-command-r-plus-08-2024
url: https://huggingface.co/CohereForAI/c4ai-command-r-plus-08-2024
precision: float16
- model: C4AI Command R+ 08-2024 FP8
mad_tag: pyt_vllm_command-r-plus_fp8
model_repo: amd/c4ai-command-r-plus-FP8-KV
url: https://huggingface.co/amd/c4ai-command-r-plus-FP8-KV
precision: float8
- group: DeepSeek
tag: deepseek
models:
- model: DeepSeek MoE 16B
mad_tag: pyt_vllm_deepseek-moe-16b-chat
model_repo: deepseek-ai/deepseek-moe-16b-chat
url: https://huggingface.co/deepseek-ai/deepseek-moe-16b-chat
precision: float16
- group: Microsoft Phi
tag: phi
models:
@@ -153,11 +86,3 @@ vllm_benchmark:
mad_tag: pyt_vllm_phi-4
model_repo: microsoft/phi-4
url: https://huggingface.co/microsoft/phi-4
- group: TII Falcon
tag: falcon
models:
- model: Falcon 180B
mad_tag: pyt_vllm_falcon-180b
model_repo: tiiuae/falcon-180B
url: https://huggingface.co/tiiuae/falcon-180B
precision: float16

View File

@@ -1,26 +1,15 @@
dockers:
- pull_tag: rocm/megatron-lm:v25.6_py312
docker_hub_url: https://hub.docker.com/layers/rocm/megatron-lm/v25.6_py312/images/sha256-482ff906532285bceabdf2bda629bd32cb6174d2d07f4243a736378001b28df0
- pull_tag: rocm/megatron-lm:v25.7_py310
docker_hub_url: https://hub.docker.com/layers/rocm/megatron-lm/v25.7_py310/images/sha256-6189df849feeeee3ae31bb1e97aef5006d69d2b90c134e97708c19632e20ab5a
components:
ROCm: 6.4.1
PyTorch: 2.8.0a0+git7d205b2
Python: 3.12
Transformer Engine: 2.1.0.dev0+8c4a512
hipBLASLt: 393e413
Triton: 3.3.0
RCCL: 2.23.4.7a84c5d
doc_name: Ubuntu 24.04 + Python 3.12
- pull_tag: rocm/megatron-lm:v25.6_py310
docker_hub_url: https://hub.docker.com/layers/rocm/megatron-lm/v25.6_py310/images/sha256-9627bd9378684fe26cb1a10c7dd817868f553b33402e49b058355b0f095568d6
components:
ROCm: 6.4.1
PyTorch: 2.8.0a0+git7d205b2
ROCm: 6.4.2
Primus: v0.1.0-rc1
PyTorch: 2.8.0a0+gitd06a406
Python: "3.10"
Transformer Engine: 2.1.0.dev0+8c4a512
hipBLASLt: 393e413
Transformer Engine: 2.1.0.dev0+ba586519
hipBLASLt: 37ba1d36
Triton: 3.3.0
RCCL: 2.23.4.7a84c5d
doc_name: Ubuntu 22.04 + Python 3.10
RCCL: 2.22.3
model_groups:
- group: Meta Llama
tag: llama

View File

@@ -0,0 +1,60 @@
dockers:
- pull_tag: rocm/megatron-lm:v25.6_py312
docker_hub_url: https://hub.docker.com/layers/rocm/megatron-lm/v25.6_py312/images/sha256-482ff906532285bceabdf2bda629bd32cb6174d2d07f4243a736378001b28df0
components:
ROCm: 6.4.1
PyTorch: 2.8.0a0+git7d205b2
Python: 3.12
Transformer Engine: 2.1.0.dev0+8c4a512
hipBLASLt: 393e413
Triton: 3.3.0
RCCL: 2.23.4.7a84c5d
doc_name: Ubuntu 24.04 + Python 3.12
- pull_tag: rocm/megatron-lm:v25.6_py310
docker_hub_url: https://hub.docker.com/layers/rocm/megatron-lm/v25.6_py310/images/sha256-9627bd9378684fe26cb1a10c7dd817868f553b33402e49b058355b0f095568d6
components:
ROCm: 6.4.1
PyTorch: 2.8.0a0+git7d205b2
Python: "3.10"
Transformer Engine: 2.1.0.dev0+8c4a512
hipBLASLt: 393e413
Triton: 3.3.0
RCCL: 2.23.4.7a84c5d
doc_name: Ubuntu 22.04 + Python 3.10
model_groups:
- group: Meta Llama
tag: llama
models:
- model: Llama 3.3 70B
mad_tag: pyt_megatron_lm_train_llama-3.3-70b
- model: Llama 3.1 8B
mad_tag: pyt_megatron_lm_train_llama-3.1-8b
- model: Llama 3.1 70B
mad_tag: pyt_megatron_lm_train_llama-3.1-70b
- model: Llama 3.1 70B (proxy)
mad_tag: pyt_megatron_lm_train_llama-3.1-70b-proxy
- model: Llama 2 7B
mad_tag: pyt_megatron_lm_train_llama-2-7b
- model: Llama 2 70B
mad_tag: pyt_megatron_lm_train_llama-2-70b
- group: DeepSeek
tag: deepseek
models:
- model: DeepSeek-V3 (proxy)
mad_tag: pyt_megatron_lm_train_deepseek-v3-proxy
- model: DeepSeek-V2-Lite
mad_tag: pyt_megatron_lm_train_deepseek-v2-lite-16b
- group: Mistral AI
tag: mistral
models:
- model: Mixtral 8x7B
mad_tag: pyt_megatron_lm_train_mixtral-8x7b
- model: Mixtral 8x22B (proxy)
mad_tag: pyt_megatron_lm_train_mixtral-8x22b-proxy
- group: Qwen
tag: qwen
models:
- model: Qwen 2.5 7B
mad_tag: pyt_megatron_lm_train_qwen2.5-7b
- model: Qwen 2.5 72B
mad_tag: pyt_megatron_lm_train_qwen2.5-72b

View File

@@ -0,0 +1,58 @@
dockers:
- pull_tag: rocm/megatron-lm:v25.7_py310
docker_hub_url: https://hub.docker.com/layers/rocm/megatron-lm/v25.7_py310/images/sha256-6189df849feeeee3ae31bb1e97aef5006d69d2b90c134e97708c19632e20ab5a
components:
ROCm: 6.4.2
Primus: v0.1.0-rc1
PyTorch: 2.8.0a0+gitd06a406
Python: "3.10"
Transformer Engine: 2.1.0.dev0+ba586519
hipBLASLt: 37ba1d36
Triton: 3.3.0
RCCL: 2.22.3
model_groups:
- group: Meta Llama
tag: llama
models:
- model: Llama 3.3 70B
mad_tag: primus_pyt_megatron_lm_train_llama-3.3-70b
config_name: llama3.3_70B-pretrain.yaml
- model: Llama 3.1 70B
mad_tag: primus_pyt_megatron_lm_train_llama-3.1-70b
config_name: llama3.1_70B-pretrain.yaml
- model: Llama 3.1 8B
mad_tag: primus_pyt_megatron_lm_train_llama-3.1-8b
config_name: llama3.1_8B-pretrain.yaml
- model: Llama 2 7B
mad_tag: primus_pyt_megatron_lm_train_llama-2-7b
config_name: llama2_7B-pretrain.yaml
- model: Llama 2 70B
mad_tag: primus_pyt_megatron_lm_train_llama-2-70b
config_name: llama2_70B-pretrain.yaml
- group: DeepSeek
tag: deepseek
models:
- model: DeepSeek-V3 (proxy)
mad_tag: primus_pyt_megatron_lm_train_deepseek-v3-proxy
config_name: deepseek_v3-pretrain.yaml
- model: DeepSeek-V2-Lite
mad_tag: primus_pyt_megatron_lm_train_deepseek-v2-lite-16b
config_name: deepseek_v2_lite-pretrain.yaml
- group: Mistral AI
tag: mistral
models:
- model: Mixtral 8x7B
mad_tag: primus_pyt_megatron_lm_train_mixtral-8x7b
config_name: mixtral_8x7B_v0.1-pretrain.yaml
- model: Mixtral 8x22B (proxy)
mad_tag: primus_pyt_megatron_lm_train_mixtral-8x22b-proxy
config_name: mixtral_8x22B_v0.1-pretrain.yaml
- group: Qwen
tag: qwen
models:
- model: Qwen 2.5 7B
mad_tag: primus_pyt_megatron_lm_train_qwen2.5-7b
config_name: primus_qwen2.5_7B-pretrain.yaml
- model: Qwen 2.5 72B
mad_tag: primus_pyt_megatron_lm_train_qwen2.5-72b
config_name: qwen2.5_72B-pretrain.yaml

Binary file not shown.

Before

Width:  |  Height:  |  Size: 1.2 MiB

After

Width:  |  Height:  |  Size: 1.1 MiB

View File

@@ -19,5 +19,5 @@ The general steps to build ROCm are:
#. Run the build command
Because the ROCm stack is constantly evolving, the most current instructions are stored with the source code in GitHub.
For detailed build instructions, see `Build ROCm from source <https://github.com/ROCm/ROCm?tab=readme-ov-file#build-rocm-from-source>`_
For detailed build instructions, see `Getting and Building ROCm from Source <https://github.com/ROCm/ROCm?tab=readme-ov-file#getting-and-building-rocm-from-source>`.

View File

@@ -2,54 +2,132 @@
:description: How to install deep learning frameworks for ROCm
:keywords: deep learning, frameworks, ROCm, install, PyTorch, TensorFlow, JAX, MAGMA, DeepSpeed, ML, AI
********************************************
Installing deep learning frameworks for ROCm
********************************************
**********************************
Deep learning frameworks for ROCm
**********************************
ROCm provides a comprehensive ecosystem for deep learning development, including
:ref:`libraries <artificial-intelligence-apis>` for optimized deep learning operations and ROCm-aware versions of popular
deep learning frameworks and libraries such as PyTorch, TensorFlow, and JAX. ROCm works closely with these
frameworks to ensure that framework-specific optimizations take advantage of AMD accelerator and GPU architectures.
Deep learning frameworks provide environments for machine learning, training, fine-tuning, inference, and performance optimization.
The following guides provide information on compatibility and supported
features for these ROCm-enabled deep learning frameworks.
ROCm offers a complete ecosystem for developing and running deep learning applications efficiently. It also provides ROCm-compatible versions of popular frameworks and libraries, such as PyTorch, TensorFlow, JAX, and others.
* :doc:`PyTorch compatibility <../compatibility/ml-compatibility/pytorch-compatibility>`
* :doc:`TensorFlow compatibility <../compatibility/ml-compatibility/tensorflow-compatibility>`
* :doc:`JAX compatibility <../compatibility/ml-compatibility/jax-compatibility>`
* :doc:`verl compatibility <../compatibility/ml-compatibility/verl-compatibility>`
* :doc:`Stanford Megatron-LM compatibility <../compatibility/ml-compatibility/stanford-megatron-lm-compatibility>`
* :doc:`DGL compatibility <../compatibility/ml-compatibility/dgl-compatibility>`
The AMD ROCm organization actively contributes to open-source development and collaborates closely with framework organizations. This collaboration ensures that framework-specific optimizations effectively leverage AMD GPUs and accelerators.
This chart steps through typical installation workflows for installing deep learning frameworks for ROCm.
The table below summarizes information about ROCm-enabled deep learning frameworks. It includes details on ROCm compatibility and third-party tool support, installation steps and options, and links to GitHub resources. For a complete list of supported framework versions on ROCm, see the :doc:`Compatibility matrix <../compatibility/compatibility-matrix>` topic.
.. image:: ../data/how-to/framework_install_2024_07_04.png
:alt: Flowchart for installing ROCm-aware machine learning frameworks
:align: center
.. list-table::
:header-rows: 1
:widths: 5 3 6 3
See the installation instructions to get started.
* - Framework
- Installation
- Installation options
- GitHub
* :doc:`PyTorch for ROCm <rocm-install-on-linux:install/3rd-party/pytorch-install>`
* :doc:`TensorFlow for ROCm <rocm-install-on-linux:install/3rd-party/tensorflow-install>`
* :doc:`JAX for ROCm <rocm-install-on-linux:install/3rd-party/jax-install>`
* :doc:`verl for ROCm <rocm-install-on-linux:install/3rd-party/verl-install>`
* :doc:`Stanford Megatron-LM for ROCm <rocm-install-on-linux:install/3rd-party/stanford-megatron-lm-install>`
* :doc:`DGL for ROCm <rocm-install-on-linux:install/3rd-party/dgl-install>`
* - `PyTorch <https://rocm.docs.amd.com/en/latest/compatibility/ml-compatibility/pytorch-compatibility.html>`_
- .. raw:: html
<a href="https://rocm.docs.amd.com/projects/install-on-linux/en/latest/install/3rd-party/pytorch-install.html"><i class="fas fa-link fa-lg"></i></a>
-
- `Docker image <https://rocm.docs.amd.com/projects/install-on-linux/en/latest/install/3rd-party/pytorch-install.html#using-a-docker-image-with-pytorch-pre-installed>`_
- `Wheels package <https://rocm.docs.amd.com/projects/install-on-linux/en/latest/install/3rd-party/pytorch-install.html#using-a-wheels-package>`_
- `ROCm Base Docker image <https://rocm.docs.amd.com/projects/install-on-linux/en/latest/install/3rd-party/pytorch-install.html#using-the-pytorch-rocm-base-docker-image>`_
- `Upstream Docker file <https://rocm.docs.amd.com/projects/install-on-linux/en/latest/install/3rd-party/pytorch-install.html#using-the-pytorch-upstream-dockerfile>`_
- .. raw:: html
<a href="https://github.com/ROCm/pytorch"><i class="fab fa-github fa-lg"></i></a>
* - `TensorFlow <https://rocm.docs.amd.com/en/latest/compatibility/ml-compatibility/tensorflow-compatibility.html>`_
- .. raw:: html
<a href="https://rocm.docs.amd.com/projects/install-on-linux/en/latest/install/3rd-party/tensorflow-install.html"><i class="fas fa-link fa-lg"></i></a>
-
- `Docker image <https://rocm.docs.amd.com/projects/install-on-linux/en/latest/install/3rd-party/tensorflow-install.html#using-a-docker-image-with-tensorflow-pre-installed>`_
- `Wheels package <https://rocm.docs.amd.com/projects/install-on-linux/en/latest/install/3rd-party/tensorflow-install.html#using-a-wheels-package>`_
.. note::
- .. raw:: html
<a href="https://github.com/ROCm/tensorflow-upstream"><i class="fab fa-github fa-lg"></i></a>
* - `JAX <https://rocm.docs.amd.com/en/latest/compatibility/ml-compatibility/jax-compatibility.html>`_
- .. raw:: html
<a href="https://rocm.docs.amd.com/projects/install-on-linux/en/latest/install/3rd-party/jax-install.html"><i class="fas fa-link fa-lg"></i></a>
-
- `Docker image <https://rocm.docs.amd.com/projects/install-on-linux/en/latest/install/3rd-party/jax-install.html#using-a-prebuilt-docker-image>`_
- .. raw:: html
<a href="https://github.com/ROCm/jax"><i class="fab fa-github fa-lg"></i></a>
* - `verl <https://rocm.docs.amd.com/en/latest/compatibility/ml-compatibility/verl-compatibility.html>`_
- .. raw:: html
<a href="https://rocm.docs.amd.com/projects/install-on-linux/en/latest/install/3rd-party/verl-install.html"><i class="fas fa-link fa-lg"></i></a>
-
- `Docker image <https://rocm.docs.amd.com/projects/install-on-linux/en/latest/install/3rd-party/verl-install.html#use-a-prebuilt-docker-image-with-verl-pre-installed>`_
- .. raw:: html
<a href="https://github.com/ROCm/verl"><i class="fab fa-github fa-lg"></i></a>
* - `Stanford Megatron-LM <https://rocm.docs.amd.com/en/latest/compatibility/ml-compatibility/stanford-megatron-lm-compatibility.html>`_
- .. raw:: html
<a href="https://rocm.docs.amd.com/projects/install-on-linux/en/latest/install/3rd-party/stanford-megatron-lm-install.html"><i class="fas fa-link fa-lg"></i></a>
-
- `Docker image <https://rocm.docs.amd.com/projects/install-on-linux/en/latest/install/3rd-party/stanford-megatron-lm-install.html#use-a-prebuilt-docker-image-with-stanford-megatron-lm-pre-installed>`_
- .. raw:: html
<a href="https://github.com/ROCm/Stanford-Megatron-LM"><i class="fab fa-github fa-lg"></i></a>
* - `DGL <https://rocm.docs.amd.com/en/latest/compatibility/ml-compatibility/dgl-compatibility.html>`_
- .. raw:: html
<a href="https://rocm.docs.amd.com/projects/install-on-linux/en/latest/install/3rd-party/dgl-install.html"><i class="fas fa-link fa-lg"></i></a>
-
- `Docker image <https://rocm.docs.amd.com/projects/install-on-linux/en/latest/install/3rd-party/dgl-install.html#use-a-prebuilt-docker-image-with-dgl-pre-installed>`_
- .. raw:: html
<a href="https://github.com/ROCm/dgl"><i class="fab fa-github fa-lg"></i></a>
* - `Megablocks <https://rocm.docs.amd.com/en/latest/compatibility/ml-compatibility/megablocks-compatibility.html>`_
- .. raw:: html
<a href="https://rocm.docs.amd.com/projects/install-on-linux/en/latest/install/3rd-party/megablocks-install.html"><i class="fas fa-link fa-lg"></i></a>
-
- `Docker image <https://rocm.docs.amd.com/projects/install-on-linux/en/latest/install/3rd-party/megablocks-install.html#using-a-prebuilt-docker-image-with-megablocks-pre-installed>`_
- .. raw:: html
<a href="https://github.com/ROCm/megablocks"><i class="fab fa-github fa-lg"></i></a>
* - `Taichi <https://rocm.docs.amd.com/en/latest/compatibility/ml-compatibility/taichi-compatibility.html>`_
- .. raw:: html
<a href="https://rocm.docs.amd.com/projects/install-on-linux/en/latest/install/3rd-party/taichi-install.html"><i class="fas fa-link fa-lg"></i></a>
-
- `Docker image <https://rocm.docs.amd.com/projects/install-on-linux/en/latest/install/3rd-party/taichi-install.html#use-a-prebuilt-docker-image-with-taichi-pre-installed>`_
- `Wheels package <https://rocm.docs.amd.com/projects/install-on-linux/en/latest/install/3rd-party/taichi-install.html#use-a-wheels-package>`_
- .. raw:: html
<a href="https://github.com/ROCm/taichi"><i class="fab fa-github fa-lg"></i></a>
For guidance on installing ROCm itself, refer to :doc:`ROCm installation for Linux <rocm-install-on-linux:index>`.
Learn how to use your ROCm deep learning environment for training, fine-tuning, inference, and performance optimization
through the following guides.
* :doc:`rocm-for-ai/index`
* :doc:`Training <rocm-for-ai/training/index>`
* :doc:`Use ROCm for training <rocm-for-ai/training/index>`
* :doc:`Use ROCm for fine-tuning LLMs <rocm-for-ai/fine-tuning/index>`
* :doc:`Use ROCm for AI inference <rocm-for-ai/inference/index>`
* :doc:`Use ROCm for AI inference optimization <rocm-for-ai/inference-optimization/index>`
* :doc:`Fine-tuning LLMs <rocm-for-ai/fine-tuning/index>`
* :doc:`Inference <rocm-for-ai/inference/index>`
* :doc:`Inference optimization <rocm-for-ai/inference-optimization/index>`

View File

@@ -0,0 +1,25 @@
:orphan:
****************************************************
SGLang inference performance testing version history
****************************************************
This table lists previous versions of the ROCm SGLang inference performance
testing environment. For detailed information about available models for
benchmarking, see the version-specific documentation.
.. list-table::
:header-rows: 1
* - Docker image tag
- Components
- Resources
* - ``lmsysorg/sglang:v0.4.5-rocm630``
-
* ROCm 6.3.0
* SGLang 0.4.5
* PyTorch 2.6.0
-
* :doc:`Documentation <../sglang>`
* `Docker Hub <https://hub.docker.com/layers/lmsysorg/sglang/v0.4.5-rocm630/images/sha256-63d2cb760a237125daf6612464cfe2f395c0784e21e8b0ea37d551cd10d3c951>`__

View File

@@ -14,7 +14,7 @@ vLLM inference performance testing
This documentation does not reflect the latest version of ROCm vLLM
inference performance documentation. See :doc:`../vllm` for the latest version.
.. _vllm-benchmark-unified-docker:
.. _vllm-benchmark-unified-docker-702:
.. datatemplate:yaml:: /data/how-to/rocm-for-ai/inference/previous-versions/vllm_0.9.1_20250702-benchmark-models.yaml
@@ -77,7 +77,7 @@ vLLM inference performance testing
</div>
</div>
.. _vllm-benchmark-vllm:
.. _vllm-benchmark-vllm-702:
{% for model_group in model_groups %}
{% for model in model_group.models %}
@@ -159,7 +159,7 @@ vLLM inference performance testing
Once the setup is complete, choose between two options to reproduce the
benchmark results:
.. _vllm-benchmark-mad:
.. _vllm-benchmark-mad-702:
{% for model_group in model_groups %}
{% for model in model_group.models %}

View File

@@ -0,0 +1,450 @@
:orphan:
.. meta::
:description: Learn how to validate LLM inference performance on MI300X accelerators using AMD MAD and the
ROCm vLLM Docker image.
:keywords: model, MAD, automation, dashboarding, validate
**********************************
vLLM inference performance testing
**********************************
.. caution::
This documentation does not reflect the latest version of ROCm vLLM
inference performance documentation. See :doc:`../vllm` for the latest version.
.. _vllm-benchmark-unified-docker-715:
.. datatemplate:yaml:: /data/how-to/rocm-for-ai/inference/previous-versions/vllm_0.9.1_20250715-benchmark_models.yaml
{% set unified_docker = data.vllm_benchmark.unified_docker.latest %}
{% set model_groups = data.vllm_benchmark.model_groups %}
The `ROCm vLLM Docker <{{ unified_docker.docker_hub_url }}>`_ image offers
a prebuilt, optimized environment for validating large language model (LLM)
inference performance on AMD Instinct™ MI300X series accelerators. This ROCm vLLM
Docker image integrates vLLM and PyTorch tailored specifically for MI300X series
accelerators and includes the following components:
.. list-table::
:header-rows: 1
* - Software component
- Version
* - `ROCm <https://github.com/ROCm/ROCm>`__
- {{ unified_docker.rocm_version }}
* - `vLLM <https://docs.vllm.ai/en/latest>`__
- {{ unified_docker.vllm_version }}
* - `PyTorch <https://github.com/ROCm/pytorch>`__
- {{ unified_docker.pytorch_version }}
* - `hipBLASLt <https://github.com/ROCm/hipBLASLt>`__
- {{ unified_docker.hipblaslt_version }}
With this Docker image, you can quickly test the :ref:`expected
inference performance numbers <vllm-benchmark-performance-measurements>` for
MI300X series accelerators.
What's new
==========
The following is summary of notable changes since the :doc:`previous ROCm/vLLM Docker release <vllm-history>`.
* The ``--compilation-config-parameter`` is no longer required as its options are now enabled by default.
This parameter has been removed from the benchmarking script.
* Resolved Llama 3.1 405 B custom all-reduce issue, eliminating the need for ``--disable-custom-all-reduce``.
This parameter has been removed from the benchmarking script.
* Fixed a ``+rms_norm`` custom kernel issue.
* Added quick reduce functionality. Set ``VLLM_ROCM_QUICK_REDUCE_QUANTIZATION=FP`` to enable; supported modes are ``FP``, ``INT8``, ``INT6``, ``INT4``.
* Implemented a workaround to potentially mitigate GPU crashes experienced with the Command R+ model, pending a driver fix.
Supported models
================
.. datatemplate:yaml:: /data/how-to/rocm-for-ai/inference/previous-versions/vllm_0.9.1_20250715-benchmark_models.yaml
{% set unified_docker = data.vllm_benchmark.unified_docker.latest %}
{% set model_groups = data.vllm_benchmark.model_groups %}
.. _vllm-benchmark-available-models-715:
The following models are supported for inference performance benchmarking
with vLLM and ROCm. Some instructions, commands, and recommendations in this
documentation might vary by model -- select one to get started.
.. raw:: html
<div id="vllm-benchmark-ud-params-picker" class="container-fluid">
<div class="row">
<div class="col-2 me-2 model-param-head">Model group</div>
<div class="row col-10">
{% for model_group in model_groups %}
<div class="col-3 model-param" data-param-k="model-group" data-param-v="{{ model_group.tag }}" tabindex="0">{{ model_group.group }}</div>
{% endfor %}
</div>
</div>
<div class="row mt-1">
<div class="col-2 me-2 model-param-head">Model</div>
<div class="row col-10">
{% for model_group in model_groups %}
{% set models = model_group.models %}
{% for model in models %}
{% if models|length % 3 == 0 %}
<div class="col-4 model-param" data-param-k="model" data-param-v="{{ model.mad_tag }}" data-param-group="{{ model_group.tag }}" tabindex="0">{{ model.model }}</div>
{% else %}
<div class="col-6 model-param" data-param-k="model" data-param-v="{{ model.mad_tag }}" data-param-group="{{ model_group.tag }}" tabindex="0">{{ model.model }}</div>
{% endif %}
{% endfor %}
{% endfor %}
</div>
</div>
</div>
.. _vllm-benchmark-vllm-715:
{% for model_group in model_groups %}
{% for model in model_group.models %}
.. container:: model-doc {{model.mad_tag}}
.. note::
See the `{{ model.model }} model card on Hugging Face <{{ model.url }}>`_ to learn more about your selected model.
Some models require access authorization prior to use via an external license agreement through a third party.
{% endfor %}
{% endfor %}
.. note::
vLLM is a toolkit and library for LLM inference and serving. AMD implements
high-performance custom kernels and modules in vLLM to enhance performance.
See :ref:`fine-tuning-llms-vllm` and :ref:`mi300x-vllm-optimization` for
more information.
.. _vllm-benchmark-performance-measurements-715:
Performance measurements
========================
To evaluate performance, the
`Performance results with AMD ROCm software <https://www.amd.com/en/developer/resources/rocm-hub/dev-ai/performance-results.html>`_
page provides reference throughput and latency measurements for inferencing popular AI models.
.. important::
The performance data presented in
`Performance results with AMD ROCm software <https://www.amd.com/en/developer/resources/rocm-hub/dev-ai/performance-results.html>`_
only reflects the latest version of this inference benchmarking environment.
The listed measurements should not be interpreted as the peak performance achievable by AMD Instinct MI325X and MI300X accelerators or ROCm software.
System validation
=================
Before running AI workloads, it's important to validate that your AMD hardware is configured
correctly and performing optimally.
If you have already validated your system settings, including aspects like NUMA auto-balancing, you
can skip this step. Otherwise, complete the procedures in the :ref:`System validation and
optimization <rocm-for-ai-system-optimization>` guide to properly configure your system settings
before starting training.
To test for optimal performance, consult the recommended :ref:`System health benchmarks
<rocm-for-ai-system-health-bench>`. This suite of tests will help you verify and fine-tune your
system's configuration.
.. datatemplate:yaml:: /data/how-to/rocm-for-ai/inference/previous-versions/vllm_0.9.1_20250715-benchmark_models.yaml
{% set unified_docker = data.vllm_benchmark.unified_docker.latest %}
{% set model_groups = data.vllm_benchmark.model_groups %}
Pull the Docker image
=====================
Download the `ROCm vLLM Docker image <{{ unified_docker.docker_hub_url }}>`_.
Use the following command to pull the Docker image from Docker Hub.
.. code-block:: shell
docker pull {{ unified_docker.pull_tag }}
Benchmarking
============
Once the setup is complete, choose between two options to reproduce the
benchmark results:
.. _vllm-benchmark-mad-715:
{% for model_group in model_groups %}
{% for model in model_group.models %}
.. container:: model-doc {{model.mad_tag}}
.. tab-set::
.. tab-item:: MAD-integrated benchmarking
1. Clone the ROCm Model Automation and Dashboarding (`<https://github.com/ROCm/MAD>`__) repository to a local
directory and install the required packages on the host machine.
.. code-block:: shell
git clone https://github.com/ROCm/MAD
cd MAD
pip install -r requirements.txt
2. Use this command to run the performance benchmark test on the `{{model.model}} <{{ model.url }}>`_ model
using one GPU with the :literal:`{{model.precision}}` data type on the host machine.
.. code-block:: shell
export MAD_SECRETS_HFTOKEN="your personal Hugging Face token to access gated models"
madengine run \
--tags {{model.mad_tag}} \
--keep-model-dir \
--live-output \
--timeout 28800
MAD launches a Docker container with the name
``container_ci-{{model.mad_tag}}``. The latency and throughput reports of the
model are collected in the following path: ``~/MAD/reports_{{model.precision}}/``.
Although the :ref:`available models <vllm-benchmark-available-models>` are preconfigured
to collect latency and throughput performance data, you can also change the benchmarking
parameters. See the standalone benchmarking tab for more information.
{% if model.tunableop %}
.. note::
For improved performance, consider enabling :ref:`PyTorch TunableOp <mi300x-tunableop>`.
TunableOp automatically explores different implementations and configurations of certain PyTorch
operators to find the fastest one for your hardware.
By default, ``{{model.mad_tag}}`` runs with TunableOp disabled
(see
`<https://github.com/ROCm/MAD/blob/develop/models.json>`__).
To enable it, include the ``--tunableop on`` argument in your
run.
Enabling TunableOp triggers a two-pass run -- a warm-up followed
by the performance-collection run.
{% endif %}
.. tab-item:: Standalone benchmarking
.. rubric:: Download the Docker image and required scripts
1. Run the vLLM benchmark tool independently by starting the
`Docker container <{{ unified_docker.docker_hub_url }}>`_
as shown in the following snippet.
.. code-block:: shell
docker pull {{ unified_docker.pull_tag }}
docker run -it \
--device=/dev/kfd \
--device=/dev/dri \
--group-add video \
--shm-size 16G \
--security-opt seccomp=unconfined \
--security-opt apparmor=unconfined \
--cap-add=SYS_PTRACE \
-v $(pwd):/workspace \
--env HUGGINGFACE_HUB_CACHE=/workspace \
--name test \
{{ unified_docker.pull_tag }}
2. In the Docker container, clone the ROCm MAD repository and navigate to the
benchmark scripts directory at ``~/MAD/scripts/vllm``.
.. code-block:: shell
git clone https://github.com/ROCm/MAD
cd MAD/scripts/vllm
3. To start the benchmark, use the following command with the appropriate options.
.. dropdown:: Benchmark options
:open:
.. list-table::
:header-rows: 1
:align: center
* - Name
- Options
- Description
* - ``$test_option``
- latency
- Measure decoding token latency
* -
- throughput
- Measure token generation throughput
* -
- all
- Measure both throughput and latency
* - ``$num_gpu``
- 1 or 8
- Number of GPUs
* - ``$datatype``
- ``float16`` or ``float8``
- Data type
The input sequence length, output sequence length, and tensor parallel (TP) are
already configured. You don't need to specify them with this script.
Command:
.. code-block::
./vllm_benchmark_report.sh \
-s $test_option \
-m {{model.model_repo}} \
-g $num_gpu \
-d {{model.precision}}
.. note::
For best performance, it's recommend to run with ``VLLM_V1_USE_PREFILL_DECODE_ATTENTION=1``.
If you encounter the following error, pass your access-authorized Hugging
Face token to the gated models.
.. code-block::
OSError: You are trying to access a gated repo.
# pass your HF_TOKEN
export HF_TOKEN=$your_personal_hf_token
.. rubric:: Benchmarking examples
Here are some examples of running the benchmark with various options:
* Latency benchmark
Use this command to benchmark the latency of the {{model.model}} model on eight GPUs with :literal:`{{model.precision}}` precision.
.. code-block::
./vllm_benchmark_report.sh \
-s latency \
-m {{model.model_repo}} \
-g 8 \
-d {{model.precision}}
Find the latency report at ``./reports_{{model.precision}}_vllm_rocm{{unified_docker.rocm_version}}/summary/{{model.model_repo.split('/', 1)[1] if '/' in model.model_repo else model.model_repo}}_latency_report.csv``.
* Throughput benchmark
Use this command to benchmark the throughput of the {{model.model}} model on eight GPUs with :literal:`{{model.precision}}` precision.
.. code-block:: shell
./vllm_benchmark_report.sh \
-s throughput \
-m {{model.model_repo}} \
-g 8 \
-d {{model.precision}}
Find the throughput report at ``./reports_{{model.precision}}_vllm_rocm{{unified_docker.rocm_version}}/summary/{{model.model_repo.split('/', 1)[1] if '/' in model.model_repo else model.model_repo}}_throughput_report.csv``.
.. raw:: html
<style>
mjx-container[jax="CHTML"][display="true"] {
text-align: left;
margin: 0;
}
</style>
.. note::
Throughput is calculated as:
- .. math:: throughput\_tot = requests \times (\mathsf{\text{input lengths}} + \mathsf{\text{output lengths}}) / elapsed\_time
- .. math:: throughput\_gen = requests \times \mathsf{\text{output lengths}} / elapsed\_time
{% endfor %}
{% endfor %}
Advanced usage
==============
For information on experimental features and known issues related to ROCm optimization efforts on vLLM,
see the developer's guide at `<https://github.com/ROCm/vllm/tree/f94ec9beeca1071cc34f9d1e206d8c7f3ac76129/docs/dev-docker>`__.
Reproducing the Docker image
----------------------------
To reproduce this ROCm/vLLM Docker image release, follow these steps:
1. Clone the `vLLM repository <https://github.com/ROCm/vllm>`__.
.. code-block:: shell
git clone https://github.com/ROCm/vllm.git
2. Checkout the specific release commit.
.. code-block:: shell
cd vllm
git checkout b432b7a285aa0dcb9677380936ffa74931bb6d6f
3. Build the Docker image. Replace ``vllm-rocm`` with your desired image tag.
.. code-block:: shell
docker build -f docker/Dockerfile.rocm -t vllm-rocm .
Known issues and workarounds
============================
AITER does not support FP8 KV cache yet.
Further reading
===============
- To learn more about the options for latency and throughput benchmark scripts,
see `<https://github.com/ROCm/vllm/tree/main/benchmarks>`_.
- To learn more about MAD and the ``madengine`` CLI, see the `MAD usage guide <https://github.com/ROCm/MAD?tab=readme-ov-file#usage-guide>`__.
- To learn more about system settings and management practices to configure your system for
AMD Instinct MI300X series accelerators, see `AMD Instinct MI300X system optimization <https://instinct.docs.amd.com/projects/amdgpu-docs/en/latest/system-optimization/mi300x.html>`_.
- For application performance optimization strategies for HPC and AI workloads,
including inference with vLLM, see :doc:`/how-to/rocm-for-ai/inference-optimization/workload`.
- To learn how to run community models from Hugging Face on AMD GPUs, see
:doc:`Running models from Hugging Face </how-to/rocm-for-ai/inference/hugging-face-models>`.
- To learn how to fine-tune LLMs and optimize inference, see
:doc:`Fine-tuning LLMs and inference optimization </how-to/rocm-for-ai/fine-tuning/fine-tuning-and-inference>`.
- For a list of other ready-made Docker images for AI with ROCm, see
`AMD Infinity Hub <https://www.amd.com/en/developer/resources/infinity-hub.html#f-amd_hub_category=AI%20%26%20ML%20Models>`_.
Previous versions
=================
See :doc:`vllm-history` to find documentation for previous releases
of the ``ROCm/vllm`` Docker image.

View File

@@ -16,14 +16,23 @@ previous releases of the ``ROCm/vllm`` Docker image on `Docker Hub <https://hub.
- Components
- Resources
* - ``rocm/vllm:rocm6.4.1_vllm_0.9.1_20250715``
* - ``rocm/vllm:rocm6.4.1_vllm_0.10.0_20250812``
(latest)
-
* ROCm 6.4.1
* vLLM 0.10.0
* PyTorch 2.7.0
-
* :doc:`Documentation <../vllm>`
* `Docker Hub <https://hub.docker.com/layers/rocm/vllm/rocm6.4.1_vllm_0.10.0_20250812/images/sha256-4c277ad39af3a8c9feac9b30bf78d439c74d9b4728e788a419d3f1d0c30cacaa>`__
* - ``rocm/vllm:rocm6.4.1_vllm_0.9.1_20250715``
-
* ROCm 6.4.1
* vLLM 0.9.1
* PyTorch 2.7.0
-
* :doc:`Documentation <../vllm>`
* :doc:`Documentation <vllm-0.9.1-20250715>`
* `Docker Hub <https://hub.docker.com/layers/rocm/vllm/rocm6.4.1_vllm_0.9.1_20250715/images/sha256-4a429705fa95a58f6d20aceab43b1b76fa769d57f32d5d28bd3f4e030e2a78ea>`__
* - ``rocm/vllm:rocm6.4.1_vllm_0.9.1_20250702``

View File

@@ -103,7 +103,7 @@ PyTorch inference performance testing
The Chai-1 benchmark uses a specifically selected Docker image using ROCm 6.2.3 and PyTorch 2.3.0 to address an accuracy issue.
.. container:: model-doc pyt_clip_inference pyt_mochi_video_inference pyt_wan2.1_inference
.. container:: model-doc pyt_clip_inference pyt_mochi_video_inference pyt_wan2.1_inference pyt_janus_pro_inference pyt_hy_video
Use the following command to pull the `ROCm PyTorch Docker image <https://hub.docker.com/layers/rocm/pytorch/latest/images/sha256-05b55983e5154f46e7441897d0908d79877370adca4d1fff4899d9539d6c4969>`__ from Docker Hub.
@@ -140,22 +140,27 @@ PyTorch inference performance testing
.. code-block:: shell
export MAD_SECRETS_HFTOKEN="your personal Hugging Face token to access gated models"
python3 tools/run_models.py --tags {{model.mad_tag}} --keep-model-dir --live-output --timeout 28800
madengine run \
--tags {{model.mad_tag}} \
--keep-model-dir \
--live-output \
--timeout 28800
MAD launches a Docker container with the name
``container_ci-{{model.mad_tag}}``. The latency and throughput reports of the
model are collected in ``perf.csv``.
model are collected in ``perf_{{model.mad_tag}}.csv``.
{% if model.mad_tag != "pyt_janus_pro_inference" %}
.. note::
For improved performance, consider enabling TunableOp. By default,
``{{model.mad_tag}}`` runs with TunableOp disabled (see
`<https://github.com/ROCm/MAD/blob/develop/models.json>`__). To enable
it, edit the default run behavior in the ``tools/run_models.py``-- update the model's
run ``args`` by changing ``--tunableop off`` to ``--tunableop on``.
it, include the ``--tunableop on`` argument in your run.
Enabling TunableOp triggers a two-pass run -- a warm-up followed by the performance-collection run.
Although this might increase the initial training time, it can result in a performance gain.
{% endif %}
{% endfor %}
{% endfor %}
@@ -163,8 +168,10 @@ PyTorch inference performance testing
Further reading
===============
- To learn more about MAD and the ``madengine`` CLI, see the `MAD usage guide <https://github.com/ROCm/MAD?tab=readme-ov-file#usage-guide>`__.
- To learn more about system settings and management practices to configure your system for
MI300X accelerators, see `AMD Instinct MI300X system optimization <https://instinct.docs.amd.com/projects/amdgpu-docs/en/latest/system-optimization/mi300x.html>`_.
AMD Instinct MI300X series accelerators, see `AMD Instinct MI300X system optimization <https://instinct.docs.amd.com/projects/amdgpu-docs/en/latest/system-optimization/mi300x.html>`_.
- For application performance optimization strategies for HPC and AI workloads,
including inference with vLLM, see :doc:`../../inference-optimization/workload`.

View File

@@ -0,0 +1,280 @@
.. meta::
:description: Learn how to validate LLM inference performance on MI300X accelerators using AMD MAD and SGLang
:keywords: model, MAD, automation, dashboarding, validate
************************************
SGLang inference performance testing
************************************
.. _sglang-benchmark-unified-docker:
.. datatemplate:yaml:: /data/how-to/rocm-for-ai/inference/sglang-benchmark-models.yaml
{% set unified_docker = data.sglang_benchmark.unified_docker.latest %}
`SGLang <https://docs.sglang.ai>`__ is a high-performance inference and
serving engine for large language models (LLMs) and vision models. The
ROCm-enabled `SGLang Docker image <{{ unified_docker.docker_hub_url }}>`__
bundles SGLang with PyTorch, optimized for AMD Instinct MI300X series
accelerators. It includes the following software components:
.. list-table::
:header-rows: 1
* - Software component
- Version
* - `ROCm <https://github.com/ROCm/ROCm>`__
- {{ unified_docker.rocm_version }}
* - `SGLang <https://docs.sglang.ai/index.html>`__
- {{ unified_docker.sglang_version }}
* - `PyTorch <https://github.com/pytorch/pytorch>`__
- {{ unified_docker.pytorch_version }}
System validation
=================
Before running AI workloads, it's important to validate that your AMD hardware is configured
correctly and performing optimally.
If you have already validated your system settings, including aspects like NUMA auto-balancing, you
can skip this step. Otherwise, complete the procedures in the :ref:`System validation and
optimization <rocm-for-ai-system-optimization>` guide to properly configure your system settings
before starting training.
To test for optimal performance, consult the recommended :ref:`System health benchmarks
<rocm-for-ai-system-health-bench>`. This suite of tests will help you verify and fine-tune your
system's configuration.
.. datatemplate:yaml:: /data/how-to/rocm-for-ai/inference/sglang-benchmark-models.yaml
{% set unified_docker = data.sglang_benchmark.unified_docker.latest %}
{% set model_groups = data.sglang_benchmark.model_groups %}
Pull the Docker image
=====================
Download the `SGLang Docker image <{{ unified_docker.docker_hub_url }}>`__.
Use the following command to pull the Docker image from Docker Hub.
.. code-block:: shell
docker pull {{ unified_docker.pull_tag }}
Benchmarking
============
Once the setup is complete, choose one of the following methods to benchmark inference performance with
`DeepSeek-R1-Distill-Qwen-32B <https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-32B>`__.
.. _sglang-benchmark-mad:
{% for model_group in model_groups %}
{% for model in model_group.models %}
.. container:: model-doc {{model.mad_tag}}
.. tab-set::
.. tab-item:: MAD-integrated benchmarking
1. Clone the ROCm Model Automation and Dashboarding (`<https://github.com/ROCm/MAD>`__) repository to a local
directory and install the required packages on the host machine.
.. code-block:: shell
git clone https://github.com/ROCm/MAD
cd MAD
pip install -r requirements.txt
2. Use this command to run the performance benchmark test on the `{{model.model}} <{{ model.url }}>`_ model
using one GPU with the ``{{model.precision}}`` data type on the host machine.
.. code-block:: shell
export MAD_SECRETS_HFTOKEN="your personal Hugging Face token to access gated models"
madengine run \
--tags {{model.mad_tag}} \
--keep-model-dir \
--live-output \
--timeout 28800
MAD launches a Docker container with the name
``container_ci-{{model.mad_tag}}``. The latency and throughput reports of the
model are collected in the following path: ``~/MAD/perf_DeepSeek-R1-Distill-Qwen-32B.csv``.
Although the DeepSeek-R1-Distill-Qwen-32B is preconfigured
to collect latency and throughput performance data, you can also change the benchmarking
parameters. See the standalone benchmarking tab for more information.
.. tab-item:: Standalone benchmarking
.. rubric:: Download the Docker image and required scripts
1. Run the SGLang benchmark script independently by starting the
`Docker container <{{ unified_docker.docker_hub_url }}>`__
as shown in the following snippet.
.. code-block:: shell
docker pull {{ unified_docker.pull_tag }}
docker run -it \
--device=/dev/kfd \
--device=/dev/dri \
--group-add video \
--shm-size 16G \
--security-opt seccomp=unconfined \
--security-opt apparmor=unconfined \
--cap-add=SYS_PTRACE \
-v $(pwd):/workspace \
--env HUGGINGFACE_HUB_CACHE=/workspace \
--name test \
{{ unified_docker.pull_tag }}
2. In the Docker container, clone the ROCm MAD repository and navigate to the
benchmark scripts directory at ``~/MAD/scripts/sglang``.
.. code-block:: shell
git clone https://github.com/ROCm/MAD
cd MAD/scripts/sglang
3. To start the benchmark, use the following command with the appropriate options.
.. dropdown:: Benchmark options
:open:
.. list-table::
:header-rows: 1
:align: center
* - Name
- Options
- Description
* - ``$test_option``
- latency
- Measure decoding token latency
* -
- throughput
- Measure token generation throughput
* -
- all
- Measure both throughput and latency
* - ``$num_gpu``
- 8
- Number of GPUs
* - ``$datatype``
- ``bfloat16``
- Data type
* - ``$dataset``
- random
- Dataset
The input sequence length, output sequence length, and tensor parallel (TP) are
already configured. You don't need to specify them with this script.
Command:
.. code-block:: shell
./sglang_benchmark_report.sh -s $test_option -m {{model.model_repo}} -g $num_gpu -d $datatype [-a $dataset]
.. note::
If you encounter the following error, pass your access-authorized Hugging
Face token to the gated models.
.. code-block:: shell-session
OSError: You are trying to access a gated repo.
# pass your HF_TOKEN
export HF_TOKEN=$your_personal_hf_token
.. rubric:: Benchmarking examples
Here are some examples of running the benchmark with various options:
* Latency benchmark
Use this command to benchmark the latency of the {{model.model}} model on eight GPUs with ``{{model.precision}}`` precision.
.. code-block:: shell
./sglang_benchmark_report.sh \
-s latency \
-m {{model.model_repo}} \
-g 8 \
-d {{model.precision}}
Find the latency report at ``./reports_{{model.precision}}/summary/{{model.model_repo.split('/', 1)[1] if '/' in model.model_repo else model.model_repo}}_latency_report.csv``.
* Throughput benchmark
Use this command to benchmark the throughput of the {{model.model}} model on eight GPUs with ``{{model.precision}}`` precision.
.. code-block:: shell
./sglang_benchmark_report.sh \
-s throughput \
-m {{model.model_repo}} \
-g 8 \
-d {{model.precision}} \
-a random
Find the throughput report at ``./reports_{{model.precision}}/summary/{{model.model_repo.split('/', 1)[1] if '/' in model.model_repo else model.model_repo}}_throughput_report.csv``.
.. raw:: html
<style>
mjx-container[jax="CHTML"][display="true"] {
text-align: left;
margin: 0;
}
</style>
.. note::
Throughput is calculated as:
- .. math:: throughput\_tot = requests \times (\mathsf{\text{input lengths}} + \mathsf{\text{output lengths}}) / elapsed\_time
- .. math:: throughput\_gen = requests \times \mathsf{\text{output lengths}} / elapsed\_time
{% endfor %}
{% endfor %}
Further reading
===============
- To learn more about the options for latency and throughput benchmark scripts,
see `<https://github.com/sgl-project/sglang/tree/main/benchmark/blog_v0_2>`__.
- To learn more about MAD and the ``madengine`` CLI, see the `MAD usage guide <https://github.com/ROCm/MAD?tab=readme-ov-file#usage-guide>`__.
- To learn more about system settings and management practices to configure your system for
MI300X series accelerators, see `AMD Instinct MI300X system optimization <https://instinct.docs.amd.com/projects/amdgpu-docs/en/latest/system-optimization/mi300x.html>`__.
- For application performance optimization strategies for HPC and AI workloads,
including inference with vLLM, see :doc:`/how-to/rocm-for-ai/inference-optimization/workload`.
- To learn how to run community models from Hugging Face on AMD GPUs, see
:doc:`Running models from Hugging Face </how-to/rocm-for-ai/inference/hugging-face-models>`.
- To learn how to fine-tune LLMs and optimize inference, see
:doc:`Fine-tuning LLMs and inference optimization </how-to/rocm-for-ai/fine-tuning/fine-tuning-and-inference>`.
- For a list of other ready-made Docker images for AI with ROCm, see
`AMD Infinity Hub <https://www.amd.com/en/developer/resources/infinity-hub.html#f-amd_hub_category=AI%20%26%20ML%20Models>`_.
Previous versions
=================
See :doc:`previous-versions/sglang-history` to find documentation for previous releases
of SGLang inference performance testing.

View File

@@ -7,7 +7,7 @@
vLLM inference performance testing
**********************************
.. _vllm-benchmark-unified-docker:
.. _vllm-benchmark-unified-docker-812:
.. datatemplate:yaml:: /data/how-to/rocm-for-ai/inference/vllm-benchmark-models.yaml
@@ -47,17 +47,11 @@ What's new
The following is summary of notable changes since the :doc:`previous ROCm/vLLM Docker release <previous-versions/vllm-history>`.
* The ``--compilation-config-parameter`` is no longer required as its options are now enabled by default.
This parameter has been removed from the benchmarking script.
* Upgraded to vLLM v0.10.
* Resolved Llama 3.1 405 B custom all-reduce issue, eliminating the need for ``--disable-custom-all-reduce``.
This parameter has been removed from the benchmarking script.
* FP8 KV cache support via AITER.
* Fixed a ``+rms_norm`` custom kernel issue.
* Added quick reduce functionality. Set ``VLLM_ROCM_QUICK_REDUCE_QUANTIZATION=FP`` to enable; supported modes are ``FP``, ``INT8``, ``INT6``, ``INT4``.
* Implemented a workaround to potentially mitigate GPU crashes experienced with the Command R+ model, pending a driver fix.
* Full graph capture support via AITER.
Supported models
================
@@ -67,7 +61,7 @@ Supported models
{% set unified_docker = data.vllm_benchmark.unified_docker.latest %}
{% set model_groups = data.vllm_benchmark.model_groups %}
.. _vllm-benchmark-available-models:
.. _vllm-benchmark-available-models-812:
The following models are supported for inference performance benchmarking
with vLLM and ROCm. Some instructions, commands, and recommendations in this
@@ -102,7 +96,7 @@ Supported models
</div>
</div>
.. _vllm-benchmark-vllm:
.. _vllm-benchmark-vllm-812:
{% for model_group in model_groups %}
{% for model in model_group.models %}
@@ -124,14 +118,14 @@ Supported models
See :ref:`fine-tuning-llms-vllm` and :ref:`mi300x-vllm-optimization` for
more information.
.. _vllm-benchmark-performance-measurements:
.. _vllm-benchmark-performance-measurements-812:
Performance measurements
========================
To evaluate performance, the
`Performance results with AMD ROCm software <https://www.amd.com/en/developer/resources/rocm-hub/dev-ai/performance-results.html>`_
page provides reference throughput and latency measurements for inferencing popular AI models.
page provides reference throughput and serving measurements for inferencing popular AI models.
.. important::
@@ -176,7 +170,7 @@ system's configuration.
Once the setup is complete, choose between two options to reproduce the
benchmark results:
.. _vllm-benchmark-mad:
.. _vllm-benchmark-mad-812:
{% for model_group in model_groups %}
{% for model in model_group.models %}
@@ -202,19 +196,22 @@ system's configuration.
.. code-block:: shell
export MAD_SECRETS_HFTOKEN="your personal Hugging Face token to access gated models"
python3 tools/run_models.py \
madengine run \
--tags {{model.mad_tag}} \
--keep-model-dir \
--live-output \
--timeout 28800
MAD launches a Docker container with the name
``container_ci-{{model.mad_tag}}``. The latency and throughput reports of the
model are collected in the following path: ``~/MAD/reports_{{model.precision}}/``.
``container_ci-{{model.mad_tag}}``. The throughput and serving reports of the
model are collected in the following paths: ``{{ model.mad_tag }}_throughput.csv``
and ``{{ model.mad_tag }}_serving.csv``.
Although the :ref:`available models <vllm-benchmark-available-models>` are preconfigured
to collect latency and throughput performance data, you can also change the benchmarking
parameters. See the standalone benchmarking tab for more information.
Although the :ref:`available models
<vllm-benchmark-available-models>` are preconfigured to collect
offline throughput and online serving performance data, you can
also change the benchmarking parameters. See the standalone
benchmarking tab for more information.
{% if model.tunableop %}
@@ -224,14 +221,12 @@ system's configuration.
TunableOp automatically explores different implementations and configurations of certain PyTorch
operators to find the fastest one for your hardware.
By default, ``{{model.mad_tag}}`` runs with TunableOp disabled
(see
`<https://github.com/ROCm/MAD/blob/develop/models.json>`__). To
enable it, edit the default run behavior in the ``models.json``
configuration before running inference -- update the model's run
``args`` by changing ``--tunableop off`` to ``--tunableop on``.
By default, ``{{model.mad_tag}}`` runs with TunableOp disabled (see
`<https://github.com/ROCm/MAD/blob/develop/models.json>`__). To enable it, include
the ``--tunableop on`` argument in your run.
Enabling TunableOp triggers a two-pass run -- a warm-up followed by the performance-collection run.
Enabling TunableOp triggers a two-pass run -- a warm-up followed by the
performance-collection run.
{% endif %}
@@ -269,6 +264,13 @@ system's configuration.
3. To start the benchmark, use the following command with the appropriate options.
.. code-block::
./run.sh \
--config $CONFIG_CSV \
--model_repo {{ model.model_repo }} \
<overrides>
.. dropdown:: Benchmark options
:open:
@@ -280,42 +282,40 @@ system's configuration.
- Options
- Description
* - ``$test_option``
- latency
- Measure decoding token latency
* - ``--config``
- ``configs/default.csv``
- Run configs from the CSV for the chosen model repo and benchmark.
* -
- throughput
- Measure token generation throughput
- ``configs/extended.csv``
-
* -
- all
- Measure both throughput and latency
- ``configs/performance.csv``
-
* - ``$num_gpu``
- 1 or 8
- Number of GPUs
* - ``--benchmark``
- ``throughput``
- Measure offline end-to-end throughput.
* - ``$datatype``
- ``float16`` or ``float8``
- Data type
* -
- ``serving``
- Measure online serving performance.
* -
- ``all``
- Measure both throughput and serving.
* - `<overrides>`
- See `run.sh <https://github.com/ROCm/MAD/blob/develop/scripts/vllm/run.sh>`__ for more info.
- Additional overrides to the config CSV.
The input sequence length, output sequence length, and tensor parallel (TP) are
already configured. You don't need to specify them with this script.
Command:
.. code-block::
./vllm_benchmark_report.sh \
-s $test_option \
-m {{model.model_repo}} \
-g $num_gpu \
-d {{model.precision}}
.. note::
For best performance, it's recommend to run with ``VLLM_V1_USE_PREFILL_DECODE_ATTENTION=1``.
For best performance, it's recommended to run with ``VLLM_V1_USE_PREFILL_DECODE_ATTENTION=1``.
If you encounter the following error, pass your access-authorized Hugging
Face token to the gated models.
@@ -331,33 +331,33 @@ system's configuration.
Here are some examples of running the benchmark with various options:
* Latency benchmark
Use this command to benchmark the latency of the {{model.model}} model on eight GPUs with :literal:`{{model.precision}}` precision.
.. code-block::
./vllm_benchmark_report.sh \
-s latency \
-m {{model.model_repo}} \
-g 8 \
-d {{model.precision}}
Find the latency report at ``./reports_{{model.precision}}_vllm_rocm{{unified_docker.rocm_version}}/summary/{{model.model_repo.split('/', 1)[1] if '/' in model.model_repo else model.model_repo}}_latency_report.csv``.
* Throughput benchmark
Use this command to benchmark the throughput of the {{model.model}} model on eight GPUs with :literal:`{{model.precision}}` precision.
.. code-block:: shell
./vllm_benchmark_report.sh \
-s throughput \
-m {{model.model_repo}} \
-g 8 \
-d {{model.precision}}
export MAD_MODEL_NAME={{ model.mad_tag }}
./run.sh \
--config configs/default.csv \
--model_repo {{model.model_repo}} \
--benchmark throughput
Find the throughput report at ``./reports_{{model.precision}}_vllm_rocm{{unified_docker.rocm_version}}/summary/{{model.model_repo.split('/', 1)[1] if '/' in model.model_repo else model.model_repo}}_throughput_report.csv``.
Find the throughput benchmark report at ``./{{ model.mad_tag }}_throughput.csv``.
* Serving benchmark
Use this command to benchmark the serving performance of the {{model.model}} model on eight GPUs with :literal:`{{model.precision}}` precision.
.. code-block::
export MAD_MODEL_NAME={{ model.mad_tag }}
./run.sh \
--config configs/default.csv \
--model_repo {{model.model_repo}} \
--benchmark serving
Find the serving benchmark report at ``./{{ model.mad_tag }}_serving.csv``.
.. raw:: html
@@ -400,7 +400,7 @@ To reproduce this ROCm/vLLM Docker image release, follow these steps:
.. code-block:: shell
cd vllm
git checkout b432b7a285aa0dcb9677380936ffa74931bb6d6f
git checkout 340ea86dfe5955d6f9a9e767d6abab5aacf2c978
3. Build the Docker image. Replace ``vllm-rocm`` with your desired image tag.
@@ -408,19 +408,16 @@ To reproduce this ROCm/vLLM Docker image release, follow these steps:
docker build -f docker/Dockerfile.rocm -t vllm-rocm .
Known issues and workarounds
============================
AITER does not support FP8 KV cache yet.
Further reading
===============
- To learn more about the options for latency and throughput benchmark scripts,
see `<https://github.com/ROCm/vllm/tree/main/benchmarks>`_.
- To learn more about MAD and the ``madengine`` CLI, see the `MAD usage guide <https://github.com/ROCm/MAD?tab=readme-ov-file#usage-guide>`__.
- To learn more about system settings and management practices to configure your system for
MI300X series accelerators, see `AMD Instinct MI300X system optimization <https://instinct.docs.amd.com/projects/amdgpu-docs/en/latest/system-optimization/mi300x.html>`_
AMD Instinct MI300X series accelerators, see `AMD Instinct MI300X system optimization <https://instinct.docs.amd.com/projects/amdgpu-docs/en/latest/system-optimization/mi300x.html>`_.
- For application performance optimization strategies for HPC and AI workloads,
including inference with vLLM, see :doc:`/how-to/rocm-for-ai/inference-optimization/workload`.

View File

@@ -24,4 +24,6 @@ training, fine-tuning, and inference. It leverages popular machine learning fram
- :doc:`PyTorch inference performance testing <benchmark-docker/pytorch-inference>`
- :doc:`SGLang inference performance testing <benchmark-docker/sglang>`
- :doc:`Deploying your model <deploy-your-model>`

View File

@@ -1,14 +1,14 @@
.. meta::
:description: How to install ROCm and popular machine learning frameworks.
:description: How to install ROCm and popular deep learning frameworks.
:keywords: ROCm, AI, LLM, train, fine-tune, FSDP, DeepSpeed, LLaMA, tutorial
.. _rocm-for-ai-install:
***********************************************
Installing ROCm and machine learning frameworks
***********************************************
********************************************
Installing ROCm and deep learning frameworks
********************************************
Before getting started, install ROCm and supported machine learning frameworks.
Before getting started, install ROCm and supported deep learning frameworks.
.. grid:: 1
@@ -24,12 +24,13 @@ If youre new to ROCm, refer to the :doc:`ROCm quick start install guide for L
If youre using a Radeon GPU for graphics-accelerated applications, refer to the
`Radeon installation instructions <https://rocm.docs.amd.com/projects/radeon/en/docs-6.1.3/docs/install/native_linux/install-radeon.html>`_.
ROCm supports multiple :doc:`installation methods <rocm-install-on-linux:install/install-overview>`:
You can install ROCm on :ref:`compatible systems <rocm-install-on-linux:reference/system-requirements>` via your Linux
distribution's package manager. See the following documentation resources to get started:
* :doc:`ROCm installation overview <rocm-install-on-linux:install/install-overview>`
* :doc:`Using your Linux distribution's package manager <rocm-install-on-linux:install/install-methods/package-manager-index>`
* :doc:`Using the AMDGPU installer <rocm-install-on-linux:install/install-methods/amdgpu-installer-index>`
* :ref:`Multi-version installation <rocm-install-on-linux:installation-types>`
.. grid:: 1
@@ -42,23 +43,16 @@ ROCm supports multiple :doc:`installation methods <rocm-install-on-linux:install
If you encounter any issues during installation, refer to the
:doc:`Installation troubleshooting <rocm-install-on-linux:reference/install-faq>` guide.
Machine learning frameworks
===========================
Deep learning frameworks
========================
ROCm supports popular machine learning frameworks and libraries including `PyTorch
ROCm supports deep learning frameworks and libraries including `PyTorch
<https://pytorch.org/blog/pytorch-for-amd-rocm-platform-now-available-as-python-package>`_, `TensorFlow
<https://tensorflow.org>`_, `JAX <https://jax.readthedocs.io/en/latest>`_, and `DeepSpeed
<https://cloudblogs.microsoft.com/opensource/2022/03/21/supporting-efficient-large-model-training-on-amd-instinct-gpus-with-deepspeed/>`_.
<https://tensorflow.org>`_, `JAX <https://jax.readthedocs.io/en/latest>`_, and more.
Review the framework installation documentation. For ease-of-use, it's recommended to use official ROCm prebuilt Docker
Review the :doc:`framework installation documentation <../deep-learning-rocm>`. For ease-of-use, it's recommended to use official ROCm prebuilt Docker
images with the framework pre-installed.
* :doc:`PyTorch for ROCm <rocm-install-on-linux:install/3rd-party/pytorch-install>`
* :doc:`TensorFlow for ROCm <rocm-install-on-linux:install/3rd-party/tensorflow-install>`
* :doc:`JAX for ROCm <rocm-install-on-linux:install/3rd-party/jax-install>`
Next steps
==========

View File

@@ -1,3 +1,5 @@
:orphan:
.. meta::
:description: How to train a model using Megatron-LM for ROCm.
:keywords: ROCm, AI, LLM, train, Megatron-LM, megatron, Llama, tutorial, docker, torch
@@ -6,6 +8,14 @@
Training a model with Megatron-LM for ROCm
******************************************
.. caution::
The ROCm Megatron-LM framework now has limited support with this Docker
environment; it now focuses on Primus with Megatron-Core. See :doc:`primus-megatron`.
To learn how to migrate your existing workloads to Primus with Megatron-Core,
see :doc:`previous-versions/megatron-lm-primus-migration-guide`.
The `Megatron-LM framework for ROCm <https://github.com/ROCm/Megatron-LM>`_ is
a specialized fork of the robust Megatron-LM, designed to enable efficient
training of large-scale language models on AMD GPUs. By leveraging AMD
@@ -20,13 +30,17 @@ essential components, including PyTorch, ROCm libraries, and Megatron-LM
utilities. It contains the following software components to accelerate training
workloads:
.. note::
This Docker environment is based on Python 3.10 and Ubuntu 22.04. For an alternative environment with
Python 3.12 and Ubuntu 24.04, see the :doc:`previous ROCm Megatron-LM v25.6 Docker release <previous-versions/megatron-lm-v25.6>`.
.. datatemplate:yaml:: /data/how-to/rocm-for-ai/training/megatron-lm-benchmark-models.yaml
{% set dockers = data.dockers %}
{% if dockers|length > 1 %}
.. tab-set::
{% for docker in data.dockers %}
{% for docker in dockers %}
.. tab-item:: ``{{ docker.pull_tag }}``
:sync: {{ docker.pull_tag }}
@@ -42,28 +56,14 @@ workloads:
{% endfor %}
{% endfor %}
{% elif dockers|length == 1 %}
.. list-table::
:header-rows: 1
* - Software component
- Version
{% for component_name, component_version in docker.components %}
* - {{ component_name }}
- {{ component_version }}
{% endfor %}
{% endif %}
.. _amd-megatron-lm-model-support:
The following models are pre-optimized for performance on AMD Instinct MI300X series accelerators.
Supported models
================
The following models are supported for training performance benchmarking with Megatron-LM and ROCm.
The following models are supported for training performance benchmarking with Megatron-LM and ROCm
on AMD Instinct MI300X series accelerators.
Some instructions, commands, and training recommendations in this documentation might
vary by model -- select one to get started.
@@ -177,7 +177,7 @@ Download the Docker image
{% if dockers|length > 1 %}
.. tab-set::
{% for docker in data.dockers %}
{% for docker in dockers %}
.. tab-item:: {{ docker.doc_name }}
:sync: {{ docker.pull_tag }}
@@ -227,10 +227,17 @@ Download the Docker image
docker start megatron_training_env
docker exec -it megatron_training_env bash
The Docker container includes a pre-installed, verified version of the ROCm
Megatron-LM development branch
`<https://github.com/ROCm/Megatron-LM/tree/rocm_dev>`__, including necessary
training scripts.
4. **Megatron-LM backward compatibility setup** -- this Docker is primarily intended for use with Primus, but it maintains Megatron-LM compatibility with limited support.
To roll back to using Megatron-LM, follow these steps:
.. code-block:: shell
cd /workspace/Megatron-LM/
pip uninstall megatron-core
pip install -e .
The Docker container hosts
`<https://github.com/ROCm/Megatron-LM/tree/rocm_dev>`__ at verified commit ``e8e9edc``.
.. _amd-megatron-lm-environment-setup:

View File

@@ -73,7 +73,11 @@ document are not validated.
.. code-block:: shell
python3 tools/run_models.py --tags pyt_mpt30b_training --keep-model-dir --live-output --clean-docker-cache
madengine run \
--tags pyt_mpt30b_training \
--keep-model-dir \
--live-output \
--clean-docker-cache
.. tip::
@@ -90,7 +94,7 @@ document are not validated.
For improved performance (training throughput), consider enabling TunableOp.
By default, ``pyt_mpt30b_training`` runs with TunableOp disabled. To enable it,
run ``tools/run_models.py`` with the ``--tunableop on`` argument or edit the
run ``madengine run`` with the ``--tunableop on`` argument or edit the
``models.json`` configuration before running training.
Although this might increase the initial training time, it can result in a performance gain.
@@ -172,4 +176,13 @@ Key performance metrics include:
Overall training loss. A decreasing trend indicates the model is learning effectively.
Further reading
===============
- To learn more about MAD and the ``madengine`` CLI, see the `MAD usage guide <https://github.com/ROCm/MAD?tab=readme-ov-file#usage-guide>`__.
- To learn more about system settings and management practices to configure your system for
AMD Instinct MI300X series accelerators, see `AMD Instinct MI300X system optimization <https://instinct.docs.amd.com/projects/amdgpu-docs/en/latest/system-optimization/mi300x.html>`_.
- For a list of other ready-made Docker images for AI with ROCm, see
`AMD Infinity Hub <https://www.amd.com/en/developer/resources/infinity-hub.html#f-amd_hub_category=AI%20%26%20ML%20Models>`_.

View File

@@ -16,12 +16,20 @@ previous releases of the ``ROCm/megatron-lm`` Docker image on `Docker Hub <https
- Components
- Resources
* - v25.6 (latest)
* - v25.7 (latest)
-
* ROCm
* PyTorch
-
* :doc:`Documentation <../megatron-lm>`
* `Docker Hub (py310) <https://hub.docker.com/layers/rocm/megatron-lm/v25.7_py310/images/sha256-6189df849feeeee3ae31bb1e97aef5006d69d2b90c134e97708c19632e20ab5a>`__
* - v25.6
-
* ROCm 6.4.1
* PyTorch 2.8.0a0+git7d205b2
-
* :doc:`Documentation <../megatron-lm>`
* :doc:`Documentation <megatron-lm-v25.6>`
* `Docker Hub (py312) <https://hub.docker.com/layers/rocm/megatron-lm/v25.6_py312/images/sha256-482ff906532285bceabdf2bda629bd32cb6174d2d07f4243a736378001b28df0>`__
* `Docker Hub (py310) <https://hub.docker.com/layers/rocm/megatron-lm/v25.6_py310/images/sha256-9627bd9378684fe26cb1a10c7dd817868f553b33402e49b058355b0f095568d6>`__

View File

@@ -0,0 +1,175 @@
:orphan:
**********************************************************************
Migrating workloads to Primus (Megatron-Core backend) from Megatron-LM
**********************************************************************
Primus supports Megatron-Core as backend optimization library,
replacing ROCm Megatron-LM. This document outlines the steps to migrate
workload from ROCm Megatron-LM to Primus with the Megatron-Core backend.
Model architecture
==================
ROCm Megatron-LM defines model architecture parameters in the training scripts;
for example, the Llama 3 8B model parameters are defined in
`examples/llama/train_llama3.sh <https://github.com/ROCm/Megatron-LM/blob/rocm_dev/examples/llama/train_llama3.sh#L117>`__
as shown below:
.. code-block:: bash
HIDDEN_SIZE=4096
FFN_HIDDEN_SIZE=14336
NUM_LAYERS=32
NUM_HEADS=32
NUM_KV_HEADS=8
Primus defines the model architecture through model YAML configuration files
inside the ``primus/configs/models/megatron/`` repository. For example, Llama 3 8B
model architecture parameters are defined in
`primus/configs/models/megatron/llama3_8B.yaml <https://github.com/AMD-AIG-AIMA/Primus/blob/v0.1.0-rc1/primus/configs/models/megatron/llama3_8B.yaml>`__
as shown below:
.. code-block:: yaml
bases:
- llama3_base.yaml
tokenizer_type: Llama3Tokenizer
tokenizer_model: meta-llama/Llama-3.1-8B
ffn_hidden_size: 14336
hidden_size: 4096
num_attention_heads: 32
num_layers: 32
num_query_groups: 8
Primus' model config files follow a hierarchical design, meaning that new model
config YAMLs can inherit existing model config files by importing them as
bases. For example,
`llama3.1_8B.yaml <https://github.com/AMD-AIG-AIMA/Primus/blob/v0.1.0-rc1/primus/configs/models/megatron/llama3.1_8B.yaml>`__
uses ``llama3_8B.yaml`` as a base config and overrides few parameters, as shown below.
In this example, ``llama3.1_8B`` overrides the ``max_position_embeddings`` value:
.. code-block:: yaml
bases:
- llama3_8B.yaml
tokenizer_type: Llama3Tokenizer
tokenizer_model: meta-llama/Llama-3.1-8B
max_position_embeddings: 131072
.. tip::
Primus provides ``llama_base.yaml`` as the base configuration, which can be
used as bases for additional model architectures. For example,
`mixtral_base.yaml <https://github.com/AMD-AIG-AIMA/Primus/blob/v0.1.0-rc1/primus/configs/models/megatron/mixtral_base.yaml>`__
and
`deepseek_v3_base.yaml <https://github.com/AMD-AIG-AIMA/Primus/blob/v0.1.0-rc1/primus/configs/models/megatron/deepseek_v3_base.yaml>`__
define ``llama_base.yaml`` as its base.
.. code-block:: yaml
# Example mixtral_base.yaml:
bases:
- llama_base.yaml
init_method_std: 0.01
rotary_base: 1000000
qk_layernorm: false
group_query_attention: true
num_query_groups: 8
# moe parameters
num_experts: 8
moe_router_topk: 2
moe_router_load_balancing_type: aux_loss
moe_aux_loss_coeff: 1e-2
moe_grouped_gemm: true
moe_token_dispatcher_type: alltoall
It is recommended to add a new ``${MODEL_NAME}_base.yaml`` to add a new
category of model and define new models on top of it. For example, to add
Qwen2.5 models in Primus, we define
`qwen2.5_base.yaml <https://github.com/AMD-AIG-AIMA/Primus/blob/v0.1.0-rc1/primus/configs/models/megatron/qwen2.5_base.yaml>`__
and build
`qwen2.5_7B.yaml <https://github.com/AMD-AIG-AIMA/Primus/blob/v0.1.0-rc1/primus/configs/models/megatron/qwen2.5_7B.yaml>`__
and
`qwen2.5_72B.yaml <https://github.com/AMD-AIG-AIMA/Primus/blob/v0.1.0-rc1/primus/configs/models/megatron/qwen2.5_72B.yaml>`__
using ``qwen2.5_base.yaml`` as the base config.
Training parameters
===================
ROCm Megatron-LM also defines the training parameters, like batch size,
tensor-parallelism, precision, as so on, in the training scripts. For example,
Llama3 8B model parameters are defined in
`examples/llama/train_llama3.sh <https://github.com/ROCm/Megatron-LM/blob/rocm_dev/examples/llama/train_llama3.sh>`__
as shown below:
.. code-block:: bash
TP="${TP:-8}"
PP="${PP:-1}"
CP="${CP:-1}"
MBS="${MBS:-1}"
BS="${BS:-8}"
Primus defines the training parameters in top-level YAML files -- see
`examples/megatron/configs/
<https://github.com/AMD-AIG-AIMA/Primus/tree/v0.1.0-rc1/examples/megatron/configs>`__.
For example, the `llama3.1_8B-pretrain.yaml
<https://github.com/AMD-AIG-AIMA/Primus/blob/v0.1.0-rc1/examples/megatron/configs/llama3.1_8B-pretrain.yaml>`__
configuration imports the ``llama3.1_8B.yaml`` model architecture file. Users can then override
the default training parameters in ``llama3.1_8B-pretrain.yaml``.
.. code-block:: yaml
# model to run
model: llama3.1_8B.yaml # Model architecture yaml
overrides:
# log
# disable_wandb: false
# disable_tensorboard: false
stderr_sink_level: DEBUG
log_avg_skip_iterations: 2
log_avg_reset_interval: 50
train_iters: 50
micro_batch_size: 2
global_batch_size: 128
seq_length: 8192
max_position_embeddings: 8192
lr: 1.0e-5
min_lr: 0.0
lr_warmup_iters: 2
lr_decay_iters: null
lr_decay_style: cosine
weight_decay: 0.1
adam_beta1: 0.9
adam_beta2: 0.95
eod_mask_loss: true
init_method_std: 0.008
norm_epsilon: 1.0e-6
Backward compatibility with Megatron-LM
=======================================
The Dockerized environment used for Primus maintains compatibility with Megatron-LM with
limited support. To roll back to using Megatron-LM, follow these steps.
.. code-block:: shell
cd /workspace/Megatron-LM/
pip uninstall megatron-core
pip install -e .
Once Megatron-LM is installed, follow :doc:`the documentation <../megatron-lm>` to run workloads as
usual.

View File

@@ -0,0 +1,602 @@
.. meta::
:description: How to train a model using Megatron-LM for ROCm.
:keywords: ROCm, AI, LLM, train, Megatron-LM, megatron, Llama, tutorial, docker, torch
**********************************************
Training a model with Primus and Megatron-Core
**********************************************
`Primus <https://github.com/AMD-AIG-AIMA/Primus>`__ is a unified and flexible
LLM training framework designed to streamline training. It streamlines LLM
training on AMD Instinct accelerators using a modular, reproducible configuration paradigm.
Primus is backend-agnostic and supports multiple training engines -- including Megatron-Core.
.. note::
Primus with the Megatron-Core backend is intended to replace ROCm
Megatron-LM in this Dockerized training environment. To learn how to migrate
workloads from Megatron-LM to Primus with Megatron-Core, see
:doc:`previous-versions/megatron-lm-primus-migration-guide`.
For ease of use, AMD provides a ready-to-use Docker image for MI300 series accelerators
containing essential components for Primus and Megatron-Core.
.. note::
This Docker environment is based on Python 3.10 and Ubuntu 22.04. For an alternative environment with
Python 3.12 and Ubuntu 24.04, see the :doc:`previous ROCm Megatron-LM v25.6 Docker release <previous-versions/megatron-lm-v25.6>`.
.. datatemplate:yaml:: /data/how-to/rocm-for-ai/training/primus-megatron-benchmark-models.yaml
{% set dockers = data.dockers %}
{% set docker = dockers[0] %}
.. list-table::
:header-rows: 1
* - Software component
- Version
{% for component_name, component_version in docker.components.items() %}
* - {{ component_name }}
- {{ component_version }}
{% endfor %}
.. _amd-primus-megatron-lm-model-support:
Supported models
================
The following models are pre-optimized for performance on AMD Instinct MI300X series accelerators.
Some instructions, commands, and training examples in this documentation might
vary by model -- select one to get started.
.. datatemplate:yaml:: /data/how-to/rocm-for-ai/training/primus-megatron-benchmark-models.yaml
{% set model_groups = data.model_groups %}
.. raw:: html
<div id="vllm-benchmark-ud-params-picker" class="container-fluid">
<div class="row">
<div class="col-2 me-2 model-param-head">Model</div>
<div class="row col-10">
{% for model_group in model_groups %}
<div class="col-3 model-param" data-param-k="model-group" data-param-v="{{ model_group.tag }}" tabindex="0">{{ model_group.group }}</div>
{% endfor %}
</div>
</div>
<div class="row mt-1">
<div class="col-2 me-2 model-param-head">Model variant</div>
<div class="row col-10">
{% for model_group in model_groups %}
{% set models = model_group.models %}
{% for model in models %}
{% if models|length % 3 == 0 %}
<div class="col-4 model-param" data-param-k="model" data-param-v="{{ model.mad_tag }}" data-param-group="{{ model_group.tag }}" tabindex="0">{{ model.model }}</div>
{% else %}
<div class="col-6 model-param" data-param-k="model" data-param-v="{{ model.mad_tag }}" data-param-group="{{ model_group.tag }}" tabindex="0">{{ model.model }}</div>
{% endif %}
{% endfor %}
{% endfor %}
</div>
</div>
</div>
.. note::
Some models, such as Llama, require an external license agreement through
a third party (for example, Meta).
System validation
=================
Before running AI workloads, it's important to validate that your AMD hardware is configured
correctly and performing optimally.
If you have already validated your system settings, including aspects like NUMA auto-balancing, you
can skip this step. Otherwise, complete the procedures in the :ref:`System validation and
optimization <rocm-for-ai-system-optimization>` guide to properly configure your system settings
before starting training.
To test for optimal performance, consult the recommended :ref:`System health benchmarks
<rocm-for-ai-system-health-bench>`. This suite of tests will help you verify and fine-tune your
system's configuration.
.. _mi300x-amd-primus-megatron-lm-training:
.. datatemplate:yaml:: /data/how-to/rocm-for-ai/training/primus-megatron-benchmark-models.yaml
{% set dockers = data.dockers %}
{% set docker = dockers[0] %}
Environment setup
=================
Use the following instructions to set up the environment, configure the script to train models, and
reproduce the benchmark results on MI300X series accelerators with the ``{{ docker.pull_tag }}`` image.
.. _amd-primus-megatron-lm-requirements:
Download the Docker image
-------------------------
1. Use the following command to pull the Docker image from Docker Hub.
.. code-block:: shell
docker pull {{ docker.pull_tag }}
2. Launch the Docker container.
.. code-block:: shell
docker run -it \
--device /dev/dri \
--device /dev/kfd \
--device /dev/infiniband \
--network host --ipc host \
--group-add video \
--cap-add SYS_PTRACE \
--security-opt seccomp=unconfined \
--privileged \
-v $HOME:$HOME \
--shm-size 128G \
--name primus_training_env \
{{ docker.pull_tag }}
3. Use these commands if you exit the ``primus_training_env`` container and need to return to it.
.. code-block:: shell
docker start primus_training_env
docker exec -it primus_training_env bash
The Docker container hosts verified release tag ``v0.1.0-rc1`` of the `Primus
<https://github.com/AMD-AIG-AIMA/Primus/tree/v0.1.0-rc1>`__ repository.
.. _amd-primus-megatron-lm-environment-setup:
Configuration
=============
Primus defines a training configuration in YAML for each model in
`examples/megatron/configs <https://github.com/AMD-AIG-AIMA/Primus/tree/v0.1.0-rc1/examples/megatron/configs>`__.
.. datatemplate:yaml:: /data/how-to/rocm-for-ai/training/primus-megatron-benchmark-models.yaml
{% set model_groups = data.model_groups %}
{% for model_group in model_groups %}
{% for model in model_group.models %}
.. container:: model-doc {{ model.mad_tag }}
To update training parameters for {{ model.model }}, you can update ``examples/megatron/configs/{{ model.config_name }}``.
Note that training configuration YAML files for other models follow this naming convention.
{% endfor %}
{% endfor %}
.. note::
See :ref:`Key options <amd-primus-megatron-lm-benchmark-test-vars>` for more information on configuration options.
Dataset options
---------------
You can use either mock data or real data for training.
* Mock data can be useful for testing and validation. Use the ``mock_data`` field to toggle between mock and real data. The default
value is ``true`` for enabled.
.. code-block:: yaml
mock_data: true
* If you're using a real dataset, update the ``train_data_path`` field to point to the location of your dataset.
.. code-block:: bash
mock_data: false
train_data_path: /path/to/your/dataset
Ensure that the files are accessible inside the Docker container.
.. _amd-primus-megatron-lm-tokenizer:
Tokenizer
---------
In Primus, each model uses a tokenizer from Hugging Face. For example, Llama
3.1 8B model uses ``tokenizer_model: meta-llama/Llama-3.1-8B`` and
``tokenizer_type: Llama3Tokenizer`` defined in the `llama3.1-8B model
<https://github.com/AMD-AIG-AIMA/Primus/tree/v0.1.0-rc1/primus/configs/models/megatron/llama3.1_8B.yaml>`__
definition. As such, you need to set the ``HF_TOKEN`` environment variable with
right permissions to access the tokenizer for each model.
.. code-block:: bash
# Export your HF_TOKEN in the workspace
export HF_TOKEN=<your_hftoken>
.. _amd-primus-megatron-lm-run-training:
Run training
============
Use the following example commands to set up the environment, configure
:ref:`key options <amd-primus-megatron-lm-benchmark-test-vars>`, and run training on
MI300X series accelerators with the AMD Megatron-LM environment.
Single node training
--------------------
To run training on a single node, navigate to ``/workspace/Primus`` and use the following setup command:
.. code-block:: shell
pip install -r requirements.txt
export HSA_NO_SCRATCH_RECLAIM=1
export NVTE_CK_USES_BWD_V3=1
Once setup is complete, run the appropriate training command.
.. container:: model-doc primus_pyt_megatron_lm_train_llama-3.3-70b
To run pre-training for Llama 3.3 70B BF16, run:
.. code-block:: shell
EXP=examples/megatron/configs/llama3.3_70B-pretrain.yaml \
bash ./examples/run_pretrain.sh \
--micro_batch_size 2 \
--global_batch_size 16 \
--train_iters 50
.. container:: model-doc primus_pyt_megatron_lm_train_llama-3.1-8b
To run pre-training for Llama 3.1 8B FP8, run:
.. code-block:: shell
EXP=examples/megatron/configs/llama3.1_8B-pretrain.yaml \
bash ./examples/run_pretrain.sh \
--train_iters 50 \
--fp8 hybrid
For Llama 3.1 8B BF16, use the following command:
.. code-block:: shell
EXP=examples/megatron/configs/llama3.1_8B-pretrain.yaml \
bash ./examples/run_pretrain.sh --train_iters 50
.. container:: model-doc primus_pyt_megatron_lm_train_llama-3.1-70b
To run pre-training for Llama 3.1 70B BF16, run:
.. code-block:: shell
EXP=examples/megatron/configs/llama3.1_70B-pretrain.yaml \
bash ./examples/run_pretrain.sh \
--train_iters 50
To run the training on a single node for Llama 3.1 70B FP8 with proxy, use the following command:
.. code-block:: shell
EXP=examples/megatron/configs/llama3.1_70B-pretrain.yaml \
bash ./examples/run_pretrain.sh \
--train_iters 50 \
--num_layers 40 \
--fp8 hybrid \
--no_fp8_weight_transpose_cache true
.. note::
Use two or more nodes to run the *full* Llama 70B model with FP8 precision.
.. container:: model-doc primus_pyt_megatron_lm_train_llama-2-7b
To run pre-training for Llama 2 7B FP8, run:
.. code-block:: shell
EXP=examples/megatron/configs/llama2_7B-pretrain.yaml \
bash ./examples/run_pretrain.sh \
--train_iters 50 \
--fp8 hybrid
To run pre-training for Llama 2 7B BF16, run:
.. code-block:: shell
EXP=examples/megatron/configs/llama2_7B-pretrain.yaml \
bash ./examples/run_pretrain.sh --train_iters 50
.. container:: model-doc primus_pyt_megatron_lm_train_llama-2-70b
To run pre-training for Llama 2 70B BF16, run:
.. code-block:: shell
EXP=examples/megatron/configs/llama2_70B-pretrain.yaml \
bash ./examples/run_pretrain.sh --train_iters 50
.. container:: model-doc primus_pyt_megatron_lm_train_deepseek-v3-proxy
To run training on a single node for DeepSeek-V3 (MoE with expert parallel) with 3-layer proxy,
use the following command:
.. code-block:: shell
EXP=examples/megatron/configs/deepseek_v3-pretrain.yaml \
bash examples/run_pretrain.sh \
--num_layers 3 \
--moe_layer_freq 1 \
--train_iters 50
.. container:: model-doc primus_pyt_megatron_lm_train_deepseek-v2-lite-16b
To run training on a single node for DeepSeek-V2-Lite (MoE with expert parallel),
use the following command:
.. code-block:: shell
EXP=examples/megatron/configs/deepseek_v2_lite-pretrain.yaml \
bash examples/run_pretrain.sh \
--global_batch_size 256 \
--train_iters 50
.. container:: model-doc primus_pyt_megatron_lm_train_mixtral-8x7b
To run training on a single node for Mixtral 8x7B (MoE with expert parallel),
use the following command:
.. code-block:: shell
EXP=examples/megatron/configs/mixtral_8x7B_v0.1-pretrain.yaml \
bash examples/run_pretrain.sh --train_iters 50
.. container:: model-doc primus_pyt_megatron_lm_train_mixtral-8x22b-proxy
To run training on a single node for Mixtral 8x7B (MoE with expert parallel) with 4-layer proxy,
use the following command:
.. code-block:: shell
EXP=examples/megatron/configs/mixtral_8x22B_v0.1-pretrain.yaml \
bash examples/run_pretrain.sh \
--num_layers 4 \
--pipeline_model_parallel_size 1 \
--micro_batch_size 1 \
--global_batch_size 16 \
--train_iters 50
.. container:: model-doc primus_pyt_megatron_lm_train_qwen2.5-7b
To run training on a single node for Qwen 2.5 7B BF16, use the following
command:
.. code-block:: shell
EXP=examples/megatron/configs/qwen2.5_7B-pretrain.yaml \
bash examples/run_pretrain.sh --train_iters 50
For FP8, use the following command.
.. code-block:: shell
EXP=examples/megatron/configs/qwen2.5_7B-pretrain.yaml \
bash examples/run_pretrain.sh \
--train_iters 50 \
--fp8 hybrid
.. container:: model-doc primus_pyt_megatron_lm_train_qwen2.5-72b
To run the training on a single node for Qwen 2.5 72B BF16, use the following command.
.. code-block:: shell
EXP=examples/megatron/configs/qwen2.5_72B-pretrain.yaml \
bash examples/run_pretrain.sh --train_iters 50
Multi-node training examples
----------------------------
To run training on multiple nodes, you can use the
`run_slurm_pretrain.sh <https://github.com/AMD-AIG-AIMA/Primus/tree/v0.1.0-rc1/examples/run_slurm_pretrain.sh>`__
to launch the multi-node workload. Use the following steps to setup your environment:
.. datatemplate:yaml:: /data/how-to/rocm-for-ai/training/primus-megatron-benchmark-models.yaml
{% set dockers = data.dockers %}
{% set docker = dockers[0] %}
.. code-block:: shell
cd /workspace/Primus/
export DOCKER_IMAGE={{ docker.pull_tag }}
export HF_TOKEN=<your_HF_token>
export HSA_NO_SCRATCH_RECLAIM=1
export NVTE_CK_USES_BWD_V3=1
export NCCL_IB_HCA=<your_NCCL_IB_HCA> # specify which RDMA interfaces to use for communication
export NCCL_SOCKET_IFNAME=<your_NCCL_SOCKET_IFNAME> # your Network Interface
export GLOO_SOCKET_IFNAME=<your_GLOO_SOCKET_IFNAME> # your Network Interface
export NCCL_IB_GID_INDEX=3 # Set InfiniBand GID index for NCCL communication. Default is 3 for ROCE
.. note::
* Make sure correct network drivers are installed on the nodes. If inside a Docker, either install the drivers inside the Docker container or pass the network drivers from the host while creating Docker container.
* If ``NCCL_IB_HCA`` and ``NCCL_SOCKET_IFNAME`` are not set, Primus will try to auto-detect. However, since NICs can vary accross different cluster, it is encouraged to explicitly export your NCCL parameters for the cluster.
* To find your network interface, you can use ``ip a``.
* To find RDMA interfaces, you can use ``ibv_devices`` to get the list of all the RDMA/IB devices.
.. container:: model-doc primus_pyt_megatron_lm_train_llama-3.3-70b
To train Llama 3.3 70B FP8 on 8 nodes, run:
.. code-block:: shell
NNODES=8 EXP=examples/megatron/configs/llama3.3_70B-pretrain.yaml \
bash examples/run_slurm_pretrain.sh \
--micro_batch_size 4 \
--global_batch_size 256 \
--recompute_num_layers 80 \
--no_fp8_weight_transpose_cache true \
--fp8 hybrid
To train Llama 3.3 70B BF16 on 8 nodes, run:
.. code-block:: shell
NNODES=8 EXP=examples/megatron/configs/llama3.3_70B-pretrain.yaml \
bash examples/run_slurm_pretrain.sh \
--micro_batch_size 1 \
--global_batch_size 256 \
--recompute_num_layers 12
.. container:: model-doc primus_pyt_megatron_lm_train_llama-3.1-8b
To train Llama 3.1 8B FP8 on 8 nodes, run:
.. code-block:: shell
# Adjust the training parameters. For e.g., `global_batch_size: 8 * #single_node_bs` for 8 nodes in this case
NNODES=8 EXP=examples/megatron/configs/llama3.1_8B-pretrain.yaml \
bash ./examples/run_slurm_pretrain.sh \
--global_batch_size 1024 \
--fp8 hybrid
.. container:: model-doc primus_pyt_megatron_lm_train_llama-3.1-70b
To train Llama 3.1 70B FP8 on 8 nodes, run:
.. code-block:: shell
NNODES=8 EXP=examples/megatron/configs/llama3.1_70B-pretrain.yaml \
bash examples/run_slurm_pretrain.sh \
--micro_batch_size 4 \
--global_batch_size 256 \
--recompute_num_layers 80 \
--no_fp8_weight_transpose_cache true \
--fp8 hybrid
To train Llama 3.1 70B BF16 on 8 nodes, run:
.. code-block:: shell
NNODES=8 EXP=examples/megatron/configs/llama3.1_70B-pretrain.yaml \
bash examples/run_slurm_pretrain.sh \
--micro_batch_size 1 \
--global_batch_size 256 \
--recompute_num_layers 12
.. container:: model-doc primus_pyt_megatron_lm_train_llama-2-7b
To train Llama 2 8B FP8 on 8 nodes, run:
.. code-block:: shell
# Adjust the training parameters. For e.g., `global_batch_size: 8 * #single_node_bs` for 8 nodes in this case
NNODES=8 EXP=examples/megatron/configs/llama2_7B-pretrain.yaml bash ./examples/run_slurm_pretrain.sh --global_batch_size 2048 --fp8 hybrid
.. container:: model-doc primus_pyt_megatron_lm_train_llama-2-70b
To train Llama 2 70B FP8 on 8 nodes, run:
.. code-block:: shell
NNODES=8 EXP=examples/megatron/configs/llama2_70B-pretrain.yaml \
bash examples/run_slurm_pretrain.sh \
--micro_batch_size 10 \
--global_batch_size 640 \
--recompute_num_layers 80 \
--no_fp8_weight_transpose_cache true \
--fp8 hybrid
To train Llama 2 70B BF16 on 8 nodes, run:
.. code-block:: shell
NNODES=8 EXP=examples/megatron/configs/llama2_70B-pretrain.yaml \
bash ./examples/run_slurm_pretrain.sh \
--micro_batch_size 2 \
--global_batch_size 1536 \
--recompute_num_layers 12
.. container:: model-doc primus_pyt_megatron_lm_train_mixtral-8x7b
To train Mixtral 8x7B BF16 on 8 nodes, run:
.. code-block:: shell
NNODES=8 EXP=examples/megatron/configs/mixtral_8x7B_v0.1-pretrain.yaml \
bash examples/run_slurm_pretrain.sh \
--micro_batch_size 2 \
--global_batch_size 256
.. container:: model-doc primus_pyt_megatron_lm_train_qwen2.5-72b
To train Qwen2.5 72B FP8 on 8 nodes, run:
.. code-block:: shell
NNODES=8 EXP=examples/megatron/configs/qwen2.5_72B-pretrain.yaml \
bash examples/run_slurm_pretrain.sh \
--micro_batch_size 8 \
--global_batch_size 512 \
--recompute_num_layers 80 \
--no_fp8_weight_transpose_cache true \
--fp8 hybrid
.. _amd-primus-megatron-lm-benchmark-test-vars:
Key options
-----------
The following are key options to take note of
fp8
``hybrid`` enables FP8 GEMMs.
use_torch_fsdp2
``use_torch_fsdp2: 1`` enables torch fsdp-v2. If FSDP is enabled,
set ``use_distributed_optimizer`` and ``overlap_param_gather`` to ``false``.
profile
To enable PyTorch profiling, set these parameters:
.. code-block:: yaml
profile: true
use_pytorch_profiler: true
profile_step_end: 7
profile_step_start: 6
train_iters
The total number of iterations (default: 50).
mock_data
True by default.
micro_batch_size
Micro batch size.
global_batch_size
Global batch size.
recompute_granularity
For activation checkpointing.
num_layers
For using a reduced number of layers as with proxy models.
Previous versions
=================
See :doc:`previous-versions/megatron-lm-history` to find documentation for previous releases
of the ``ROCm/megatron-lm`` Docker image.
This training environment now uses Primus with Megatron as the primary
configuration. Limited support for the legacy ROCm Megatron-LM is still
available. For instructions on using ROCm Megatron-LM, see the
:doc:`megatron-lm` document.

View File

@@ -142,7 +142,11 @@ The following models are pre-optimized for performance on the AMD Instinct MI325
.. code-block:: shell
export MAD_SECRETS_HFTOKEN="your personal Hugging Face token to access gated models"
python3 tools/run_models.py --tags {{ model.mad_tag }} --keep-model-dir --live-output --timeout 28800
madengine run \
--tags {{ model.mad_tag }} \
--keep-model-dir \
--live-output \
--timeout 28800
MAD launches a Docker container with the name
``container_ci-{{ model.mad_tag }}``, for example. The latency and throughput reports of the
@@ -427,6 +431,17 @@ The following models are pre-optimized for performance on the AMD Instinct MI325
For examples of benchmarking commands, see `<https://github.com/ROCm/MAD/tree/develop/benchmark/pytorch_train#benchmarking-examples>`__.
Further reading
===============
- To learn more about MAD and the ``madengine`` CLI, see the `MAD usage guide <https://github.com/ROCm/MAD?tab=readme-ov-file#usage-guide>`__.
- To learn more about system settings and management practices to configure your system for
AMD Instinct MI300X series accelerators, see `AMD Instinct MI300X system optimization <https://instinct.docs.amd.com/projects/amdgpu-docs/en/latest/system-optimization/mi300x.html>`_.
- For a list of other ready-made Docker images for AI with ROCm, see
`AMD Infinity Hub <https://www.amd.com/en/developer/resources/infinity-hub.html#f-amd_hub_category=AI%20%26%20ML%20Models>`_.
Previous versions
=================

View File

@@ -21,6 +21,8 @@ In this guide, you'll learn about:
- Training a model
- :doc:`With Primus (Megatron-LM backend) <benchmark-docker/primus-megatron>`
- :doc:`With Megatron-LM <benchmark-docker/megatron-lm>`
- :doc:`With PyTorch <benchmark-docker/pytorch-training>`

View File

@@ -285,7 +285,7 @@ For more information about ROCm hardware compatibility, see the ROCm `Compatibil
- Radeon AI PRO R9700
- RDNA4
- gfx1201
- 16
- 32
- 64
- 32 or 64
- 128

View File

@@ -55,7 +55,7 @@ The floating-point types supported by ROCm are listed in the following table.
.. list-table::
:header-rows: 1
:widths: 15,15,70
:widths: 15,25,60
*
- Type name
@@ -63,18 +63,19 @@ The floating-point types supported by ROCm are listed in the following table.
- Description
*
- float8 (E4M3)
- ``__hip_fp8_e4m3_fnuz``
- An 8-bit floating-point number that mostly follows IEEE-754 conventions
and **S1E4M3** bit layout, as described in `8-bit Numerical Formats for Deep Neural Networks <https://arxiv.org/abs/2206.02915>`_,
with expanded range and no infinity or signed zero. NaN is represented
as negative zero.
- | ``__hip_fp8_e4m3_fnuz``,
| ``__hip_fp8_e4m3``
- An 8-bit floating-point number with **S1E4M3** bit layout, as described in :doc:`low precision floating point types page <hip:reference/low_fp_types>`.
The FNUZ variant has expanded range with no infinity or signed zero (NaN represented as negative zero),
while the OCP variant follows the Open Compute Project specification.
*
- float8 (E5M2)
- ``__hip_fp8_e5m2_fnuz``
- An 8-bit floating-point number mostly following IEEE-754 conventions and
**S1E5M2** bit layout, as described in `8-bit Numerical Formats for Deep Neural Networks <https://arxiv.org/abs/2206.02915>`_,
with expanded range and no infinity or signed zero. NaN is represented
as negative zero.
- | ``__hip_fp8_e5m2_fnuz``,
| ``__hip_fp8_e5m2``
- An 8-bit floating-point number with **S1E5M2** bit layout, as described in :doc:`low precision floating point types page <hip:reference/low_fp_types>`.
The FNUZ variant has expanded range with no infinity or signed zero (NaN represented as negative zero),
while the OCP variant follows the Open Compute Project specification.
*
- float16
- ``half``
@@ -107,9 +108,8 @@ The floating-point types supported by ROCm are listed in the following table.
* The float8 and tensorfloat32 types are internal types used in calculations
in Matrix Cores and can be stored in any type of the same size.
* The encodings for FP8 (E5M2) and FP8 (E4M3) that the
MI300 series natively supports differ from the FP8 (E5M2) and FP8 (E4M3)
encodings used in NVIDIA H100
* CNDA3 natively supports FP8 FNUZ (E4M3 and E5M2), which differs from the customised
FP8 format used in NVIDIA's H100
(`FP8 Formats for Deep Learning <https://arxiv.org/abs/2209.05433>`_).
* In some AMD documents and articles, float8 (E5M2) is referred to as bfloat8.
@@ -128,7 +128,7 @@ pages.
:header-rows: 1
*
- Icon
- Icon
- Definition
*
@@ -163,12 +163,137 @@ pages.
* Any type can be emulated by software, but this page does not cover such
cases.
Data type support by Hardware Architecture
Data type support by hardware architecture
==========================================
The MI200 series GPUs, which include MI210, MI250, and MI250X, are based on the
CDNA2 architecture. The MI300 series GPUs, consisting of MI300A, MI300X, and
MI325X, are based on the CDNA3 architecture.
AMD's GPU lineup spans multiple architecture generations:
* CDNA1 architecture: includes models such as MI100
* CDNA2 architecture: includes models such as MI210, MI250, and MI250X
* CDNA3 architecture: includes models such as MI300A, MI300X, and MI325X
* RDNA3 architecture: includes models such as RX 7900XT and RX 7900XTX
* RDNA4 architecture: includes models such as RX 9070 and RX 9070XT
HIP C++ type implementation support
-----------------------------------
The HIP C++ types available on different hardware platforms are listed in the
following table.
.. list-table::
:header-rows: 1
*
- HIP C++ Type
- CDNA1
- CDNA2
- CDNA3
- RDNA3
- RDNA4
*
- ``int8_t``, ``uint8_t``
-
-
-
-
-
*
- ``int16_t``, ``uint16_t``
-
-
-
-
-
*
- ``int32_t``, ``uint32_t``
-
-
-
-
-
*
- ``int64_t``, ``uint64_t``
-
-
-
-
-
*
- ``__hip_fp8_e4m3_fnuz``
-
-
-
-
-
*
- ``__hip_fp8_e5m2_fnuz``
-
-
-
-
-
*
- ``__hip_fp8_e4m3``
-
-
-
-
-
*
- ``__hip_fp8_e5m2``
-
-
-
-
-
*
- ``half``
-
-
-
-
-
*
- ``bfloat16``
-
-
-
-
-
*
- ``float``
-
-
-
-
-
*
- ``double``
-
-
-
-
-
.. note::
Library support for specific data types is contingent upon hardware support.
Even if a ROCm library indicates support for a particular data type, that type
will only be fully functional if the underlying hardware architecture (as shown
in the table above) also supports it. For example, fp8 types are only available
on architectures shown with a checkmark in the relevant rows.
Compute units support
---------------------
@@ -190,19 +315,33 @@ The following table lists data type support for compute units.
- int32
- int64
*
- MI100
- CDNA1
-
-
-
-
*
- MI200 series
- CDNA2
-
-
-
-
*
- MI300 series
- CDNA3
-
-
-
-
*
- RDNA3
-
-
-
-
*
- RDNA4
-
-
-
@@ -224,7 +363,7 @@ The following table lists data type support for compute units.
- float32
- float64
*
- MI100
- CDNA1
-
-
-
@@ -233,7 +372,7 @@ The following table lists data type support for compute units.
-
-
*
- MI200 series
- CDNA2
-
-
-
@@ -242,7 +381,27 @@ The following table lists data type support for compute units.
-
-
*
- MI300 series
- CDNA3
-
-
-
-
-
-
-
*
- RDNA3
-
-
-
-
-
-
-
*
- RDNA4
-
-
-
@@ -271,19 +430,33 @@ The following table lists data type support for AMD GPU matrix cores.
- int32
- int64
*
- MI100
- CDNA1
-
-
-
-
*
- MI200 series
- CDNA2
-
-
-
-
*
- MI300 series
- CDNA3
-
-
-
-
*
- RDNA3
-
-
-
-
*
- RDNA4
-
-
-
@@ -305,7 +478,7 @@ The following table lists data type support for AMD GPU matrix cores.
- float32
- float64
*
- MI100
- CDNA1
-
-
-
@@ -314,7 +487,7 @@ The following table lists data type support for AMD GPU matrix cores.
-
-
*
- MI200 series
- CDNA2
-
-
-
@@ -323,7 +496,7 @@ The following table lists data type support for AMD GPU matrix cores.
-
-
*
- MI300 series
- CDNA3
-
-
-
@@ -332,6 +505,26 @@ The following table lists data type support for AMD GPU matrix cores.
-
-
*
- RDNA3
-
-
-
-
-
-
-
*
- RDNA4
-
-
-
-
-
-
-
Atomic operations support
-------------------------
@@ -357,19 +550,33 @@ page.
- int32
- int64
*
- MI100
- CDNA1
-
-
-
-
*
- MI200 series
- CDNA2
-
-
-
-
*
- MI300 series
- CDNA3
-
-
-
-
*
- RDNA3
-
-
-
-
*
- RDNA4
-
-
-
@@ -391,7 +598,7 @@ page.
- float32
- float64
*
- MI100
- CDNA1
-
-
-
@@ -400,7 +607,7 @@ page.
-
-
*
- MI200 series
- CDNA2
-
-
-
@@ -409,7 +616,7 @@ page.
-
-
*
- MI300 series
- CDNA3
-
-
-
@@ -418,6 +625,26 @@ page.
-
-
*
- RDNA3
-
-
-
-
-
-
-
*
- RDNA4
-
-
-
-
-
-
-
.. note::
You can emulate atomic operations using software for cases that are not
@@ -452,36 +679,98 @@ detailed description.
- int16
- int32
- int64
*
- :doc:`hipSPARSELt <hipsparselt:reference/data-type-support>`
- :doc:`Composable Kernel <composable_kernel:reference/Composable_Kernel_supported_scalar_types>`
- ✅/✅
- ❌/❌
- ✅/✅
- ❌/❌
- ❌/❌
*
- :doc:`rocRAND <rocrand:api-reference/data-type-support>`
- NA/✅
- NA/✅
- NA/✅
- NA/✅
*
- :doc:`hipRAND <hiprand:api-reference/data-type-support>`
- NA/✅
- NA/✅
- NA/✅
- NA/✅
*
- :doc:`rocPRIM <rocprim:reference/data-type-support>`
- ✅/✅
- ✅/✅
- ✅/✅
- ✅/✅
*
- :doc:`hipCUB <hipcub:api-reference/data-type-support>`
- ✅/✅
- ✅/✅
- ✅/✅
- ✅/✅
*
- :doc:`hipRAND <hiprand:api-reference/data-type-support>`
- NA/✅
- NA/✅
- NA/✅
- NA/✅
*
- :doc:`hipSOLVER <hipsolver:reference/precision>`
- ❌/❌
- ❌/❌
- ❌/❌
- ❌/❌
*
- :doc:`hipSPARSELt <hipsparselt:reference/data-type-support>`
- ✅/✅
- ❌/❌
- ❌/❌
- ❌/❌
*
- :doc:`hipTensor <hiptensor:api-reference/api-reference>`
- ❌/❌
- ❌/❌
- ❌/❌
- ❌/❌
*
- :doc:`MIGraphX <amdmigraphx:reference/cpp>`
- ✅/✅
- ✅/✅
- ✅/✅
- ✅/✅
*
- :doc:`MIOpen <miopen:reference/datatypes>`
- ⚠️/⚠️
- ❌/❌
- ⚠️/⚠️
- ❌/❌
*
- :doc:`RCCL <rccl:api-reference/library-specification>`
- ✅/✅
- ❌/❌
- ✅/✅
- ✅/✅
*
- :doc:`rocFFT <rocfft:reference/api>`
- ❌/❌
- ❌/❌
- ❌/❌
- ❌/❌
*
- :doc:`rocPRIM <rocprim:reference/data-type-support>`
- ✅/✅
- ✅/✅
- ✅/✅
- ✅/✅
*
- :doc:`rocRAND <rocrand:api-reference/data-type-support>`
- NA/✅
- NA/✅
- NA/✅
- NA/✅
*
- :doc:`rocSOLVER <rocsolver:reference/precision>`
- ❌/❌
- ❌/❌
- ❌/❌
- ❌/❌
*
- :doc:`rocThrust <rocthrust:data-type-support>`
- ✅/✅
@@ -489,6 +778,14 @@ detailed description.
- ✅/✅
- ✅/✅
*
- :doc:`rocWMMA <rocwmma:api-reference/api-reference-guide>`
- ✅/✅
- ❌/❌
- ❌/✅
- ❌/❌
.. tab-item:: Floating-point types
:sync: floating-point-type
@@ -504,42 +801,17 @@ detailed description.
- tensorfloat32
- float32
- float64
*
- :doc:`hipSPARSELt <hipsparselt:reference/data-type-support>`
- ❌/❌
- ❌/❌
- :doc:`Composable Kernel <composable_kernel:reference/Composable_Kernel_supported_scalar_types>`
- ✅/✅
- ✅/✅
- ❌/❌
- ❌/❌
- ❌/❌
*
- :doc:`rocRAND <rocrand:api-reference/data-type-support>`
- NA/❌
- NA/❌
- NA/✅
- NA/❌
- NA/❌
- NA/✅
- NA/✅
*
- :doc:`hipRAND <hiprand:api-reference/data-type-support>`
- NA/❌
- NA/❌
- NA/✅
- NA/❌
- NA/❌
- NA/✅
- NA/✅
*
- :doc:`rocPRIM <rocprim:reference/data-type-support>`
- ❌/❌
- ❌/❌
- ✅/✅
- ✅/✅
- ❌/❌
- ✅/✅
- ✅/✅
*
- :doc:`hipCUB <hipcub:api-reference/data-type-support>`
- ❌/❌
@@ -549,6 +821,117 @@ detailed description.
- ❌/❌
- ✅/✅
- ✅/✅
*
- :doc:`hipRAND <hiprand:api-reference/data-type-support>`
- NA/❌
- NA/❌
- NA/✅
- NA/❌
- NA/❌
- NA/✅
- NA/✅
*
- :doc:`hipSOLVER <hipsolver:reference/precision>`
- ❌/❌
- ❌/❌
- ❌/❌
- ❌/❌
- ❌/❌
- ✅/✅
- ✅/✅
*
- :doc:`hipSPARSELt <hipsparselt:reference/data-type-support>`
- ✅/✅
- ✅/✅
- ✅/✅
- ✅/✅
- ❌/❌
- ❌/❌
- ❌/❌
*
- :doc:`hipTensor <hiptensor:api-reference/api-reference>`
- ❌/❌
- ❌/❌
- ✅/✅
- ✅/✅
- ❌/❌
- ✅/✅
- ✅/✅
*
- :doc:`MIGraphX <amdmigraphx:reference/cpp>`
- ✅/✅
- ✅/✅
- ✅/✅
- ✅/✅
- ✅/✅
- ✅/✅
- ✅/✅
*
- :doc:`MIOpen <miopen:reference/datatypes>`
- ⚠️/⚠️
- ⚠️/⚠️
- ✅/✅
- ⚠️/⚠️
- ❌/❌
- ✅/✅
- ⚠️/⚠️
*
- :doc:`RCCL <rccl:api-reference/library-specification>`
- ✅/✅
- ✅/✅
- ✅/✅
- ✅/✅
- ❌/❌
- ✅/✅
- ✅/✅
*
- :doc:`rocFFT <rocfft:reference/api>`
- ❌/❌
- ❌/❌
- ✅/✅
- ❌/❌
- ❌/❌
- ✅/✅
- ✅/✅
*
- :doc:`rocPRIM <rocprim:reference/data-type-support>`
- ❌/❌
- ❌/❌
- ✅/✅
- ✅/✅
- ❌/❌
- ✅/✅
- ✅/✅
*
- :doc:`rocRAND <rocrand:api-reference/data-type-support>`
- NA/❌
- NA/❌
- NA/✅
- NA/❌
- NA/❌
- NA/✅
- NA/✅
*
- :doc:`rocSOLVER <rocsolver:reference/precision>`
- ❌/❌
- ❌/❌
- ❌/❌
- ❌/❌
- ❌/❌
- ✅/✅
- ✅/✅
*
- :doc:`rocThrust <rocthrust:data-type-support>`
- ❌/❌
@@ -559,62 +942,123 @@ detailed description.
- ✅/✅
- ✅/✅
*
- :doc:`rocWMMA <rocwmma:api-reference/api-reference-guide>`
- ✅/❌
- ✅/❌
- ✅/✅
- ✅/✅
- ✅/✅
- ✅/✅
- ✅/✅
.. note::
As random number generation libraries, rocRAND and hipRAND only specify output
data types for the random values they generate, with no need for input data
types.
Libraries internal calculations type support
--------------------------------------------
hipDataType enumeration
-----------------------
The following tables list ROCm library support for specific internal data types.
Refer to the corresponding library data type support page for a detailed
description.
The ``hipDataType`` enumeration defines data precision types and is primarily
used when the data reference itself does not include type information, such as
in ``void*`` pointers. This enumeration is mainly utilized in BLAS libraries.
The HIP type equivalents of the ``hipDataType`` enumeration are listed in the
following table with descriptions and values.
.. tab-set::
.. list-table::
:header-rows: 1
:widths: 25,25,10,40
.. tab-item:: Integral types
:sync: integral-type
*
- hipDataType
- HIP type
- Value
- Description
.. list-table::
:header-rows: 1
*
- ``HIP_R_8I``
- ``int8_t``
- 3
- 8-bit real signed integer.
*
- Library internal data type name
- int8
- int16
- int32
- int64
*
- :doc:`hipSPARSELt <hipsparselt:reference/data-type-support>`
-
-
-
-
*
- ``HIP_R_8U``
- ``uint8_t``
- 8
- 8-bit real unsigned integer.
*
- ``HIP_R_16I``
- ``int16_t``
- 20
- 16-bit real signed integer.
.. tab-item:: Floating-point types
:sync: floating-point-type
*
- ``HIP_R_16U``
- ``uint16_t``
- 22
- 16-bit real unsigned integer.
.. list-table::
:header-rows: 1
*
- ``HIP_R_32I``
- ``int32_t``
- 10
- 32-bit real signed integer.
*
- Library internal data type name
- float8 (E4M3)
- float8 (E5M2)
- float16
- bfloat16
- tensorfloat32
- float32
- float64
*
- :doc:`hipSPARSELt <hipsparselt:reference/data-type-support>`
-
-
-
-
-
-
-
*
- ``HIP_R_32U``
- ``uint32_t``
- 12
- 32-bit real unsigned integer.
*
- ``HIP_R_32F``
- ``float``
- 0
- 32-bit real single precision floating-point.
*
- ``HIP_R_64F``
- ``double``
- 1
- 64-bit real double precision floating-point.
*
- ``HIP_R_16F``
- ``half``
- 2
- 16-bit real half precision floating-point.
*
- ``HIP_R_16BF``
- ``bfloat16``
- 14
- 16-bit real bfloat16 precision floating-point.
*
- ``HIP_R_8F_E4M3``
- ``__hip_fp8_e4m3``
- 28
- 8-bit real float8 precision floating-point (OCP version).
*
- ``HIP_R_8F_E5M2``
- ``__hip_fp8_e5m2``
- 29
- 8-bit real bfloat8 precision floating-point (OCP version).
*
- ``HIP_R_8F_E4M3_FNUZ``
- ``__hip_fp8_e4m3_fnuz``
- 1000
- 8-bit real float8 precision floating-point (FNUZ version).
*
- ``HIP_R_8F_E5M2_FNUZ``
- ``__hip_fp8_e5m2_fnuz``
- 1001
- 8-bit real bfloat8 precision floating-point (FNUZ version).
The full list of the ``hipDataType`` enumeration listed in `library_types.h <https://github.com/ROCm/hip/blob/amd-staging/include/hip/library_types.h>`_ .

View File

@@ -10,6 +10,7 @@
| Version | Release date |
| ------- | ------------ |
| [6.4.3](https://rocm.docs.amd.com/en/docs-6.4.3/) | August 7, 2025 |
| [6.4.2](https://rocm.docs.amd.com/en/docs-6.4.2/) | July 21, 2025 |
| [6.4.1](https://rocm.docs.amd.com/en/docs-6.4.1/) | May 21, 2025 |
| [6.4.0](https://rocm.docs.amd.com/en/docs-6.4.0/) | April 11, 2025 |

View File

@@ -19,14 +19,32 @@ subtrees:
- caption: Install
entries:
- url: https://rocm.docs.amd.com/projects/install-on-linux/en/latest/
- url: https://rocm.docs.amd.com/projects/install-on-linux/en/${branch}/
title: ROCm on Linux
- url: https://rocm.docs.amd.com/projects/install-on-windows/en/${branch}/
- url: https://rocm.docs.amd.com/projects/install-on-windows/en/latest/
title: HIP SDK on Windows
- url: https://rocm.docs.amd.com/projects/radeon/en/latest/index.html
title: ROCm on Radeon GPUs
- file: how-to/deep-learning-rocm.md
title: Deep learning frameworks
subtrees:
- entries:
- file: compatibility/ml-compatibility/pytorch-compatibility.rst
title: PyTorch compatibility
- file: compatibility/ml-compatibility/tensorflow-compatibility.rst
title: TensorFlow compatibility
- file: compatibility/ml-compatibility/jax-compatibility.rst
title: JAX compatibility
- file: compatibility/ml-compatibility/verl-compatibility.rst
title: verl compatibility
- file: compatibility/ml-compatibility/stanford-megatron-lm-compatibility.rst
title: Stanford Megatron-LM compatibility
- file: compatibility/ml-compatibility/dgl-compatibility.rst
title: DGL compatibility
- file: compatibility/ml-compatibility/megablocks-compatibility.rst
title: Megablocks compatibility
- file: compatibility/ml-compatibility/taichi-compatibility.rst
title: Taichi compatibility
- file: how-to/build-rocm.rst
title: Build ROCm from source
@@ -44,8 +62,8 @@ subtrees:
title: Training
subtrees:
- entries:
- file: how-to/rocm-for-ai/training/benchmark-docker/megatron-lm.rst
title: Train a model with Megatron-LM
- file: how-to/rocm-for-ai/training/benchmark-docker/primus-megatron.rst
title: Train a model with Primus and Megatron-Core
- file: how-to/rocm-for-ai/training/benchmark-docker/pytorch-training.rst
title: Train a model with PyTorch
- file: how-to/rocm-for-ai/training/benchmark-docker/jax-maxtext.rst
@@ -82,6 +100,8 @@ subtrees:
title: vLLM inference performance testing
- file: how-to/rocm-for-ai/inference/benchmark-docker/pytorch-inference.rst
title: PyTorch inference performance testing
- file: how-to/rocm-for-ai/inference/benchmark-docker/sglang.rst
title: SGLang inference performance testing
- file: how-to/rocm-for-ai/inference/deploy-your-model.rst
title: Deploy your model

View File

@@ -1,4 +1,4 @@
rocm-docs-core==1.20.1
rocm-docs-core==1.22.0
sphinx-reredirects
sphinx-sitemap
sphinxcontrib.datatemplates==0.11.0

View File

@@ -23,7 +23,7 @@ beautifulsoup4==4.13.4
# via pydata-sphinx-theme
breathe==4.36.0
# via rocm-docs-core
certifi==2025.4.26
certifi==2025.7.14
# via requests
cffi==1.17.1
# via
@@ -35,18 +35,16 @@ click==8.2.1
# via
# jupyter-cache
# sphinx-external-toc
comm==0.2.2
comm==0.2.3
# via ipykernel
cryptography==45.0.3
cryptography==45.0.5
# via pyjwt
debugpy==1.8.14
debugpy==1.8.15
# via ipykernel
decorator==5.2.1
# via ipython
defusedxml==0.7.1
# via sphinxcontrib-datatemplates
deprecated==1.2.18
# via pygithub
docutils==0.21.2
# via
# myst-parser
@@ -62,7 +60,7 @@ fastjsonschema==2.21.1
# rocm-docs-core
gitdb==4.0.12
# via gitpython
gitpython==3.1.44
gitpython==3.1.45
# via rocm-docs-core
greenlet==3.2.3
# via sqlalchemy
@@ -74,7 +72,7 @@ importlib-metadata==8.7.0
# via
# jupyter-cache
# myst-nb
ipykernel==6.29.5
ipykernel==6.30.0
# via myst-nb
ipython==8.37.0
# via
@@ -86,7 +84,7 @@ jinja2==3.1.6
# via
# myst-parser
# sphinx
jsonschema==4.24.0
jsonschema==4.25.0
# via nbformat
jsonschema-specifications==2025.4.1
# via jsonschema
@@ -116,7 +114,7 @@ mdit-py-plugins==0.4.2
# via myst-parser
mdurl==0.1.2
# via markdown-it-py
myst-nb==1.2.0
myst-nb==1.3.0
# via rocm-docs-core
myst-parser==4.0.1
# via myst-nb
@@ -134,7 +132,6 @@ nest-asyncio==1.6.0
packaging==25.0
# via
# ipykernel
# pydata-sphinx-theme
# sphinx
parso==0.8.4
# via jedi
@@ -152,13 +149,13 @@ pure-eval==0.2.3
# via stack-data
pycparser==2.22
# via cffi
pydata-sphinx-theme==0.15.4
pydata-sphinx-theme==0.16.1
# via
# rocm-docs-core
# sphinx-book-theme
pygithub==2.6.1
pygithub==2.7.0
# via rocm-docs-core
pygments==2.19.1
pygments==2.19.2
# via
# accessible-pygments
# ipython
@@ -178,7 +175,7 @@ pyyaml==6.0.2
# rocm-docs-core
# sphinx-external-toc
# sphinxcontrib-datatemplates
pyzmq==26.4.0
pyzmq==27.0.0
# via
# ipykernel
# jupyter-client
@@ -190,9 +187,9 @@ requests==2.32.4
# via
# pygithub
# sphinx
rocm-docs-core==1.20.1
rocm-docs-core==1.22.0
# via -r requirements.in
rpds-py==0.25.1
rpds-py==0.26.0
# via
# jsonschema
# referencing
@@ -220,7 +217,7 @@ sphinx==8.1.3
# sphinx-reredirects
# sphinxcontrib-datatemplates
# sphinxcontrib-runcmd
sphinx-book-theme==1.1.4
sphinx-book-theme==1.1.3
# via rocm-docs-core
sphinx-copybutton==0.5.2
# via rocm-docs-core
@@ -252,7 +249,7 @@ sphinxcontrib-runcmd==0.2.0
# via sphinxcontrib-datatemplates
sphinxcontrib-serializinghtml==2.0.0
# via sphinx
sqlalchemy==2.0.41
sqlalchemy==2.0.42
# via jupyter-cache
stack-data==0.6.3
# via ipython
@@ -266,7 +263,6 @@ tornado==6.5.1
# jupyter-client
traitlets==5.14.3
# via
# comm
# ipykernel
# ipython
# jupyter-client
@@ -274,7 +270,7 @@ traitlets==5.14.3
# matplotlib-inline
# nbclient
# nbformat
typing-extensions==4.14.0
typing-extensions==4.14.1
# via
# beautifulsoup4
# exceptiongroup
@@ -290,7 +286,5 @@ urllib3==2.5.0
# requests
wcwidth==0.2.13
# via prompt-toolkit
wrapt==1.17.2
# via deprecated
zipp==3.23.0
# via importlib-metadata