Compare commits

..

37 Commits

Author SHA1 Message Date
randyh62
acb54bdcdd Update RELEASE.md
Add Optimized entry for HIP graph performance
2025-05-15 12:03:20 -07:00
Pratik Basyal
298ca30757 615 column added 630 (#4637)
* 6.1.5 compatibility column added
2025-04-17 11:49:13 -04:00
Peter Park
72970cda1f Merge pull request #4283 from ROCm/Llama-recipe-link-update
Fixed link for Llama recipe
2025-01-27 17:09:25 -05:00
Pratik Basyal
8edd5e550e Fixed link for Llama recipe 2025-01-21 16:12:36 -05:00
Pratik Basyal
b606b1577d HPC application list updated (#4066) (#4244)
* PETSc added

* List of HPC applications updated for 6.2.4

* Leo's feedback incorporated



* Review feedback incorporated

* vllm removed

---------

Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com>
2025-01-08 10:06:54 -05:00
Pratik Basyal
596e606edc AMDSMI Github reference updated in 6.3.0 Release notes (#4237) (#4239)
* AMDSMi github reference updated (#4237)

* Release version fixed
2025-01-07 10:52:04 -05:00
Alex Xu
cf241d7269 update rocm-docs-core to 1.12.1 2025-01-02 16:05:35 -05:00
alexxu-amd
6afbb33144 Change version variable to latest
Since gpu-cluster-networking gets moved to dcgpu. All versioning will be renamed.

(cherry picked from commit 027b2ea376)
2024-12-23 18:31:29 -05:00
alexxu-amd
22e71b4dce Update index.md
(cherry picked from commit fe69fc1bb4)
2024-12-23 18:08:27 -05:00
alexxu-amd
aeb073716b Update _toc.yml.in
(cherry picked from commit 4d31d717a6)
2024-12-23 18:08:24 -05:00
Istvan Kiss
6a4247156f Remove hipExtHostAlloc add at HIP section 2024-12-23 15:31:26 +01:00
Peter Park
511b4b3a9a Merge pull request #4170 from ROCm/hip_changelog_update
Remove upcoming changes section in HIP changes
2024-12-18 13:15:57 -05:00
Peter Park
cacd5a7845 Merge pull request #4169 from peterjunpark/docs/6.3.0
Hotfix compatibility matrix
2024-12-18 13:02:34 -05:00
Istvan Kiss
42da33d8b6 Remove upcoming changes section 2024-12-18 17:49:44 +01:00
Peter Park
33ca42f743 hotfix compat matrix 2024-12-18 11:22:18 -05:00
Peter Park
449c0e00b3 Merge pull request #4165 from peterjunpark/docs/6.3.0
[6.3] Add megatron training doc (#4159)
2024-12-16 13:50:27 -05:00
Peter Park
6c426ff9fa add megatron training doc (#4159)
* add megatron training doc

update toc

add images

update formatting and wording

formatting

update formatting

update conf.py

update formatting

update docker img

tweak formatting

Fix stuff

fix mock-data/data-path

add specific commit hash to checkout

update docker pull tag

fix docker run cmd and examples path

fix docker cmd

* wording

words

words

* improve title

(cherry picked from commit f9dbc1f21f)
2024-12-16 13:38:40 -05:00
Jeffrey Novotny
43399d4eed Change reference to kernel-mode GPU compute driver in ROCm (#4147) (#4156)
* Change reference to kernel-mode GPU compute driver in ROCm

* More changes for kernel-mode terminology

* Fix linting

(cherry picked from commit 04fdc08328)
2024-12-13 12:27:10 -05:00
spolifroni-amd
555f4d43ca Merge pull request #4151 from spolifroni-amd/spolifroni-amd/cherry-pick-RN-update
Cherry pick release note update
2024-12-12 11:27:12 -05:00
spolifroni-amd
8db294f215 Added MIGraphX changes (#4150)
* Added MIGraphX changes

* removed gfx support

* Update RELEASE.md

Co-authored-by: Jeffrey Novotny <jnovotny@amd.com>

* Update RELEASE.md

Co-authored-by: Jeffrey Novotny <jnovotny@amd.com>

* Update RELEASE.md

Co-authored-by: Jeffrey Novotny <jnovotny@amd.com>

* Update RELEASE.md

Co-authored-by: Jeffrey Novotny <jnovotny@amd.com>

* Update RELEASE.md

* Update RELEASE.md

Co-authored-by: Jeffrey Novotny <jnovotny@amd.com>

* Update RELEASE.md

Co-authored-by: Jeffrey Novotny <jnovotny@amd.com>

* Update RELEASE.md

Co-authored-by: Jeffrey Novotny <jnovotny@amd.com>

* Update RELEASE.md

Co-authored-by: Jeffrey Novotny <jnovotny@amd.com>

* Update RELEASE.md

* Update RELEASE.md

Co-authored-by: Jeffrey Novotny <jnovotny@amd.com>

---------

Co-authored-by: Jeffrey Novotny <jnovotny@amd.com>
(cherry picked from commit 2a7520f08a)
2024-12-12 11:21:59 -05:00
randyh62
309f3a19e6 Update index.md (#4144)
Remove Programming Guide topic from "How to"
2024-12-09 10:04:51 -08:00
Peter Park
8070f30a46 Merge pull request #4142 from peterjunpark/docs/6.3.0
[6.3] fix rccl hip streams section in workload tuning guide (#4140)
2024-12-09 11:10:55 -05:00
Peter Park
b5955a8d46 fix rccl hip streams section in workload tuning guide (#4140)
(cherry picked from commit 78f9adc6ec)
2024-12-09 11:07:40 -05:00
randyh62
5c25c3e797 Revert "Update RELEASE.md (#4127)" (#4141)
This reverts commit 7b57247b9a.
2024-12-09 07:46:47 -08:00
Sam Wu
6bbc7429df Merge pull request #4132 from ROCm/roc-6.3.x
Merge Roc 6.3.x into 6.3.0
2024-12-06 14:38:11 -07:00
Peter Park
31d24eb5cc Merge pull request #4129 from peterjunpark/docs/6.3.0
[6.3] Add @hongxiayang updates to MI300X workload tuning guide (#4123)
2024-12-06 12:30:08 -05:00
Peter Park
9f6757b71d Add @hongxiayang updates to MI300X workload tuning guide (#4123)
minor fixes to formatting

fix spelling errors

more spelling

fixes

quantization update

fix format

simplify wording in tunableops and format fix

Apply suggestions from code review

review feedback by Peter

Co-authored-by: Peter Park <peter.park@amd.com>

Apply suggestions from code review

addressing feedback

Co-authored-by: Peter Park <peter.park@amd.com>

Apply suggestions from code review

feedback again

Co-authored-by: Peter Park <peter.park@amd.com>

add hipblaslt yaml file figure

feedback and minor formatting

formatting

update wordlist.txt

remove outdated sentence regarding fsdp and rccl

(cherry picked from commit 87fa9fd83a2e623f6cab4e69d65f49e3db0a45f6)

update wordlist

Co-authored-by: hongxyan <hongxyan@amd.com>
(cherry picked from commit b0722b3228)
2024-12-06 12:21:16 -05:00
randyh62
7b57247b9a Update RELEASE.md (#4127)
Change __AMDGCN_WAVEFRONT_SIZE__ macro to __AMDGCN_WAVEFRONT_SIZE macro
2024-12-06 09:09:44 -08:00
Swati Rawat
1f25c77654 Merge pull request #4126 from SwRaw/docs/6.3.0
Update what-is-rocm.rst (#4122)
2024-12-06 21:13:19 +05:30
Swati Rawat
3d3d3cb1da Update what-is-rocm.rst (#4122)
(cherry picked from commit 5e6ddec385)
2024-12-06 21:08:51 +05:30
Peter Park
0f16b8eb29 remove programming guide from TOC (#4116)
(cherry picked from commit 1a4d54a4f1)
2024-12-05 17:23:16 -05:00
Peter Park
33a6c37f44 Merge pull request #4114 from ROCm/develop
[docs/6.3.0] fix stack image (#4112)
2024-12-04 22:01:55 -05:00
Sam Wu
3490079e2e Merge branch 'roc-6.3.x' into docs/6.3.0 2024-12-04 19:34:39 -07:00
Sam Wu
f9abf88965 Merge pull request #4110 from ROCm/roc-6.3.x
fix links to smi tools full changelog on GH (#4108) in 6.3.0 docs
2024-12-04 19:12:31 -07:00
Sam Wu
480c23a83e Merge pull request #4105 from ROCm/roc-6.3.x
Merge ROCm 6.3 release branch into 6.3.0 docs branch
2024-12-04 17:14:34 -07:00
randyh62
52c44cccca Update license.md (#4099)
Change ROCgdb license to point to GNU version 3
2024-12-04 11:03:57 -08:00
Sam Wu
e868fb6c19 Merge pull request #4094 from ROCm/roc-6.3.x
Merge Roc 6.3.x into 6.3.0 docs
2024-12-03 16:19:00 -07:00
254 changed files with 7138 additions and 10491 deletions

View File

@@ -23,40 +23,8 @@ ROCm-CI Azure DevOps Pipelines contains markup language files that orchestrate b
- [YAML schema](https://learn.microsoft.com/en-us/azure/devops/pipelines/yaml-schema/?view=azure-pipelines&viewFallbackFrom=azure-devops)
- [Azure Pipelines Task Index](https://learn.microsoft.com/en-us/azure/devops/pipelines/tasks/reference/?view=azure-pipelines)
## VMSS Setup
The Azure VMSS used for build jobs have docker installed during provisioning through the Azure Portal's settings.
Select the VMSS, then go to Settings, Operating system.
In the Custom Data section, select to Modify Custom Data. Enter the code block below and Apply.
```bash
#cloud-config
bootcmd:
- mkdir -p /etc/systemd/system/walinuxagent.service.d
- echo "[Unit]\nAfter=cloud-final.service" > /etc/systemd/system/walinuxagent.service.d/override.conf
- sed "s/After=multi-user.target//g" /lib/systemd/system/cloud-final.service > /etc/systemd/system/cloud-final.service
- systemctl daemon-reload
apt:
sources:
docker.list:
source: deb [arch=amd64] https://download.docker.com/linux/ubuntu $RELEASE stable
keyid: 9DC858229FC7DD38854AE2D88D81803C0EBFCD88
packages:
- docker-ce
- docker-ce-cli
groups:
- docker
runcmd:
- usermod -aG docker $USER
- systemctl restart docker
- systemctl enable docker
```
## Disclaimer
The information presented in this document is for informational purposes only and may contain technical inaccuracies, omissions, and typographical errors. The information contained herein is subject to change and may be rendered inaccurate for many reasons, including but not limited to product and roadmap changes, component and motherboard versionchanges, new model and/or product releases, product differences between differing manufacturers, software changes, BIOS flashes, firmware upgrades, or the like. Any computer system has risks of security vulnerabilities that cannot be completely prevented or mitigated.AMD assumes no obligation to update or otherwise correct or revise this information. However, AMD reserves the right to revise this information and to make changes from time to time to the content hereof without obligation of AMD to notify any person of such revisions or changes.THIS INFORMATION IS PROVIDED AS IS.” AMD MAKES NO REPRESENTATIONS OR WARRANTIES WITH RESPECT TO THE CONTENTS HEREOF AND ASSUMES NO RESPONSIBILITY FOR ANY INACCURACIES, ERRORS, OR OMISSIONS THAT MAY APPEAR IN THIS INFORMATION. AMD SPECIFICALLY DISCLAIMS ANY IMPLIED WARRANTIES OF NON-INFRINGEMENT, MERCHANTABILITY, OR FITNESS FOR ANY PARTICULAR PURPOSE. IN NO EVENT WILL AMD BE LIABLE TO ANY PERSON FOR ANY RELIANCE, DIRECT, INDIRECT, SPECIAL, OR OTHER CONSEQUENTIAL DAMAGES ARISING FROM THE USE OF ANY INFORMATION CONTAINED HEREIN, EVEN IF AMD IS EXPRESSLY ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. AMD, the AMD Arrow logo, and combinations thereof are trademarks of Advanced Micro Devices, Inc. Other product names used in this publication are for identification purposes only and may be trademarks of their respective companies.
© 2024 Advanced Micro Devices, Inc. All Rights Reserved.

View File

@@ -94,16 +94,23 @@ jobs:
parameters:
checkoutRepo: ${{ parameters.checkoutRepo }}
# half version should be fixed to 5.6.0
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/local-artifact-download.yml
parameters:
buildType: specific
definitionId: ${{ variables.HALF560_PIPELINE_ID }}
buildId: ${{ variables.HALF560_BUILD_ID }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencySource: fixed
fixedComponentName: half
fixedPipelineIdentifier: ${{ variables.HALF560_PIPELINE_ID }}
skipLibraryLinking: true
skipLlvmSymlink: true
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
dependencyList: ${{ parameters.rocmDependencies }}
gpuTarget: $(JOB_GPU_TARGET)
# CI case: download latest default branch build
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
# manual build case: triggered by ROCm/ROCm repo
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/build-cmake.yml
parameters:
extraBuildFlags: >-
@@ -121,12 +128,6 @@ jobs:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-upload.yml
parameters:
gpuTarget: $(JOB_GPU_TARGET)
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-links.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
pipModules: ${{ parameters.pipModules }}
gpuTarget: $(JOB_GPU_TARGET)
- job: AMDMIGraphX_testing
dependsOn: AMDMIGraphX
@@ -154,17 +155,29 @@ jobs:
parameters:
checkoutRepo: ${{ parameters.checkoutRepo }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-aqlprofile.yml
# half version should be fixed to 5.6.0
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/local-artifact-download.yml
parameters:
buildType: specific
definitionId: ${{ variables.HALF560_PIPELINE_ID }}
buildId: ${{ variables.HALF560_BUILD_ID }}
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
# half version should be fixed to 5.6.0
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
dependencySource: fixed
fixedComponentName: half
fixedPipelineIdentifier: ${{ variables.HALF560_PIPELINE_ID }}
skipLibraryLinking: true
skipLlvmSymlink: true
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmTestDependencies }}
gpuTarget: $(JOB_GPU_TARGET)
# CI case: download latest default branch build
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
# manual build case: triggered by ROCm/ROCm repo
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- task: CMake@1
displayName: MIGraphXTest CMake Flags
inputs:
@@ -185,11 +198,3 @@ jobs:
testExecutable: make
testParameters: -j$(nproc) check
testPublishResults: false
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
pipModules: ${{ parameters.pipModules }}
environment: test
gpuTarget: $(JOB_GPU_TARGET)
extraEnvVars:
- MIGRAPHX_TRACE_BENCHMARKING:::1

View File

@@ -8,13 +8,11 @@ parameters:
- name: aptPackages
type: object
default:
- cmake
- libnuma-dev
- mesa-common-dev
- ocl-icd-libopencl1
- ocl-icd-opencl-dev
- opencl-headers
- python3-pip
- name: pipModules
type: object
default:
@@ -41,7 +39,8 @@ jobs:
variables:
- group: common
- template: /.azuredevops/variables-global.yml
pool: ${{ variables.LOW_BUILD_POOL }}
pool:
vmImage: ${{ variables.BASE_BUILD_POOL }}
workspace:
clean: all
steps:
@@ -63,8 +62,13 @@ jobs:
checkoutRepo: hipother_repo
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmDependenciesAMD }}
# CI case: download latest default branch build
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
# manual build case: triggered by ROCm/ROCm repo
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
# compile clr
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/build-cmake.yml
parameters:
@@ -84,19 +88,14 @@ jobs:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-upload.yml
parameters:
artifactName: amd
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-links.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
pipModules: ${{ parameters.pipModules }}
environment: amd
# HIP with Nvidia backend
- job: hip_clr_combined_nvidia
variables:
- group: common
- template: /.azuredevops/variables-global.yml
pool: ${{ variables.LOW_BUILD_POOL }}
pool:
vmImage: ${{ variables.BASE_BUILD_POOL }}
workspace:
clean: all
steps:
@@ -119,8 +118,13 @@ jobs:
checkoutRepo: hipother_repo
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmDependenciesNvidia }}
# CI case: download latest default branch build
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
# manual build case: triggered by ROCm/ROCm repo
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- script: 'ls -1R $(Agent.BuildDirectory)/rocm'
displayName: 'Artifact listing'
# compile clr
@@ -138,9 +142,3 @@ jobs:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-upload.yml
parameters:
artifactName: nvidia
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-links.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
pipModules: ${{ parameters.pipModules }}
environment: nvidia

View File

@@ -37,9 +37,9 @@ jobs:
inputs:
targetType: inline
script: |
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-keyring_1.1-1_all.deb
sudo dpkg -i cuda-keyring_1.1-1_all.deb
sudo rm -f cuda-keyring_1.1-1_all.deb
sudo mkdir --parents --mode=0755 /etc/apt/keyrings
wget -q -O- https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-archive-keyring.gpg | gpg --dearmor | sudo tee /etc/apt/trusted.gpg.d/cuda-archive-keyring.gpg > /dev/null
echo "deb [signed-by=/etc/apt/trusted.gpg.d/cuda-archive-keyring.gpg] https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/ /" | sudo tee /etc/apt/sources.list.d/cuda-ubuntu2204-x86_64.list
sudo apt update
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-other.yml
parameters:
@@ -106,14 +106,3 @@ jobs:
testExecutable: make
testParameters: test-hipify
testPublishResults: false
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-links.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
environment: combined
registerCUDAPackages: true
extraCopyDirectories:
- llvm-project
extraEnvVars:
- UPSTREAM_LLVM_GIT_URL:::https://github.com/llvm/llvm-project.git
- UPSTREAM_LLVM_TAG:::llvmorg-18.1.2

View File

@@ -81,9 +81,14 @@ jobs:
gpuTarget: $(JOB_GPU_TARGET)
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmDependencies }}
gpuTarget: $(JOB_GPU_TARGET)
# CI case: download latest default branch build
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
# manual build case: triggered by ROCm/ROCm repo
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- task: Bash@3
displayName: Build and install other dependencies
inputs:
@@ -113,17 +118,9 @@ jobs:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-upload.yml
parameters:
gpuTarget: $(JOB_GPU_TARGET)
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-links.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
pipModules: ${{ parameters.pipModules }}
gpuTarget: $(JOB_GPU_TARGET)
extraCopyDirectories:
- miopen-deps
- job: MIOpen_testing
timeoutInMinutes: 180
timeoutInMinutes: 90
dependsOn: MIOpen
condition: and(succeeded(), eq(variables.ENABLE_GFX942_TESTS, 'true'), not(containsValue(split(variables.DISABLED_GFX942_TESTS, ','), variables['Build.DefinitionName'])))
variables:
@@ -147,14 +144,22 @@ jobs:
parameters:
checkoutRepo: ${{ parameters.checkoutRepo }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-aqlprofile.yml
parameters:
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/miopen-get-ck-build.yml
parameters:
gpuTarget: $(JOB_GPU_TARGET)
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmTestDependencies }}
gpuTarget: $(JOB_GPU_TARGET)
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- task: Bash@3
displayName: Build and install other dependencies
inputs:
@@ -192,12 +197,3 @@ jobs:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/test.yml
parameters:
componentName: MIOpen
testParameters: '-VV --output-on-failure --force-new-ctest-process --output-junit test_output.xml --exclude-regex test_rnn_seq_api'
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
pipModules: ${{ parameters.pipModules }}
environment: test
gpuTarget: $(JOB_GPU_TARGET)
extraCopyDirectories:
- miopen-deps

View File

@@ -24,7 +24,6 @@ parameters:
- libopencv-dev
- protobuf-compiler
- libprotoc-dev
- python3-pip
- name: pipModules
type: object
default:
@@ -70,7 +69,8 @@ jobs:
variables:
- group: common
- template: /.azuredevops/variables-global.yml
pool: ${{ variables.LOW_BUILD_POOL }}
pool:
vmImage: ${{ variables.BASE_BUILD_POOL }}
workspace:
clean: all
strategy:
@@ -88,9 +88,14 @@ jobs:
checkoutRepo: ${{ parameters.checkoutRepo }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmDependencies }}
gpuTarget: $(JOB_GPU_TARGET)
# CI case: download latest default branch build
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
# manual build case: triggered by ROCm/ROCm repo
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/build-cmake.yml
parameters:
extraBuildFlags: >-
@@ -104,12 +109,6 @@ jobs:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-upload.yml
parameters:
gpuTarget: $(JOB_GPU_TARGET)
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-links.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
pipModules: ${{ parameters.pipModules }}
gpuTarget: $(JOB_GPU_TARGET)
- job: MIVisionX_testing
dependsOn: MIVisionX
@@ -134,11 +133,19 @@ jobs:
parameters:
gpuTarget: $(JOB_GPU_TARGET)
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-aqlprofile.yml
parameters:
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmTestDependencies }}
gpuTarget: $(JOB_GPU_TARGET)
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
# anything in /opt may be persistent across runs
# so we need to remove the symlink if it already exists
- script: |
@@ -152,10 +159,3 @@ jobs:
parameters:
componentName: MIVisionX
testDir: 'mivisionx-tests'
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
pipModules: ${{ parameters.pipModules }}
environment: test
gpuTarget: $(JOB_GPU_TARGET)
optSymLink: true

View File

@@ -8,13 +8,10 @@ parameters:
- name: aptPackages
type: object
default:
- cmake
- g++
- libdrm-dev
- libelf-dev
- libnuma-dev
- pkg-config
- python3-pip
- name: rocmDependencies
type: object
default:
@@ -32,7 +29,8 @@ jobs:
variables:
- group: common
- template: /.azuredevops/variables-global.yml
pool: ${{ variables.LOW_BUILD_POOL }}
pool:
vmImage: ${{ variables.BASE_BUILD_POOL }}
workspace:
clean: all
steps:
@@ -45,8 +43,13 @@ jobs:
checkoutRepo: ${{ parameters.checkoutRepo }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmDependencies }}
# CI case: download latest default branch build
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
# manual build case: triggered by ROCm/ROCm repo
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/build-cmake.yml
parameters:
extraBuildFlags: >-
@@ -55,10 +58,6 @@ jobs:
-DCMAKE_BUILD_TYPE=Release
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/manifest.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-upload.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-links.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
- job: ROCR_Runtime_testing
dependsOn: ROCR_Runtime
@@ -89,11 +88,19 @@ jobs:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/preamble.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/local-artifact-download.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-aqlprofile.yml
parameters:
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmTestDependencies }}
gpuTarget: $(JOB_GPU_TARGET)
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/checkout.yml
parameters:
checkoutRepo: ${{ parameters.checkoutRepo }}
@@ -141,9 +148,3 @@ jobs:
testExecutable: ./rocrtst64
testParameters: '--gtest_filter="-rocrtstNeg.Memory_Negative_Tests:rocrtstFunc.Memory_Max_Mem" --gtest_output=xml:./test_output.xml --gtest_color=yes'
testDir: $(Build.SourcesDirectory)/rocrtst/suites/test_common/build/$(JOB_GPU_TARGET)
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
environment: test
gpuTarget: $(JOB_GPU_TARGET)
# docker image will be missing libhwloc5

View File

@@ -16,7 +16,8 @@ jobs:
variables:
- group: common
- template: /.azuredevops/variables-global.yml
pool: ${{ variables.LOW_BUILD_POOL }}
pool:
vmImage: ${{ variables.BASE_BUILD_POOL }}
workspace:
clean: all
steps:

View File

@@ -10,7 +10,6 @@ parameters:
default:
- cmake
- ninja-build
- python3-pip
- name: rocmDependencies
type: object
default:
@@ -24,7 +23,8 @@ jobs:
variables:
- group: common
- template: /.azuredevops/variables-global.yml
pool: ${{ variables.LOW_BUILD_POOL }}
pool:
vmImage: ${{ variables.BASE_BUILD_POOL }}
workspace:
clean: all
steps:
@@ -37,8 +37,13 @@ jobs:
checkoutRepo: ${{ parameters.checkoutRepo }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmDependencies }}
# CI case: download latest default branch build
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
# manual build case: triggered by ROCm/ROCm repo
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/build-cmake.yml
parameters:
extraBuildFlags: >-
@@ -47,7 +52,3 @@ jobs:
-GNinja
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/manifest.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-upload.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-links.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}

View File

@@ -55,10 +55,20 @@ jobs:
parameters:
checkoutRepo: ${{ parameters.checkoutRepo }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-aqlprofile.yml
parameters:
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmDependencies }}
# CI case: download latest default branch build
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
# manual build case: triggered by ROCm/ROCm repo
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/build-autotools.yml
parameters:
configureFlags: >-
@@ -92,7 +102,6 @@ jobs:
sudo rm -rf /opt/rocm
sudo ln -s $(Agent.BuildDirectory)/rocm /opt/rocm
echo "##vso[task.prependpath]/opt/rocm/bin"
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-links.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/gpu-diagnostics.yml
- task: Bash@3
displayName: check-gdb
@@ -107,10 +116,3 @@ jobs:
targetType: inline
script: find -name gdb.log -exec cat {} \;
workingDirectory: $(Build.SourcesDirectory)
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
environment: combined
gpuTarget: $(JOB_GPU_TARGET)
extraEnvVars:
- PKG_CONFIG_PATH:::/home/user/workspace/rocm/share/pkgconfig

View File

@@ -10,14 +10,12 @@ parameters:
default:
- cmake
- ninja-build
- libdrm-dev
- libyaml-cpp-dev
- libpci-dev
- libpci3
- libgst-dev
- libgtest-dev
- git
- python3-pip
- name: rocmDependencies
type: object
default:
@@ -60,7 +58,8 @@ jobs:
value: $(Agent.BuildDirectory)/rocm
- name: HIP_INC_DIR
value: $(Agent.BuildDirectory)/rocm
pool: ${{ variables.LOW_BUILD_POOL }}
pool:
vmImage: ${{ variables.BASE_BUILD_POOL }}
workspace:
clean: all
strategy:
@@ -77,9 +76,14 @@ jobs:
checkoutRepo: ${{ parameters.checkoutRepo }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmDependencies }}
gpuTarget: $(JOB_GPU_TARGET)
# CI case: download latest default branch build
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
# manual build case: triggered by ROCm/ROCm repo
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/build-cmake.yml
parameters:
extraBuildFlags: >-
@@ -94,15 +98,6 @@ jobs:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-upload.yml
parameters:
gpuTarget: $(JOB_GPU_TARGET)
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-links.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
gpuTarget: $(JOB_GPU_TARGET)
extraEnvVars:
- HIP_ROCCLR_HOME:::/home/user/workspace/rocm
- ROCM_PATH:::/home/user/workspace/rocm
- HIP_INC_DIR:::/home/user/workspace/rocm
- job: ROCmValidationSuite_testing
dependsOn: ROCmValidationSuite
@@ -127,11 +122,19 @@ jobs:
parameters:
gpuTarget: $(JOB_GPU_TARGET)
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-aqlprofile.yml
parameters:
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmTestDependencies }}
gpuTarget: $(JOB_GPU_TARGET)
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/gpu-diagnostics.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/test.yml
parameters:
@@ -140,8 +143,3 @@ jobs:
testParameters: ''
testDir: $(Agent.BuildDirectory)
testPublishResults: false
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
environment: test
gpuTarget: $(JOB_GPU_TARGET)

View File

@@ -35,7 +35,8 @@ jobs:
variables:
- group: common
- template: /.azuredevops/variables-global.yml
pool: ${{ variables.LOW_BUILD_POOL }}
pool:
vmImage: ${{ variables.BASE_BUILD_POOL }}
workspace:
clean: all
steps:
@@ -49,8 +50,13 @@ jobs:
checkoutRepo: ${{ parameters.checkoutRepo }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmDependencies }}
# CI case: download latest default branch build
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
# manual build case: triggered by ROCm/ROCm repo
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/manifest.yml
- task: Bash@3
displayName: Create wheel file
@@ -71,23 +77,6 @@ jobs:
retryCountOnTaskFailure: 3
inputs:
targetPath: $(Build.ArtifactStagingDirectory)
- task: Bash@3
displayName: Save pipeline artifact file names
inputs:
workingDirectory: $(Pipeline.Workspace)
targetType: inline
script: |
echo "$(Build.DefinitionName)_$(Build.SourceBranchName)_$(Build.BuildId)_$(Build.BuildNumber)_ubuntu2204_drop_$(JOB_GPU_TARGET).tar.gz" >> pipelineArtifacts.txt
whlFile=$(find "$(Build.ArtifactStagingDirectory)" -type f -name "*.whl" | head -n 1)
if [ -n "$whlFile" ]; then
echo $(basename "$whlFile") >> pipelineArtifacts.txt
fi
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-links.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
pipModules: ${{ parameters.pipModules }}
gpuTarget: $(JOB_GPU_TARGET)
- job: Tensile_testing
timeoutInMinutes: 90
@@ -119,11 +108,19 @@ jobs:
parameters:
checkoutRepo: ${{ parameters.checkoutRepo }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-aqlprofile.yml
parameters:
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmDependencies }}
gpuTarget: $(JOB_GPU_TARGET)
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- task: Bash@3
displayName: pip install
inputs:
@@ -178,13 +175,3 @@ jobs:
inputs:
targetType: inline
script: echo "##vso[task.setvariable variable=PATH]$(echo $PATH | sed -e 's;:$(Agent.BuildDirectory)/rocm/llvm/bin;;' -e 's;^/;;' -e 's;/$;;')"
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
pipModules: ${{ parameters.pipModules }}
environment: test
gpuTarget: $(JOB_GPU_TARGET)
optSymLink: true
pythonEnvVars: true
extraPaths: /home/user/workspace/rocm/llvm/bin:/home/user/workspace/rocm/bin
# docker image will not have python site-packages in path, but the env vars will make it easier

View File

@@ -1,123 +0,0 @@
parameters:
- name: checkoutRepo
type: string
default: 'self'
- name: checkoutRef
type: string
default: ''
- name: aptPackages
type: object
default:
- cmake
- libnuma-dev
- ninja-build
- python3-pip
- name: rocmDependencies
type: object
default:
- clr
- llvm-project
- rocm-cmake
- rocminfo
- ROCR-Runtime
- name: rocmTestDependencies
type: object
default:
- clr
- llvm-project
- rocm-cmake
- rocminfo
- rocprofiler-register
- ROCR-Runtime
jobs:
- job: TransferBench
variables:
- group: common
- template: /.azuredevops/variables-global.yml
pool: ${{ variables.LOW_BUILD_POOL }}
workspace:
clean: all
strategy:
matrix:
gfx942:
JOB_GPU_TARGET: gfx942
steps:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-other.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/preamble.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/checkout.yml
parameters:
checkoutRepo: ${{ parameters.checkoutRepo }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmDependencies }}
gpuTarget: $(JOB_GPU_TARGET)
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/build-cmake.yml
parameters:
extraBuildFlags: >-
-DCMAKE_CXX_COMPILER=$(Agent.BuildDirectory)/rocm/bin/hipcc
-DCMAKE_PREFIX_PATH=$(Agent.BuildDirectory)/rocm
-DROCM_PATH=$(Agent.BuildDirectory)/rocm
-GNinja
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/manifest.yml
parameters:
gpuTarget: $(JOB_GPU_TARGET)
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-upload.yml
parameters:
gpuTarget: $(JOB_GPU_TARGET)
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-links.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
gpuTarget: $(JOB_GPU_TARGET)
- job: TransferBench_testing
dependsOn: TransferBench
condition: and(succeeded(), eq(variables.ENABLE_GFX942_TESTS, 'true'), not(containsValue(split(variables.DISABLED_GFX942_TESTS, ','), variables['Build.DefinitionName'])))
variables:
- group: common
- template: /.azuredevops/variables-global.yml
pool: $(JOB_TEST_POOL)
workspace:
clean: all
strategy:
matrix:
gfx942:
JOB_GPU_TARGET: gfx942
JOB_TEST_POOL: ${{ variables.GFX942_TEST_POOL }}
steps:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-other.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/preamble.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/local-artifact-download.yml
parameters:
gpuTarget: $(JOB_GPU_TARGET)
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmTestDependencies }}
gpuTarget: $(JOB_GPU_TARGET)
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/gpu-diagnostics.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/test.yml
parameters:
componentName: TransferBench All-to-all
testDir: '$(Agent.BuildDirectory)'
testExecutable: './rocm/bin/TransferBench'
testParameters: 'a2a'
testPublishResults: false
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/test.yml
parameters:
componentName: TransferBench Peer-to-peer
testDir: '$(Agent.BuildDirectory)'
testExecutable: './rocm/bin/TransferBench'
testParameters: 'p2p'
testPublishResults: false
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
environment: test
gpuTarget: $(JOB_GPU_TARGET)

View File

@@ -8,17 +8,15 @@ parameters:
- name: aptPackages
type: object
default:
- cmake
- libdrm-dev
- python3-pip
- pkg-config
jobs:
- job: amdsmi
variables:
- group: common
- template: /.azuredevops/variables-global.yml
pool: ${{ variables.LOW_BUILD_POOL }}
pool:
vmImage: ${{ variables.BASE_BUILD_POOL }}
workspace:
clean: all
steps:
@@ -35,10 +33,6 @@ jobs:
-DBUILD_TESTS=ON
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/manifest.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-upload.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-links.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
- job: amdsmi_testing
dependsOn: amdsmi
@@ -69,8 +63,3 @@ jobs:
testDir: '$(Agent.BuildDirectory)'
testExecutable: './rocm/share/amd_smi/tests/amdsmitst'
testParameters: '--gtest_output=xml:./test_output.xml --gtest_color=yes'
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
environment: test
gpuTarget: $(JOB_GPU_TARGET)

View File

@@ -23,7 +23,8 @@ jobs:
variables:
- group: common
- template: /.azuredevops/variables-global.yml
pool: ${{ variables.LOW_BUILD_POOL }}
pool:
vmImage: ${{ variables.BASE_BUILD_POOL }}
workspace:
clean: all
steps:
@@ -36,8 +37,13 @@ jobs:
checkoutRepo: ${{ parameters.checkoutRepo }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmDependencies }}
# CI case: download latest default branch build
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
# manual build case: triggered by ROCm/ROCm repo
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/build-cmake.yml
parameters:
componentName: aomp-extras
@@ -50,7 +56,3 @@ jobs:
installDir: $(Build.BinariesDirectory)/llvm
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/manifest.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-upload.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-links.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}

View File

@@ -97,8 +97,13 @@ jobs:
checkoutRepo: llvm-project_repo
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmDependencies }}
# CI case: download latest default branch build
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
# manual build case: triggered by ROCm/ROCm repo
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
# Because clang is not being rebuilt and to separate downloaded ROCm
# dependencies from the new artifacts, we use temporary symbolic links
# for the compilation and installation to go through.
@@ -436,11 +441,6 @@ jobs:
RemoveDotFiles: true
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/manifest.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-upload.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-links.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
optSymLink: true
- job: aomp_testing
dependsOn: aomp
@@ -464,8 +464,11 @@ jobs:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/local-artifact-download.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmDependencies }}
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- task: Bash@3
displayName: ROCm symbolic link
inputs:
@@ -515,9 +518,3 @@ jobs:
SKIP_TEST_PACKAGE: 1
MAINLINE_BUILD: 1
SUITE_LIST: smoke
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
environment: test
gpuTarget: $(JOB_GPU_TARGET)
optSymLink: true

View File

@@ -58,9 +58,14 @@ jobs:
checkoutRepo: ${{ parameters.checkoutRepo }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmDependencies }}
gpuTarget: $(JOB_GPU_TARGET)
# CI case: download latest default branch build
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
# manual build case: triggered by ROCm/ROCm repo
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- script: |
mkdir -p $(CCACHE_DIR)
echo "##vso[task.prependpath]/usr/lib/ccache"
@@ -94,14 +99,8 @@ jobs:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-upload.yml
parameters:
gpuTarget: $(JOB_GPU_TARGET)
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-links.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
gpuTarget: $(JOB_GPU_TARGET)
- job: composable_kernel_testing
timeoutInMinutes: 90
dependsOn: composable_kernel
condition: and(succeeded(), eq(variables.ENABLE_GFX942_TESTS, 'true'), not(containsValue(split(variables.DISABLED_GFX942_TESTS, ','), variables['Build.DefinitionName'])))
variables:
@@ -126,11 +125,19 @@ jobs:
parameters:
gpuTarget: $(JOB_GPU_TARGET)
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-aqlprofile.yml
parameters:
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmTestDependencies }}
gpuTarget: $(JOB_GPU_TARGET)
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/gpu-diagnostics.yml
- task: Bash@3
displayName: Iterate through test scripts
@@ -141,8 +148,3 @@ jobs:
./$file | tee -a $(TEST_LOG_FILE)
done
workingDirectory: $(Agent.BuildDirectory)/rocm/bin
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
environment: test
gpuTarget: $(JOB_GPU_TARGET)

View File

@@ -31,4 +31,3 @@ jobs:
sourceDir: $(Agent.BuildDirectory)/rocm
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/manifest.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-upload.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-links.yml

View File

@@ -25,7 +25,8 @@ jobs:
variables:
- group: common
- template: /.azuredevops/variables-global.yml
pool: ${{ variables.LOW_BUILD_POOL }}
pool:
vmImage: ${{ variables.BASE_BUILD_POOL }}
workspace:
clean: all
steps:
@@ -38,8 +39,13 @@ jobs:
checkoutRepo: ${{ parameters.checkoutRepo }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmDependencies }}
# CI case: download latest default branch build
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
# manual build case: triggered by ROCm/ROCm repo
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/build-cmake.yml
parameters:
extraBuildFlags: >-
@@ -58,8 +64,3 @@ jobs:
make test03
./bin/test
workingDirectory: $(Build.SourcesDirectory)/test
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-links.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
environment: combined

View File

@@ -62,8 +62,13 @@ jobs:
checkoutRepo: ${{ parameters.checkoutRepo }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmDependencies }}
# CI case: download latest default branch build
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
# manual build case: triggered by ROCm/ROCm repo
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
# compile hip-tests
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/build-cmake.yml
parameters:
@@ -83,13 +88,6 @@ jobs:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-upload.yml
parameters:
gpuTarget: $(JOB_GPU_TARGET)
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-links.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
gpuTarget: $(JOB_GPU_TARGET)
extraEnvVars:
- HIP_ROCCLR_HOME:::/home/user/workspace/rocm
- job: hip_tests_testing
timeoutInMinutes: 240
@@ -115,11 +113,19 @@ jobs:
parameters:
gpuTarget: $(JOB_GPU_TARGET)
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-aqlprofile.yml
parameters:
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmTestDependencies }}
gpuTarget: $(JOB_GPU_TARGET)
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- task: Bash@3
displayName: Symlink rocm_agent_enumerator
inputs:
@@ -139,9 +145,3 @@ jobs:
inputs:
targetType: inline
script: sudo rm -rf /opt/rocm
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
environment: test
gpuTarget: $(JOB_GPU_TARGET)
optSymLink: true

View File

@@ -29,7 +29,8 @@ jobs:
- name: ROCM_PATH
value: $(Agent.BuildDirectory)/rocm
- template: /.azuredevops/variables-global.yml
pool: ${{ variables.LOW_BUILD_POOL }}
pool:
vmImage: ${{ variables.BASE_BUILD_POOL }}
workspace:
clean: all
steps:
@@ -40,10 +41,18 @@ jobs:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/checkout.yml
parameters:
checkoutRepo: ${{ parameters.checkoutRepo }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmDependencies }}
# CI case: download latest default branch build
- ${{ if eq(parameters.checkoutRef, '') }}:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
dependencyList: ${{ parameters.rocmDependencies }}
dependencySource: staging
# manual build case: triggered by ROCm/ROCm repo
- ${{ if ne(parameters.checkoutRef, '') }}:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
dependencyList: ${{ parameters.rocmDependencies }}
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/build-cmake.yml
parameters:
extraBuildFlags: >-
@@ -52,9 +61,3 @@ jobs:
-GNinja
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/manifest.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-upload.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-links.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
extraEnvVars:
- ROCM_PATH:::/home/user/workspace/rocm

View File

@@ -75,9 +75,14 @@ jobs:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-aocl.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmDependencies }}
gpuTarget: $(JOB_GPU_TARGET)
# CI case: download latest default branch build
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
# manual build case: triggered by ROCm/ROCm repo
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/build-cmake.yml
parameters:
extraBuildFlags: >-
@@ -96,13 +101,6 @@ jobs:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-upload.yml
parameters:
gpuTarget: $(JOB_GPU_TARGET)
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-links.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
pipModules: ${{ parameters.pipModules }}
installAOCL: true
gpuTarget: $(JOB_GPU_TARGET)
- job: hipBLAS_testing
dependsOn: hipBLAS
@@ -128,11 +126,19 @@ jobs:
parameters:
gpuTarget: $(JOB_GPU_TARGET)
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-aqlprofile.yml
parameters:
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmTestDependencies }}
gpuTarget: $(JOB_GPU_TARGET)
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/gpu-diagnostics.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/test.yml
parameters:
@@ -140,9 +146,3 @@ jobs:
testExecutable: $(Agent.BuildDirectory)/rocm/bin/hipblas-test
testParameters: '--yaml hipblas_smoke.yaml --gtest_output=xml:./test_output.xml --gtest_color=yes'
testDir: '$(Agent.BuildDirectory)/rocm/bin'
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
pipModules: ${{ parameters.pipModules }}
environment: test
gpuTarget: $(JOB_GPU_TARGET)

View File

@@ -22,8 +22,7 @@ parameters:
type: object
default:
- joblib
- packaging>=22.0
- --upgrade
- packaging
- name: rocmDependencies
type: object
default:
@@ -88,9 +87,14 @@ jobs:
checkoutRepo: ${{ parameters.checkoutRepo }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmDependencies }}
gpuTarget: $(JOB_GPU_TARGET)
# CI case: download latest default branch build
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
# manual build case: triggered by ROCm/ROCm repo
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- script: sudo ln -s $(Agent.BuildDirectory)/rocm /opt/rocm
displayName: ROCm symbolic link
# Build and install gtest, lapack, hipBLAS-common
@@ -133,6 +137,7 @@ jobs:
-DAMDGPU_TARGETS=$(JOB_GPU_TARGET)
-DTensile_LOGIC=
-DTensile_CPU_THREADS=
-DTensile_CODE_OBJECT_VERSION=default
-DTensile_LIBRARY_FORMAT=msgpack
-DCMAKE_PREFIX_PATH="$(Agent.BuildDirectory)/rocm"
-DBUILD_CLIENTS_TESTS=ON
@@ -143,22 +148,7 @@ jobs:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-upload.yml
parameters:
gpuTarget: $(JOB_GPU_TARGET)
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-links.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
pipModules: ${{ parameters.pipModules }}
gpuTarget: $(JOB_GPU_TARGET)
extraPaths: /home/user/workspace/rocm/llvm/bin:/home/user/workspace/rocm/bin
installLatestCMake: true
extraEnvVars:
- HIP_ROCCLR_HOME:::/home/user/workspace/rocm
- TENSILE_ROCM_ASSEMBLER_PATH:::/home/user/workspace/rocm/llvm/bin/amdclang
- CMAKE_CXX_COMPILER:::/home/user/workspace/rocm/bin/hipcc
- TENSILE_ROCM_OFFLOAD_BUNDLER_PATH:::/home/user/workspace/rocm/llvm/bin/clang-offload-bundler
- TENSILE_ROCM_PATH:::/home/user/workspace/rocm/bin/hipcc
extraCopyDirectories:
- deps
- job: hipBLASLt_testing
dependsOn: hipBLASLt
@@ -184,11 +174,19 @@ jobs:
parameters:
gpuTarget: $(JOB_GPU_TARGET)
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-aqlprofile.yml
parameters:
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmTestDependencies }}
gpuTarget: $(JOB_GPU_TARGET)
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/gpu-diagnostics.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/test.yml
parameters:
@@ -196,9 +194,3 @@ jobs:
testDir: '$(Agent.BuildDirectory)/rocm/bin'
testExecutable: './hipblaslt-test'
testParameters: '--gtest_output=xml:./test_output.xml --gtest_color=yes --gtest_filter=*pre_checkin*'
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
pipModules: ${{ parameters.pipModules }}
environment: test
gpuTarget: $(JOB_GPU_TARGET)

View File

@@ -53,9 +53,14 @@ jobs:
checkoutRepo: ${{ parameters.checkoutRepo }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmDependencies }}
gpuTarget: $(JOB_GPU_TARGET)
# CI case: download latest default branch build
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
# manual build case: triggered by ROCm/ROCm repo
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/build-cmake.yml
parameters:
extraBuildFlags: >-
@@ -71,11 +76,6 @@ jobs:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-upload.yml
parameters:
gpuTarget: $(JOB_GPU_TARGET)
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-links.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
gpuTarget: $(JOB_GPU_TARGET)
- job: hipCUB_testing
dependsOn: hipCUB
@@ -100,18 +100,21 @@ jobs:
parameters:
gpuTarget: $(JOB_GPU_TARGET)
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-aqlprofile.yml
parameters:
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmTestDependencies }}
gpuTarget: $(JOB_GPU_TARGET)
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/gpu-diagnostics.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/test.yml
parameters:
componentName: hipCUB
testDir: '$(Agent.BuildDirectory)/rocm/bin/hipcub'
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
environment: test
gpuTarget: $(JOB_GPU_TARGET)

View File

@@ -47,7 +47,8 @@ jobs:
- template: /.azuredevops/variables-global.yml
- name: HIP_ROCCLR_HOME
value: $(Build.BinariesDirectory)/rocm
pool: ${{ variables.LOW_BUILD_POOL }}
pool:
vmImage: ${{ variables.BASE_BUILD_POOL }}
workspace:
clean: all
strategy:
@@ -64,9 +65,14 @@ jobs:
checkoutRepo: ${{ parameters.checkoutRepo }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmDependencies }}
gpuTarget: $(JOB_GPU_TARGET)
# CI case: download latest default branch build
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
# manual build case: triggered by ROCm/ROCm repo
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/build-cmake.yml
parameters:
extraBuildFlags: >-
@@ -88,11 +94,6 @@ jobs:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-upload.yml
parameters:
gpuTarget: $(JOB_GPU_TARGET)
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-links.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
gpuTarget: $(JOB_GPU_TARGET)
- job: hipFFT_testing
dependsOn: hipFFT
@@ -117,11 +118,19 @@ jobs:
parameters:
gpuTarget: $(JOB_GPU_TARGET)
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-aqlprofile.yml
parameters:
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmTestDependencies }}
gpuTarget: $(JOB_GPU_TARGET)
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/gpu-diagnostics.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/test.yml
parameters:
@@ -129,8 +138,3 @@ jobs:
testDir: '$(Agent.BuildDirectory)/rocm/bin'
testExecutable: './hipfft-test'
testParameters: '--test_prob 0.002 --gtest_output=xml:./test_output.xml --gtest_color=yes'
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
environment: test
gpuTarget: $(JOB_GPU_TARGET)

View File

@@ -12,7 +12,6 @@ parameters:
- ninja-build
- googletest
- git
- python3-pip
- name: rocmDependencies
type: object
default:
@@ -38,7 +37,8 @@ jobs:
- template: /.azuredevops/variables-global.yml
- name: HIP_ROCCLR_HOME
value: $(Build.BinariesDirectory)/rocm
pool: ${{ variables.LOW_BUILD_POOL }}
pool:
vmImage: ${{ variables.BASE_BUILD_POOL }}
workspace:
clean: all
strategy:
@@ -55,9 +55,14 @@ jobs:
checkoutRepo: ${{ parameters.checkoutRepo }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmDependencies }}
gpuTarget: $(JOB_GPU_TARGET)
# CI case: download latest default branch build
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
# manual build case: triggered by ROCm/ROCm repo
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/build-cmake.yml
parameters:
extraBuildFlags: >-
@@ -75,13 +80,6 @@ jobs:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-upload.yml
parameters:
gpuTarget: $(JOB_GPU_TARGET)
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-links.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
gpuTarget: $(JOB_GPU_TARGET)
extraEnvVars:
- HIP_ROCCLR_HOME:::/home/user/workspace/rocm
- job: hipRAND_testing
dependsOn: hipRAND
@@ -106,18 +104,21 @@ jobs:
parameters:
gpuTarget: $(JOB_GPU_TARGET)
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-aqlprofile.yml
parameters:
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmTestDependencies }}
gpuTarget: $(JOB_GPU_TARGET)
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/gpu-diagnostics.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/test.yml
parameters:
componentName: hipRAND
testDir: '$(Agent.BuildDirectory)/rocm/bin/hipRAND'
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
environment: test
gpuTarget: $(JOB_GPU_TARGET)

View File

@@ -15,7 +15,6 @@ parameters:
- git
- googletest
- libgtest-dev
- python3-pip
- name: rocmDependencies
type: object
default:
@@ -50,7 +49,8 @@ jobs:
variables:
- group: common
- template: /.azuredevops/variables-global.yml
pool: ${{ variables.LOW_BUILD_POOL }}
pool:
vmImage: ${{ variables.BASE_BUILD_POOL }}
workspace:
clean: all
strategy:
@@ -67,9 +67,14 @@ jobs:
checkoutRepo: ${{ parameters.checkoutRepo }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmDependencies }}
gpuTarget: $(JOB_GPU_TARGET)
# CI case: download latest default branch build
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
# manual build case: triggered by ROCm/ROCm repo
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
# build external gtest and lapack
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/build-cmake.yml
parameters:
@@ -96,13 +101,6 @@ jobs:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-upload.yml
parameters:
gpuTarget: $(JOB_GPU_TARGET)
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-links.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
gpuTarget: $(JOB_GPU_TARGET)
extraCopyDirectories:
- deps-install
- job: hipSOLVER_testing
dependsOn: hipSOLVER
@@ -127,11 +125,19 @@ jobs:
parameters:
gpuTarget: $(JOB_GPU_TARGET)
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-aqlprofile.yml
parameters:
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmTestDependencies }}
gpuTarget: $(JOB_GPU_TARGET)
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/gpu-diagnostics.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/test.yml
parameters:
@@ -139,8 +145,3 @@ jobs:
testDir: '$(Agent.BuildDirectory)/rocm/bin'
testExecutable: './hipsolver-test'
testParameters: '--gtest_filter="*checkin*" --gtest_output=xml:./test_output.xml --gtest_color=yes'
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
environment: test
gpuTarget: $(JOB_GPU_TARGET)

View File

@@ -16,7 +16,6 @@ parameters:
- git
- gfortran
- libgtest-dev
- python3-pip
- name: rocmDependencies
type: object
default:
@@ -45,7 +44,8 @@ jobs:
variables:
- group: common
- template: /.azuredevops/variables-global.yml
pool: ${{ variables.LOW_BUILD_POOL }}
pool:
vmImage: ${{ variables.BASE_BUILD_POOL }}
workspace:
clean: all
strategy:
@@ -62,9 +62,14 @@ jobs:
checkoutRepo: ${{ parameters.checkoutRepo }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmDependencies }}
gpuTarget: $(JOB_GPU_TARGET)
# CI case: download latest default branch build
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
# manual build case: triggered by ROCm/ROCm repo
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/build-cmake.yml
parameters:
extraBuildFlags: >-
@@ -91,12 +96,6 @@ jobs:
parameters:
artifactName: testMatrices
gpuTarget: $(JOB_GPU_TARGET)
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-links.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
environment: test
gpuTarget: $(JOB_GPU_TARGET)
- job: hipSPARSE_testing
dependsOn: hipSPARSE
@@ -121,11 +120,19 @@ jobs:
parameters:
gpuTarget: $(JOB_GPU_TARGET)
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-aqlprofile.yml
parameters:
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmTestDependencies }}
gpuTarget: $(JOB_GPU_TARGET)
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/gpu-diagnostics.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/test.yml
parameters:
@@ -133,8 +140,3 @@ jobs:
testDir: '$(Agent.BuildDirectory)/rocm/bin'
testExecutable: './hipsparse-test'
testParameters: '--gtest_output=xml:./test_output.xml --gtest_color=yes'
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
environment: test
gpuTarget: $(JOB_GPU_TARGET)

View File

@@ -31,7 +31,6 @@ parameters:
- rocminfo
- rocprofiler-register
- ROCR-Runtime
- roctracer
- name: rocmTestDependencies
type: object
default:
@@ -44,7 +43,6 @@ parameters:
- rocminfo
- rocprofiler-register
- ROCR-Runtime
- roctracer
jobs:
- job: hipSPARSELt
@@ -80,9 +78,14 @@ jobs:
checkoutRepo: ${{ parameters.checkoutRepo }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmDependencies }}
gpuTarget: $(JOB_GPU_TARGET)
# CI case: download latest default branch build
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
# manual build case: triggered by ROCm/ROCm repo
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
# Build and install gtest and lapack
# $(Pipeline.Workspace)/deps is a temporary folder for the build process
# $(Pipeline.Workspace)/s/deps is part of the hipSPARSELt repo
@@ -108,6 +111,7 @@ jobs:
-DAMDGPU_TARGETS=$(JOB_GPU_TARGET)
-DTensile_LOGIC=
-DTensile_CPU_THREADS=
-DTensile_CODE_OBJECT_VERSION=default
-DTensile_LIBRARY_FORMAT=msgpack
-DCMAKE_PREFIX_PATH="$(Agent.BuildDirectory)/rocm"
-DROCM_PATH=$(Agent.BuildDirectory)/rocm
@@ -119,21 +123,6 @@ jobs:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-upload.yml
parameters:
gpuTarget: $(JOB_GPU_TARGET)
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-links.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
pipModules: ${{ parameters.pipModules }}
gpuTarget: $(JOB_GPU_TARGET)
extraCopyDirectories:
- deps
extraPaths: /home/user/workspace/rocm/llvm/bin:/home/user/workspace/rocm/bin
extraEnvVars:
- HIP_ROCCLR_HOME:::/home/user/workspace/rocm
- TENSILE_ROCM_ASSEMBLER_PATH:::/home/user/workspace/rocm/llvm/bin/clang
- CMAKE_CXX_COMPILER:::/home/user/workspace/rocm/llvm/bin/hipcc
- TENSILE_ROCM_OFFLOAD_BUNDLER_PATH:::/home/user/workspace/rocm/llvm/bin/clang-offload-bundler
installLatestCMake: true
- job: hipSPARSELt_testing
dependsOn: hipSPARSELt
@@ -159,9 +148,12 @@ jobs:
gpuTarget: $(JOB_GPU_TARGET)
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmTestDependencies }}
gpuTarget: $(JOB_GPU_TARGET)
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/gpu-diagnostics.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/test.yml
parameters:
@@ -169,9 +161,3 @@ jobs:
testDir: '$(Agent.BuildDirectory)/rocm/bin'
testExecutable: './hipsparselt-test'
testParameters: '--gtest_output=xml:./test_output.xml --gtest_color=yes --gtest_filter=*pre_checkin*'
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
pipModules: ${{ parameters.pipModules }}
environment: test
gpuTarget: $(JOB_GPU_TARGET)

View File

@@ -52,9 +52,14 @@ jobs:
checkoutRepo: ${{ parameters.checkoutRepo }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmDependencies }}
gpuTarget: $(JOB_GPU_TARGET)
# CI case: download latest default branch build
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
# manual build case: triggered by ROCm/ROCm repo
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/build-cmake.yml
parameters:
extraBuildFlags: >-
@@ -71,11 +76,6 @@ jobs:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-upload.yml
parameters:
gpuTarget: $(JOB_GPU_TARGET)
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-links.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
gpuTarget: $(JOB_GPU_TARGET)
- job: hipTensor_testing
timeoutInMinutes: 90
@@ -101,19 +101,22 @@ jobs:
parameters:
gpuTarget: $(JOB_GPU_TARGET)
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-aqlprofile.yml
parameters:
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmTestDependencies }}
gpuTarget: $(JOB_GPU_TARGET)
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/gpu-diagnostics.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/test.yml
parameters:
componentName: hipTensor
testDir: '$(Agent.BuildDirectory)/rocm/bin/hiptensor'
testParameters: '-E ".*-extended" -VV --output-on-failure --force-new-ctest-process --output-junit test_output.xml'
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
environment: test
gpuTarget: $(JOB_GPU_TARGET)

View File

@@ -62,9 +62,14 @@ jobs:
checkoutRepo: ${{ parameters.checkoutRepo }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmDependencies }}
gpuTarget: $(JOB_GPU_TARGET)
# CI case: download latest default branch build
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
# manual build case: triggered by ROCm/ROCm repo
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/build-cmake.yml
parameters:
extraBuildFlags: >-
@@ -87,12 +92,6 @@ jobs:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-upload.yml
parameters:
gpuTarget: $(JOB_GPU_TARGET)
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-links.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
gpuTarget: $(JOB_GPU_TARGET)
installLatestCMake: true
- job: hipfort_testing
dependsOn: hipfort
@@ -117,11 +116,19 @@ jobs:
parameters:
gpuTarget: $(JOB_GPU_TARGET)
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-aqlprofile.yml
parameters:
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmDependencies }}
gpuTarget: $(JOB_GPU_TARGET)
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/checkout.yml
parameters:
checkoutRepo: ${{ parameters.checkoutRepo }}
@@ -140,9 +147,3 @@ jobs:
targetType: inline
script: PATH=$(Agent.BuildDirectory)/rocm/bin:$PATH make run_all
workingDirectory: $(Build.SourcesDirectory)/test
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
environment: test
gpuTarget: $(JOB_GPU_TARGET)
optSymLink: true

View File

@@ -42,9 +42,14 @@ jobs:
checkoutRepo: ${{ parameters.checkoutRepo }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmDependencies }}
skipLlvmSymlink: true
# CI case: download latest default branch build
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
# manual build case: triggered by ROCm/ROCm repo
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/build-cmake.yml
parameters:
componentName: rocm-llvm
@@ -121,7 +126,6 @@ jobs:
componentName: comgr
extraBuildFlags: >-
-DCMAKE_PREFIX_PATH="$(Build.SourcesDirectory)/llvm/build;$(Build.SourcesDirectory)/amd/device-libs/build"
-DCOMGR_DISABLE_SPIRV=1
-DCMAKE_BUILD_TYPE=Release
cmakeBuildDir: 'amd/comgr/build'
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/test.yml
@@ -139,11 +143,3 @@ jobs:
cmakeBuildDir: 'amd/hipcc/build'
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/manifest.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-upload.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-links.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
environment: combined
extraEnvVars:
- HIP_DEVICE_LIB_PATH:::/home/user/workspace/bin/amdgcn/bitcode
- HIP_PATH:::/home/user/workspace/rocm

View File

@@ -75,9 +75,14 @@ jobs:
checkoutRepo: ${{ parameters.checkoutRepo }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmDependencies }}
gpuTarget: $(JOB_GPU_TARGET)
# CI case: download latest default branch build
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
# manual build case: triggered by ROCm/ROCm repo
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- task: Bash@3
displayName: ROCm symbolic link
inputs:

View File

@@ -8,31 +8,29 @@ parameters:
- name: aptPackages
type: object
default:
- cmake
- git
- googletest
- libboost-program-options-dev
- libdrm-dev
- libfftw3-dev
- libnuma-dev
- libstdc++-12-dev
- ninja-build
- python3-pip
- cmake
- libboost-program-options-dev
- googletest
- libfftw3-dev
- git
- ninja-build
- libstdc++-12-dev
- libnuma-dev
- name: rocmDependencies
type: object
default:
- rocm-cmake
- llvm-project
- ROCR-Runtime
- clr
- rocminfo
- rocm_smi_lib
- rocprofiler-register
- rocm-core
- HIPIFY
- aomp
- aomp-extras
- clr
- HIPIFY
- llvm-project
- rocm-cmake
- rocm-core
- rocm_smi_lib
- rocminfo
- rocprofiler-register
- ROCR-Runtime
- roctracer
- name: rocmTestDependencies
type: object
default:
@@ -47,7 +45,6 @@ parameters:
- rocminfo
- rocprofiler-register
- ROCR-Runtime
- roctracer
jobs:
- job: rccl
@@ -75,9 +72,14 @@ jobs:
submoduleBehaviour: recursive
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmDependencies }}
gpuTarget: $(JOB_GPU_TARGET)
# CI case: download latest default branch build
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
# manual build case: triggered by ROCm/ROCm repo
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- script: chmod +x $(Agent.BuildDirectory)/rocm/bin/hipify-perl
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/build-cmake.yml
parameters:
@@ -97,14 +99,6 @@ jobs:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-upload.yml
parameters:
gpuTarget: $(JOB_GPU_TARGET)
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-links.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
gpuTarget: $(JOB_GPU_TARGET)
extraEnvVars:
- HIP_ROCCLR_HOME:::/home/user/workspace/rocm
installLatestCMake: true
- job: rccl_testing
timeoutInMinutes: 120
@@ -130,11 +124,19 @@ jobs:
parameters:
gpuTarget: $(JOB_GPU_TARGET)
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-aqlprofile.yml
parameters:
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmTestDependencies }}
gpuTarget: $(JOB_GPU_TARGET)
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/gpu-diagnostics.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/test.yml
parameters:
@@ -142,8 +144,3 @@ jobs:
testDir: '$(Agent.BuildDirectory)/rocm/bin'
testExecutable: './rccl-UnitTests'
testParameters: '--gtest_output=xml:./test_output.xml --gtest_color=yes'
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
environment: test
gpuTarget: $(JOB_GPU_TARGET)

View File

@@ -37,7 +37,6 @@ parameters:
- ROCmValidationSuite
- rocprofiler
- rocprofiler-register
- rocprofiler-sdk
- ROCR-Runtime
- name: rocmTestDependencies
type: object
@@ -75,9 +74,14 @@ jobs:
checkoutRepo: ${{ parameters.checkoutRepo }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmDependencies }}
gpuTarget: $(JOB_GPU_TARGET)
# CI case: download latest default branch build
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
# manual build case: triggered by ROCm/ROCm repo
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
# Build grpc
- task: Bash@3
displayName: 'git clone grpc'
@@ -112,11 +116,6 @@ jobs:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-upload.yml
parameters:
gpuTarget: $(JOB_GPU_TARGET)
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-links.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
gpuTarget: $(JOB_GPU_TARGET)
- job: rdc_testing
dependsOn: rdc
@@ -141,11 +140,19 @@ jobs:
parameters:
gpuTarget: $(JOB_GPU_TARGET)
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-aqlprofile.yml
parameters:
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmTestDependencies }}
gpuTarget: $(JOB_GPU_TARGET)
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- task: Bash@3
displayName: Setup test environment
inputs:
@@ -167,9 +174,3 @@ jobs:
--batch_mode
--start_rdcd
--unauth_comm
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
environment: test
gpuTarget: $(JOB_GPU_TARGET)
extraPaths: /home/user/workspace/rocm/bin

View File

@@ -21,8 +21,6 @@ parameters:
- libavcodec-dev
- libavformat-dev
- libavutil-dev
- libdlpack-dev
- libsndfile1-dev
- libswscale-dev
- libturbojpeg-dev
- libjpeg-turbo-official=3.0.2-20240124
@@ -38,15 +36,15 @@ parameters:
- name: rocmDependencies
type: object
default:
- aomp
- clr
- half
- llvm-project
- MIVisionX
- rocDecode
- rocm-cmake
- llvm-project
- ROCR-Runtime
- clr
- rocDecode
- half
- rpp
- MIVisionX
- aomp
- name: rocmTestDependencies
type: object
default:
@@ -55,7 +53,6 @@ parameters:
- half
- llvm-project
- MIVisionX
- rocDecode
- rocminfo
- rocprofiler-register
- ROCR-Runtime
@@ -66,7 +63,8 @@ jobs:
variables:
- group: common
- template: /.azuredevops/variables-global.yml
pool: ${{ variables.LOW_BUILD_POOL }}
pool:
vmImage: ${{ variables.BASE_BUILD_POOL }}
workspace:
clean: all
strategy:
@@ -136,9 +134,14 @@ jobs:
workingDirectory: '$(Build.SourcesDirectory)/rapidjson/build'
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmDependencies }}
gpuTarget: $(JOB_GPU_TARGET)
# CI case: download latest default branch build
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
# manual build case: triggered by ROCm/ROCm repo
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/build-cmake.yml
parameters:
extraBuildFlags: >-
@@ -146,7 +149,6 @@ jobs:
-DCMAKE_PREFIX_PATH=$(Agent.BuildDirectory)/rocm;/opt/libjpeg-turbo
-DCMAKE_INSTALL_PREFIX_PYTHON=$Python3_STDARCH
-DCMAKE_BUILD_TYPE=Release
-DGPU_TARGETS=$(JOB_GPU_TARGET)
-GNinja
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/manifest.yml
parameters:
@@ -154,15 +156,6 @@ jobs:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-upload.yml
parameters:
gpuTarget: $(JOB_GPU_TARGET)
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-links.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
pipModules: ${{ parameters.pipModules }}
registerJPEGPackages: true
gpuTarget: $(JOB_GPU_TARGET)
extraCopyDirectories:
- /opt/libjpeg-turbo
- job: rocAL_testing
dependsOn: rocAL
@@ -174,9 +167,7 @@ jobs:
value: $(Agent.BuildDirectory)/rocm
- name: CMAKE_INCLUDE_PATH
value: $(Agent.BuildDirectory)/rocm/include/rocal
pool:
name: $(JOB_TEST_POOL)
demands: firstRenderDeviceAccess
pool: $(JOB_TEST_POOL)
workspace:
clean: all
strategy:
@@ -203,11 +194,19 @@ jobs:
parameters:
gpuTarget: $(JOB_GPU_TARGET)
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-aqlprofile.yml
parameters:
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmTestDependencies }}
gpuTarget: $(JOB_GPU_TARGET)
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- task: Bash@3
displayName: Link libjpeg-turbo
inputs:
@@ -236,13 +235,3 @@ jobs:
script: |
sudo rm /etc/ld.so.conf.d/libjpeg-turbo.conf
sudo ldconfig -v
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
pipModules: ${{ parameters.pipModules }}
registerJPEGPackages: true
environment: test
gpuTarget: $(JOB_GPU_TARGET)
extraCopyDirectories:
- /opt/libjpeg-turbo
# docker image will be missing the ldconfig to libjpeg-turbo

View File

@@ -15,7 +15,6 @@ parameters:
- git
- mpich
- ninja-build
- python3-pip
- name: rocmDependencies
type: object
default:
@@ -53,7 +52,8 @@ jobs:
- template: /.azuredevops/variables-global.yml
- name: HIP_ROCCLR_HOME
value: $(Build.BinariesDirectory)/rocm
pool: ${{ variables.LOW_BUILD_POOL }}
pool:
vmImage: ${{ variables.BASE_BUILD_POOL }}
workspace:
clean: all
strategy:
@@ -70,9 +70,14 @@ jobs:
checkoutRepo: ${{ parameters.checkoutRepo }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmDependencies }}
gpuTarget: $(JOB_GPU_TARGET)
# CI case: download latest default branch build
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
# manual build case: triggered by ROCm/ROCm repo
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/build-cmake.yml
parameters:
extraBuildFlags: >-
@@ -91,13 +96,6 @@ jobs:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-upload.yml
parameters:
gpuTarget: $(JOB_GPU_TARGET)
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-links.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
gpuTarget: $(JOB_GPU_TARGET)
extraEnvVars:
- HIP_ROCCLR_HOME:::/home/user/workspace/rocm
- job: rocALUTION_testing
dependsOn: rocALUTION
@@ -122,11 +120,19 @@ jobs:
parameters:
gpuTarget: $(JOB_GPU_TARGET)
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-aqlprofile.yml
parameters:
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmTestDependencies }}
gpuTarget: $(JOB_GPU_TARGET)
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/gpu-diagnostics.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/test.yml
parameters:
@@ -134,8 +140,3 @@ jobs:
testDir: '$(Agent.BuildDirectory)/rocm/bin'
testExecutable: './rocalution-test'
testParameters: '--gtest_output=xml:./test_output.xml --gtest_color=yes'
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
environment: test
gpuTarget: $(JOB_GPU_TARGET)

View File

@@ -87,18 +87,23 @@ jobs:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-aocl.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmDependencies }}
gpuTarget: $(JOB_GPU_TARGET)
# CI case: download latest default branch build
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
# manual build case: triggered by ROCm/ROCm repo
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/build-cmake.yml
parameters:
extraBuildFlags: >-
-DCMAKE_TOOLCHAIN_FILE=toolchain-linux.cmake
-DCMAKE_PREFIX_PATH=$(Agent.BuildDirectory)/rocm/llvm;$(Agent.BuildDirectory)/rocm
-DCMAKE_PREFIX_PATH=$(Agent.BuildDirectory)/rocm/llvm;$(Agent.BuildDirectory)/rocm;$(Pipeline.Workspace)/deps-install
-DCMAKE_BUILD_TYPE=Release
-DCMAKE_CXX_COMPILER=$(Agent.BuildDirectory)/rocm/bin/hipcc
-DCMAKE_C_COMPILER=$(Agent.BuildDirectory)/rocm/bin/hipcc
-DGPU_TARGETS=$(JOB_GPU_TARGET)
-DAMDGPU_TARGETS=$(JOB_GPU_TARGET)
-DTensile_CODE_OBJECT_VERSION=default
-DTensile_LOGIC=asm_full
-DTensile_SEPARATE_ARCHITECTURES=ON
@@ -115,18 +120,6 @@ jobs:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-upload.yml
parameters:
gpuTarget: $(JOB_GPU_TARGET)
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-links.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
pipModules: ${{ parameters.pipModules }}
installAOCL: true
gpuTarget: $(JOB_GPU_TARGET)
extraEnvVars:
- HIP_ROCCLR_HOME:::/home/user/workspace/rocm
- TENSILE_ROCM_ASSEMBLER_PATH:::/home/user/workspace/rocm/llvm/bin/clang
- CMAKE_CXX_COMPILER:::/home/user/workspace/rocm/bin/hipcc
- TENSILE_ROCM_OFFLOAD_BUNDLER_PATH:::/home/user/workspace/rocm/llvm/bin/clang-offload-bundler
- job: rocBLAS_testing
dependsOn: rocBLAS
@@ -152,11 +145,19 @@ jobs:
parameters:
gpuTarget: $(JOB_GPU_TARGET)
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-aqlprofile.yml
parameters:
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmTestDependencies }}
gpuTarget: $(JOB_GPU_TARGET)
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/gpu-diagnostics.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/test.yml
parameters:
@@ -164,9 +165,3 @@ jobs:
testDir: '$(Agent.BuildDirectory)/rocm/bin'
testExecutable: './rocblas-test'
testParameters: '--yaml rocblas_smoke.yaml --gtest_output=xml:./test_output.xml --gtest_color=yes'
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
pipModules: ${{ parameters.pipModules }}
environment: test
gpuTarget: $(JOB_GPU_TARGET)

View File

@@ -20,7 +20,6 @@ parameters:
- libva-amdgpu-dev
- mesa-amdgpu-va-drivers
- libdrm-dev
- python3-pip
- name: rocmDependencies
type: object
default:
@@ -45,7 +44,8 @@ jobs:
variables:
- group: common
- template: /.azuredevops/variables-global.yml
pool: ${{ variables.LOW_BUILD_POOL }}
pool:
vmImage: ${{ variables.BASE_BUILD_POOL }}
workspace:
clean: all
steps:
@@ -70,8 +70,14 @@ jobs:
checkoutRepo: ${{ parameters.checkoutRepo }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmDependencies }}
gpuTarget: $(JOB_GPU_TARGET)
# CI case: download latest default branch build
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
# manual build case: triggered by ROCm/ROCm repo
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/build-cmake.yml
parameters:
extraBuildFlags: >-
@@ -81,11 +87,6 @@ jobs:
-GNinja
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/manifest.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-upload.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-links.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
registerROCmPackages: true
- job: rocDecode_testing
dependsOn: rocDecode
@@ -122,11 +123,19 @@ jobs:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/preamble.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/local-artifact-download.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-aqlprofile.yml
parameters:
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmTestDependencies }}
gpuTarget: $(JOB_GPU_TARGET)
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
# anything in /opt may be persistent across runs
# so we need to remove the symlink if it already exists
- script: |
@@ -142,9 +151,3 @@ jobs:
testDir: 'rocDecode-tests'
- script: sudo rm /opt/rocm
condition: always()
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
registerROCmPackages: true
environment: test
gpuTarget: $(JOB_GPU_TARGET)

View File

@@ -65,9 +65,14 @@ jobs:
checkoutRepo: ${{ parameters.checkoutRepo }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmDependencies }}
gpuTarget: $(JOB_GPU_TARGET)
# CI case: download latest default branch build
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
# manual build case: triggered by ROCm/ROCm repo
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/build-cmake.yml
parameters:
extraBuildFlags: >-
@@ -88,13 +93,6 @@ jobs:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-upload.yml
parameters:
gpuTarget: $(JOB_GPU_TARGET)
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-links.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
gpuTarget: $(JOB_GPU_TARGET)
extraEnvVars:
- HIP_ROCCLR_HOME:::/home/user/workspace/rocm
- job: rocFFT_testing
dependsOn: rocFFT
@@ -119,11 +117,19 @@ jobs:
parameters:
gpuTarget: $(JOB_GPU_TARGET)
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-aqlprofile.yml
parameters:
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmTestDependencies }}
gpuTarget: $(JOB_GPU_TARGET)
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/gpu-diagnostics.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/test.yml
parameters:
@@ -131,8 +137,3 @@ jobs:
testDir: '$(Agent.BuildDirectory)/rocm/bin'
testExecutable: './rocfft-test'
testParameters: '--test_prob 0.004 --gtest_output=xml:./test_output.xml --gtest_color=yes'
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
environment: test
gpuTarget: $(JOB_GPU_TARGET)

View File

@@ -15,7 +15,6 @@ parameters:
- mesa-amdgpu-va-drivers
- ninja-build
- pkg-config
- python3-pip
- name: rocmDependencies
type: object
default:
@@ -40,13 +39,10 @@ jobs:
variables:
- group: common
- template: /.azuredevops/variables-global.yml
pool: ${{ variables.LOW_BUILD_POOL }}
pool:
vmImage: ${{ variables.BASE_BUILD_POOL }}
workspace:
clean: all
strategy:
matrix:
gfx942:
JOB_GPU_TARGET: gfx942
steps:
# Since mesa-amdgpu-multimedia-devel is not directly available from apt, register it
- task: Bash@3
@@ -69,25 +65,23 @@ jobs:
checkoutRepo: ${{ parameters.checkoutRepo }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmDependencies }}
gpuTarget: $(JOB_GPU_TARGET)
# CI case: download latest default branch build
${{ if eq(parameters.checkoutRef, 'develop') }}:
dependencySource: staging
# manual build case: triggered by ROCm/ROCm repo
${{ elseif ne(parameters.checkoutRef, 'develop') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/build-cmake.yml
parameters:
extraBuildFlags: >-
-DROCM_PATH=$(Agent.BuildDirectory)/rocm
-DCMAKE_PREFIX_PATH=$(Agent.BuildDirectory)/rocm
-DCMAKE_BUILD_TYPE=Release
-DGPU_TARGETS=$(JOB_GPU_TARGET)
-GNinja
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/manifest.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-upload.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-links.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
gpuTarget: $(JOB_GPU_TARGET)
registerROCmPackages: true
- job: rocJPEG_testing
dependsOn: rocJPEG
@@ -124,11 +118,19 @@ jobs:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/preamble.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/local-artifact-download.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-aqlprofile.yml
parameters:
${{ if eq(parameters.checkoutRef, 'develop') }}:
dependencySource: staging
${{ elseif ne(parameters.checkoutRef, 'develop') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmTestDependencies }}
gpuTarget: $(JOB_GPU_TARGET)
${{ if eq(parameters.checkoutRef, 'develop') }}:
dependencySource: staging
${{ elseif ne(parameters.checkoutRef, 'develop') }}:
dependencySource: tag-builds
# anything in /opt may be persistent across runs
# so we need to remove the symlink if it already exists
- script: |
@@ -144,10 +146,3 @@ jobs:
testDir: 'rocJPEG-tests'
- script: sudo rm /opt/rocm
condition: always()
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
environment: test
gpuTarget: $(JOB_GPU_TARGET)
registerROCmPackages: true
optSymLink: true

View File

@@ -47,8 +47,13 @@ jobs:
checkoutRepo: ${{ parameters.checkoutRepo }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmDependencies }}
# CI case: download latest default branch build
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
# manual build case: triggered by ROCm/ROCm repo
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/build-cmake.yml
parameters:
extraBuildFlags: >-
@@ -60,11 +65,6 @@ jobs:
-GNinja
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/manifest.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-upload.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-links.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
pipModules: ${{ parameters.pipModules }}
# compiling and running test on the test system together
- job: rocMLIR_testing
@@ -91,11 +91,21 @@ jobs:
parameters:
checkoutRepo: ${{ parameters.checkoutRepo }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-aqlprofile.yml
parameters:
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmDependencies }}
gpuTarget: $(JOB_GPU_TARGET)
# CI case: download latest default branch build
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
# manual build case: triggered by ROCm/ROCm repo
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- task: Bash@3
displayName: ROCm symbolic link
inputs:
@@ -121,9 +131,3 @@ jobs:
testDir: $(Build.SourcesDirectory)/build
testExecutable: ninja
testParameters: check-rocmlir
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
pipModules: ${{ parameters.pipModules }}
environment: test
gpuTarget: $(JOB_GPU_TARGET)

View File

@@ -52,9 +52,14 @@ jobs:
checkoutRepo: ${{ parameters.checkoutRepo }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmDependencies }}
gpuTarget: $(JOB_GPU_TARGET)
# CI case: download latest default branch build
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
# manual build case: triggered by ROCm/ROCm repo
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/build-cmake.yml
parameters:
extraBuildFlags: >-
@@ -70,11 +75,6 @@ jobs:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-upload.yml
parameters:
gpuTarget: $(JOB_GPU_TARGET)
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-links.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
gpuTarget: $(JOB_GPU_TARGET)
- job: rocPRIM_testing
dependsOn: rocPRIM
@@ -99,18 +99,21 @@ jobs:
parameters:
gpuTarget: $(JOB_GPU_TARGET)
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-aqlprofile.yml
parameters:
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmTestDependencies }}
gpuTarget: $(JOB_GPU_TARGET)
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/gpu-diagnostics.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/test.yml
parameters:
componentName: rocPRIM
testDir: '$(Agent.BuildDirectory)/rocm/bin/rocprim'
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
environment: test
gpuTarget: $(JOB_GPU_TARGET)

View File

@@ -41,7 +41,8 @@ jobs:
variables:
- group: common
- template: /.azuredevops/variables-global.yml
pool: ${{ variables.LOW_BUILD_POOL }}
pool:
vmImage: ${{ variables.BASE_BUILD_POOL }}
workspace:
clean: all
strategy:
@@ -64,9 +65,14 @@ jobs:
checkoutRepo: ${{ parameters.checkoutRepo }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmDependencies }}
gpuTarget: $(JOB_GPU_TARGET)
# CI case: download latest default branch build
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
# manual build case: triggered by ROCm/ROCm repo
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- task: Bash@3
displayName: 'ROCm symbolic link'
inputs:
@@ -88,7 +94,7 @@ jobs:
-DROCM_PATH=$(Agent.BuildDirectory)/rocm
-DCMAKE_PREFIX_PATH=$(Agent.BuildDirectory)/rocm;$(PYTHON_USER_SITE)/pybind11;$(PYTHON_DIST_PACKAGES)/pybind11;$(PYBIND11_PATH)
-DCMAKE_BUILD_TYPE=Release
-DGPU_TARGETS=$(JOB_GPU_TARGET)
-DAMDGPU_TARGETS=$(JOB_GPU_TARGET)
-DCMAKE_INSTALL_PREFIX_PYTHON=$(Build.BinariesDirectory)
-GNinja
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/manifest.yml
@@ -118,23 +124,6 @@ jobs:
retryCountOnTaskFailure: 3
inputs:
targetPath: $(Build.ArtifactStagingDirectory)
- task: Bash@3
displayName: Save pipeline artifact file names
inputs:
workingDirectory: $(Pipeline.Workspace)
targetType: inline
script: |
whlFile=$(find "$(Build.ArtifactStagingDirectory)" -type f -name "*.whl" | head -n 1)
if [ -n "$whlFile" ]; then
echo $(basename "$whlFile") >> pipelineArtifacts.txt
fi
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-links.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
pipModules: ${{ parameters.pipModules }}
gpuTarget: $(JOB_GPU_TARGET)
optSymLink: true
- job: rocPyDecode_testing
dependsOn: rocPyDecode
@@ -183,12 +172,19 @@ jobs:
parameters:
checkoutRepo: ${{ parameters.checkoutRepo }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-aqlprofile.yml
parameters:
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmDependencies }}
gpuTarget: $(JOB_GPU_TARGET)
setupHIPLibrarySymlinks: true
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- task: Bash@3
displayName: pip install
inputs:
@@ -214,7 +210,7 @@ jobs:
-DROCM_PATH=$(Agent.BuildDirectory)/rocm
-DCMAKE_PREFIX_PATH=$(Agent.BuildDirectory)/rocm;$(PYTHON_USER_SITE)/pybind11;$(PYTHON_DIST_PACKAGES)/pybind11;$(PYBIND11_PATH)
-DCMAKE_BUILD_TYPE=Release
-DGPU_TARGETS=$(JOB_GPU_TARGET)
-DAMDGPU_TARGETS=$(JOB_GPU_TARGET)
..
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/gpu-diagnostics.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/test.yml
@@ -230,11 +226,3 @@ jobs:
script: |
pip uninstall -y rocPyDecode
pip uninstall -y hip-python
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
pipModules: ${{ parameters.pipModules }}
environment: test
gpuTarget: $(JOB_GPU_TARGET)
pythonEnvVars: true
# note that this docker won't have hip-python installed via pip

View File

@@ -9,11 +9,10 @@ parameters:
type: object
default:
- cmake
- git
- ninja-build
- googletest
- libgtest-dev
- ninja-build
- python3-pip
- git
- name: rocmDependencies
type: object
default:
@@ -38,7 +37,8 @@ jobs:
- template: /.azuredevops/variables-global.yml
- name: HIP_ROCCLR_HOME
value: $(Build.BinariesDirectory)/rocm
pool: ${{ variables.LOW_BUILD_POOL }}
pool:
vmImage: ${{ variables.BASE_BUILD_POOL }}
workspace:
clean: all
strategy:
@@ -55,9 +55,14 @@ jobs:
checkoutRepo: ${{ parameters.checkoutRepo }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmDependencies }}
gpuTarget: $(JOB_GPU_TARGET)
# CI case: download latest default branch build
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
# manual build case: triggered by ROCm/ROCm repo
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/build-cmake.yml
parameters:
extraBuildFlags: >-
@@ -72,13 +77,6 @@ jobs:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-upload.yml
parameters:
gpuTarget: $(JOB_GPU_TARGET)
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-links.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
gpuTarget: $(JOB_GPU_TARGET)
extraEnvVars:
- HIP_ROCCLR_HOME:::/home/user/workspace/rocm
- job: rocRAND_testing
dependsOn: rocRAND
@@ -103,18 +101,21 @@ jobs:
parameters:
gpuTarget: $(JOB_GPU_TARGET)
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-aqlprofile.yml
parameters:
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmTestDependencies }}
gpuTarget: $(JOB_GPU_TARGET)
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/gpu-diagnostics.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/test.yml
parameters:
componentName: rocRAND
testDir: '$(Agent.BuildDirectory)/rocm/bin/rocRAND'
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
environment: test
gpuTarget: $(JOB_GPU_TARGET)

View File

@@ -74,9 +74,14 @@ jobs:
workingDirectory: '$(Build.SourcesDirectory)'
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmDependencies }}
gpuTarget: $(JOB_GPU_TARGET)
# CI case: download latest default branch build
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
# manual build case: triggered by ROCm/ROCm repo
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/build-cmake.yml
parameters:
componentName: lapack
@@ -106,13 +111,6 @@ jobs:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-upload.yml
parameters:
gpuTarget: $(JOB_GPU_TARGET)
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-links.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
gpuTarget: $(JOB_GPU_TARGET)
extraCopyDirectories:
- deps-install
- job: rocSOLVER_testing
dependsOn: rocSOLVER
@@ -137,11 +135,19 @@ jobs:
parameters:
gpuTarget: $(JOB_GPU_TARGET)
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-aqlprofile.yml
parameters:
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmTestDependencies }}
gpuTarget: $(JOB_GPU_TARGET)
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/gpu-diagnostics.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/test.yml
parameters:
@@ -149,8 +155,3 @@ jobs:
testDir: '$(Agent.BuildDirectory)/rocm/bin'
testExecutable: './rocsolver-test'
testParameters: '--gtest_filter="*checkin*" --gtest_output=xml:./test_output.xml --gtest_color=yes'
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
environment: test
gpuTarget: $(JOB_GPU_TARGET)

View File

@@ -66,9 +66,14 @@ jobs:
checkoutRepo: ${{ parameters.checkoutRepo }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmDependencies }}
gpuTarget: $(JOB_GPU_TARGET)
# CI case: download latest default branch build
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
# manual build case: triggered by ROCm/ROCm repo
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/build-cmake.yml
parameters:
extraBuildFlags: >-
@@ -100,13 +105,6 @@ jobs:
parameters:
artifactName: testMatrices
gpuTarget: $(JOB_GPU_TARGET)
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-links.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
gpuTarget: $(JOB_GPU_TARGET)
extraEnvVars:
- HIP_ROCCLR_HOME:::/home/user/workspace/rocm
- job: rocSPARSE_testing
timeoutInMinutes: 90
@@ -132,11 +130,19 @@ jobs:
parameters:
gpuTarget: $(JOB_GPU_TARGET)
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-aqlprofile.yml
parameters:
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmTestDependencies }}
gpuTarget: $(JOB_GPU_TARGET)
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/gpu-diagnostics.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/test.yml
parameters:
@@ -144,8 +150,4 @@ jobs:
testDir: '$(Agent.BuildDirectory)/rocm/bin'
testExecutable: './rocsparse-test'
testParameters: '--gtest_filter="*quick*" --gtest_output=xml:./test_output.xml --gtest_color=yes'
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
environment: test
gpuTarget: $(JOB_GPU_TARGET)

View File

@@ -57,9 +57,14 @@ jobs:
checkoutRepo: ${{ parameters.checkoutRepo }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmDependencies }}
gpuTarget: $(JOB_GPU_TARGET)
# CI case: download latest default branch build
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
# manual build case: triggered by ROCm/ROCm repo
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/build-cmake.yml
parameters:
extraBuildFlags: >-
@@ -75,11 +80,6 @@ jobs:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-upload.yml
parameters:
gpuTarget: $(JOB_GPU_TARGET)
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-links.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
gpuTarget: $(JOB_GPU_TARGET)
- job: rocThrust_testing
dependsOn: rocThrust
@@ -104,18 +104,21 @@ jobs:
parameters:
gpuTarget: $(JOB_GPU_TARGET)
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-aqlprofile.yml
parameters:
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmTestDependencies }}
gpuTarget: $(JOB_GPU_TARGET)
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/gpu-diagnostics.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/test.yml
parameters:
componentName: rocThrust
testDir: '$(Agent.BuildDirectory)/rocm/bin/rocthrust'
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
environment: test
gpuTarget: $(JOB_GPU_TARGET)

View File

@@ -12,7 +12,6 @@ parameters:
- cmake
- ninja-build
- libboost-program-options-dev
- libdrm-dev
- libgtest-dev
- googletest
- libfftw3-dev
@@ -67,9 +66,14 @@ jobs:
checkoutRepo: ${{ parameters.checkoutRepo }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmDependencies }}
gpuTarget: $(JOB_GPU_TARGET)
# CI case: download latest default branch build
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
# manual build case: triggered by ROCm/ROCm repo
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/build-cmake.yml
parameters:
extraBuildFlags: >-
@@ -78,7 +82,7 @@ jobs:
-DCMAKE_BUILD_TYPE=Release
-DROCWMMA_BUILD_TESTS=ON
-DROCWMMA_BUILD_SAMPLES=OFF
-DGPU_TARGETS=$(JOB_GPU_TARGET)
-DAMDGPU_TARGETS=$(JOB_GPU_TARGET)
-DCMAKE_BUILD_WITH_INSTALL_RPATH=ON
-GNinja
# gfx1030 not supported in documentation
@@ -88,11 +92,6 @@ jobs:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-upload.yml
parameters:
gpuTarget: $(JOB_GPU_TARGET)
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-links.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
gpuTarget: $(JOB_GPU_TARGET)
- job: rocWMMA_testing
timeoutInMinutes: 90
@@ -118,18 +117,21 @@ jobs:
parameters:
gpuTarget: $(JOB_GPU_TARGET)
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-aqlprofile.yml
parameters:
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmTestDependencies }}
gpuTarget: $(JOB_GPU_TARGET)
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/gpu-diagnostics.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/test.yml
parameters:
componentName: rocWMMA
testDir: '$(Agent.BuildDirectory)/rocm/bin/rocwmma'
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
environment: test
gpuTarget: $(JOB_GPU_TARGET)

View File

@@ -8,11 +8,9 @@ parameters:
- name: aptPackages
type: object
default:
- cmake
- doxygen
- doxygen-doc
- ninja-build
- python3-pip
- python3-sphinx
- name: pipModules
type: object
@@ -25,7 +23,8 @@ jobs:
variables:
- group: common
- template: /.azuredevops/variables-global.yml
pool: ${{ variables.LOW_BUILD_POOL }}
pool:
vmImage: ${{ variables.BASE_BUILD_POOL }}
workspace:
clean: all
steps:
@@ -50,10 +49,3 @@ jobs:
componentName: rocm-cmake
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/manifest.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-upload.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-links.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
pipModules: ${{ parameters.pipModules }}
environment: combined
gpuTarget: $(JOB_GPU_TARGET)

View File

@@ -5,24 +5,17 @@ parameters:
- name: checkoutRef
type: string
default: ''
- name: aptPackages
type: object
default:
- cmake
- python3-pip
jobs:
- job: rocm_core
variables:
- group: common
- template: /.azuredevops/variables-global.yml
pool: ${{ variables.LOW_BUILD_POOL }}
pool:
vmImage: ${{ variables.BASE_BUILD_POOL }}
workspace:
clean: all
steps:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-other.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/preamble.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/checkout.yml
parameters:
@@ -39,7 +32,3 @@ jobs:
-DROCM_VERSION="$(next-release)"
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/manifest.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-upload.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-links.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}

View File

@@ -85,9 +85,14 @@ jobs:
checkoutRepo: ${{ parameters.checkoutRepo }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmDependencies }}
gpuTarget: $(JOB_GPU_TARGET)
# CI case: download latest default branch build
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
# manual build case: triggered by ROCm/ROCm repo
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/build-cmake.yml
parameters:
# https://github.com/ROCm/HIP/issues/2203
@@ -111,11 +116,6 @@ jobs:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-upload.yml
parameters:
gpuTarget: $(JOB_GPU_TARGET)
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-links.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
gpuTarget: $(JOB_GPU_TARGET)
- job: rocm_examples_testing
dependsOn: rocm_examples
@@ -142,11 +142,19 @@ jobs:
parameters:
checkoutRepo: ${{ parameters.checkoutRepo }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-aqlprofile.yml
parameters:
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmTestDependencies }}
gpuTarget: $(JOB_GPU_TARGET)
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/build-cmake.yml
parameters:
# https://github.com/ROCm/HIP/issues/2203
@@ -162,8 +170,3 @@ jobs:
parameters:
componentName: rocm-examples
testDir: $(Build.SourcesDirectory)/build
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
environment: test
gpuTarget: $(JOB_GPU_TARGET)

View File

@@ -40,7 +40,8 @@ jobs:
value: $(Agent.BuildDirectory)/rocm
- name: ROCR_LIB_DIR
value: $(Agent.BuildDirectory)/rocm
pool: ${{ variables.LOW_BUILD_POOL }}
pool:
vmImage: ${{ variables.BASE_BUILD_POOL }}
workspace:
clean: all
steps:
@@ -54,8 +55,13 @@ jobs:
checkoutRepo: ${{ parameters.checkoutRepo }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmDependencies }}
# CI case: download latest default branch build
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
# manual build case: triggered by ROCm/ROCm repo
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/build-cmake.yml
parameters:
extraBuildFlags: >-
@@ -65,14 +71,6 @@ jobs:
-GNinja
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/manifest.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-upload.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-links.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
pipModules: ${{ parameters.pipModules }}
extraEnvVars:
- ROCR_INC_DIR:::/home/user/workspace/rocm
- ROCR_LIB_DIR:::/home/user/workspace/rocm
- job: rocm_bandwidth_test_testing
dependsOn: rocm_bandwidth_test
@@ -97,9 +95,12 @@ jobs:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/local-artifact-download.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmTestDependencies }}
gpuTarget: $(JOB_GPU_TARGET)
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/gpu-diagnostics.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/test.yml
parameters:
@@ -108,9 +109,3 @@ jobs:
testExecutable: './rocm/bin/rocm-bandwidth-test'
testParameters: ''
testPublishResults: false
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
pipModules: ${{ parameters.pipModules }}
environment: test
gpuTarget: $(JOB_GPU_TARGET)

View File

@@ -5,25 +5,17 @@ parameters:
- name: checkoutRef
type: string
default: ''
- name: aptPackages
type: object
default:
- cmake
- libdrm-dev
- python3-pip
jobs:
- job: rocm_smi_lib
variables:
- group: common
- template: /.azuredevops/variables-global.yml
pool: ${{ variables.LOW_BUILD_POOL }}
pool:
vmImage: ${{ variables.BASE_BUILD_POOL }}
workspace:
clean: all
steps:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-other.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/preamble.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/checkout.yml
parameters:
@@ -35,11 +27,6 @@ jobs:
-DROCM_DEP_ROCMCORE=ON
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/manifest.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-upload.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-links.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
gpuTarget: $(JOB_GPU_TARGET)
- job: rocm_smi_lib_testing
dependsOn: rocm_smi_lib
@@ -67,8 +54,3 @@ jobs:
testDir: '$(Agent.BuildDirectory)'
testExecutable: './rocm/share/rocm_smi/rsmitst_tests/rsmitst'
testParameters: '--gtest_output=xml:./test_output.xml --gtest_color=yes'
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
environment: test
gpuTarget: $(JOB_GPU_TARGET)

View File

@@ -5,13 +5,6 @@ parameters:
- name: checkoutRef
type: string
default: ''
- name: aptPackages
type: object
default:
- cmake
- libdrm-amdgpu-dev
- libdrm-dev
- python3-pip
- name: rocmDependencies
type: object
default:
@@ -28,23 +21,25 @@ jobs:
variables:
- group: common
- template: /.azuredevops/variables-global.yml
pool: ${{ variables.LOW_BUILD_POOL }}
pool:
vmImage: ${{ variables.BASE_BUILD_POOL }}
workspace:
clean: all
steps:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-other.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
registerRadeon: true
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/preamble.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/checkout.yml
parameters:
checkoutRepo: ${{ parameters.checkoutRepo }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmDependencies }}
skipLlvmSymlink: true
# CI case: download latest default branch build
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
# manual build case: triggered by ROCm/ROCm repo
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/build-cmake.yml
parameters:
extraBuildFlags: >-
@@ -52,10 +47,6 @@ jobs:
-DROCRTST_BLD_TYPE=release
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/manifest.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-upload.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-links.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
- job: rocminfo_testing
dependsOn: rocminfo
@@ -72,16 +63,16 @@ jobs:
JOB_GPU_TARGET: gfx942
JOB_TEST_POOL: ${{ variables.GFX942_TEST_POOL }}
steps:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-other.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/preamble.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/local-artifact-download.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmTestDependencies }}
gpuTarget: $(JOB_GPU_TARGET)
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/gpu-diagnostics.yml
parameters:
runRocminfo: false
@@ -99,8 +90,3 @@ jobs:
testExecutable: './rocm/bin/rocm_agent_enumerator'
testParameters: ''
testPublishResults: false
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
environment: test
gpuTarget: $(JOB_GPU_TARGET)

View File

@@ -52,7 +52,8 @@ jobs:
variables:
- group: common
- template: /.azuredevops/variables-global.yml
pool: ${{ variables.LOW_BUILD_POOL }}
pool:
vmImage: ${{ variables.BASE_BUILD_POOL }}
workspace:
clean: all
strategy:
@@ -69,11 +70,21 @@ jobs:
parameters:
checkoutRepo: ${{ parameters.checkoutRepo }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-aqlprofile.yml
parameters:
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmDependencies }}
gpuTarget: $(JOB_GPU_TARGET)
# CI case: download latest default branch build
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
# manual build case: triggered by ROCm/ROCm repo
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/build-cmake.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/manifest.yml
parameters:
@@ -81,12 +92,6 @@ jobs:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-upload.yml
parameters:
gpuTarget: $(JOB_GPU_TARGET)
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-links.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
pipModules: ${{ parameters.pipModules }}
gpuTarget: $(JOB_GPU_TARGET)
- job: rocprofiler_compute_testing
timeoutInMinutes: 120
@@ -124,11 +129,19 @@ jobs:
checkoutRepo: ${{ parameters.checkoutRepo }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/local-artifact-download.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-aqlprofile.yml
parameters:
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmDependencies }}
gpuTarget: $(JOB_GPU_TARGET)
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- task: Bash@3
displayName: Add ROCm binaries to PATH
inputs:
@@ -168,9 +181,3 @@ jobs:
inputs:
targetType: inline
script: echo "##vso[task.setvariable variable=PATH]$(echo $PATH | sed -e 's;:$(Agent.BuildDirectory)/rocm/llvm/bin;;' -e 's;^/;;' -e 's;/$;;')"
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
pipModules: ${{ parameters.pipModules }}
environment: test
gpuTarget: $(JOB_GPU_TARGET)

View File

@@ -5,24 +5,17 @@ parameters:
- name: checkoutRef
type: string
default: ''
- name: aptPackages
type: object
default:
- cmake
- python3-pip
jobs:
- job: rocprofiler_register
variables:
- group: common
- template: /.azuredevops/variables-global.yml
pool: ${{ variables.LOW_BUILD_POOL }}
pool:
vmImage: ${{ variables.BASE_BUILD_POOL }}
workspace:
clean: all
steps:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-other.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/preamble.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/checkout.yml
parameters:
@@ -43,8 +36,3 @@ jobs:
testDir: 'tests/build'
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/manifest.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-upload.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-links.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
environment: combined

View File

@@ -8,13 +8,11 @@ parameters:
- name: aptPackages
type: object
default:
- build-essential
- libdrm-amdgpu-dev
- cmake
- python3-pip
- libdrm-dev
- libdw-dev
- libelf-dev
- pkg-config
- python3-pip
- name: pipModules
type: object
default:
@@ -29,7 +27,7 @@ parameters:
- pandas
- perfetto
- pycobertura
- pytest>=6.2.5
- pytest
- pyyaml
- name: rocmDependencies
type: object
@@ -42,13 +40,15 @@ parameters:
- rocminfo
- ROCR-Runtime
- rocprofiler-register
- roctracer
jobs:
- job: rocprofiler_sdk
- job: rocprofilersdk
variables:
- group: common
- template: /.azuredevops/variables-global.yml
pool: ${{ variables.LOW_BUILD_POOL }}
pool:
vmImage: ${{ variables.BASE_BUILD_POOL }}
workspace:
clean: all
strategy:
@@ -60,98 +60,37 @@ jobs:
parameters:
aptPackages: ${{ parameters.aptPackages }}
pipModules: ${{ parameters.pipModules }}
registerRadeon: true
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/preamble.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/checkout.yml
parameters:
checkoutRepo: ${{ parameters.checkoutRepo }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-aqlprofile.yml
parameters:
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmDependencies }}
gpuTarget: $(JOB_GPU_TARGET)
- task: Bash@3
displayName: Add Python site-packages binaries to path
inputs:
targetType: inline
script: |
USER_BASE=$(python3 -m site --user-base)
echo "##vso[task.prependpath]$USER_BASE/bin"
# CI case: download latest default branch build
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
# manual build case: triggered by ROCm/ROCm repo
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/build-cmake.yml
parameters:
extraBuildFlags: >-
-DCMAKE_PREFIX_PATH=$(Agent.BuildDirectory)/rocm
-DROCPROFILER_BUILD_TESTS=ON
-DROCPROFILER_BUILD_SAMPLES=ON
-DROCPROFILER_BUILD_RELEASE=ON
-DGPU_TARGETS=$(JOB_GPU_TARGET)
multithreadFlag: -- -j4
-DROCPROFILER_BUILD_SAMPLES=OFF
-DAMDGPU_TARGETS=$(JOB_GPU_TARGET)
multithreadFlag: -- -j2
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/manifest.yml
parameters:
gpuTarget: $(JOB_GPU_TARGET)
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-upload.yml
parameters:
gpuTarget: $(JOB_GPU_TARGET)
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-links.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
pipModules: ${{ parameters.pipModules }}
gpuTarget: $(JOB_GPU_TARGET)
- job: rocprofiler_sdk_testing
dependsOn: rocprofiler_sdk
condition: and(succeeded(), eq(variables.ENABLE_GFX942_TESTS, 'true'), not(containsValue(split(variables.DISABLED_GFX942_TESTS, ','), variables['Build.DefinitionName'])))
variables:
- group: common
- template: /.azuredevops/variables-global.yml
pool: $(JOB_TEST_POOL)
workspace:
clean: all
strategy:
matrix:
gfx942:
JOB_GPU_TARGET: gfx942
JOB_TEST_POOL: ${{ variables.GFX942_TEST_POOL }}
steps:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-other.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
pipModules: ${{ parameters.pipModules }}
registerRadeon: true
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/preamble.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/checkout.yml
parameters:
checkoutRepo: ${{ parameters.checkoutRepo }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-aqlprofile.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmDependencies }}
gpuTarget: $(JOB_GPU_TARGET)
- task: Bash@3
displayName: Add Python site-packages binaries to path
inputs:
targetType: inline
script: |
USER_BASE=$(python3 -m site --user-base)
echo "##vso[task.prependpath]$USER_BASE/bin"
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/build-cmake.yml
parameters:
extraBuildFlags: >-
-DCMAKE_PREFIX_PATH=$(Agent.BuildDirectory)/rocm
-DROCPROFILER_BUILD_TESTS=ON
-DROCPROFILER_BUILD_SAMPLES=ON
-DROCPROFILER_BUILD_RELEASE=ON
-DGPU_TARGETS=$(JOB_GPU_TARGET)
multithreadFlag: -- -j16
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/test.yml
parameters:
componentName: rocprofiler-sdk
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
pipModules: ${{ parameters.pipModules }}
environment: test
gpuTarget: $(JOB_GPU_TARGET)

View File

@@ -17,12 +17,7 @@ parameters:
- clang
- cmake
- environment-modules
- ffmpeg
- g++-12
- libavcodec-dev
- libavformat-dev
- libavutil-dev
- libdrm-amdgpu-dev
- libdrm-dev
- libfabric-dev
- libiberty-dev
@@ -32,9 +27,8 @@ parameters:
- libopenmpi-dev
- m4
- openmpi-bin
- pkg-config
- python3-pip
- software-properties-common
- python3-pip
- texinfo
- zlib1g-dev
- name: pipModules
@@ -50,13 +44,13 @@ parameters:
- clr
- llvm-project
- rccl
- rocDecode
- rocm-core
- rocm_smi_lib
- rocminfo
- ROCR-Runtime
- rocprofiler
- rocprofiler-register
- rocprofiler-sdk
- roctracer
jobs:
- job: rocprofiler_systems
@@ -75,17 +69,20 @@ jobs:
parameters:
aptPackages: ${{ parameters.aptPackages }}
pipModules: ${{ parameters.pipModules }}
registerRadeon: true
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/preamble.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/checkout.yml
parameters:
checkoutRepo: ${{ parameters.checkoutRepo }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-aqlprofile.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmDependencies }}
gpuTarget: $(JOB_GPU_TARGET)
# CI case: download latest default branch build
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
# manual build case: triggered by ROCm/ROCm repo
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- task: Bash@3
displayName: ROCm symbolic link
inputs:
@@ -112,98 +109,13 @@ jobs:
-DROCPROFSYS_BUILD_TESTING=ON
-DROCPROFSYS_BUILD_DYNINST=ON
-DROCPROFSYS_BUILD_LIBUNWIND=ON
-DROCPROFSYS_DISABLE_EXAMPLES="openmp-target"
-DDYNINST_BUILD_TBB=ON
-DDYNINST_BUILD_ELFUTILS=ON
-DDYNINST_BUILD_LIBIBERTY=ON
-DDYNINST_BUILD_BOOST=ON
-DROCPROFSYS_USE_PAPI=ON
-DROCPROFSYS_USE_MPI=ON
-DGPU_TARGETS=$(JOB_GPU_TARGET)
multithreadFlag: -- -j32
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/manifest.yml
parameters:
gpuTarget: $(JOB_GPU_TARGET)
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-upload.yml
parameters:
gpuTarget: $(JOB_GPU_TARGET)
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-links.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
pipModules: ${{ parameters.pipModules }}
gpuTarget: $(JOB_GPU_TARGET)
registerROCmPackages: true
optSymLink: true
extraPaths: /home/user/workspace/rocm/bin:/home/user/workspace/rocm/llvm/bin
- job: rocprofiler_systems_testing
dependsOn: rocprofiler_systems
condition: and(succeeded(), eq(variables.ENABLE_GFX942_TESTS, 'true'), not(containsValue(split(variables.DISABLED_GFX942_TESTS, ','), variables['Build.DefinitionName'])))
timeoutInMinutes: 180
variables:
- group: common
- template: /.azuredevops/variables-global.yml
pool:
name: $(JOB_TEST_POOL)
demands: firstRenderDeviceAccess
workspace:
clean: all
strategy:
matrix:
gfx942:
JOB_GPU_TARGET: gfx942
JOB_TEST_POOL: ${{ variables.GFX942_TEST_POOL }}
steps:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-other.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
pipModules: ${{ parameters.pipModules }}
registerRadeon: true
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/preamble.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/checkout.yml
parameters:
checkoutRepo: ${{ parameters.checkoutRepo }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-aqlprofile.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmDependencies }}
gpuTarget: $(JOB_GPU_TARGET)
- task: Bash@3
displayName: ROCm symbolic link
inputs:
targetType: inline
script: |
sudo rm -rf /opt/rocm
sudo ln -s $(Agent.BuildDirectory)/rocm /opt/rocm
- task: Bash@3
displayName: Add ROCm binaries to PATH
inputs:
targetType: inline
script: echo "##vso[task.prependpath]$(Agent.BuildDirectory)/rocm/bin"
- task: Bash@3
displayName: Add ROCm compilers to PATH
inputs:
targetType: inline
script: echo "##vso[task.prependpath]$(Agent.BuildDirectory)/rocm/llvm/bin"
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/build-cmake.yml
parameters:
# build flags reference: https://rocm.docs.amd.com/projects/omnitrace/en/latest/install/install.html
extraBuildFlags: >-
-DCMAKE_PREFIX_PATH=$(Agent.BuildDirectory)/rocm
-DROCM_PATH=$(Agent.BuildDirectory)/rocm
-DROCPROFSYS_BUILD_TESTING=ON
-DROCPROFSYS_BUILD_DYNINST=ON
-DROCPROFSYS_BUILD_LIBUNWIND=ON
-DROCPROFSYS_DISABLE_EXAMPLES="openmp-target"
-DDYNINST_BUILD_TBB=ON
-DDYNINST_BUILD_ELFUTILS=ON
-DDYNINST_BUILD_LIBIBERTY=ON
-DDYNINST_BUILD_BOOST=ON
-DROCPROFSYS_USE_PAPI=ON
-DROCPROFSYS_USE_MPI=ON
-DGPU_TARGETS=$(JOB_GPU_TARGET)
-DAMDGPU_TARGETS=$(JOB_GPU_TARGET)
multithreadFlag: -- -j32
- task: Bash@3
displayName: Set up rocprofiler-systems env
@@ -232,12 +144,3 @@ jobs:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-upload.yml
parameters:
gpuTarget: $(JOB_GPU_TARGET)
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
pipModules: ${{ parameters.pipModules }}
environment: test
registerROCmPackages: true
gpuTarget: $(JOB_GPU_TARGET)
optSymLink: true
extraPaths: /home/user/workspace/rocm/bin:/home/user/workspace/rocm/llvm/bin

View File

@@ -67,11 +67,21 @@ jobs:
parameters:
checkoutRepo: ${{ parameters.checkoutRepo }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-aqlprofile.yml
parameters:
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmDependencies }}
gpuTarget: $(JOB_GPU_TARGET)
# CI case: download latest default branch build
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
# manual build case: triggered by ROCm/ROCm repo
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/build-cmake.yml
parameters:
extraBuildFlags: >-
@@ -88,15 +98,6 @@ jobs:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-upload.yml
parameters:
gpuTarget: $(JOB_GPU_TARGET)
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-links.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
pipModules: ${{ parameters.pipModules }}
gpuTarget: $(JOB_GPU_TARGET)
extraEnvVars:
- HIP_ROCCLR_HOME:::/home/user/workspace/rocm
- ROCM_PATH:::/home/user/workspace/rocm
- job: rocprofiler_testing
dependsOn: rocprofiler
@@ -121,11 +122,19 @@ jobs:
parameters:
gpuTarget: $(JOB_GPU_TARGET)
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-aqlprofile.yml
parameters:
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmDependencies }}
gpuTarget: $(JOB_GPU_TARGET)
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- task: Bash@3
displayName: Setup test environment
inputs:
@@ -147,10 +156,3 @@ jobs:
testExecutable: LD_LIBRARY_PATH="$(Agent.BuildDirectory)/rocm/lib/rocprofiler:$(Agent.BuildDirectory)/rocm/share/rocprofiler/tests" share/rocprofiler/tests/runUnitTests
testParameters: '--gtest_output=xml:./test_output.xml --gtest_color=yes'
testDir: $(Agent.BuildDirectory)/rocm
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
pipModules: ${{ parameters.pipModules }}
environment: test
gpuTarget: $(JOB_GPU_TARGET)
optSymLink: true

View File

@@ -14,7 +14,6 @@ parameters:
- libdw-dev
- libstdc++-12-dev
- python-is-python3
- python3-pip
- name: rocmDependencies
type: object
default:
@@ -40,7 +39,8 @@ jobs:
variables:
- group: common
- template: /.azuredevops/variables-global.yml
pool: ${{ variables.LOW_BUILD_POOL }}
pool:
vmImage: ${{ variables.BASE_BUILD_POOL }}
workspace:
clean: all
steps:
@@ -53,8 +53,13 @@ jobs:
checkoutRepo: ${{ parameters.checkoutRepo }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmDependencies }}
# CI case: download latest default branch build
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
# manual build case: triggered by ROCm/ROCm repo
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/build-cmake.yml
parameters:
extraBuildFlags: >-
@@ -65,11 +70,6 @@ jobs:
-GNinja
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/manifest.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-upload.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-links.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
gpuTarget: $(JOB_GPU_TARGET)
- job: rocr_debug_agent_testing
dependsOn: rocr_debug_agent
@@ -92,11 +92,19 @@ jobs:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/preamble.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/local-artifact-download.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-aqlprofile.yml
parameters:
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmTestDependencies }}
gpuTarget: $(JOB_GPU_TARGET)
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/build-cmake.yml
parameters:
componentName: rocr_debug_agent-tests
@@ -112,8 +120,3 @@ jobs:
parameters:
componentName: rocr_debug_agent
testDir: '$(Agent.BuildDirectory)/rocm/src/rocm-debug-agent-test'
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
environment: test
gpuTarget: $(JOB_GPU_TARGET)

View File

@@ -9,10 +9,9 @@ parameters:
type: object
default:
- cmake
- ninja-build
- doxygen
- graphviz
- ninja-build
- python3-pip
- name: pipModules
type: object
default:
@@ -42,7 +41,8 @@ jobs:
- template: /.azuredevops/variables-global.yml
- name: HIP_ROCCLR_HOME
value: $(Build.BinariesDirectory)/rocm
pool: ${{ variables.LOW_BUILD_POOL }}
pool:
vmImage: ${{ variables.BASE_BUILD_POOL }}
workspace:
clean: all
strategy:
@@ -60,9 +60,14 @@ jobs:
checkoutRepo: ${{ parameters.checkoutRepo }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmDependencies }}
gpuTarget: $(JOB_GPU_TARGET)
# CI case: download latest default branch build
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
# manual build case: triggered by ROCm/ROCm repo
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/build-cmake.yml
parameters:
extraBuildFlags: >-
@@ -79,12 +84,6 @@ jobs:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-upload.yml
parameters:
gpuTarget: $(JOB_GPU_TARGET)
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-links.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
pipModules: ${{ parameters.pipModules }}
gpuTarget: $(JOB_GPU_TARGET)
- job: roctracer_testing
dependsOn: roctracer
@@ -109,11 +108,19 @@ jobs:
parameters:
gpuTarget: $(JOB_GPU_TARGET)
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-aqlprofile.yml
parameters:
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmTestDependencies }}
gpuTarget: $(JOB_GPU_TARGET)
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/gpu-diagnostics.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/test.yml
parameters:
@@ -122,9 +129,3 @@ jobs:
testParameters: ''
testDir: $(Agent.BuildDirectory)
testPublishResults: false
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
pipModules: ${{ parameters.pipModules }}
environment: test
gpuTarget: $(JOB_GPU_TARGET)

View File

@@ -8,14 +8,13 @@ parameters:
- name: aptPackages
type: object
default:
- clang
- cmake
- imagemagick
- libopencv-dev
- libsndfile1-dev
- libstdc++-12-dev
- imagemagick
- ninja-build
- python3-pip
- clang
- name: pipModules
type: object
default:
@@ -49,7 +48,8 @@ jobs:
variables:
- group: common
- template: /.azuredevops/variables-global.yml
pool: ${{ variables.LOW_BUILD_POOL }}
pool:
vmImage: ${{ variables.BASE_BUILD_POOL }}
workspace:
clean: all
strategy:
@@ -66,9 +66,14 @@ jobs:
checkoutRepo: ${{ parameters.checkoutRepo }}
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmDependencies }}
gpuTarget: $(JOB_GPU_TARGET)
# CI case: download latest default branch build
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
# manual build case: triggered by ROCm/ROCm repo
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/build-cmake.yml
parameters:
extraBuildFlags: >-
@@ -78,7 +83,7 @@ jobs:
-DCMAKE_PREFIX_PATH=$(Agent.BuildDirectory)/rocm
-DHALF_INCLUDE_DIRS=$(Agent.BuildDirectory)/rocm/include
-DCMAKE_BUILD_TYPE=Release
-DGPU_TARGETS=$(JOB_GPU_TARGET)
-DAMDGPU_TARGETS=$(JOB_GPU_TARGET)
-GNinja
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/manifest.yml
parameters:
@@ -86,12 +91,6 @@ jobs:
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-upload.yml
parameters:
gpuTarget: $(JOB_GPU_TARGET)
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-links.yml
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
pipModules: ${{ parameters.pipModules }}
gpuTarget: $(JOB_GPU_TARGET)
- job: rpp_testing
dependsOn: rpp
@@ -119,11 +118,19 @@ jobs:
parameters:
gpuTarget: $(JOB_GPU_TARGET)
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-aqlprofile.yml
parameters:
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
checkoutRef: ${{ parameters.checkoutRef }}
dependencyList: ${{ parameters.rocmTestDependencies }}
gpuTarget: $(JOB_GPU_TARGET)
${{ if eq(parameters.checkoutRef, '') }}:
dependencySource: staging
${{ elseif ne(parameters.checkoutRef, '') }}:
dependencySource: tag-builds
# Dependencies from: https://github.com/ROCm/rpp/blob/develop/utilities/test_suite/README.md
- task: Bash@3
displayName: Build and install Turbo JPEG
@@ -132,7 +139,7 @@ jobs:
script: |
sudo apt-get install nasm
sudo apt-get install wget
git clone -b 3.0.2 https://github.com/libjpeg-turbo/libjpeg-turbo.git
git clone -b 3.0.2 https://github.com/libjpeg-turbo/libjpeg-turbo.git
cd libjpeg-turbo
mkdir build
cd build
@@ -149,7 +156,7 @@ jobs:
inputs:
targetType: 'inline'
script: |
git clone -b v3.0.1 https://github.com/NIFTI-Imaging/nifti_clib.git
git clone https://github.com/NIFTI-Imaging/nifti_clib.git
cd nifti_clib
mkdir build
cd build
@@ -175,9 +182,3 @@ jobs:
testDir: 'rpp-tests'
- script: sudo rm /opt/rocm
condition: always()
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/docker-container.yml
parameters:
aptPackages: ${{ parameters.aptPackages }}
pipModules: ${{ parameters.pipModules }}
environment: test
gpuTarget: $(JOB_GPU_TARGET)

View File

@@ -142,10 +142,6 @@ parameters:
- binary_ufuncs
- autograd
# - inductor/torchinductor takes too long
# set to false to disable torchvision build and test
- name: includeVision
type: boolean
default: false
trigger: none
pr: none
@@ -177,6 +173,8 @@ jobs:
value: 6.3.0
- name: MKLROOT
value: /opt/intel
- name: AOTRITON_INSTALLED_PREFIX
value: /opt/rocm/aotriton
- name: DESIRED_PYTHON
value: 3.10
- name: PYTORCH_ROOT
@@ -239,12 +237,6 @@ jobs:
git clone https://github.com/pytorch/builder.git --depth=1 --recurse-submodules
sudo ln -s $(Build.SourcesDirectory)/builder /builder
workingDirectory: $(Build.SourcesDirectory)
- task: Bash@3
displayName: Temporarily Patch CK Submodule
inputs:
targetType: inline
script: git pull origin develop
workingDirectory: $(Build.SourcesDirectory)/pytorch/third_party/composable_kernel
- task: Bash@3
displayName: Install patchelf
inputs:
@@ -273,6 +265,12 @@ jobs:
script: |
sudo PYTORCH_ROCM_ARCH=$(JOB_GPU_TARGET) MKLROOT=$(MKLROOT) bash pytorch/.ci/docker/common/install_rocm_magma.sh
workingDirectory: $(Build.SourcesDirectory)
- task: Bash@3
displayName: Install AOTriton Shared Library
inputs:
targetType: inline
script: sudo bash ./common/install_aotriton.sh /opt/rocm
workingDirectory: $(Build.SourcesDirectory)/pytorch/.ci/docker
- task: Bash@3
displayName: Run ROCm Build Script
inputs:
@@ -285,6 +283,7 @@ jobs:
DESIRED_PYTHON=$(DESIRED_PYTHON)
PYTORCH_ROOT=$(PYTORCH_ROOT)
CMAKE_PREFIX_PATH=$(Agent.BuildDirectory)/rocm
AOTRITON_INSTALLED_PREFIX=$(AOTRITON_INSTALLED_PREFIX)
DESIRED_DEVTOOLSET=$(DESIRED_DEVTOOLSET)
TORCH_PACKAGE_NAME=torch.$(ROCM_BRANCH).$(JOB_GPU_TARGET)
PYTORCH_BUILD_VERSION=$(cat $(Build.SourcesDirectory)/pytorch/version.txt | cut -da -f1)
@@ -297,74 +296,59 @@ jobs:
sourceDir: /remote/wheelhouserocm$(ROCM_VERSION)
contentsString: '*.whl'
# common helper source for pytorch vision and audio
- ${{ if eq(parameters.includeVision, true) }}:
- task: Bash@3
displayName: git clone pytorch test-infra
inputs:
targetType: inline
script: git clone https://github.com/pytorch/test-infra.git --depth=1 --recurse-submodules
workingDirectory: $(Build.SourcesDirectory)
- task: Bash@3
displayName: install package helper
inputs:
targetType: inline
script: python3 -m pip install test-infra/tools/pkg-helpers
workingDirectory: $(Build.SourcesDirectory)
- task: Bash@3
displayName: pytorch pkg helpers
inputs:
targetType: inline
script: CU_VERSION=${CU_VERSION} CHANNEL=${CHANNEL} python -m pytorch_pkg_helpers
# get torch vision source and build
- task: Bash@3
displayName: git clone pytorch vision
inputs:
targetType: inline
script: git clone https://github.com/pytorch/vision.git --depth=1 --recurse-submodules
workingDirectory: $(Build.SourcesDirectory)
- task: Bash@3
displayName: Build vision
inputs:
targetType: inline
script: >-
TORCH_PACKAGE_NAME=torch.$(ROCM_BRANCH).$(JOB_GPU_TARGET)
TORCHVISION_PACKAGE_NAME=torchvision.$(ROCM_BRANCH).$(JOB_GPU_TARGET)
PYTORCH_VERSION=$(cat $(Build.SourcesDirectory)/pytorch/version.txt | cut -da -f1)post$(date -u +%Y%m%d)
BUILD_VERSION=$(cat $(Build.SourcesDirectory)/vision/version.txt | cut -da -f1)post$(date -u +%Y%m%d)
python3 setup.py bdist_wheel
workingDirectory: $(Build.SourcesDirectory)/vision
- task: Bash@3
displayName: Relocate vision
inputs:
targetType: inline
script: python3 packaging/wheel/relocate.py
workingDirectory: $(Build.SourcesDirectory)/vision
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-prepare-package.yml
parameters:
sourceDir: $(Build.SourcesDirectory)/vision/dist
contentsString: '*.whl'
clean: false
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/manifest.yml
- task: Bash@3
displayName: git clone pytorch test-infra
inputs:
targetType: inline
script: git clone https://github.com/pytorch/test-infra.git --depth=1 --recurse-submodules
workingDirectory: $(Build.SourcesDirectory)
- task: Bash@3
displayName: install package helper
inputs:
targetType: inline
script: python3 -m pip install test-infra/tools/pkg-helpers
workingDirectory: $(Build.SourcesDirectory)
- task: Bash@3
displayName: pytorch pkg helpers
inputs:
targetType: inline
script: CU_VERSION=${CU_VERSION} CHANNEL=${CHANNEL} python -m pytorch_pkg_helpers
# get torch vision source and build
- task: Bash@3
displayName: git clone pytorch vision
inputs:
targetType: inline
script: git clone https://github.com/pytorch/vision.git --depth=1 --recurse-submodules
workingDirectory: $(Build.SourcesDirectory)
- task: Bash@3
displayName: Build vision
inputs:
targetType: inline
script: >-
TORCH_PACKAGE_NAME=torch.$(ROCM_BRANCH).$(JOB_GPU_TARGET)
TORCHVISION_PACKAGE_NAME=torchvision.$(ROCM_BRANCH).$(JOB_GPU_TARGET)
PYTORCH_VERSION=$(cat $(Build.SourcesDirectory)/pytorch/version.txt | cut -da -f1)post$(date -u +%Y%m%d)
BUILD_VERSION=$(cat $(Build.SourcesDirectory)/vision/version.txt | cut -da -f1)post$(date -u +%Y%m%d)
python3 setup.py bdist_wheel
workingDirectory: $(Build.SourcesDirectory)/vision
- task: Bash@3
displayName: Relocate vision
inputs:
targetType: inline
script: python3 packaging/wheel/relocate.py
workingDirectory: $(Build.SourcesDirectory)/vision
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-prepare-package.yml
parameters:
gpuTarget: $(JOB_GPU_TARGET)
sourceDir: $(Build.SourcesDirectory)/vision/dist
contentsString: '*.whl'
clean: false
- task: PublishPipelineArtifact@1
displayName: 'wheel file Publish'
retryCountOnTaskFailure: 3
inputs:
targetPath: $(Build.BinariesDirectory)
- task: Bash@3
displayName: Save pipeline artifact file name
inputs:
workingDirectory: $(Pipeline.Workspace)
targetType: inline
script: |
whlFile=$(find "$(Build.BinariesDirectory)" -type f -name "*.whl" | head -n 1)
if [ -n "$whlFile" ]; then
echo $(basename "$whlFile") >> pipelineArtifacts.txt
fi
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-links.yml
- job: pytorch_testing
- job: torchvision_testing
dependsOn: pytorch
condition: and(succeeded(), eq(variables.ENABLE_GFX942_TESTS, 'true'), not(containsValue(split(variables.DISABLED_GFX942_TESTS, ','), variables['Build.DefinitionName'])))
variables:
@@ -395,12 +379,6 @@ jobs:
parameters:
aptPackages: ${{ parameters.aptPackages }}
pipModules: ${{ parameters.pipModules }}
# pytorch tests require an updated version of click, even if requirements is not called outright
- task: Bash@3
displayName: 'pip update click'
inputs:
targetType: inline
script: pip install --upgrade click
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/preamble.yml
- task: DownloadPipelineArtifact@2
displayName: 'Download Pipeline Wheel Files'
@@ -423,13 +401,12 @@ jobs:
targetType: inline
script: git clone https://github.com/pytorch/pytorch.git --depth=1 --recurse-submodules
workingDirectory: $(Build.SourcesDirectory)
- ${{ if eq(parameters.includeVision, true) }}:
- task: Bash@3
displayName: git clone pytorch vision
inputs:
targetType: inline
script: git clone https://github.com/pytorch/vision.git --depth=1 --recurse-submodules
workingDirectory: $(Build.SourcesDirectory)
- task: Bash@3
displayName: git clone pytorch vision
inputs:
targetType: inline
script: git clone https://github.com/pytorch/vision.git --depth=1 --recurse-submodules
workingDirectory: $(Build.SourcesDirectory)
- task: Bash@3
displayName: Install Wheel Files
inputs:
@@ -474,7 +451,7 @@ jobs:
python3 -c 'import torch; x = torch.rand(5, 3); print(x)'
# Test artifact build script has too many if statements for different environments
# Based off the snippet of interest for this environment, with some adjustments
# https://github.com/pytorch/pytorch/blob/main/.ci/pytorch/build.sh#L330-L371
# https://github.com/pytorch/pytorch/blob/main/.ci/pytorch/build.sh#L335-L375
# Removing in-line comments since it does not fit with the yaml markup
- task: Bash@3
displayName: Build Pytorch Test Artifacts
@@ -533,14 +510,13 @@ jobs:
script: pytest test/test_${{ torchTest }}.py
# Reference on what tests to run for torchvision found in private repo:
# https://github.com/ROCm/rocAutomation/blob/jenkins-pipelines/pytorch/pytorch_ci/test_torchvision.sh#L51
- ${{ if eq(parameters.includeVision, true) }}:
- task: Bash@3
displayName: Test vision/transforms
continueOnError: true
inputs:
targetType: inline
script: pytest test/test_transforms.py
workingDirectory: $(Build.SourcesDirectory)/vision
- task: Bash@3
displayName: Test vision/transforms
continueOnError: true
inputs:
targetType: inline
script: pytest test/test_transforms.py
workingDirectory: $(Build.SourcesDirectory)/vision
- task: Bash@3
displayName: Uninstall Wheel Files
inputs:

View File

@@ -1,10 +1,5 @@
parameters:
- name: dependencySource
type: string
default: staging
values:
- staging
- mainline
# currently excludes clr
- name: rocmDependencies
type: object
default:
@@ -65,7 +60,6 @@ parameters:
- roctracer
- rocWMMA
- rpp
- TransferBench
trigger: none
pr: none
@@ -105,11 +99,11 @@ jobs:
displayName: System disk space before ROCm
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/dependencies-rocm.yml
parameters:
dependencySource: ${{ parameters.dependencySource }}
dependencyList: ${{ parameters.rocmDependencies }}
gpuTarget: $(JOB_GPU_TARGET)
dependencySource: staging
skipLibraryLinking: true
skipLlvmSymlink: true
gpuTarget: $(JOB_GPU_TARGET)
- script: df -h
displayName: System disk space after ROCm
- script: du -sh $(Agent.BuildDirectory)/rocm
@@ -130,10 +124,3 @@ jobs:
retryCountOnTaskFailure: 3
inputs:
targetPath: '$(Build.ArtifactStagingDirectory)'
- task: Bash@3
displayName: Save pipeline artifact file name
inputs:
workingDirectory: $(Pipeline.Workspace)
targetType: inline
script: echo "$(Build.DefinitionName)_$(Build.BuildNumber)_ubuntu2204_$(JOB_GPU_TARGET).tar.gz" >> pipelineArtifacts.txt
- template: ${{ variables.CI_TEMPLATE_PATH }}/steps/artifact-links.yml

View File

@@ -6,12 +6,152 @@ parameters:
- name: pipelineId
type: string
default: ''
- name: branchName
type: string
default: '$(Build.SourceBranchName)' # for tagged builds
- name: useDefaultBranch
type: boolean
default: true
# useMainlineBranch only processed if useDefaultBranch is false
- name: useMainlineBranch
type: boolean
default: false
- name: latestFromBranch
type: boolean
default: true
- name: extractToMnt
type: boolean
default: false
- name: fileFilter
type: string
default: ''
- name: defaultBranchList
type: object
default:
AMDMIGraphX: develop
amdsmi: amd-staging
aomp-extras: aomp-dev
aomp: aomp-dev
clr: amd-staging
composable_kernel: develop
half: rocm
HIP: amd-staging
hip-tests: amd-staging
hipBLAS: develop
hipBLASLt: develop
hipBLAS-common: develop
hipCUB: develop
hipFFT: develop
hipfort: develop
HIPIFY: amd-staging
hipRAND: develop
hipSOLVER: develop
hipSPARSE: develop
hipSPARSELt: develop
hipTensor: develop
llvm-project: amd-staging
MIOpen: develop
MIVisionX: develop
omniperf: amd-staging
omnitrace: amd-staging
rccl: develop
rdc: amd-staging
rocAL: develop
rocALUTION: develop
rocBLAS: develop
ROCdbgapi : amd-staging
rocDecode: develop
rocFFT: develop
ROCgdb: amd-staging
rocJPEG: develop
rocm-cmake: develop
rocm-core: master
rocm-examples: develop
rocminfo: amd-staging
rocMLIR: develop
ROCmValidationSuite: master
rocm_bandwidth_test: master
rocm_smi_lib: amd-staging
rocPRIM: develop
rocprofiler: amd-staging
rocprofiler-compute: amd-staging
rocprofiler-register: amd-staging
rocprofiler-sdk: amd-staging
rocprofiler-systems: amd-staging
rocPyDecode: develop
ROCR-Runtime: amd-staging
rocRAND: develop
rocr_debug_agent: amd-staging
rocSOLVER: develop
rocSPARSE: develop
rocThrust: develop
roctracer: amd-staging
rocWMMA: develop
rpp: develop
- name: mainlineBranchList
type: object
default:
AMDMIGraphX: mainline
amdsmi: amd-mainline
aomp-extras: amd-mainline-open
aomp: amd-mainline-open
clr: amd-mainline
composable_kernel: mainline
half: rocm
HIP: amd-mainline
hip-tests: amd-mainline
hipBLAS: mainline
hipBLASLt: mainline
hipBLAS-common: mainline
hipCUB: mainline
hipFFT: mainline
hipfort: mainline
HIPIFY: amd-mainline
hipRAND: mainline
hipSOLVER: mainline
hipSPARSE: mainline
hipSPARSELt: mainline
hipTensor: mainline
llvm-project: amd-mainline-open
MIOpen: mainline
MIVisionX: mainline
omniperf: amd-mainline
omnitrace: amd-mainline
rccl: mainline
rdc: amd-mainline
rocAL: master # needs the yaml file
rocALUTION: mainline
rocBLAS: mainline
ROCdbgapi : amd-mainline
rocDecode: mainline
rocFFT: mainline
ROCgdb: amd-mainline-rocgdb-15
rocJPEG: mainline
rocm-cmake: mainline
rocm-core: amd-master
rocm-examples: develop # no mainline
rocminfo: amd-master
rocMLIR: mainline # needs the yaml file
ROCmValidationSuite: mainline
rocm_bandwidth_test: master
rocm_smi_lib: amd-mainline
rocPRIM: mainline
rocprofiler: amd-master
rocprofiler-compute: amd-mainline
rocprofiler-register: amd-mainline
rocprofiler-sdk: amd-mainline
rocprofiler-systems: amd-mainline
rocPyDecode: mainline
ROCR-Runtime: amd-master
rocRAND: mainline
rocr_debug_agent: amd-mainline
rocSOLVER: mainline
rocSPARSE: mainline
rocThrust: mainline
roctracer: amd-master
rocWMMA: mainline
rpp: mainline
# BELOW REQUIRED IF useDefaultBranch false
- name: branchName
type: string
default: '$(Build.SourceBranchName)' # for tagged builds
steps:
- task: Bash@3
@@ -32,16 +172,25 @@ steps:
definition: ${{ parameters.pipelineId }}
specificBuildWithTriggering: true
itemPattern: '**/*${{ parameters.fileFilter }}*'
${{ if notIn(parameters.componentName, 'aomp') }}: # remove this once these pipelines are functional + up-to-date
buildVersionToDownload: latestFromBranch # default is 'latest'
branchName: refs/heads/${{ parameters.branchName }}
${{ if eq(parameters.latestFromBranch, true) }}:
${{ if notIn(parameters.componentName, 'aomp') }}: # remove this once these pipelines are functional + up-to-date
buildVersionToDownload: latestFromBranch # default is 'latest'
${{ if eq(parameters.useDefaultBranch, true) }}:
branchName: refs/heads/${{ parameters.defaultBranchList[parameters.componentName] }}
${{ elseif eq(parameters.useMainlineBranch, true) }}:
branchName: refs/heads/${{ parameters.mainlineBranchList[parameters.componentName] }}
${{ else }}:
branchName: ${{ parameters.branchName }}
allowPartiallySucceededBuilds: $(allowPartiallySucceededBuilds)
targetPath: '$(Pipeline.Workspace)/d'
- task: ExtractFiles@1
displayName: Extract ${{ parameters.componentName }}
inputs:
archiveFilePatterns: '$(Pipeline.Workspace)/d/**/*.tar.gz'
destinationFolder: '$(Agent.BuildDirectory)/rocm'
${{ if parameters.extractToMnt }}:
destinationFolder: '/mnt/rocm'
${{ else }}:
destinationFolder: '$(Agent.BuildDirectory)/rocm'
cleanDestinationFolder: false
overwriteExistingFiles: true
- task: DeleteFiles@1

View File

@@ -1,32 +0,0 @@
# Every publish artifact call should be coupled with a script that
# prints out the artifact name to a common text file
# This template parses that text file line by line and prints out the download URL
# for each artifact, so that they are easily accessible to the public
# replace trailing '=' with their count in the encoded string
steps:
- task: Bash@3
displayName: "!! Download Links !!"
condition: always()
continueOnError: true
inputs:
workingDirectory: $(Pipeline.Workspace)
targetType: inline
script: |
URL_BEGIN="https://artprodcus3.artifacts.visualstudio.com/"
URL_MIDDLE="/_apis/artifact/"
URL_END="/content?format=file&subPath=%2F"
FORMATTED_JOB_NAME=$(echo $(Agent.JobName) | sed 's/ /./g; s/[-_]//g')
ARTIFACT_STRING="pipelineartifact://ROCm-CI/projectId/$(DOWNLOAD_PROJECT_ID)/buildId/$(Build.BuildId)/artifactName/${FORMATTED_JOB_NAME}"
ENCODED_STRING=$(echo -n "${ARTIFACT_STRING}" | base64 -w 0)
PADDING_COUNT=$(echo -n "${ENCODED_STRING}" | awk -F= '{print NF-1}')
if [ "$PADDING_COUNT" -gt 0 ]; then
FINAL_ENCODED_STRING=$(echo -n "${ENCODED_STRING}" | sed "s/=*$/${PADDING_COUNT}/")
else
FINAL_ENCODED_STRING="${ENCODED_STRING}0"
fi
while IFS= read -r fileName; do
echo "File Name:"
echo "$fileName"
printf "Download Link:\n%s%s/%s%s%s%s%s\n\n" "${URL_BEGIN}" "${DOWNLOAD_ORGANIZATION_ID}" "${DOWNLOAD_PROJECT_ID}" "${URL_MIDDLE}" "${FINAL_ENCODED_STRING}" "${URL_END}" "${fileName}"
done < pipelineArtifacts.txt
rm pipelineArtifacts.txt

View File

@@ -27,12 +27,6 @@ steps:
SourceFolder: '$(Build.BinariesDirectory)'
Contents: '/**/*'
RemoveDotFiles: true
- task: Bash@3
displayName: Save pipeline artifact file name
inputs:
workingDirectory: $(Pipeline.Workspace)
targetType: inline
script: echo "$(Build.DefinitionName)_$(Build.SourceBranchName)_$(Build.BuildId)_$(Build.BuildNumber)_ubuntu2204_${{ parameters.artifactName }}_${{ parameters.gpuTarget }}.tar.gz" >> pipelineArtifacts.txt
# then publish it
- ${{ if parameters.publish }}:
- task: PublishPipelineArtifact@1

View File

@@ -1,16 +1,29 @@
parameters:
- name: dependencySource
type: string
default: staging
values:
- staging
- tag-builds
- name: repositoryUrl
type: object
default:
staging: https://repo.radeon.com/rocm/apt/latest/pool/main/h/hsa-amd-aqlprofile/ # end slash is important for curl!
tag-builds: https://repo.radeon.com/rocm/apt/$(TAGGED_RELEASE)/pool/main/h/hsa-amd-aqlprofile/
steps:
- task: Bash@3
displayName: Get aqlprofile package name
inputs:
targetType: inline
script: |
export packageName=$(curl -s https://repo.radeon.com/rocm/apt/latest/pool/main/h/hsa-amd-aqlprofile/ | grep -oP "href=\"\K[^\"]*$(lsb_release -rs)[^\"]*\.deb")
export packageName=$(curl -s ${{ parameters.repositoryUrl[parameters.dependencySource] }} | grep -oP "href=\"\K[^\"]*$(lsb_release -rs)[^\"]*\.deb")
echo "##vso[task.setvariable variable=packageName;isreadonly=true]$packageName"
- task: Bash@3
displayName: 'Download aqlprofile'
inputs:
targetType: inline
script: wget -nv https://repo.radeon.com/rocm/apt/latest/pool/main/h/hsa-amd-aqlprofile/$(packageName)
script: wget -nv ${{ parameters.repositoryUrl[parameters.dependencySource] }}$(packageName)
workingDirectory: '$(Pipeline.Workspace)'
- task: Bash@3
displayName: 'Extract aqlprofile'

View File

@@ -6,26 +6,10 @@ parameters:
- name: pipModules
type: object
default: []
- name: registerRadeon
type: boolean
default: false
steps:
- ${{ if eq(parameters.registerRadeon, true) }}:
- task: Bash@3
displayName: 'Register repo.radeon packages'
inputs:
targetType: inline
script: |
sudo mkdir --parents --mode=0755 /etc/apt/keyrings
wget https://repo.radeon.com/rocm/rocm.gpg.key -O - | gpg --dearmor | sudo tee /etc/apt/keyrings/rocm.gpg > /dev/null
echo "deb [arch=amd64 signed-by=/etc/apt/keyrings/rocm.gpg] https://repo.radeon.com/amdgpu/latest/ubuntu jammy main" | sudo tee /etc/apt/sources.list.d/amdgpu.list
echo "deb [arch=amd64 signed-by=/etc/apt/keyrings/rocm.gpg] https://repo.radeon.com/rocm/apt/latest jammy main" | sudo tee --append /etc/apt/sources.list.d/rocm.list
echo -e 'Package: *\nPin: release o=repo.radeon.com\nPin-Priority: 600' | sudo tee /etc/apt/preferences.d/rocm-pin-600
sudo apt update
# firefox takes time to upgrade and is not needed for CI workloads, hold version
- task: Bash@3
continueOnError: true
displayName: 'sudo apt-mark hold firefox'
inputs:
targetType: inline

View File

@@ -1,21 +1,160 @@
# download and install rocm dependencies through pipeline builds in the project
# REQUIRED
parameters:
- name: checkoutRef
type: string
default: ''
- name: dependencySource # optional, overrides checkoutRef
type: string
default: null
values:
- null # empty strings aren't allowed as values, use null instead
- staging
- mainline
- name: dependencyList
type: object
default: []
- name: gpuTarget
- name: dependencySource
type: string
default: staging
values:
- staging
- mainline
- tag-builds
- fixed
- name: extractToMnt
type: boolean
default: false
# required values for fixed selection
- name: fixedPipelineIdentifier
type: string
default: 0
- name: fixedComponentName
type: string
default: ''
- name: latestFromBranch
type: boolean
default: true
# match case of the repo in this object for the left side of the maps
# should not need to replace these parameters
- name: stagingPipelineIdentifiers
type: object
default:
AMDMIGraphX: $(AMDMIGRAPHX_PIPELINE_ID)
amdsmi: $(AMDSMI_PIPELINE_ID)
aomp-extras: $(AOMP_EXTRAS_PIPELINE_ID)
aomp: $(AOMP_PIPELINE_ID)
clr: $(CLR_PIPELINE_ID)
composable_kernel: $(COMPOSABLE_KERNEL_PIPELINE_ID)
half: $(HALF_PIPELINE_ID)
HIP: $(HIP_PIPELINE_ID)
hip-tests: $(HIP_TESTS_PIPELINE_ID)
hipBLAS: $(HIPBLAS_PIPELINE_ID)
hipBLAS-common: $(HIPBLAS_COMMON_PIPELINE_ID)
hipBLASLt: $(HIPBLASLT_PIPELINE_ID)
hipCUB: $(HIPCUB_PIPELINE_ID)
hipFFT: $(HIPFFT_PIPELINE_ID)
hipfort: $(HIPFORT_PIPELINE_ID)
HIPIFY: $(HIPIFY_PIPELINE_ID)
hipRAND: $(HIPRAND_PIPELINE_ID)
hipSOLVER: $(HIPSOLVER_PIPELINE_ID)
hipSPARSE: $(HIPSPARSE_PIPELINE_ID)
hipSPARSELt: $(HIPSPARSELT_PIPELINE_ID)
hipTensor: $(HIPTENSOR_PIPELINE_ID)
llvm-project: $(LLVM_PROJECT_PIPELINE_ID)
MIOpen: $(MIOpen_PIPELINE_ID)
MIVisionX: $(MIVISIONX_PIPELINE_ID)
omniperf: $(OMNIPERF_PIPELINE_ID)
omnitrace: $(OMNITRACE_PIPELINE_ID)
rccl: $(RCCL_PIPELINE_ID)
rdc: $(RDC_PIPELINE_ID)
rocAL: $(ROCAL_PIPELINE_ID)
rocALUTION: $(ROCALUTION_PIPELINE_ID)
rocBLAS: $(ROCBLAS_PIPELINE_ID)
ROCdbgapi: $(ROCDBGAPI_PIPELINE_ID)
rocDecode: $(ROCDECODE_PIPELINE_ID)
rocFFT: $(ROCFFT_PIPELINE_ID)
ROCgdb: $(ROCGDB_PIPELINE_ID)
rocJPEG: $(ROCJPEG_PIPELINE_ID)
rocm-cmake: $(ROCM_CMAKE_PIPELINE_ID)
rocm-core: $(ROCM_CORE_PIPELINE_ID)
rocm-examples: $(ROCM_EXAMPLES_PIPELINE_ID)
rocminfo: $(ROCMINFO_PIPELINE_ID)
rocMLIR: $(ROCMLIR_PIPELINE_ID)
ROCmValidationSuite: $(ROCMVALIDATIONSUITE_PIPELINE_ID)
rocm_bandwidth_test: $(ROCM_BANDWIDTH_TEST_PIPELINE_ID)
rocm_smi_lib: $(ROCM_SMI_LIB_PIPELINE_ID)
rocPRIM: $(ROCPRIM_PIPELINE_ID)
rocprofiler-compute: $(ROCPROFILER_COMPUTE_PIPELINE_ID)
rocprofiler-register: $(ROCPROFILER_REGISTER_PIPELINE_ID)
rocprofiler-sdk: $(ROCPROFILER_SDK_PIPELINE_ID)
rocprofiler-systems: $(ROCPROFILER_SYSTEMS_PIPELINE_ID)
rocprofiler: $(ROCPROFILER_PIPELINE_ID)
rocPyDecode: $(ROCPYDECODE_PIPELINE_ID)
ROCR-Runtime: $(ROCR_RUNTIME_PIPELINE_ID)
rocRAND: $(ROCRAND_PIPELINE_ID)
rocr_debug_agent: $(ROCR_DEBUG_AGENT_PIPELINE_ID)
rocSOLVER: $(ROCSOLVER_PIPELINE_ID)
rocSPARSE: $(ROCSPARSE_PIPELINE_ID)
ROCT-Thunk-Interface: $(ROCT_THUNK_INTERFACE_PIPELINE_ID)
rocThrust: $(ROCTHRUST_PIPELINE_ID)
roctracer: $(ROCTRACER_PIPELINE_ID)
rocWMMA: $(ROCWMMA_PIPELINE_ID)
rpp: $(RPP_PIPELINE_ID)
- name: taggedPipelineIdentifiers
type: object
default:
AMDMIGraphX: $(AMDMIGRAPHX_TAGGED_PIPELINE_ID)
amdsmi: $(AMDSMI_TAGGED_PIPELINE_ID)
aomp-extras: $(AOMP_EXTRAS_TAGGED_PIPELINE_ID)
aomp: $(AOMP_TAGGED_PIPELINE_ID)
clr: $(CLR_TAGGED_PIPELINE_ID)
composable_kernel: $(COMPOSABLE_KERNEL_TAGGED_PIPELINE_ID)
half: $(HALF_TAGGED_PIPELINE_ID)
HIP: $(HIP_TAGGED_PIPELINE_ID)
hip-tests: $(HIP_TESTS_TAGGED_PIPELINE_ID)
hipBLAS: $(HIPBLAS_TAGGED_PIPELINE_ID)
hipBLAS-common: $(HIPBLAS_COMMON_TAGGED_PIPELINE_ID)
hipBLASLt: $(HIPBLASLT_TAGGED_PIPELINE_ID)
hipCUB: $(HIPCUB_TAGGED_PIPELINE_ID)
hipFFT: $(HIPFFT_TAGGED_PIPELINE_ID)
hipfort: $(HIPFORT_TAGGED_PIPELINE_ID)
HIPIFY: $(HIPIFY_TAGGED_PIPELINE_ID)
hipRAND: $(HIPRAND_TAGGED_PIPELINE_ID)
hipSOLVER: $(HIPSOLVER_TAGGED_PIPELINE_ID)
hipSPARSE: $(HIPSPARSE_TAGGED_PIPELINE_ID)
hipSPARSELt: $(HIPSPARSELT_TAGGED_PIPELINE_ID)
hipTensor: $(HIPTENSOR_TAGGED_PIPELINE_ID)
llvm-project: $(LLVM_PROJECT_TAGGED_PIPELINE_ID)
MIOpen: $(MIOpen_TAGGED_PIPELINE_ID)
MIVisionX: $(MIVISIONX_TAGGED_PIPELINE_ID)
omniperf: $(OMNIPERF_TAGGED_PIPELINE_ID)
omnitrace: $(OMNITRACE_TAGGED_PIPELINE_ID)
rccl: $(RCCL_TAGGED_PIPELINE_ID)
rdc: $(RDC_TAGGED_PIPELINE_ID)
rocAL: $(ROCAL_TAGGED_PIPELINE_ID)
rocALUTION: $(ROCALUTION_TAGGED_PIPELINE_ID)
rocBLAS: $(ROCBLAS_TAGGED_PIPELINE_ID)
ROCdbgapi: $(ROCDBGAPI_TAGGED_PIPELINE_ID)
rocDecode: $(ROCDECODE_TAGGED_PIPELINE_ID)
rocFFT: $(ROCFFT_TAGGED_PIPELINE_ID)
ROCgdb: $(ROCGDB_TAGGED_PIPELINE_ID)
rocJPEG: $(ROCJPEG_TAGGED_PIPELINE_ID)
rocm-cmake: $(ROCM_CMAKE_TAGGED_PIPELINE_ID)
rocm-core: $(ROCM_CORE_TAGGED_PIPELINE_ID)
rocm-examples: $(ROCM_EXAMPLES_TAGGED_PIPELINE_ID)
rocminfo: $(ROCMINFO_TAGGED_PIPELINE_ID)
rocMLIR: $(ROCMLIR_TAGGED_PIPELINE_ID)
ROCmValidationSuite: $(ROCMVALIDATIONSUITE_TAGGED_PIPELINE_ID)
rocm_bandwidth_test: $(ROCM_BANDWIDTH_TEST_TAGGED_PIPELINE_ID)
rocm_smi_lib: $(ROCM_SMI_LIB_TAGGED_PIPELINE_ID)
rocPRIM: $(ROCPRIM_TAGGED_PIPELINE_ID)
rocprofiler-compute: $(ROCPROFILER_COMPUTE_TAGGED_PIPELINE_ID)
rocprofiler-register: $(ROCPROFILER_REGISTER_TAGGED_PIPELINE_ID)
rocprofiler-sdk: $(ROCPROFILER_SDK_TAGGED_PIPELINE_ID)
rocprofiler-systems: $(ROCPROFILER_SYSTEMS_PIPELINE_ID)
rocprofiler: $(ROCPROFILER_TAGGED_PIPELINE_ID)
rocPyDecode: $(ROCPYDECODE_TAGGED_PIPELINE_ID)
ROCR-Runtime: $(ROCR_RUNTIME_TAGGED_PIPELINE_ID)
rocRAND: $(ROCRAND_TAGGED_PIPELINE_ID)
rocr_debug_agent: $(ROCR_DEBUG_AGENT_TAGGED_PIPELINE_ID)
rocSOLVER: $(ROCSOLVER_TAGGED_PIPELINE_ID)
rocSPARSE: $(ROCSPARSE_TAGGED_PIPELINE_ID)
ROCT-Thunk-Interface: $(ROCT_THUNK_INTERFACE_TAGGED_PIPELINE_ID)
rocThrust: $(ROCTHRUST_TAGGED_PIPELINE_ID)
roctracer: $(ROCTRACER_TAGGED_PIPELINE_ID)
rocWMMA: $(ROCWMMA_TAGGED_PIPELINE_ID)
rpp: $(RPP_TAGGED_PIPELINE_ID)
# set to true if you're calling this template file multiple files in same pipeline
# only leave last call false to optimize sequence
- name: skipLibraryLinking
@@ -26,320 +165,52 @@ parameters:
- name: skipLlvmSymlink
type: boolean
default: false
# set to true if dlopen calls for HIP libraries are causing failures
# because they do not follow shared library symlink convention
- name: setupHIPLibrarySymlinks
type: boolean
default: false
- name: componentVarList
# some ROCm components can specify GPU target and this will affect downloads
- name: gpuTarget
type: string
default: ''
# array of ROCm components that can specify GPU target or have dependency of such a component
# these components would have parallel build jobs per GPU target so they produce multiple artifacts
# only download artifact tied to the GPU target
- name: componentsWithGPUTarget
type: object
default:
AMDMIGraphX:
pipelineId: $(AMDMIGRAPHX_PIPELINE_ID)
stagingBranch: develop
mainlineBranch: mainline
hasGpuTarget: true
amdsmi:
pipelineId: $(AMDSMI_PIPELINE_ID)
stagingBranch: amd-staging
mainlineBranch: amd-mainline
hasGpuTarget: false
aomp-extras:
pipelineId: $(AOMP_EXTRAS_PIPELINE_ID)
stagingBranch: aomp-dev
mainlineBranch: aomp-dev
hasGpuTarget: false
aomp:
pipelineId: $(AOMP_PIPELINE_ID)
stagingBranch: aomp-dev
mainlineBranch: amd-mainline-open
hasGpuTarget: false
clr:
pipelineId: $(CLR_PIPELINE_ID)
stagingBranch: amd-staging
mainlineBranch: amd-mainline
hasGpuTarget: false
composable_kernel:
pipelineId: $(COMPOSABLE_KERNEL_PIPELINE_ID)
stagingBranch: develop
mainlineBranch: mainline
hasGpuTarget: true
half:
pipelineId: $(HALF_PIPELINE_ID)
stagingBranch: rocm
mainlineBranch: rocm
hasGpuTarget: false
HIP:
pipelineId: $(HIP_PIPELINE_ID)
stagingBranch: amd-staging
mainlineBranch: amd-mainline
hasGpuTarget: false
hip-tests:
pipelineId: $(HIP_TESTS_PIPELINE_ID)
stagingBranch: amd-staging
mainlineBranch: amd-mainline
hasGpuTarget: false
hipBLAS:
pipelineId: $(HIPBLAS_PIPELINE_ID)
stagingBranch: develop
mainlineBranch: mainline
hasGpuTarget: true
hipBLASLt:
pipelineId: $(HIPBLASLT_PIPELINE_ID)
stagingBranch: develop
mainlineBranch: mainline
hasGpuTarget: true
hipBLAS-common:
pipelineId: $(HIPBLAS_COMMON_PIPELINE_ID)
stagingBranch: develop
mainlineBranch: mainline
hasGpuTarget: false
hipCUB:
pipelineId: $(HIPCUB_PIPELINE_ID)
stagingBranch: develop
mainlineBranch: mainline
hasGpuTarget: true
hipFFT:
pipelineId: $(HIPFFT_PIPELINE_ID)
stagingBranch: develop
mainlineBranch: mainline
hasGpuTarget: true
hipfort:
pipelineId: $(HIPFORT_PIPELINE_ID)
stagingBranch: develop
mainlineBranch: mainline
hasGpuTarget: false
HIPIFY:
pipelineId: $(HIPIFY_PIPELINE_ID)
stagingBranch: amd-staging
mainlineBranch: amd-mainline
hasGpuTarget: false
hipRAND:
pipelineId: $(HIPRAND_PIPELINE_ID)
stagingBranch: develop
mainlineBranch: mainline
hasGpuTarget: true
hipSOLVER:
pipelineId: $(HIPSOLVER_PIPELINE_ID)
stagingBranch: develop
mainlineBranch: mainline
hasGpuTarget: true
hipSPARSE:
pipelineId: $(HIPSPARSE_PIPELINE_ID)
stagingBranch: develop
mainlineBranch: mainline
hasGpuTarget: true
hipSPARSELt:
pipelineId: $(HIPSPARSELT_PIPELINE_ID)
stagingBranch: develop
mainlineBranch: mainline
hasGpuTarget: true
hipTensor:
pipelineId: $(HIPTENSOR_PIPELINE_ID)
stagingBranch: develop
mainlineBranch: mainline
hasGpuTarget: true
llvm-project:
pipelineId: $(LLVM_PROJECT_PIPELINE_ID)
stagingBranch: amd-staging
mainlineBranch: amd-mainline-open
hasGpuTarget: false
MIOpen:
pipelineId: $(MIOpen_PIPELINE_ID)
stagingBranch: develop
mainlineBranch: mainline
hasGpuTarget: true
MIVisionX:
pipelineId: $(MIVISIONX_PIPELINE_ID)
stagingBranch: develop
mainlineBranch: mainline
hasGpuTarget: true
omnitrace: # deprecated
pipelineId: $(OMNITRACE_PIPELINE_ID)
stagingBranch: amd-staging
mainlineBranch: amd-mainline
hasGpuTarget: true
rccl:
pipelineId: $(RCCL_PIPELINE_ID)
stagingBranch: develop
mainlineBranch: mainline
hasGpuTarget: true
rdc:
pipelineId: $(RDC_PIPELINE_ID)
stagingBranch: amd-staging
mainlineBranch: amd-mainline
hasGpuTarget: false
rocAL:
pipelineId: $(ROCAL_PIPELINE_ID)
stagingBranch: develop
mainlineBranch: master
hasGpuTarget: true
rocALUTION:
pipelineId: $(ROCALUTION_PIPELINE_ID)
stagingBranch: develop
mainlineBranch: mainline
hasGpuTarget: true
rocBLAS:
pipelineId: $(ROCBLAS_PIPELINE_ID)
stagingBranch: develop
mainlineBranch: mainline
hasGpuTarget: true
ROCdbgapi:
pipelineId: $(ROCDBGAPI_PIPELINE_ID)
stagingBranch: amd-staging
mainlineBranch: amd-mainline
hasGpuTarget: false
rocDecode:
pipelineId: $(ROCDECODE_PIPELINE_ID)
stagingBranch: develop
mainlineBranch: mainline
hasGpuTarget: false
rocFFT:
pipelineId: $(ROCFFT_PIPELINE_ID)
stagingBranch: develop
mainlineBranch: mainline
hasGpuTarget: true
ROCgdb:
pipelineId: $(ROCGDB_PIPELINE_ID)
stagingBranch: amd-staging
mainlineBranch: amd-mainline-rocgdb-15
hasGpuTarget: false
rocJPEG:
pipelineId: $(ROCJPEG_PIPELINE_ID)
stagingBranch: develop
mainlineBranch: mainline
hasGpuTarget: false
rocm-cmake:
pipelineId: $(ROCM_CMAKE_PIPELINE_ID)
stagingBranch: develop
mainlineBranch: mainline
hasGpuTarget: false
rocm-core:
pipelineId: $(ROCM_CORE_PIPELINE_ID)
stagingBranch: master
mainlineBranch: amd-master
hasGpuTarget: false
rocm-examples:
pipelineId: $(ROCM_EXAMPLES_PIPELINE_ID)
stagingBranch: develop
mainlineBranch: develop
hasGpuTarget: true
rocminfo:
pipelineId: $(ROCMINFO_PIPELINE_ID)
stagingBranch: amd-staging
mainlineBranch: amd-master
hasGpuTarget: false
rocMLIR:
pipelineId: $(ROCMLIR_PIPELINE_ID)
stagingBranch: develop
mainlineBranch: mainline
hasGpuTarget: false
ROCmValidationSuite:
pipelineId: $(ROCMVALIDATIONSUITE_PIPELINE_ID)
stagingBranch: master
mainlineBranch: mainline
hasGpuTarget: true
rocm_bandwidth_test:
pipelineId: $(ROCM_BANDWIDTH_TEST_PIPELINE_ID)
stagingBranch: master
mainlineBranch: master
hasGpuTarget: false
rocm_smi_lib:
pipelineId: $(ROCM_SMI_LIB_PIPELINE_ID)
stagingBranch: amd-staging
mainlineBranch: amd-mainline
hasGpuTarget: false
rocPRIM:
pipelineId: $(ROCPRIM_PIPELINE_ID)
stagingBranch: develop
mainlineBranch: mainline
hasGpuTarget: true
rocprofiler:
pipelineId: $(ROCPROFILER_PIPELINE_ID)
stagingBranch: amd-staging
mainlineBranch: amd-master
hasGpuTarget: true
rocprofiler-compute:
pipelineId: $(ROCPROFILER_COMPUTE_PIPELINE_ID)
stagingBranch: amd-staging
mainlineBranch: amd-mainline
hasGpuTarget: true
rocprofiler-register:
pipelineId: $(ROCPROFILER_REGISTER_PIPELINE_ID)
stagingBranch: amd-staging
mainlineBranch: amd-mainline
hasGpuTarget: false
rocprofiler-sdk:
pipelineId: $(ROCPROFILER_SDK_PIPELINE_ID)
stagingBranch: amd-staging
mainlineBranch: amd-mainline
hasGpuTarget: true
rocprofiler-systems:
pipelineId: $(ROCPROFILER_SYSTEMS_PIPELINE_ID)
stagingBranch: amd-staging
mainlineBranch: amd-mainline
hasGpuTarget: true
rocPyDecode:
pipelineId: $(ROCPYDECODE_PIPELINE_ID)
stagingBranch: develop
mainlineBranch: mainline
hasGpuTarget: true
ROCR-Runtime:
pipelineId: $(ROCR_RUNTIME_PIPELINE_ID)
stagingBranch: amd-staging
mainlineBranch: amd-master
hasGpuTarget: false
rocRAND:
pipelineId: $(ROCRAND_PIPELINE_ID)
stagingBranch: develop
mainlineBranch: mainline
hasGpuTarget: true
rocr_debug_agent:
pipelineId: $(ROCR_DEBUG_AGENT_PIPELINE_ID)
stagingBranch: amd-staging
mainlineBranch: amd-mainline
hasGpuTarget: false
rocSOLVER:
pipelineId: $(ROCSOLVER_PIPELINE_ID)
stagingBranch: develop
mainlineBranch: mainline
hasGpuTarget: true
rocSPARSE:
pipelineId: $(ROCSPARSE_PIPELINE_ID)
stagingBranch: develop
mainlineBranch: mainline
hasGpuTarget: true
ROCT-Thunk-Interface: # deprecated
pipelineId: $(ROCT_THUNK_INTERFACE_PIPELINE_ID)
stagingBranch: master
mainlineBranch: master
hasGpuTarget: false
rocThrust:
pipelineId: $(ROCTHRUST_PIPELINE_ID)
stagingBranch: develop
mainlineBranch: mainline
hasGpuTarget: true
roctracer:
pipelineId: $(ROCTRACER_PIPELINE_ID)
stagingBranch: amd-staging
mainlineBranch: amd-master
hasGpuTarget: true
rocWMMA:
pipelineId: $(ROCWMMA_PIPELINE_ID)
stagingBranch: develop
mainlineBranch: mainline
hasGpuTarget: true
rpp:
pipelineId: $(RPP_PIPELINE_ID)
stagingBranch: develop
mainlineBranch: mainline
hasGpuTarget: true
TransferBench:
pipelineId: $(TRANSFERBENCH_PIPELINE_ID)
stagingBranch: develop
mainlineBranch: mainline
hasGpuTarget: true
- AMDMIGraphX
- composable_kernel
- hipBLASLt
- hipCUB
- hipFFT
- hipRAND
- hipSPARSELt
- hipTensor
- omnitrace
- rccl
- rocALUTION
- rocBLAS
- rocFFT
- rocm-examples
- rocPRIM
- rocprofiler-compute
- rocprofiler-sdk
- rocprofiler-systems
- rocprofiler
- rocPyDecode
- rocRAND
- rocSOLVER
- rocSPARSE
- rocThrust
- roctracer
- rocWMMA
- rpp
# list below do not have flag for gpu target but have dependencies of components that do, so build separately per gpu target
- hipBLAS
- hipSOLVER
- hipSPARSE
- MIOpen
- MIVision
- omniperf
- rocAL
- ROCmValidationSuite
steps:
# assuming artifact-download.yml template file in same directory
@@ -353,59 +224,47 @@ steps:
- template: artifact-download.yml
parameters:
componentName: ${{ split(dependency, ':')[0] }}
pipelineId: ${{ parameters.componentVarList[split(dependency, ':')[0]].pipelineId }}
${{ if parameters.componentVarList[split(dependency, ':')[0]].hasGpuTarget }}:
extractToMnt: ${{ parameters.extractToMnt }}
${{ if eq(parameters.dependencySource, 'staging') }}:
pipelineId: ${{ parameters.stagingPipelineIdentifiers[ split(dependency, ':')[0] ] }}
latestFromBranch: ${{ parameters.latestFromBranch }}
${{ elseif eq(parameters.dependencySource, 'mainline') }}:
pipelineId: ${{ parameters.stagingPipelineIdentifiers[ split(dependency, ':')[0] ] }}
useMainlineBranch: true
latestFromBranch: ${{ parameters.latestFromBranch }}
${{ elseif eq(parameters.dependencySource, 'tag-builds') }}:
pipelineId: ${{ parameters.taggedPipelineIdentifiers[ split(dependency, ':')[0] ] }}
latestFromBranch: false
${{ if containsValue( parameters.componentsWithGPUTarget, split(dependency, ':')[0] ) }}:
fileFilter: "${{ split(dependency, ':')[1] }}*${{ parameters.gpuTarget }}"
# dependencySource = staging
${{ if eq(parameters.dependencySource, 'staging')}}:
branchName: ${{ parameters.componentVarList[split(dependency, ':')[0]].stagingBranch }}
# dependencySource = mainline
${{ elseif eq(parameters.dependencySource, 'mainline')}}:
branchName: ${{ parameters.componentVarList[split(dependency, ':')[0]].mainlineBranch }}
# checkoutRef = staging
${{ elseif eq(parameters.checkoutRef, parameters.componentVarList[variables['Build.DefinitionName']].stagingBranch) }}:
branchName: ${{ parameters.componentVarList[split(dependency, ':')[0]].stagingBranch }}
# checkoutRef = mainline
${{ elseif eq(parameters.checkoutRef, parameters.componentVarList[variables['Build.DefinitionName']].mainlineBranch) }}:
branchName: ${{ parameters.componentVarList[split(dependency, ':')[0]].mainlineBranch }}
# SourceBranchName = staging
${{ elseif eq(variables['Build.SourceBranchName'], parameters.componentVarlist[variables['Build.DefinitionName']].stagingBranch) }}:
branchName: ${{ parameters.componentVarList[split(dependency, ':')[0]].stagingBranch }}
# SourceBranchName = mainline
${{ elseif eq(variables['Build.SourceBranchName'], parameters.componentVarlist[variables['Build.DefinitionName']].mainlineBranch) }}:
branchName: ${{ parameters.componentVarList[split(dependency, ':')[0]].mainlineBranch }}
# default = staging
${{ else }}:
branchName: ${{ parameters.componentVarList[split(dependency, ':')[0]].stagingBranch }}
fileFilter: ${{ split(dependency, ':')[1] }}
# no colon (:) found in this item in the list
- ${{ else }}:
- template: artifact-download.yml
parameters:
componentName: ${{ dependency }}
pipelineId: ${{ parameters.componentVarList[dependency].pipelineId }}
${{ if parameters.componentVarList[dependency].hasGpuTarget }}:
extractToMnt: ${{ parameters.extractToMnt }}
${{ if eq(parameters.dependencySource, 'staging') }}:
pipelineId: ${{ parameters.stagingPipelineIdentifiers[dependency] }}
latestFromBranch: ${{ parameters.latestFromBranch }}
${{ elseif eq(parameters.dependencySource, 'mainline') }}:
pipelineId: ${{ parameters.stagingPipelineIdentifiers[dependency] }}
useMainlineBranch: true
latestFromBranch: ${{ parameters.latestFromBranch }}
${{ elseif eq(parameters.dependencySource, 'tag-builds') }}:
pipelineId: ${{ parameters.taggedPipelineIdentifiers[dependency] }}
latestFromBranch: false
${{ if containsValue( parameters.componentsWithGPUTarget, dependency ) }}:
fileFilter: ${{ parameters.gpuTarget }}
# dependencySource = staging
${{ if eq(parameters.dependencySource, 'staging')}}:
branchName: ${{ parameters.componentVarList[dependency].stagingBranch }}
# dependencySource = mainline
${{ elseif eq(parameters.dependencySource, 'mainline')}}:
branchName: ${{ parameters.componentVarList[dependency].mainlineBranch }}
# checkoutRef = staging
${{ elseif eq(parameters.checkoutRef, parameters.componentVarList[variables['Build.DefinitionName']].stagingBranch) }}:
branchName: ${{ parameters.componentVarList[dependency].stagingBranch }}
# checkoutRef = mainline
${{ elseif eq(parameters.checkoutRef, parameters.componentVarList[variables['Build.DefinitionName']].mainlineBranch) }}:
branchName: ${{ parameters.componentVarList[dependency].mainlineBranch }}
# SourceBranchName = staging
${{ elseif eq(variables['Build.SourceBranchName'], parameters.componentVarlist[variables['Build.DefinitionName']].stagingBranch) }}:
branchName: ${{ parameters.componentVarList[dependency].stagingBranch }}
# SourceBranchName = mainline
${{ elseif eq(variables['Build.SourceBranchName'], parameters.componentVarlist[variables['Build.DefinitionName']].mainlineBranch) }}:
branchName: ${{ parameters.componentVarList[dependency].mainlineBranch }}
# default = staging
${{ else }}:
branchName: ${{ parameters.componentVarList[dependency].stagingBranch }}
# fixed case only accepts one component at a time, so no array input
- ${{ if eq(parameters.dependencySource, 'fixed') }}:
- template: artifact-download.yml
parameters:
componentName: ${{ parameters.fixedComponentName }}
pipelineId: ${{ parameters.fixedPipelineIdentifier }}
latestFromBranch: false
extractToMnt: ${{ parameters.extractToMnt }}
# Set link to redirect llvm folder
- ${{ if eq(parameters.skipLlvmSymlink, false) }}:
- task: Bash@3
@@ -421,51 +280,31 @@ steps:
for file in amdclang amdclang++ amdclang-cl amdclang-cpp amdflang amdlld aompcc mygpu mycpu offload-arch; do
sudo ln -s $(Agent.BuildDirectory)/rocm/llvm/bin/$file $(Agent.BuildDirectory)/rocm/bin/$file
done
# dlopen calls within a ctest or pytest sequence runs into issues when shared library symlink convention is not followed
# the convention is as follows:
# unversioned .so is a symlink to major version .so
# major version .so is a symlink to detailed version .so
# HIP libraries do not follow this convention, and each .so is a copy of each other
# changing the library structure to follow the symlink convention resolves some test failures
- ${{ if eq(parameters.setupHIPLibrarySymlinks, true) }}:
- task: Bash@3
displayName: Setup symlinks for hip libraries
inputs:
targetType: inline
workingDirectory: $(Agent.BuildDirectory)/rocm/lib
script: |
LIBRARIES=("libamdhip64" "libhiprtc-builtins" "libhiprtc")
for LIB_NAME in "${LIBRARIES[@]}"; do
VERSIONED_SO=$(ls ${LIB_NAME}.so.* 2>/dev/null | grep -E "${LIB_NAME}\.so\.[0-9]+\.[0-9]+\.[0-9]+(-.*)?" | sort -V | tail -n 1)
if [[ -z "$VERSIONED_SO" ]]; then
continue
fi
MAJOR_VERSION=$(echo "$VERSIONED_SO" | grep -oP "${LIB_NAME}\.so\.\K[0-9]+")
if [[ -e "${LIB_NAME}.so.${MAJOR_VERSION}" && ! -L "${LIB_NAME}.so.${MAJOR_VERSION}" ]]; then
rm -f "${LIB_NAME}.so.${MAJOR_VERSION}"
fi
if [[ -e "${LIB_NAME}.so" && ! -L "${LIB_NAME}.so" ]]; then
rm -f "${LIB_NAME}.so"
fi
ln -sf "$VERSIONED_SO" "${LIB_NAME}.so.${MAJOR_VERSION}"
ln -sf "${LIB_NAME}.so.${MAJOR_VERSION}" "${LIB_NAME}.so"
echo "Symlinks created for $LIB_NAME:"
ls -l ${LIB_NAME}.so*
done
- task: Bash@3
displayName: 'List downloaded ROCm files'
inputs:
targetType: inline
script: ls -1R $(Agent.BuildDirectory)/rocm
${{ if eq(parameters.extractToMnt, true) }}:
script: ls -1R /mnt/rocm
${{ else }}:
script: ls -1R $(Agent.BuildDirectory)/rocm
- ${{ if eq(parameters.skipLibraryLinking, false) }}:
- task: Bash@3
displayName: 'Link ROCm shared libraries'
inputs:
targetType: inline
# OS ignores if the ROCm lib folder shows up more than once
script: |
echo $(Agent.BuildDirectory)/rocm/lib | sudo tee /etc/ld.so.conf.d/rocm-ci.conf
echo $(Agent.BuildDirectory)/rocm/llvm/lib | sudo tee -a /etc/ld.so.conf.d/rocm-ci.conf
sudo cat /etc/ld.so.conf.d/rocm-ci.conf
sudo ldconfig -v
ldconfig -p
${{ if eq(parameters.extractToMnt, true) }}:
script: |
echo /mnt/rocm/lib | sudo tee /etc/ld.so.conf.d/rocm-ci.conf
echo /mnt/rocm/llvm/lib | sudo tee -a /etc/ld.so.conf.d/rocm-ci.conf
sudo cat /etc/ld.so.conf.d/rocm-ci.conf
sudo ldconfig -v
ldconfig -p
${{ else }}:
script: |
echo $(Agent.BuildDirectory)/rocm/lib | sudo tee /etc/ld.so.conf.d/rocm-ci.conf
echo $(Agent.BuildDirectory)/rocm/llvm/lib | sudo tee -a /etc/ld.so.conf.d/rocm-ci.conf
sudo cat /etc/ld.so.conf.d/rocm-ci.conf
sudo ldconfig -v
ldconfig -p

View File

@@ -1,348 +0,0 @@
parameters:
# this base set of packages should not be changed by calling script
# should be common across all pipelines
- name: baseAptPackages
type: object
default:
- build-essential
- ca-certificates
- curl
- file
- git
- gcc
- g++
- gpg
- kmod
- libdrm-dev
- libelf-dev
- libgtest-dev
- libhsakmt-dev
- libhwloc-dev
- libnuma-dev
- libstdc++-12-dev
- libtbb-dev
- lsb-release
- lsof
- ninja-build
- pkg-config
- python3-dev
- python3-pip
- wget
- zip
# optional array of additional apt packages to install
- name: aptPackages
type: object
default: []
# optional array of python modules to install
- name: pipModules
type: object
default: []
# optional array of workspace directories to install
# sources, binaries, and rocm directories are copied by default
- name: extraCopyDirectories
type: object
default: []
# optional string to specify gpuTarget for the docker image string
- name: gpuTarget
type: string
default: ''
# test environment involves gpu-related steps
# some jobs combine both build and test
# some jobs differentiate based on gpu vendor
- name: environment
type: string
default: build
values:
- build
- test
- combined
- amd
- nvidia
# optional boolean prerequisites before install extra apt packages
- name: registerROCmPackages
type: boolean
default: false
- name: registerCUDAPackages
type: boolean
default: false
- name: registerJPEGPackages
type: boolean
default: false
# optional boolean for special setup steps to accomodate some components
- name: installLatestCMake
type: boolean
default: false
- name: installAOCL
type: boolean
default: false
- name: aoclRepositoryUrl
type: string
default: https://download.amd.com/developer/eula/aocl/aocl-4-2
- name: aoclPackageName
type: string
default: aocl-linux-gcc-4.2.0_1_amd64.deb
- name: optSymLink
type: boolean
default: false
- name: pythonEnvVars
type: boolean
default: false
# optional string to add to PATH
- name: extraPaths
type: string
default: ''
# optional array of environment variables to set
# each array element expected to be in format of
# key:value
- name: extraEnvVars
type: object
default: []
# force the docker to be created, regardless of failure condition
- name: forceDockerCreation
type: boolean
default: false
steps:
# these steps should only be run if there was a failure or warning
# dynamically write to a Dockerfile
# first is to do base setup of users, groups
- task: Bash@3
condition: or(failed(), ${{ eq(parameters.forceDockerCreation, true) }})
displayName: Create start of Dockerfile
inputs:
workingDirectory: $(Pipeline.Workspace)
targetType: inline
script: |
echo "FROM ubuntu:22.04" > Dockerfile
echo "ARG USERNAME=user" >> Dockerfile
echo "ARG USER_UID=1000" >> Dockerfile
echo "ARG USER_GID=\$USER_UID" >> Dockerfile
echo "RUN groupadd --gid \$USER_GID \$USERNAME" >> Dockerfile
echo "RUN useradd --uid \$USER_UID --gid \$USER_GID -m \$USERNAME" >> Dockerfile
echo "RUN DEBIAN_FRONTEND=noninteractive apt-get --yes update" >> Dockerfile
echo "RUN DEBIAN_FRONTEND=noninteractive apt-get --yes install sudo" >> Dockerfile
echo "RUN echo \$USERNAME ALL=\(root\) NOPASSWD:ALL > /etc/sudoers.d/\$USERNAME" >> Dockerfile
echo "RUN chmod 0440 /etc/sudoers.d/\$USERNAME" >> Dockerfile
# for test jobs, setup GPU-related users and group
- ${{ if eq(parameters.environment, 'test') }}:
- task: Bash@3
condition: or(failed(), ${{ eq(parameters.forceDockerCreation, true) }})
displayName: GPU setup of Dockerfile
inputs:
workingDirectory: $(Pipeline.Workspace)
targetType: inline
script: |
echo "RUN groupadd render" >> Dockerfile
echo "RUN usermod -aG render,video \$USERNAME" >> Dockerfile
# now install a common set of packages through apt
- ${{ if gt(length(parameters.baseAptPackages), 0) }}:
- task: Bash@3
condition: or(failed(), ${{ eq(parameters.forceDockerCreation, true) }})
displayName: Base Apt Packages to Dockerfile
inputs:
workingDirectory: $(Pipeline.Workspace)
targetType: inline
script: echo "RUN DEBIAN_FRONTEND=noninteractive apt-get --yes install ${{ join(' ', parameters.baseAptPackages) }}" >> Dockerfile
# iterate through possible apt repos that might need to be added to the docker container
- ${{ if eq(parameters.registerROCmPackages, true) }}:
- task: Bash@3
condition: or(failed(), ${{ eq(parameters.forceDockerCreation, true) }})
displayName: Register ROCm packages to Dockerfile
inputs:
workingDirectory: $(Pipeline.Workspace)
targetType: inline
script: |
echo "RUN mkdir --parents --mode=0755 /etc/apt/keyrings" >> Dockerfile
echo "RUN wget https://repo.radeon.com/rocm/rocm.gpg.key -O - | gpg --dearmor | tee /etc/apt/keyrings/rocm.gpg > /dev/null" >> Dockerfile
echo "RUN echo \"deb [arch=amd64 signed-by=/etc/apt/keyrings/rocm.gpg] https://repo.radeon.com/amdgpu/latest/ubuntu jammy main\" | tee /etc/apt/sources.list.d/amdgpu.list" >> Dockerfile
echo "RUN echo \"deb [arch=amd64 signed-by=/etc/apt/keyrings/rocm.gpg] https://repo.radeon.com/rocm/apt/latest jammy main\" | tee --append /etc/apt/sources.list.d/rocm.list" >> Dockerfile
echo "RUN printf 'Package: *\\nPin: release o=repo.radeon.com\\nPin-Priority: 600' > /etc/apt/preferences.d/rocm-pin-600" >> Dockerfile
echo "RUN DEBIAN_FRONTEND=noninteractive apt-get --yes update" >> Dockerfile
- ${{ if eq(parameters.registerCUDAPackages, true) }}:
- task: Bash@3
condition: or(failed(), ${{ eq(parameters.forceDockerCreation, true) }})
displayName: Register CUDA packages to Dockerfile
inputs:
workingDirectory: $(Pipeline.Workspace)
targetType: inline
script: |
echo "RUN wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-keyring_1.1-1_all.deb" >> Dockerfile
echo "RUN dpkg -i cuda-keyring_1.1-1_all.deb" >> Dockerfile
echo "RUN rm -f cuda-keyring_1.1-1_all.deb" >> Dockerfile
echo 'RUN DEBIAN_FRONTEND=noninteractive apt-get --yes update' >> Dockerfile
- ${{ if eq(parameters.registerJPEGPackages, true) }}:
- task: Bash@3
condition: or(failed(), ${{ eq(parameters.forceDockerCreation, true) }})
displayName: Register libjpeg-turbo packages to Dockerfile
inputs:
workingDirectory: $(Pipeline.Workspace)
targetType: inline
script: |
echo "RUN mkdir --parents --mode=0755 /etc/apt/keyrings" >> Dockerfile
echo "RUN wget https://packagecloud.io/dcommander/libjpeg-turbo/gpgkey -O - | gpg --dearmor | tee /etc/apt/trusted.gpg.d/libjpeg-turbo.gpg > /dev/null" >> Dockerfile
echo "RUN echo \"deb [signed-by=/etc/apt/trusted.gpg.d/libjpeg-turbo.gpg] https://packagecloud.io/dcommander/libjpeg-turbo/any/ any main\" | sudo tee /etc/apt/sources.list.d/libjpeg-turbo.list" >> Dockerfile
echo "RUN DEBIAN_FRONTEND=noninteractive apt-get --yes update" >> Dockerfile
# install AOCL to docker container, if needed
- ${{ if eq(parameters.installAOCL, true) }}:
- task: Bash@3
condition: or(failed(), ${{ eq(parameters.forceDockerCreation, true) }})
displayName: aocl install to Dockerfile
inputs:
workingDirectory: $(Pipeline.Workspace)
targetType: inline
script: |
echo "RUN wget -nv ${{ parameters.aoclRepositoryUrl }}/${{ parameters.aoclPackageName }}" >> Dockerfile
echo "RUN DEBIAN_FRONTEND=noninteractive apt-get --yes install ./${{ parameters.aoclPackageName }}" >> Dockerfile
echo "RUN rm -f ${{ parameters.aoclPackageName }}" >> Dockerfile
# since apt repo list is updated, install the extra apt packages
- ${{ if gt(length(parameters.aptPackages), 0) }}:
- task: Bash@3
condition: or(failed(), ${{ eq(parameters.forceDockerCreation, true) }})
displayName: Extra Apt Packages to Dockerfile
inputs:
workingDirectory: $(Pipeline.Workspace)
targetType: inline
script: echo "RUN DEBIAN_FRONTEND=noninteractive apt-get --yes install ${{ join(' ', parameters.aptPackages) }}" >> Dockerfile
# install latest cmake to docker container, if needed
- ${{ if eq(parameters.installLatestCMake, true) }}:
- task: Bash@3
condition: or(failed(), ${{ eq(parameters.forceDockerCreation, true) }})
displayName: latest cmake install to Dockerfile
inputs:
workingDirectory: $(Pipeline.Workspace)
targetType: inline
script: |
echo "RUN DEBIAN_FRONTEND=noninteractive apt-get --yes purge cmake" >> Dockerfile
echo "RUN pip install cmake --upgrade" >> Dockerfile
# setup workspace where binaries, sources, and dependencies from the job will be copied to
- task: Bash@3
condition: or(failed(), ${{ eq(parameters.forceDockerCreation, true) }})
displayName: Workspace setup of Dockerfile
inputs:
workingDirectory: $(Pipeline.Workspace)
targetType: inline
script: |
echo "USER \$USERNAME" >> Dockerfile
echo "WORKDIR /home/user" >> Dockerfile
echo "RUN mkdir -p /home/user/workspace" >> Dockerfile
# pip install is done here as non-root
- ${{ if gt(length(parameters.pipModules), 0) }}:
- task: Bash@3
condition: or(failed(), ${{ eq(parameters.forceDockerCreation, true) }})
displayName: Extra Python Modules to Dockerfile
inputs:
workingDirectory: $(Pipeline.Workspace)
targetType: inline
script: echo "RUN pip install -v ${{ join(' ', parameters.pipModules) }}" >> Dockerfile
# copy common directories
- task: Bash@3
condition: or(failed(), ${{ eq(parameters.forceDockerCreation, true) }})
displayName: Copy base directories to Dockerfile
inputs:
workingDirectory: $(Pipeline.Workspace)
targetType: inline
script: |
if [ -d "$(Agent.BuildDirectory)/rocm" ]; then
echo "COPY rocm /home/user/workspace/rocm" >> Dockerfile
fi
if [ -d "$(Build.SourcesDirectory)" ] && [ "$(Build.SourcesDirectory)" != "" ]; then
echo "COPY s /home/user/workspace/src" >> Dockerfile
fi
if [ -d "$(Build.BinariesDirectory)" ] && [ "$(Build.BinariesDirectory)" != "" ]; then
echo "COPY b /home/user/workspace/bin" >> Dockerfile
fi
# copy extra directories, if applicable to the job
- ${{ each extraCopyDirectory in parameters.extraCopyDirectories }}:
- task: Bash@3
condition: or(failed(), ${{ eq(parameters.forceDockerCreation, true) }})
displayName: Copy ${{ extraCopyDirectory }} to Dockerfile
inputs:
workingDirectory: $(Pipeline.Workspace)
targetType: inline
script: |
if [ -d "${{ extraCopyDirectory }}" ]; then
echo "COPY ${{ extraCopyDirectory }} /home/user/workspace/${{ extraCopyDirectory }}" >> Dockerfile
fi
# setup ldconfig
- task: Bash@3
condition: or(failed(), ${{ eq(parameters.forceDockerCreation, true) }})
displayName: ldconfig to Dockerfile
inputs:
workingDirectory: $(Pipeline.Workspace)
targetType: inline
script: |
echo "USER root" >> Dockerfile
echo "RUN echo /home/user/workspace/rocm/lib | tee /etc/ld.so.conf.d/rocm-ci.conf" >> Dockerfile
echo "RUN echo /home/user/workspace/rocm/llvm/lib | tee -a /etc/ld.so.conf.d/rocm-ci.conf" >> Dockerfile
echo "RUN cat /etc/ld.so.conf.d/rocm-ci.conf" >> Dockerfile
echo "RUN ldconfig -v" >> Dockerfile
# create /opt/rocm symbolic link, if needed
- ${{ if eq(parameters.optSymLink, true) }}:
- task: Bash@3
condition: or(failed(), ${{ eq(parameters.forceDockerCreation, true) }})
displayName: /opt/rocm symbolic link to Dockerfile
inputs:
workingDirectory: $(Pipeline.Workspace)
targetType: inline
script: |
echo "USER root" >> Dockerfile
echo "RUN ln -s /home/user/workspace/rocm /opt/rocm" >> Dockerfile
# set environment variables needed for some python-based components
- ${{ if eq(parameters.pythonEnvVars, true) }}:
- task: Bash@3
condition: or(failed(), ${{ eq(parameters.forceDockerCreation, true) }})
displayName: python environment variables
inputs:
workingDirectory: $(Pipeline.Workspace)
targetType: inline
script: |
echo "USER root" >> Dockerfile
echo "ENV PYTHON_USER_SITE=$(python3 -m site --user-site)" >> Dockerfile
echo "ENV PYTHON_DIST_PACKAGES=$(python3 -c 'import sysconfig; print(sysconfig.get_paths()[\"purelib\"])')" >> Dockerfile
echo "ENV PYBIND11_PATH=$(python3 -c 'import pybind11; print(pybind11.get_cmake_dir())')" >> Dockerfile
# add to PATH environment variable
- ${{ if ne(parameters.extraPaths, '') }}:
- task: Bash@3
condition: or(failed(), ${{ eq(parameters.forceDockerCreation, true) }})
displayName: Add to PATH in Dockerfile
inputs:
workingDirectory: $(Pipeline.Workspace)
targetType: inline
script: echo "ENV PATH='${{ parameters.extraPaths }}:\$PATH'" >> Dockerfile
# set extra environment variables, if applicable to the job
# use ::: as delimiter to allow for colons to be in the environment variable values
- ${{ each extraEnvVar in parameters.extraEnvVars }}:
- task: Bash@3
condition: or(failed(), ${{ eq(parameters.forceDockerCreation, true) }})
displayName: Set ${{ extraEnvVar }} to Dockerfile
inputs:
workingDirectory: $(Pipeline.Workspace)
targetType: inline
script: echo "ENV ${{ split(extraEnvVar, ':::')[0] }}='${{ split(extraEnvVar, ':::')[1] }}'" >> Dockerfile
- task: Bash@3
condition: or(failed(), ${{ eq(parameters.forceDockerCreation, true) }})
displayName: Print Dockerfile
inputs:
workingDirectory: $(Pipeline.Workspace)
targetType: inline
script: cat Dockerfile
- task: Docker@2
condition: or(failed(), ${{ eq(parameters.forceDockerCreation, true) }})
inputs:
containerRegistry: 'ContainerService'
${{ if ne(parameters.gpuTarget, '') }}:
repository: '$(Build.DefinitionName)-${{ parameters.environment }}-${{ parameters.gpuTarget }}'
${{ else }}:
repository: '$(Build.DefinitionName)-${{ parameters.environment }}'
Dockerfile: '$(Pipeline.Workspace)/Dockerfile'
buildContext: '$(Pipeline.Workspace)'
- task: Bash@3
condition: or(failed(), ${{ eq(parameters.forceDockerCreation, true) }})
displayName: "!! Docker Image URL !!"
inputs:
workingDirectory: $(Pipeline.Workspace)
targetType: inline
${{ if ne(parameters.gpuTarget, '') }}:
script: echo "rocmexternalcicd.azurecr.io/$(Build.DefinitionName)-${{ parameters.environment }}-${{ parameters.gpuTarget }}:$(Build.BuildId)"
${{ else }}:
script: echo "rocmexternalcicd.azurecr.io/$(Build.DefinitionName)-${{ parameters.environment }}:$(Build.BuildId)"

View File

@@ -138,13 +138,3 @@ steps:
inputs:
tabName: Manifest
reportDir: $(Build.ArtifactStagingDirectory)/manifest_$(Build.DefinitionName)_$(Build.SourceBranchName)_$(Build.BuildId)_$(Build.BuildNumber)_ubuntu2204_${{ parameters.artifactName }}_${{ parameters.gpuTarget }}.html
- task: Bash@3
displayName: Save manifest artifact file name
condition: always()
continueOnError: true
inputs:
workingDirectory: $(Pipeline.Workspace)
targetType: inline
script: |
echo "manifest_$(Build.DefinitionName)_$(Build.SourceBranchName)_$(Build.BuildId)_$(Build.BuildNumber)_ubuntu2204_${{ parameters.artifactName }}_${{ parameters.gpuTarget }}.html" >> pipelineArtifacts.txt
echo "manifest_$(Build.DefinitionName)_$(Build.SourceBranchName)_$(Build.BuildId)_$(Build.BuildNumber)_ubuntu2204_${{ parameters.artifactName }}_${{ parameters.gpuTarget }}.json" >> pipelineArtifacts.txt

View File

@@ -36,7 +36,7 @@ steps:
if [ -z "$CK_BUILD_ID" ]; then
echo "Did not find specific CK build ID"
LATEST_BUILD_URL="$AZ_API/build/builds?definitions=$(COMPOSABLE_KERNEL_PIPELINE_ID)&statusFilter=completed&resultFilter=succeeded&\$top=1&api-version=7.1"
LATEST_BUILD_URL="$AZ_API/build/builds?definitions=$(COMPOSABLE_KERNEL_PIPELINE_ID)&status=completed&result=succeeded&\$top=1&api-version=7.1"
CK_BUILD_ID=$(curl -s $LATEST_BUILD_URL | jq '.value[0].id')
echo "Found latest CK build ID: $CK_BUILD_ID"
EXIT_CODE=1

View File

@@ -35,7 +35,6 @@ parameters:
- MIVisionX
- rocm-cmake
- rocm_smi_lib
- rocprofiler-sdk
- roctracer
steps:

View File

@@ -74,9 +74,9 @@ variables:
- name: HALF_TAGGED_PIPELINE_ID
value: 11
- name: HALF560_PIPELINE_ID
value: 68
- name: HALF560_BUILD_ID
value: 621
value: 66
- name: HALF560_TAGGED_PIPELINE_ID
value: 66
- name: HIP_PIPELINE_ID
value: 93
- name: HIP_TAGGED_PIPELINE_ID
@@ -343,9 +343,5 @@ variables:
value: 78
- name: RPP_TAGGED_PIPELINE_ID
value: 39
- name: TRANSFERBENCH_PIPELINE_ID
value: 265
- name: TRANSFERBENCH_TAGGED_PIPELINE_ID
value: 266
- name: BOOST_DEPENDENCY_PIPELINE_ID
value: 250

View File

@@ -2,13 +2,13 @@ name: Linting
on:
push:
branches:
branches:
- develop
- main
- 'docs/*'
- 'roc**'
pull_request:
branches:
branches:
- develop
- main
- 'docs/*'

View File

@@ -1,10 +0,0 @@
matrix:
- name: Markdown
sources:
- ['tools/autotag/templates/**/*.md', '!tools/autotag/templates/**/5*.md', '!tools/autotag/templates/**/6.0*.md', '!tools/autotag/templates/**/6.1*.md']
- name: reST
sources:
- []
- name: Cpp
sources:
- []

View File

@@ -26,7 +26,6 @@ ASm
ATI
AddressSanitizer
AlexNet
Andrej
Arb
Autocast
BARs
@@ -74,7 +73,6 @@ Conda
ConnectX
CuPy
Dashboarding
DBRX
DDR
DF
DGEMM
@@ -92,8 +90,6 @@ Dask
DataFrame
DataLoader
DataParallel
Debian
DeepSeek
DeepSpeed
Dependabot
Deprecations
@@ -110,14 +106,12 @@ FFT
FFTs
FFmpeg
FHS
FIXME
FMA
FP
FX
Filesystem
FindDb
Flang
FluxBenchmark
Fortran
Fuyu
GALB
@@ -132,12 +126,10 @@ GDS
GEMM
GEMMs
GFortran
Gemma
GiB
GIM
GL
GLXT
Gloo
GMI
GPG
GPR
@@ -155,10 +147,7 @@ HCA
HGX
HIPCC
HIPExtension
HIPification
HIPIFY
HIPification
HIPify
HPC
HPCG
HPE
@@ -194,17 +183,15 @@ Interop
Intersphinx
Intra
Ioffe
JAX's
Jinja
JSON
Jupyter
KFD
KFDTest
KiB
KMD
KV
KVM
Karpathy's
KiB
Keras
Khronos
LAPACK
@@ -225,7 +212,6 @@ MiB
MIGraphX
MIOpen
MIOpenGEMM
MIOpen's
MIVisionX
MLM
MMA
@@ -255,8 +241,6 @@ MyEnvironment
MyST
NBIO
NBIOs
NCCL
NCF
NIC
NICs
NLI
@@ -298,12 +282,10 @@ OpenVX
OpenXLA
Oversubscription
PagedAttention
Pallas
PCC
PCI
PCIe
PEFT
PEQT
PIL
PILImage
POR
@@ -318,7 +300,6 @@ PipelineParallel
PnP
PowerEdge
PowerShell
Pretraining
Profiler's
PyPi
Pytest
@@ -334,7 +315,6 @@ RDMA
RDNA
README
RHEL
RMW
RNN
RNNs
ROC
@@ -351,7 +331,6 @@ ROCmSoftwarePlatform
ROCmValidationSuite
ROCprofiler
ROCr
RPP
RST
RW
Radeon
@@ -359,7 +338,6 @@ RelWithDebInfo
Req
Rickle
RoCE
Runfile
Ryzen
SALU
SBIOS
@@ -372,7 +350,6 @@ SENDMSG
SGPR
SGPRs
SHA
SHARK's
SIGQUIT
SIMD
SIMDs
@@ -417,14 +394,9 @@ TensorFlow
TensorParallel
ToC
TorchAudio
torchaudio
TorchElastic
TorchMIGraphX
torchrec
TorchScript
TorchServe
torchserve
torchtext
TorchVision
TransferBench
TrapStatus
@@ -443,7 +415,6 @@ Unittests
Unhandled
VALU
VBIOS
VCN
VGPR
VGPRs
VM
@@ -532,9 +503,6 @@ copyable
cpp
csn
cuBLAS
cuda
cuDNN
cudnn
cuFFT
cuLIB
cuRAND
@@ -551,7 +519,6 @@ dbgapi
de
deallocation
debuggability
debian
denoise
denoised
denoises
@@ -589,7 +556,6 @@ gRPC
galb
gcc
gdb
gemm
gfortran
gfx
githooks
@@ -605,7 +571,6 @@ hipBLASLt's
hipblaslt
hipCUB
hipFFT
hipFORT
hipLIB
hipRAND
hipSOLVER
@@ -649,13 +614,11 @@ jax
kdb
kfd
kv
lang
latencies
len
libfabric
libjpeg
libs
linalg
linearized
linter
linux
@@ -678,7 +641,6 @@ mutex
mvffr
namespace
namespaces
nanoGPT
num
numref
ocl
@@ -690,9 +652,7 @@ optimizers
os
oversubscription
pageable
pallas
parallelization
parallelizing
parameterization
passthrough
perfcounter
@@ -705,7 +665,6 @@ prebuilt
precompiled
preconditioner
preconfigured
preemptible
prefetch
prefetchable
prefill
@@ -717,19 +676,15 @@ preprocessing
preprocessor
prequantized
prerequisites
pretraining
profiler
profilers
protobuf
pseudorandom
py
recommender
recommenders
quantile
quantizer
quasirandom
queueing
radeon
rccl
rdc
rdma
@@ -751,7 +706,6 @@ rocALUTION
rocBLAS
rocDecode
rocFFT
rocHPCG
rocJPEG
rocLIB
rocMLIR
@@ -783,7 +737,6 @@ runtimes
sL
scalability
scalable
scipy
seealso
sendmsg
seqs
@@ -808,7 +761,6 @@ submodules
supercomputing
symlink
symlinks
sys
td
tensorfloat
th

View File

@@ -1,6 +1,6 @@
MIT License
Copyright (c) 2023 - 2025 Advanced Micro Devices, Inc. All rights reserved.
Copyright (c) 2023 - 2024 Advanced Micro Devices, Inc. All rights reserved.
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal

View File

@@ -50,8 +50,7 @@ The following example shows how to use the repo tool to download the ROCm source
```bash
mkdir -p ~/ROCm/
cd ~/ROCm/
export ROCM_VERSION=6.3.2
~/bin/repo init -u http://github.com/ROCm/ROCm.git -b roc-6.3.x -m tools/rocm-build/rocm-${ROCM_VERSION}.xml
~/bin/repo init -u http://github.com/ROCm/ROCm.git -b roc-6.3.x
~/bin/repo sync
```
@@ -77,8 +76,8 @@ The Build time will reduce significantly if we limit the GPU Architecture/s agai
mkdir -p ~/WORKSPACE/ # Or any folder name other than WORKSPACE
cd ~/WORKSPACE/
export ROCM_VERSION=6.3.2
~/bin/repo init -u http://github.com/ROCm/ROCm.git -b develop -m tools/rocm-build/rocm-${ROCM_VERSION}.xml
export ROCM_VERSION=6.3.0
~/bin/repo init -u http://github.com/ROCm/ROCm.git -b roc-6.3.x -m tools/rocm-build/rocm-${ROCM_VERSION}.xml
~/bin/repo sync
# --------------------------------------
@@ -88,11 +87,11 @@ export ROCM_VERSION=6.3.2
# Option 1: Start a docker container
# Pulling required base docker images:
# Ubuntu20.04 built from ROCm/tools/rocm-build/docker/ubuntu20/Dockerfile
docker pull rocm/rocm-build-ubuntu-20.04:6.3
docker pull rocm/rocm-build-ubuntu-20.04:6.2
# Ubuntu22.04 built from ROCm/tools/rocm-build/docker/ubuntu22/Dockerfile
docker pull rocm/rocm-build-ubuntu-22.04:6.3
docker pull rocm/rocm-build-ubuntu-22.04:6.2
# Ubuntu24.04 built from ROCm/tools/rocm-build/docker/ubuntu24/Dockerfile
docker pull rocm/rocm-build-ubuntu-24.04:6.3
docker pull rocm/rocm-build-ubuntu-24.04:6.2
# Start docker container and mount the source code folder:
docker run -ti \

1626
RELEASE.md

File diff suppressed because it is too large Load Diff

View File

@@ -1,14 +1,17 @@
<?xml version="1.0" encoding="UTF-8"?>
<manifest>
<remote name="rocm-org" fetch="https://github.com/ROCm/" />
<default revision="refs/tags/rocm-6.3.2"
<default revision="refs/tags/rocm-6.3.0"
remote="rocm-org"
sync-c="true"
sync-j="4" />
<!--list of projects for ROCm-->
<project name="ROCK-Kernel-Driver" />
<project name="ROCR-Runtime" />
<project name="ROCT-Thunk-Interface" />
<project name="amdsmi" />
<project name="omniperf" />
<project name="omnitrace" />
<project name="rdc" />
<project name="rocm_bandwidth_test" />
<project name="rocm_smi_lib" />
@@ -18,8 +21,6 @@
<project name="rocprofiler" />
<project name="rocprofiler-register" />
<project name="rocprofiler-sdk" />
<project name="rocprofiler-compute" />
<project name="rocprofiler-systems" />
<project name="roctracer" />
<!--HIP Projects-->
<project name="HIP" />
@@ -41,7 +42,6 @@
<project groups="mathlibs" name="ROCmValidationSuite" />
<project groups="mathlibs" name="Tensile" />
<project groups="mathlibs" name="composable_kernel" />
<project groups="mathlibs" name="hipBLAS-common" />
<project groups="mathlibs" name="hipBLAS" />
<project groups="mathlibs" name="hipBLASLt" />
<project groups="mathlibs" name="hipCUB" />
@@ -57,7 +57,6 @@
<project groups="mathlibs" name="rocALUTION" />
<project groups="mathlibs" name="rocBLAS" />
<project groups="mathlibs" name="rocDecode" />
<project groups="mathlibs" name="rocJPEG" />
<project groups="mathlibs" name="rocPyDecode" />
<project groups="mathlibs" name="rocFFT" />
<project groups="mathlibs" name="rocPRIM" />
@@ -68,7 +67,6 @@
<project groups="mathlibs" name="rocWMMA" />
<project groups="mathlibs" name="rocm-cmake" />
<project groups="mathlibs" name="rpp" />
<project groups="mathlibs" name="TransferBench" />
<!-- Projects for OpenMP-Extras -->
<project name="aomp" path="openmp-extras/aomp" />
<project name="aomp-extras" path="openmp-extras/aomp-extras" />

View File

@@ -25,15 +25,15 @@ additional licenses. Please review individual repositories for more information.
<!-- spellcheck-disable -->
| Component | License |
|:---------------------|:-------------------------|
| [AMD Compute Language Runtime (CLR)](https://github.com/ROCm/clr) | [MIT](https://github.com/ROCm/clr/blob/amd-staging/LICENCE) |
| [AMD SMI](https://github.com/ROCm/amdsmi) | [MIT](https://github.com/ROCm/amdsmi/blob/amd-staging/LICENSE) |
| [AMD Compute Language Runtime (CLR)](https://github.com/ROCm/clr) | [MIT](https://github.com/ROCm/clr/blob/develop/LICENCE) |
| [AMD SMI](https://github.com/ROCm/amdsmi) | [MIT](https://github.com/ROCm/amdsmi/blob/develop/LICENSE) |
| [aomp](https://github.com/ROCm/aomp/) | [Apache 2.0](https://github.com/ROCm/aomp/blob/aomp-dev/LICENSE) |
| [aomp-extras](https://github.com/ROCm/aomp-extras/) | [MIT](https://github.com/ROCm/aomp-extras/blob/aomp-dev/LICENSE) |
| [Code Object Manager (Comgr)](https://github.com/ROCm/llvm-project/tree/amd-staging/amd/comgr) | [The University of Illinois/NCSA](https://github.com/ROCm/llvm-project/blob/amd-staging/amd/comgr/LICENSE.txt) |
| [Composable Kernel](https://github.com/ROCm/composable_kernel) | [MIT](https://github.com/ROCm/composable_kernel/blob/develop/LICENSE) |
| [half](https://github.com/ROCm/half/) | [MIT](https://github.com/ROCm/half/blob/rocm/LICENSE.txt) |
| [HIP](https://github.com/ROCm/HIP/) | [MIT](https://github.com/ROCm/HIP/blob/amd-staging/LICENSE.txt) |
| [hipamd](https://github.com/ROCm/clr/tree/amd-staging/hipamd) | [MIT](https://github.com/ROCm/clr/blob/amd-staging/hipamd/LICENSE.txt) |
| [HIP](https://github.com/ROCm/HIP/) | [MIT](https://github.com/ROCm/HIP/blob/develop/LICENSE.txt) |
| [hipamd](https://github.com/ROCm/clr/tree/develop/hipamd) | [MIT](https://github.com/ROCm/clr/blob/develop/hipamd/LICENSE.txt) |
| [hipBLAS](https://github.com/ROCm/hipBLAS/) | [MIT](https://github.com/ROCm/hipBLAS/blob/develop/LICENSE.md) |
| [hipBLASLt](https://github.com/ROCm/hipBLASLt/) | [MIT](https://github.com/ROCm/hipBLASLt/blob/develop/LICENSE.md) |
| [HIPCC](https://github.com/ROCm/llvm-project/tree/amd-staging/amd/hipcc) | [MIT](https://github.com/ROCm/llvm-project/blob/amd-staging/amd/hipcc/LICENSE.txt) |
@@ -58,29 +58,29 @@ additional licenses. Please review individual repositories for more information.
| [ROCdbgapi](https://github.com/ROCm/ROCdbgapi/) | [MIT](https://github.com/ROCm/ROCdbgapi/blob/amd-staging/LICENSE.txt) |
| [rocDecode](https://github.com/ROCm/rocDecode) | [MIT](https://github.com/ROCm/rocDecode/blob/develop/LICENSE) |
| [rocFFT](https://github.com/ROCm/rocFFT/) | [MIT](https://github.com/ROCm/rocFFT/blob/develop/LICENSE.md) |
| [ROCgdb](https://github.com/ROCm/ROCgdb/) | [GNU General Public License v3.0](https://github.com/ROCm/ROCgdb/blob/amd-staging/COPYING3) |
| [ROCgdb](https://github.com/ROCm/ROCgdb/) | [GNU General Public License v3.0](https://github.com/ROCm/ROCgdb/blob/amd-master/COPYING3) |
| [rocJPEG](https://github.com/ROCm/rocJPEG/) | [MIT](https://github.com/ROCm/rocJPEG/blob/develop/LICENSE) |
| [ROCK-Kernel-Driver](https://github.com/ROCm/ROCK-Kernel-Driver/) | [GPL 2.0 WITH Linux-syscall-note](https://github.com/ROCm/ROCK-Kernel-Driver/blob/master/COPYING) |
| [rocminfo](https://github.com/ROCm/rocminfo/) | [The University of Illinois/NCSA](https://github.com/ROCm/rocminfo/blob/amd-staging/License.txt) |
| [ROCm Bandwidth Test](https://github.com/ROCm/rocm_bandwidth_test/) | [MIT](https://github.com/ROCm/rocm_bandwidth_test/blob/master/LICENSE.txt) |
| [ROCm Bandwidth Test](https://github.com/ROCm/rocm_bandwidth_test/) | [The University of Illinois/NCSA](https://github.com/ROCm/rocm_bandwidth_test/blob/master/LICENSE.txt) |
| [ROCm CMake](https://github.com/ROCm/rocm-cmake/) | [MIT](https://github.com/ROCm/rocm-cmake/blob/develop/LICENSE) |
| [ROCm Communication Collectives Library (RCCL)](https://github.com/ROCm/rccl/) | [Custom](https://github.com/ROCm/rccl/blob/develop/LICENSE.txt) |
| [ROCm-Core](https://github.com/ROCm/rocm-core) | [MIT](https://github.com/ROCm/rocm-core/blob/master/copyright) |
| [ROCm Compute Profiler](https://github.com/ROCm/rocprofiler-compute) | [MIT](https://github.com/ROCm/rocprofiler-compute/blob/amd-staging/LICENSE) |
| [ROCm Data Center (RDC)](https://github.com/ROCm/rdc/) | [MIT](https://github.com/ROCm/rdc/blob/amd-staging/LICENSE) |
| [ROCm Data Center (RDC)](https://github.com/ROCm/rdc/) | [MIT](https://github.com/ROCm/rdc/blob/develop/LICENSE) |
| [ROCm-Device-Libs](https://github.com/ROCm/llvm-project/tree/amd-staging/amd/device-libs) | [The University of Illinois/NCSA](https://github.com/ROCm/llvm-project/blob/amd-staging/amd/device-libs/LICENSE.TXT) |
| [ROCm-OpenCL-Runtime](https://github.com/ROCm/clr/tree/amd-staging/opencl) | [MIT](https://github.com/ROCm/clr/blob/amd-staging/opencl/LICENSE.txt) |
| [ROCm-OpenCL-Runtime](https://github.com/ROCm/clr/tree/develop/opencl) | [MIT](https://github.com/ROCm/clr/blob/develop/opencl/LICENSE.txt) |
| [ROCm Performance Primitives (RPP)](https://github.com/ROCm/rpp) | [MIT](https://github.com/ROCm/rpp/blob/develop/LICENSE) |
| [ROCm SMI Lib](https://github.com/ROCm/rocm_smi_lib/) | [MIT](https://github.com/ROCm/rocm_smi_lib/blob/amd-staging/License.txt) |
| [ROCm SMI Lib](https://github.com/ROCm/rocm_smi_lib/) | [MIT](https://github.com/ROCm/rocm_smi_lib/blob/develop/License.txt) |
| [ROCm Systems Profiler](https://github.com/ROCm/rocprofiler-systems) | [MIT](https://github.com/ROCm/rocprofiler-systems/blob/amd-staging/LICENSE) |
| [ROCm Validation Suite](https://github.com/ROCm/ROCmValidationSuite/) | [MIT](https://github.com/ROCm/ROCmValidationSuite/blob/master/LICENSE) |
| [rocPRIM](https://github.com/ROCm/rocPRIM/) | [MIT](https://github.com/ROCm/rocPRIM/blob/develop/LICENSE.txt) |
| [ROCProfiler](https://github.com/ROCm/rocprofiler/) | [MIT](https://github.com/ROCm/rocprofiler/blob/amd-staging/LICENSE) |
| [ROCProfiler](https://github.com/ROCm/rocprofiler/) | [MIT](https://github.com/ROCm/rocprofiler/blob/amd-master/LICENSE) |
| [ROCprofiler-SDK](https://github.com/ROCm/rocprofiler-sdk) | [MIT](https://github.com/ROCm/rocprofiler-sdk/blob/amd-mainline/LICENSE) |
| [rocPyDecode](https://github.com/ROCm/rocPyDecode) | [MIT](https://github.com/ROCm/rocPyDecode/blob/develop/LICENSE) |
| [rocRAND](https://github.com/ROCm/rocRAND/) | [MIT](https://github.com/ROCm/rocRAND/blob/develop/LICENSE.txt) |
| [ROCr Debug Agent](https://github.com/ROCm/rocr_debug_agent/) | [The University of Illinois/NCSA](https://github.com/ROCm/rocr_debug_agent/blob/amd-staging/LICENSE.txt) |
| [ROCR-Runtime](https://github.com/ROCm/ROCR-Runtime/) | [The University of Illinois/NCSA](https://github.com/ROCm/ROCR-Runtime/blob/amd-staging/LICENSE.txt) |
| [ROCR-Runtime](https://github.com/ROCm/ROCR-Runtime/) | [The University of Illinois/NCSA](https://github.com/ROCm/ROCR-Runtime/blob/master/LICENSE.txt) |
| [rocSOLVER](https://github.com/ROCm/rocSOLVER/) | [BSD-2-Clause](https://github.com/ROCm/rocSOLVER/blob/develop/LICENSE.md) |
| [rocSPARSE](https://github.com/ROCm/rocSPARSE/) | [MIT](https://github.com/ROCm/rocSPARSE/blob/develop/LICENSE.md) |
| [rocThrust](https://github.com/ROCm/rocThrust/) | [Apache 2.0](https://github.com/ROCm/rocThrust/blob/develop/LICENSE) |
@@ -99,7 +99,7 @@ repositories to distinguish from open sourced packages.
The following additional terms and conditions apply to your use of ROCm technical documentation.
```
©2023 - 2025 Advanced Micro Devices, Inc. All rights reserved.
©2023 - 2024 Advanced Micro Devices, Inc. All rights reserved.
The information presented in this document is for informational purposes only
and may contain technical inaccuracies, omissions, and typographical errors. The

View File

@@ -1,120 +1,118 @@
ROCm Version,6.3.2,6.3.1,6.3.0,6.2.4,6.2.2,6.2.1,6.2.0, 6.1.5, 6.1.2, 6.1.1, 6.1.0, 6.0.2, 6.0.0
:ref:`Operating systems & kernels <OS-kernel-versions>`,Ubuntu 24.04.2,Ubuntu 24.04.2,Ubuntu 24.04.2,"Ubuntu 24.04.1, 24.04","Ubuntu 24.04.1, 24.04","Ubuntu 24.04.1, 24.04",Ubuntu 24.04,,,,,,
,Ubuntu 22.04.5,Ubuntu 22.04.5,Ubuntu 22.04.5,"Ubuntu 22.04.5, 22.04.4","Ubuntu 22.04.5, 22.04.4","Ubuntu 22.04.5, 22.04.4","Ubuntu 22.04.5, 22.04.4","Ubuntu 22.04.5, 22.04.4, 22.04.3","Ubuntu 22.04.4, 22.04.3","Ubuntu 22.04.4, 22.04.3","Ubuntu 22.04.4, 22.04.3","Ubuntu 22.04.4, 22.04.3, 22.04.2","Ubuntu 22.04.4, 22.04.3, 22.04.2"
,,,,,,,,"Ubuntu 20.04.6, 20.04.5","Ubuntu 20.04.6, 20.04.5","Ubuntu 20.04.6, 20.04.5","Ubuntu 20.04.6, 20.04.5","Ubuntu 20.04.6, 20.04.5","Ubuntu 20.04.6, 20.04.5"
,"RHEL 9.5, 9.4","RHEL 9.5, 9.4","RHEL 9.5, 9.4","RHEL 9.4, 9.3","RHEL 9.4, 9.3","RHEL 9.4, 9.3","RHEL 9.4, 9.3","RHEL 9.4, 9.3, 9.2","RHEL 9.4, 9.3, 9.2","RHEL 9.4, 9.3, 9.2","RHEL 9.4, 9.3, 9.2","RHEL 9.3, 9.2","RHEL 9.3, 9.2"
,RHEL 8.10,RHEL 8.10,RHEL 8.10,"RHEL 8.10, 8.9","RHEL 8.10, 8.9","RHEL 8.10, 8.9","RHEL 8.10, 8.9","RHEL 8.9, 8.8","RHEL 8.9, 8.8","RHEL 8.9, 8.8","RHEL 8.9, 8.8","RHEL 8.9, 8.8","RHEL 8.9, 8.8"
,"SLES 15 SP6, SP5","SLES 15 SP6, SP5","SLES 15 SP6, SP5","SLES 15 SP6, SP5","SLES 15 SP6, SP5","SLES 15 SP6, SP5","SLES 15 SP6, SP5","SLES 15 SP5, SP4","SLES 15 SP5, SP4","SLES 15 SP5, SP4","SLES 15 SP5, SP4","SLES 15 SP5, SP4","SLES 15 SP5, SP4"
,,,,,,,,,CentOS 7.9,CentOS 7.9,CentOS 7.9,CentOS 7.9,CentOS 7.9
,Oracle Linux 8.10 [#mi300x-past-60]_,Oracle Linux 8.10 [#mi300x-past-60]_,Oracle Linux 8.10 [#mi300x-past-60]_,Oracle Linux 8.9 [#mi300x-past-60]_,Oracle Linux 8.9 [#mi300x-past-60]_,Oracle Linux 8.9 [#mi300x-past-60]_,Oracle Linux 8.9 [#mi300x-past-60]_,Oracle Linux 8.9 [#mi300x-past-60]_,Oracle Linux 8.9 [#mi300x-past-60]_,Oracle Linux 8.9 [#mi300x-past-60]_,,,
,Debian 12 [#single-node-past-60]_,Debian 12 [#single-node-past-60]_,,,,,,,,,,,
,Azure Linux 3.0 [#mi300x-past-60]_,,,,,,,,,,,,
,.. _architecture-support-compatibility-matrix-past-60:,,,,,,,,,,,,
:doc:`Architecture <rocm-install-on-linux:reference/system-requirements>`,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3
,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2
,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA
,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3
,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2
,.. _gpu-support-compatibility-matrix-past-60:,,,,,,,,,,,,
:doc:`GPU / LLVM target <rocm-install-on-linux:reference/system-requirements>`,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100
,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030
,gfx942,gfx942,gfx942,gfx942 [#mi300_624-past-60]_,gfx942 [#mi300_622-past-60]_,gfx942 [#mi300_621-past-60]_,gfx942 [#mi300_620-past-60]_, gfx942 [#mi300_612-past-60]_, gfx942 [#mi300_612-past-60]_, gfx942 [#mi300_611-past-60]_, gfx942 [#mi300_610-past-60]_, gfx942 [#mi300_602-past-60]_, gfx942 [#mi300_600-past-60]_
,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a
,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908
,,,,,,,,,,,,,
FRAMEWORK SUPPORT,.. _framework-support-compatibility-matrix-past-60:,,,,,,,,,,,,
:doc:`PyTorch <../compatibility/ml-compatibility/pytorch-compatibility>`,"2.4, 2.3, 2.2, 1.13","2.4, 2.3, 2.2, 1.13","2.4, 2.3, 2.2, 2.1, 2.0, 1.13","2.3, 2.2, 2.1, 2.0, 1.13","2.3, 2.2, 2.1, 2.0, 1.13","2.3, 2.2, 2.1, 2.0, 1.13","2.3, 2.2, 2.1, 2.0, 1.13","2.1, 2.0, 1.13","2.1, 2.0, 1.13","2.1, 2.0, 1.13","2.1, 2.0, 1.13","2.1, 2.0, 1.13","2.1, 2.0, 1.13"
:doc:`TensorFlow <../compatibility/ml-compatibility/tensorflow-compatibility>`,"2.17.0, 2.16.2, 2.15.1","2.17.0, 2.16.2, 2.15.1","2.17.0, 2.16.2, 2.15.1","2.16.1, 2.15.1, 2.14.1","2.16.1, 2.15.1, 2.14.1","2.16.1, 2.15.1, 2.14.1","2.16.1, 2.15.1, 2.14.1","2.15.0, 2.14.0, 2.13.1","2.15.0, 2.14.0, 2.13.1","2.15.0, 2.14.0, 2.13.1","2.15.0, 2.14.0, 2.13.1","2.14.0, 2.13.1, 2.12.1","2.14.0, 2.13.1, 2.12.1"
:doc:`JAX <../compatibility/ml-compatibility/jax-compatibility>`,0.4.31,0.4.31,0.4.31,0.4.26,0.4.26,0.4.26,0.4.26,0.4.26,0.4.26,0.4.26,0.4.26,0.4.26,0.4.26
`ONNX Runtime <https://onnxruntime.ai/docs/build/eps.html#amd-migraphx>`_,1.17.3,1.17.3,1.17.3,1.17.3,1.17.3,1.17.3,1.17.3,1.17.3,1.17.3,1.17.3,1.17.3,1.14.1,1.14.1
,,,,,,,,,,,,,
THIRD PARTY COMMS,.. _thirdpartycomms-support-compatibility-matrix-past-60:,,,,,,,,,,,,
`UCC <https://github.com/ROCm/ucc>`_,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.2.0,>=1.2.0
`UCX <https://github.com/ROCm/ucx>`_,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.14.1,>=1.14.1,>=1.14.1,>=1.14.1,>=1.14.1,>=1.14.1
,,,,,,,,,,,,,
THIRD PARTY ALGORITHM,.. _thirdpartyalgorithm-support-compatibility-matrix-past-60:,,,,,,,,,,,,
Thrust,2.3.2,2.3.2,2.3.2,2.2.0,2.2.0,2.2.0,2.2.0,2.1.0,2.1.0,2.1.0,2.1.0,2.0.1,2.0.1
CUB,2.3.2,2.3.2,2.3.2,2.2.0,2.2.0,2.2.0,2.2.0,2.1.0,2.1.0,2.1.0,2.1.0,2.0.1,2.0.1
,,,,,,,,,,,,,
KMD & USER SPACE [#kfd_support-past-60]_,.. _kfd-userspace-support-compatibility-matrix-past-60:,,,,,,,,,,,,
Tested user space versions,"6.3.x, 6.2.x, 6.1.x","6.3.x, 6.2.x, 6.1.x","6.3.x, 6.2.x, 6.1.x","6.3.x, 6.2.x, 6.1.x, 6.0.x","6.3.x, 6.2.x, 6.1.x, 6.0.x","6.3.x, 6.2.x, 6.1.x, 6.0.x","6.3.x, 6.2.x, 6.1.x, 6.0.x","6.3.x, 6.2.x, 6.1.x, 6.0.x, 5.7.x","6.3.x, 6.2.x, 6.1.x, 6.0.x, 5.7.x","6.3.x, 6.2.x, 6.1.x, 6.0.x, 5.7.x","6.3.x, 6.2.x, 6.1.x, 6.0.x, 5.7.x","6.2.x, 6.1.x, 6.0.x, 5.7.x, 5.6.x","6.2.x, 6.1.x, 6.0.x, 5.7.x, 5.6.x"
,,,,,,,,,,,,,
ML & COMPUTER VISION,.. _mllibs-support-compatibility-matrix-past-60:,,,,,,,,,,,,
:doc:`Composable Kernel <composable_kernel:index>`,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0
:doc:`MIGraphX <amdmigraphx:index>`,2.11.0,2.11.0,2.11.0,2.10.0,2.10.0,2.10.0,2.10.0,2.9.0,2.9.0,2.9.0,2.9.0,2.8.0,2.8.0
:doc:`MIOpen <miopen:index>`,3.3.0,3.3.0,3.3.0,3.2.0,3.2.0,3.2.0,3.2.0,3.1.0,3.1.0,3.1.0,3.1.0,3.0.0,3.0.0
:doc:`MIVisionX <mivisionx:index>`,3.1.0,3.1.0,3.1.0,3.0.0,3.0.0,3.0.0,3.0.0,2.5.0,2.5.0,2.5.0,2.5.0,2.5.0,2.5.0
:doc:`rocAL <rocal:index>`,2.1.0,2.1.0,2.1.0,2.0.0,2.0.0,2.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0
:doc:`rocDecode <rocdecode:index>`,0.8.0,0.8.0,0.8.0,0.6.0,0.6.0,0.6.0,0.6.0,0.6.0,0.6.0,0.5.0,0.5.0,N/A,N/A
:doc:`rocJPEG <rocjpeg:index>`,0.6.0,0.6.0,0.6.0,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A
:doc:`rocPyDecode <rocpydecode:index>`,0.2.0,0.2.0,0.2.0,0.1.0,0.1.0,0.1.0,0.1.0,N/A,N/A,N/A,N/A,N/A,N/A
:doc:`RPP <rpp:index>`,1.9.1,1.9.1,1.9.1,1.8.0,1.8.0,1.8.0,1.8.0,1.5.0,1.5.0,1.5.0,1.5.0,1.4.0,1.4.0
,,,,,,,,,,,,,
COMMUNICATION,.. _commlibs-support-compatibility-matrix-past-60:,,,,,,,,,,,,
:doc:`RCCL <rccl:index>`,2.21.5,2.21.5,2.21.5,2.20.5,2.20.5,2.20.5,2.20.5,2.18.6,2.18.6,2.18.6,2.18.6,2.18.3,2.18.3
,,,,,,,,,,,,,
MATH LIBS,.. _mathlibs-support-compatibility-matrix-past-60:,,,,,,,,,,,,
`half <https://github.com/ROCm/half>`_ ,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0
:doc:`hipBLAS <hipblas:index>`,2.3.0,2.3.0,2.3.0,2.2.0,2.2.0,2.2.0,2.2.0,2.1.0,2.1.0,2.1.0,2.1.0,2.0.0,2.0.0
:doc:`hipBLASLt <hipblaslt:index>`,0.10.0,0.10.0,0.10.0,0.8.0,0.8.0,0.8.0,0.8.0,0.7.0,0.7.0,0.7.0,0.7.0,0.6.0,0.6.0
:doc:`hipFFT <hipfft:index>`,1.0.17,1.0.17,1.0.17,1.0.16,1.0.15,1.0.15,1.0.14,1.0.14,1.0.14,1.0.14,1.0.14,1.0.13,1.0.13
:doc:`hipfort <hipfort:index>`,0.5.1,0.5.0,0.5.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0
:doc:`hipRAND <hiprand:index>`,2.11.1,2.11.1,2.11.0,2.11.1,2.11.0,2.11.0,2.11.0,2.10.16,2.10.16,2.10.16,2.10.16,2.10.16,2.10.16
:doc:`hipSOLVER <hipsolver:index>`,2.3.0,2.3.0,2.3.0,2.2.0,2.2.0,2.2.0,2.2.0,2.1.1,2.1.1,2.1.1,2.1.0,2.0.0,2.0.0
:doc:`hipSPARSE <hipsparse:index>`,3.1.2,3.1.2,3.1.2,3.1.1,3.1.1,3.1.1,3.1.1,3.0.1,3.0.1,3.0.1,3.0.1,3.0.0,3.0.0
:doc:`hipSPARSELt <hipsparselt:index>`,0.2.2,0.2.2,0.2.2,0.2.1,0.2.1,0.2.1,0.2.1,0.2.0,0.2.0,0.1.0,0.1.0,0.1.0,0.1.0
:doc:`rocALUTION <rocalution:index>`,3.2.1,3.2.1,3.2.1,3.2.1,3.2.0,3.2.0,3.2.0,3.1.1,3.1.1,3.1.1,3.1.1,3.0.3,3.0.3
:doc:`rocBLAS <rocblas:index>`,4.3.0,4.3.0,4.3.0,4.2.4,4.2.1,4.2.1,4.2.0,4.1.2,4.1.2,4.1.0,4.1.0,4.0.0,4.0.0
:doc:`rocFFT <rocfft:index>`,1.0.31,1.0.31,1.0.31,1.0.30,1.0.29,1.0.29,1.0.28,1.0.27,1.0.27,1.0.27,1.0.26,1.0.25,1.0.23
:doc:`rocRAND <rocrand:index>`,3.2.0,3.2.0,3.2.0,3.1.1,3.1.0,3.1.0,3.1.0,3.0.1,3.0.1,3.0.1,3.0.1,3.0.0,2.10.17
:doc:`rocSOLVER <rocsolver:index>`,3.27.0,3.27.0,3.27.0,3.26.2,3.26.0,3.26.0,3.26.0,3.25.0,3.25.0,3.25.0,3.25.0,3.24.0,3.24.0
:doc:`rocSPARSE <rocsparse:index>`,3.3.0,3.3.0,3.3.0,3.2.1,3.2.0,3.2.0,3.2.0,3.1.2,3.1.2,3.1.2,3.1.2,3.0.2,3.0.2
:doc:`rocWMMA <rocwmma:index>`,1.6.0,1.6.0,1.6.0,1.5.0,1.5.0,1.5.0,1.5.0,1.4.0,1.4.0,1.4.0,1.4.0,1.3.0,1.3.0
:doc:`Tensile <tensile:src/index>`,4.42.0,4.42.0,4.42.0,4.41.0,4.41.0,4.41.0,4.41.0,4.40.0,4.40.0,4.40.0,4.40.0,4.39.0,4.39.0
,,,,,,,,,,,,,
PRIMITIVES,.. _primitivelibs-support-compatibility-matrix-past-60:,,,,,,,,,,,,
:doc:`hipCUB <hipcub:index>`,3.3.0,3.3.0,3.3.0,3.2.1,3.2.0,3.2.0,3.2.0,3.1.0,3.1.0,3.1.0,3.1.0,3.0.0,3.0.0
:doc:`hipTensor <hiptensor:index>`,1.4.0,1.4.0,1.4.0,1.3.0,1.3.0,1.3.0,1.3.0,1.2.0,1.2.0,1.2.0,1.2.0,1.1.0,1.1.0
:doc:`rocPRIM <rocprim:index>`,3.3.0,3.3.0,3.3.0,3.2.2,3.2.0,3.2.0,3.2.0,3.1.0,3.1.0,3.1.0,3.1.0,3.0.0,3.0.0
:doc:`rocThrust <rocthrust:index>`,3.3.0,3.3.0,3.3.0,3.1.1,3.1.0,3.1.0,3.0.1,3.0.1,3.0.1,3.0.1,3.0.1,3.0.0,3.0.0
,,,,,,,,,,,,,
SUPPORT LIBS,,,,,,,,,,,,,
`hipother <https://github.com/ROCm/hipother>`_,6.3.42134,6.3.42133,6.3.42131,6.2.41134,6.2.41134,6.2.41134,6.2.41133,6.1.40093,6.1.40093,6.1.40092,6.1.40091,6.1.32831,6.1.32830
`rocm-core <https://github.com/ROCm/rocm-core>`_,6.3.2,6.3.1,6.3.0,6.2.4,6.2.2,6.2.1,6.2.0,6.1.5,6.1.2,6.1.1,6.1.0,6.0.2,6.0.0
`ROCT-Thunk-Interface <https://github.com/ROCm/ROCT-Thunk-Interface>`_,N/A [#ROCT-rocr-past-60]_,N/A [#ROCT-rocr-past-60]_,N/A [#ROCT-rocr-past-60]_,20240607.5.7,20240607.5.7,20240607.4.05,20240607.1.4246,20240125.5.08,20240125.5.08,20240125.5.08,20240125.3.30,20231016.2.245,20231016.2.245
,,,,,,,,,,,,,
SYSTEM MGMT TOOLS,.. _tools-support-compatibility-matrix-past-60:,,,,,,,,,,,,
:doc:`AMD SMI <amdsmi:index>`,24.7.1,24.7.1,24.7.1,24.6.3,24.6.3,24.6.3,24.6.2,24.5.1,24.5.1,24.5.1,24.4.1,23.4.2,23.4.2
:doc:`ROCm Data Center Tool <rdc:index>`,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0
:doc:`rocminfo <rocminfo:index>`,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0
:doc:`ROCm SMI <rocm_smi_lib:index>`,7.4.0,7.4.0,7.4.0,7.3.0,7.3.0,7.3.0,7.3.0,7.2.0,7.2.0,7.0.0,7.0.0,6.0.2,6.0.0
:doc:`ROCm Validation Suite <rocmvalidationsuite:index>`,1.1.0,1.1.0,1.1.0,1.0.60204,1.0.60202,1.0.60201,1.0.60200,1.0.60105,1.0.60102,1.0.60101,1.0.60100,1.0.60002,1.0.60000
,,,,,,,,,,,,,
PERFORMANCE TOOLS,,,,,,,,,,,,,
:doc:`ROCm Bandwidth Test <rocm_bandwidth_test:index>`,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0
:doc:`ROCm Compute Profiler <rocprofiler-compute:index>`,3.0.0,3.0.0,3.0.0,2.0.1,2.0.1,2.0.1,2.0.1,N/A,N/A,N/A,N/A,N/A,N/A
:doc:`ROCm Systems Profiler <rocprofiler-systems:index>`,0.1.1,0.1.0,0.1.0,1.11.2,1.11.2,1.11.2,1.11.2,N/A,N/A,N/A,N/A,N/A,N/A
:doc:`ROCProfiler <rocprofiler:index>`,2.0.60302,2.0.60301,2.0.60300,2.0.60204,2.0.60202,2.0.60201,2.0.60200,2.0.60105,2.0.60102,2.0.60101,2.0.60100,2.0.60002,2.0.60000
:doc:`ROCprofiler-SDK <rocprofiler-sdk:index>`,0.5.0,0.5.0,0.5.0,0.4.0,0.4.0,0.4.0,0.4.0,N/A,N/A,N/A,N/A,N/A,N/A
:doc:`ROCTracer <roctracer:index>`,4.1.60302,4.1.60301,4.1.60300,4.1.60204,4.1.60202,4.1.60201,4.1.60200,4.1.60105,4.1.60102,4.1.60101,4.1.60100,4.1.60002,4.1.60000
,,,,,,,,,,,,,
DEVELOPMENT TOOLS,,,,,,,,,,,,,
:doc:`HIPIFY <hipify:index>`,18.0.0.25012,18.0.0.24491,18.0.0.24455,18.0.0.24392,18.0.0.24355,18.0.0.24355,18.0.0.24232,17.0.0.24193,17.0.0.24193,17.0.0.24154,17.0.0.24103,17.0.0.24012,17.0.0.23483
:doc:`ROCm CMake <rocmcmakebuildtools:index>`,0.14.0,0.14.0,0.14.0,0.13.0,0.13.0,0.13.0,0.13.0,0.12.0,0.12.0,0.12.0,0.12.0,0.11.0,0.11.0
:doc:`ROCdbgapi <rocdbgapi:index>`,0.77.0,0.77.0,0.77.0,0.76.0,0.76.0,0.76.0,0.76.0,0.71.0,0.71.0,0.71.0,0.71.0,0.71.0,0.71.0
:doc:`ROCm Debugger (ROCgdb) <rocgdb:index>`,15.2.0,15.2.0,15.2.0,14.2.0,14.2.0,14.2.0,14.2.0,14.1.0,14.1.0,14.1.0,14.1.0,13.2.0,13.2.0
`rocprofiler-register <https://github.com/ROCm/rocprofiler-register>`_,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.3.0,0.3.0,0.3.0,0.3.0,N/A,N/A
:doc:`ROCr Debug Agent <rocr_debug_agent:index>`,2.0.3,2.0.3,2.0.3,2.0.3,2.0.3,2.0.3,2.0.3,2.0.3,2.0.3,2.0.3,2.0.3,2.0.3,2.0.3
,,,,,,,,,,,,,
COMPILERS,.. _compilers-support-compatibility-matrix-past-60:,,,,,,,,,,,,
`clang-ocl <https://github.com/ROCm/clang-ocl>`_,N/A,N/A,N/A,N/A,N/A,N/A,N/A,0.5.0,0.5.0,0.5.0,0.5.0,0.5.0,0.5.0
:doc:`hipCC <hipcc:index>`,1.1.1,1.1.1,1.1.1,1.1.1,1.1.1,1.1.1,1.1.1,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0
`Flang <https://github.com/ROCm/flang>`_,18.0.0.25012,18.0.0.24491,18.0.0.24455,18.0.0.24392,18.0.0.24355,18.0.0.24355,18.0.0.24232,17.0.0.24193,17.0.0.24193,17.0.0.24154,17.0.0.24103,17.0.0.24012,17.0.0.23483
:doc:`llvm-project <llvm-project:index>`,18.0.0.25012,18.0.0.24491,18.0.0.24491,18.0.0.24392,18.0.0.24355,18.0.0.24355,18.0.0.24232,17.0.0.24193,17.0.0.24193,17.0.0.24154,17.0.0.24103,17.0.0.24012,17.0.0.23483
`OpenMP <https://github.com/ROCm/llvm-project/tree/amd-staging/openmp>`_,18.0.0.25012,18.0.0.24491,18.0.0.24491,18.0.0.24392,18.0.0.24355,18.0.0.24355,18.0.0.24232,17.0.0.24193,17.0.0.24193,17.0.0.24154,17.0.0.24103,17.0.0.24012,17.0.0.23483
,,,,,,,,,,,,,
RUNTIMES,.. _runtime-support-compatibility-matrix-past-60:,,,,,,,,,,,,
:doc:`AMD CLR <hip:understand/amd_clr>`,6.3.42134,6.3.42133,6.3.42131,6.2.41134,6.2.41134,6.2.41134,6.2.41133,6.1.40093,6.1.40093,6.1.40092,6.1.40091,6.1.32831,6.1.32830
:doc:`HIP <hip:index>`,6.3.42134,6.3.42133,6.3.42131,6.2.41134,6.2.41134,6.2.41134,6.2.41133,6.1.40093,6.1.40093,6.1.40092,6.1.40091,6.1.32831,6.1.32830
`OpenCL Runtime <https://github.com/ROCm/clr/tree/develop/opencl>`_,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0
:doc:`ROCr Runtime <rocr-runtime:index>`,1.14.0,1.14.0,1.14.0,1.14.0,1.14.0,1.14.0,1.13.0,1.13.0,1.13.0,1.13.0,1.13.0,1.12.0,1.12.0
ROCm Version,6.3.0,6.2.4,6.2.2,6.2.1,6.2.0, 6.1.5, 6.1.2, 6.1.1, 6.1.0, 6.0.2, 6.0.0
:ref:`Operating systems & kernels <OS-kernel-versions>`,Ubuntu 24.04.2,"Ubuntu 24.04.1, 24.04","Ubuntu 24.04.1, 24.04","Ubuntu 24.04.1, 24.04",Ubuntu 24.04,,,,,,
,Ubuntu 22.04.5,"Ubuntu 22.04.5, 22.04.4","Ubuntu 22.04.5, 22.04.4","Ubuntu 22.04.5, 22.04.4","Ubuntu 22.04.5, 22.04.4","Ubuntu 22.04.5, 22.04.4, 22.04.3","Ubuntu 22.04.4, 22.04.3","Ubuntu 22.04.4, 22.04.3","Ubuntu 22.04.4, 22.04.3","Ubuntu 22.04.4, 22.04.3, 22.04.2","Ubuntu 22.04.4, 22.04.3, 22.04.2"
,,,,,,"Ubuntu 20.04.6, 20.04.5","Ubuntu 20.04.6, 20.04.5","Ubuntu 20.04.6, 20.04.5","Ubuntu 20.04.6, 20.04.5","Ubuntu 20.04.6, 20.04.5","Ubuntu 20.04.6, 20.04.5"
,"RHEL 9.5, 9.4","RHEL 9.4, 9.3","RHEL 9.4, 9.3","RHEL 9.4, 9.3","RHEL 9.4, 9.3","RHEL 9.4 [#red-hat94-past-60]_, 9.3, 9.2","RHEL 9.4 [#red-hat94-past-60]_, 9.3, 9.2","RHEL 9.4 [#red-hat94-past-60]_, 9.3, 9.2","RHEL 9.4 [#red-hat94-past-60]_, 9.3, 9.2","RHEL 9.3, 9.2","RHEL 9.3, 9.2"
,RHEL 8.10,"RHEL 8.10, 8.9","RHEL 8.10, 8.9","RHEL 8.10, 8.9","RHEL 8.10, 8.9","RHEL 8.9, 8.8","RHEL 8.9, 8.8","RHEL 8.9, 8.8","RHEL 8.9, 8.8","RHEL 8.9, 8.8","RHEL 8.9, 8.8"
,"SLES 15 SP6, SP5","SLES 15 SP6, SP5","SLES 15 SP6, SP5","SLES 15 SP6, SP5","SLES 15 SP6, SP5","SLES 15 SP5, SP4","SLES 15 SP5, SP4","SLES 15 SP5, SP4","SLES 15 SP5, SP4","SLES 15 SP5, SP4","SLES 15 SP5, SP4"
,,,,,,,CentOS 7.9,CentOS 7.9,CentOS 7.9,CentOS 7.9,CentOS 7.9
,Oracle Linux 8.10 [#oracle89-past-60]_,Oracle Linux 8.9 [#oracle89-past-60]_,Oracle Linux 8.9 [#oracle89-past-60]_,Oracle Linux 8.9 [#oracle89-past-60]_,Oracle Linux 8.9 [#oracle89-past-60]_,Oracle Linux 8.9 [#oracle89-past-60]_,Oracle Linux 8.9 [#oracle89-past-60]_,Oracle Linux 8.9 [#oracle89-past-60]_,,,
,.. _architecture-support-compatibility-matrix-past-60:,,,,,,,,,,
:doc:`Architecture <rocm-install-on-linux:reference/system-requirements>`,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3,CDNA3
,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2,CDNA2
,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA,CDNA
,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3,RDNA3
,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2,RDNA2
,.. _gpu-support-compatibility-matrix-past-60:,,,,,,,,,,
:doc:`GPU / LLVM target <rocm-install-on-linux:reference/system-requirements>`,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100,gfx1100
,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030,gfx1030
,gfx942,gfx942 [#mi300_624-past-60]_,gfx942 [#mi300_622-past-60]_,gfx942 [#mi300_621-past-60]_,gfx942 [#mi300_620-past-60]_, gfx942 [#mi300_612-past-60]_, gfx942 [#mi300_612-past-60]_, gfx942 [#mi300_611-past-60]_, gfx942 [#mi300_610-past-60]_, gfx942 [#mi300_602-past-60]_, gfx942 [#mi300_600-past-60]_
,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a,gfx90a
,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908,gfx908
,,,,,,,,,,,
FRAMEWORK SUPPORT,.. _framework-support-compatibility-matrix-past-60:,,,,,,,,,,
:doc:`PyTorch <rocm-install-on-linux:install/3rd-party/pytorch-install>`,"2.4, 2.3, 2.2, 2.1, 2.0, 1.13","2.3, 2.2, 2.1, 2.0, 1.13","2.3, 2.2, 2.1, 2.0, 1.13","2.3, 2.2, 2.1, 2.0, 1.13","2.3, 2.2, 2.1, 2.0, 1.13","2.1, 2.0, 1.13","2.1, 2.0, 1.13","2.1, 2.0, 1.13","2.1, 2.0, 1.13","2.1, 2.0, 1.13","2.1, 2.0, 1.13"
:doc:`TensorFlow <rocm-install-on-linux:install/3rd-party/tensorflow-install>`,"2.17.0, 2.16.2, 2.15.1","2.16.1, 2.15.1, 2.14.1","2.16.1, 2.15.1, 2.14.1","2.16.1, 2.15.1, 2.14.1","2.16.1, 2.15.1, 2.14.1","2.15.0, 2.14.0, 2.13.1","2.15.0, 2.14.0, 2.13.1","2.15.0, 2.14.0, 2.13.1","2.15.0, 2.14.0, 2.13.1","2.14.0, 2.13.1, 2.12.1","2.14.0, 2.13.1, 2.12.1"
:doc:`JAX <rocm-install-on-linux:install/3rd-party/jax-install>`,0.4.26,0.4.26,0.4.26,0.4.26,0.4.26,0.4.26,0.4.26,0.4.26,0.4.26,0.4.26,0.4.26
`ONNX Runtime <https://onnxruntime.ai/docs/build/eps.html#amd-migraphx>`_,1.17.3,1.17.3,1.17.3,1.17.3,1.17.3,1.17.3,1.17.3,1.17.3,1.17.3,1.14.1,1.14.1
,,,,,,,,,,,
THIRD PARTY COMMS,.. _thirdpartycomms-support-compatibility-matrix-past-60:,,,,,,,,,,
`UCC <https://github.com/ROCm/ucc>`_,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.3.0,>=1.2.0,>=1.2.0
`UCX <https://github.com/ROCm/ucx>`_,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.15.0,>=1.14.1,>=1.14.1,>=1.14.1,>=1.14.1,>=1.14.1,>=1.14.1
,,,,,,,,,,,
THIRD PARTY ALGORITHM,.. _thirdpartyalgorithm-support-compatibility-matrix-past-60:,,,,,,,,,,
Thrust,2.3.2,2.2.0,2.2.0,2.2.0,2.2.0,2.1.0,2.1.0,2.1.0,2.1.0,2.0.1,2.0.1
CUB,2.3.2,2.2.0,2.2.0,2.2.0,2.2.0,2.1.0,2.1.0,2.1.0,2.1.0,2.0.1,2.0.1
,,,,,,,,,,,
KMD & USER SPACE [#kfd_support-past-60]_,.. _kfd-userspace-support-compatibility-matrix-past-60:,,,,,,,,,,
Tested user space versions,"6.3.x, 6.2.x, 6.1.x","6.3.x, 6.2.x, 6.1.x, 6.0.x","6.3.x, 6.2.x, 6.1.x, 6.0.x","6.3.x, 6.2.x, 6.1.x, 6.0.x","6.3.x, 6.2.x, 6.1.x, 6.0.x","6.3.x, 6.2.x, 6.1.x, 6.0.x, 5.7.x","6.3.x, 6.2.x, 6.1.x, 6.0.x, 5.7.x","6.3.x, 6.2.x, 6.1.x, 6.0.x, 5.7.x","6.3.x, 6.2.x, 6.1.x, 6.0.x, 5.7.x","6.2.x, 6.1.x, 6.0.x, 5.7.x, 5.6.x","6.2.x, 6.1.x, 6.0.x, 5.7.x, 5.6.x"
,,,,,,,,,,,
ML & COMPUTER VISION,.. _mllibs-support-compatibility-matrix-past-60:,,,,,,,,,,
:doc:`Composable Kernel <composable_kernel:index>`,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0,1.1.0
:doc:`MIGraphX <amdmigraphx:index>`,2.11.0,2.10.0,2.10.0,2.10.0,2.10.0,2.9.0,2.9.0,2.9.0,2.9.0,2.8.0,2.8.0
:doc:`MIOpen <miopen:index>`,3.3.0,3.2.0,3.2.0,3.2.0,3.2.0,3.1.0,3.1.0,3.1.0,3.1.0,3.0.0,3.0.0
:doc:`MIVisionX <mivisionx:index>`,3.1.0,3.0.0,3.0.0,3.0.0,3.0.0,2.5.0,2.5.0,2.5.0,2.5.0,2.5.0,2.5.0
:doc:`rocAL <rocal:index>`,2.1.0,2.0.0,2.0.0,2.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0
:doc:`rocDecode <rocdecode:index>`,0.8.0,0.6.0,0.6.0,0.6.0,0.6.0,0.6.0,0.6.0,0.5.0,0.5.0,N/A,N/A
:doc:`rocJPEG <rocjpeg:index>`,0.6.0,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A
:doc:`rocPyDecode <rocpydecode:index>`,0.2.0,0.1.0,0.1.0,0.1.0,0.1.0,N/A,N/A,N/A,N/A,N/A,N/A
:doc:`RPP <rpp:index>`,1.9.1,1.8.0,1.8.0,1.8.0,1.8.0,1.5.0,1.5.0,1.5.0,1.5.0,1.4.0,1.4.0
,,,,,,,,,,,
COMMUNICATION,.. _commlibs-support-compatibility-matrix-past-60:,,,,,,,,,,
:doc:`RCCL <rccl:index>`,2.21.5,2.20.5,2.20.5,2.20.5,2.20.5,2.18.6,2.18.6,2.18.6,2.18.6,2.18.3,2.18.3
,,,,,,,,,,,
MATH LIBS,.. _mathlibs-support-compatibility-matrix-past-60:,,,,,,,,,,
`half <https://github.com/ROCm/half>`_ ,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0,1.12.0
:doc:`hipBLAS <hipblas:index>`,2.3.0,2.2.0,2.2.0,2.2.0,2.2.0,2.1.0,2.1.0,2.1.0,2.1.0,2.0.0,2.0.0
:doc:`hipBLASLt <hipblaslt:index>`,0.10.0,0.8.0,0.8.0,0.8.0,0.8.0,0.7.0,0.7.0,0.7.0,0.7.0,0.6.0,0.6.0
:doc:`hipFFT <hipfft:index>`,1.0.17,1.0.16,1.0.15,1.0.15,1.0.14,1.0.14,1.0.14,1.0.14,1.0.14,1.0.13,1.0.13
:doc:`hipfort <hipfort:index>`,0.5.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0
:doc:`hipRAND <hiprand:index>`,2.11.0,2.11.1,2.11.0,2.11.0,2.11.0,2.10.16,2.10.16,2.10.16,2.10.16,2.10.16,2.10.16
:doc:`hipSOLVER <hipsolver:index>`,2.3.0,2.2.0,2.2.0,2.2.0,2.2.0,2.1.1,2.1.1,2.1.1,2.1.0,2.0.0,2.0.0
:doc:`hipSPARSE <hipsparse:index>`,3.1.2,3.1.1,3.1.1,3.1.1,3.1.1,3.0.1,3.0.1,3.0.1,3.0.1,3.0.0,3.0.0
:doc:`hipSPARSELt <hipsparselt:index>`,0.2.2,0.2.1,0.2.1,0.2.1,0.2.1,0.2.0,0.2.0,0.1.0,0.1.0,0.1.0,0.1.0
:doc:`rocALUTION <rocalution:index>`,3.2.1,3.2.1,3.2.0,3.2.0,3.2.0,3.1.1,3.1.1,3.1.1,3.1.1,3.0.3,3.0.3
:doc:`rocBLAS <rocblas:index>`,4.3.0,4.2.4,4.2.1,4.2.1,4.2.0,4.1.2,4.1.2,4.1.0,4.1.0,4.0.0,4.0.0
:doc:`rocFFT <rocfft:index>`,1.0.31,1.0.30,1.0.29,1.0.29,1.0.28,1.0.27,1.0.27,1.0.27,1.0.26,1.0.25,1.0.23
:doc:`rocRAND <rocrand:index>`,3.2.0,3.1.1,3.1.0,3.1.0,3.1.0,3.0.1,3.0.1,3.0.1,3.0.1,3.0.0,2.10.17
:doc:`rocSOLVER <rocsolver:index>`,3.27.0,3.26.2,3.26.0,3.26.0,3.26.0,3.25.0,3.25.0,3.25.0,3.25.0,3.24.0,3.24.0
:doc:`rocSPARSE <rocsparse:index>`,3.3.0,3.2.1,3.2.0,3.2.0,3.2.0,3.1.2,3.1.2,3.1.2,3.1.2,3.0.2,3.0.2
:doc:`rocWMMA <rocwmma:index>`,1.6.0,1.5.0,1.5.0,1.5.0,1.5.0,1.4.0,1.4.0,1.4.0,1.4.0,1.3.0,1.3.0
:doc:`Tensile <tensile:index>`,4.42.0,4.41.0,4.41.0,4.41.0,4.41.0,4.40.0,4.40.0,4.40.0,4.40.0,4.39.0,4.39.0
,,,,,,,,,,,
PRIMITIVES,.. _primitivelibs-support-compatibility-matrix-past-60:,,,,,,,,,,
:doc:`hipCUB <hipcub:index>`,3.3.0,3.2.1,3.2.0,3.2.0,3.2.0,3.1.0,3.1.0,3.1.0,3.1.0,3.0.0,3.0.0
:doc:`hipTensor <hiptensor:index>`,1.4.0,1.3.0,1.3.0,1.3.0,1.3.0,1.2.0,1.2.0,1.2.0,1.2.0,1.1.0,1.1.0
:doc:`rocPRIM <rocprim:index>`,3.3.0,3.2.2,3.2.0,3.2.0,3.2.0,3.1.0,3.1.0,3.1.0,3.1.0,3.0.0,3.0.0
:doc:`rocThrust <rocthrust:index>`,3.3.0,3.1.1,3.1.0,3.1.0,3.0.1,3.0.1,3.0.1,3.0.1,3.0.1,3.0.0,3.0.0
,,,,,,,,,,,
SUPPORT LIBS,,,,,,,,,,,
`hipother <https://github.com/ROCm/hipother>`_,6.3.42131,6.2.41134,6.2.41134,6.2.41134,6.2.41133,6.1.40093,6.1.40093,6.1.40092,6.1.40091,6.1.32831,6.1.32830
`rocm-core <https://github.com/ROCm/rocm-core>`_,6.3.0,6.2.4,6.2.2,6.2.1,6.2.0,6.1.5,6.1.2,6.1.1,6.1.0,6.0.2,6.0.0
`ROCT-Thunk-Interface <https://github.com/ROCm/ROCT-Thunk-Interface>`_,N/A [#ROCT-rocr-past-60]_,20240607.5.7,20240607.5.7,20240607.4.05,20240607.1.4246,20240125.5.08,20240125.5.08,20240125.5.08,20240125.3.30,20231016.2.245,20231016.2.245
,,,,,,,,,,,
SYSTEM MGMT TOOLS,.. _tools-support-compatibility-matrix-past-60:,,,,,,,,,,
:doc:`AMD SMI <amdsmi:index>`,24.7.1,24.6.3,24.6.3,24.6.3,24.6.2,24.5.1,24.5.1,24.5.1,24.4.1,23.4.2,23.4.2
:doc:`ROCm Data Center Tool <rdc:index>`,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0,0.3.0
:doc:`rocminfo <rocminfo:index>`,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0
:doc:`ROCm SMI <rocm_smi_lib:index>`,7.4.0,7.3.0,7.3.0,7.3.0,7.3.0,7.2.0,7.2.0,7.0.0,7.0.0,6.0.2,6.0.0
:doc:`ROCm Validation Suite <rocmvalidationsuite:index>`,1.1.0,1.0.60204,1.0.60202,1.0.60201,1.0.60200,1.0.60105,1.0.60102,1.0.60101,1.0.60100,1.0.60002,1.0.60000
,,,,,,,,,,,
PERFORMANCE TOOLS,,,,,,,,,,,
:doc:`ROCm Bandwidth Test <rocm_bandwidth_test:index>`,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0,1.4.0
:doc:`ROCm Compute Profiler <rocprofiler-compute:index>`,3.0.0,2.0.1,2.0.1,2.0.1,2.0.1,N/A,N/A,N/A,N/A,N/A,N/A
:doc:`ROCm Systems Profiler <rocprofiler-systems:index>`,0.1.0,1.11.2,1.11.2,1.11.2,1.11.2,N/A,N/A,N/A,N/A,N/A,N/A
:doc:`ROCProfiler <rocprofiler:index>`,2.0.60300,2.0.60204,2.0.60202,2.0.60201,2.0.60200,2.0.60105,2.0.60102,2.0.60101,2.0.60100,2.0.60002,2.0.60000
:doc:`ROCprofiler-SDK <rocprofiler-sdk:index>`,0.5.0,0.4.0,0.4.0,0.4.0,0.4.0,N/A,N/A,N/A,N/A,N/A,N/A
:doc:`ROCTracer <roctracer:index>`,4.1.60300,4.1.60204,4.1.60202,4.1.60201,4.1.60200,4.1.60105,4.1.60102,4.1.60101,4.1.60100,4.1.60002,4.1.60000
,,,,,,,,,,,
DEVELOPMENT TOOLS,,,,,,,,,,,
:doc:`HIPIFY <hipify:index>`,18.0.0.24455,18.0.0.24392,18.0.0.24355,18.0.0.24355,18.0.0.24232,17.0.0.24193,17.0.0.24193,17.0.0.24154,17.0.0.24103,17.0.0.24012,17.0.0.23483
:doc:`ROCm CMake <rocmcmakebuildtools:index>`,0.14.0,0.13.0,0.13.0,0.13.0,0.13.0,0.12.0,0.12.0,0.12.0,0.12.0,0.11.0,0.11.0
:doc:`ROCdbgapi <rocdbgapi:index>`,0.77.0,0.76.0,0.76.0,0.76.0,0.76.0,0.71.0,0.71.0,0.71.0,0.71.0,0.71.0,0.71.0
:doc:`ROCm Debugger (ROCgdb) <rocgdb:index>`,15.2.0,14.2.0,14.2.0,14.2.0,14.2.0,14.1.0,14.1.0,14.1.0,14.1.0,13.2.0,13.2.0
`rocprofiler-register <https://github.com/ROCm/rocprofiler-register>`_,0.4.0,0.4.0,0.4.0,0.4.0,0.4.0,0.3.0,0.3.0,0.3.0,0.3.0,N/A,N/A
:doc:`ROCr Debug Agent <rocr_debug_agent:index>`,2.0.3,2.0.3,2.0.3,2.0.3,2.0.3,2.0.3,2.0.3,2.0.3,2.0.3,2.0.3,2.0.3
,,,,,,,,,,,
COMPILERS,.. _compilers-support-compatibility-matrix-past-60:,,,,,,,,,,
`clang-ocl <https://github.com/ROCm/clang-ocl>`_,N/A,N/A,N/A,N/A,N/A,0.5.0,0.5.0,0.5.0,0.5.0,0.5.0,0.5.0
:doc:`hipCC <hipcc:index>`,1.1.1,1.1.1,1.1.1,1.1.1,1.1.1,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0,1.0.0
`Flang <https://github.com/ROCm/flang>`_,18.0.0.24455,18.0.0.24392,18.0.0.24355,18.0.0.24355,18.0.0.24232,17.0.0.24193,17.0.0.24193,17.0.0.24154,17.0.0.24103,17.0.0.24012,17.0.0.23483
:doc:`llvm-project <llvm-project:index>`,18.0.0.24455,18.0.0.24392,18.0.0.24355,18.0.0.24355,18.0.0.24232,17.0.0.24193,17.0.0.24193,17.0.0.24154,17.0.0.24103,17.0.0.24012,17.0.0.23483
`OpenMP <https://github.com/ROCm/llvm-project/tree/amd-staging/openmp>`_,18.0.0.24455,18.0.0.24392,18.0.0.24355,18.0.0.24355,18.0.0.24232,17.0.0.24193,17.0.0.24193,17.0.0.24154,17.0.0.24103,17.0.0.24012,17.0.0.23483
,,,,,,,,,,,
RUNTIMES,.. _runtime-support-compatibility-matrix-past-60:,,,,,,,,,,
:doc:`AMD CLR <hip:understand/amd_clr>`,6.3.42131,6.2.41134,6.2.41134,6.2.41134,6.2.41133,6.1.40093,6.1.40093,6.1.40092,6.1.40091,6.1.32831,6.1.32830
:doc:`HIP <hip:index>`,6.3.42131,6.2.41134,6.2.41134,6.2.41134,6.2.41133,6.1.40093,6.1.40093,6.1.40092,6.1.40091,6.1.32831,6.1.32830
`OpenCL Runtime <https://github.com/ROCm/clr/tree/develop/opencl>`_,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0,2.0.0
:doc:`ROCr Runtime <rocr-runtime:index>`,1.14.0,1.14.0,1.14.0,1.14.0,1.13.0,1.13.0,1.13.0,1.13.0,1.13.0,1.12.0,1.12.0
1 ROCm Version 6.3.2 6.3.0 6.3.1 6.2.4 6.2.2 6.2.1 6.2.0 6.1.5 6.1.2 6.1.1 6.1.0 6.0.2 6.0.0
2 :ref:`Operating systems & kernels <OS-kernel-versions>` Ubuntu 24.04.2 Ubuntu 24.04.2 Ubuntu 24.04.2 Ubuntu 24.04.1, 24.04 Ubuntu 24.04.1, 24.04 Ubuntu 24.04.1, 24.04 Ubuntu 24.04
3 Ubuntu 22.04.5 Ubuntu 22.04.5 Ubuntu 22.04.5 Ubuntu 22.04.5, 22.04.4 Ubuntu 22.04.5, 22.04.4 Ubuntu 22.04.5, 22.04.4 Ubuntu 22.04.5, 22.04.4 Ubuntu 22.04.5, 22.04.4, 22.04.3 Ubuntu 22.04.4, 22.04.3 Ubuntu 22.04.4, 22.04.3 Ubuntu 22.04.4, 22.04.3 Ubuntu 22.04.4, 22.04.3, 22.04.2 Ubuntu 22.04.4, 22.04.3, 22.04.2
4 Ubuntu 20.04.6, 20.04.5 Ubuntu 20.04.6, 20.04.5 Ubuntu 20.04.6, 20.04.5 Ubuntu 20.04.6, 20.04.5 Ubuntu 20.04.6, 20.04.5 Ubuntu 20.04.6, 20.04.5
5 RHEL 9.5, 9.4 RHEL 9.5, 9.4 RHEL 9.5, 9.4 RHEL 9.4, 9.3 RHEL 9.4, 9.3 RHEL 9.4, 9.3 RHEL 9.4, 9.3 RHEL 9.4, 9.3, 9.2 RHEL 9.4 [#red-hat94-past-60]_, 9.3, 9.2 RHEL 9.4, 9.3, 9.2 RHEL 9.4 [#red-hat94-past-60]_, 9.3, 9.2 RHEL 9.4, 9.3, 9.2 RHEL 9.4 [#red-hat94-past-60]_, 9.3, 9.2 RHEL 9.4, 9.3, 9.2 RHEL 9.4 [#red-hat94-past-60]_, 9.3, 9.2 RHEL 9.3, 9.2 RHEL 9.3, 9.2
6 RHEL 8.10 RHEL 8.10 RHEL 8.10 RHEL 8.10, 8.9 RHEL 8.10, 8.9 RHEL 8.10, 8.9 RHEL 8.10, 8.9 RHEL 8.9, 8.8 RHEL 8.9, 8.8 RHEL 8.9, 8.8 RHEL 8.9, 8.8 RHEL 8.9, 8.8 RHEL 8.9, 8.8
7 SLES 15 SP6, SP5 SLES 15 SP6, SP5 SLES 15 SP6, SP5 SLES 15 SP6, SP5 SLES 15 SP6, SP5 SLES 15 SP6, SP5 SLES 15 SP6, SP5 SLES 15 SP5, SP4 SLES 15 SP5, SP4 SLES 15 SP5, SP4 SLES 15 SP5, SP4 SLES 15 SP5, SP4 SLES 15 SP5, SP4
8 CentOS 7.9 CentOS 7.9 CentOS 7.9 CentOS 7.9 CentOS 7.9
9 Oracle Linux 8.10 [#mi300x-past-60]_ Oracle Linux 8.10 [#mi300x-past-60]_ Oracle Linux 8.10 [#oracle89-past-60]_ Oracle Linux 8.10 [#mi300x-past-60]_ Oracle Linux 8.9 [#mi300x-past-60]_ Oracle Linux 8.9 [#oracle89-past-60]_ Oracle Linux 8.9 [#mi300x-past-60]_ Oracle Linux 8.9 [#oracle89-past-60]_ Oracle Linux 8.9 [#mi300x-past-60]_ Oracle Linux 8.9 [#oracle89-past-60]_ Oracle Linux 8.9 [#mi300x-past-60]_ Oracle Linux 8.9 [#oracle89-past-60]_ Oracle Linux 8.9 [#mi300x-past-60]_ Oracle Linux 8.9 [#oracle89-past-60]_ Oracle Linux 8.9 [#mi300x-past-60]_ Oracle Linux 8.9 [#oracle89-past-60]_ Oracle Linux 8.9 [#mi300x-past-60]_ Oracle Linux 8.9 [#oracle89-past-60]_
10 Debian 12 [#single-node-past-60]_ .. _architecture-support-compatibility-matrix-past-60: Debian 12 [#single-node-past-60]_
11 :doc:`Architecture <rocm-install-on-linux:reference/system-requirements>` Azure Linux 3.0 [#mi300x-past-60]_ CDNA3 CDNA3 CDNA3 CDNA3 CDNA3 CDNA3 CDNA3 CDNA3 CDNA3 CDNA3 CDNA3
12 .. _architecture-support-compatibility-matrix-past-60: CDNA2 CDNA2 CDNA2 CDNA2 CDNA2 CDNA2 CDNA2 CDNA2 CDNA2 CDNA2 CDNA2
13 :doc:`Architecture <rocm-install-on-linux:reference/system-requirements>` CDNA3 CDNA3 CDNA CDNA3 CDNA3 CDNA CDNA3 CDNA CDNA3 CDNA CDNA3 CDNA CDNA3 CDNA CDNA3 CDNA CDNA3 CDNA CDNA3 CDNA CDNA3 CDNA CDNA3 CDNA
14 CDNA2 CDNA2 RDNA3 CDNA2 CDNA2 RDNA3 CDNA2 RDNA3 CDNA2 RDNA3 CDNA2 RDNA3 CDNA2 RDNA3 CDNA2 RDNA3 CDNA2 RDNA3 CDNA2 RDNA3 CDNA2 RDNA3 CDNA2 RDNA3
15 CDNA CDNA RDNA2 CDNA CDNA RDNA2 CDNA RDNA2 CDNA RDNA2 CDNA RDNA2 CDNA RDNA2 CDNA RDNA2 CDNA RDNA2 CDNA RDNA2 CDNA RDNA2 CDNA RDNA2
16 RDNA3 RDNA3 .. _gpu-support-compatibility-matrix-past-60: RDNA3 RDNA3 RDNA3 RDNA3 RDNA3 RDNA3 RDNA3 RDNA3 RDNA3 RDNA3 RDNA3
17 :doc:`GPU / LLVM target <rocm-install-on-linux:reference/system-requirements>` RDNA2 RDNA2 gfx1100 RDNA2 RDNA2 gfx1100 RDNA2 gfx1100 RDNA2 gfx1100 RDNA2 gfx1100 RDNA2 gfx1100 RDNA2 gfx1100 RDNA2 gfx1100 RDNA2 gfx1100 RDNA2 gfx1100 RDNA2 gfx1100
18 .. _gpu-support-compatibility-matrix-past-60: gfx1030 gfx1030 gfx1030 gfx1030 gfx1030 gfx1030 gfx1030 gfx1030 gfx1030 gfx1030 gfx1030
19 :doc:`GPU / LLVM target <rocm-install-on-linux:reference/system-requirements>` gfx1100 gfx1100 gfx942 gfx1100 gfx1100 gfx942 [#mi300_624-past-60]_ gfx1100 gfx942 [#mi300_622-past-60]_ gfx1100 gfx942 [#mi300_621-past-60]_ gfx1100 gfx942 [#mi300_620-past-60]_ gfx1100 gfx942 [#mi300_612-past-60]_ gfx1100 gfx942 [#mi300_612-past-60]_ gfx1100 gfx942 [#mi300_611-past-60]_ gfx1100 gfx942 [#mi300_610-past-60]_ gfx1100 gfx942 [#mi300_602-past-60]_ gfx1100 gfx942 [#mi300_600-past-60]_
20 gfx1030 gfx1030 gfx90a gfx1030 gfx1030 gfx90a gfx1030 gfx90a gfx1030 gfx90a gfx1030 gfx90a gfx1030 gfx90a gfx1030 gfx90a gfx1030 gfx90a gfx1030 gfx90a gfx1030 gfx90a gfx1030 gfx90a
21 gfx942 gfx942 gfx908 gfx942 gfx942 [#mi300_624-past-60]_ gfx908 gfx942 [#mi300_622-past-60]_ gfx908 gfx942 [#mi300_621-past-60]_ gfx908 gfx942 [#mi300_620-past-60]_ gfx908 gfx942 [#mi300_612-past-60]_ gfx908 gfx942 [#mi300_612-past-60]_ gfx908 gfx942 [#mi300_611-past-60]_ gfx908 gfx942 [#mi300_610-past-60]_ gfx908 gfx942 [#mi300_602-past-60]_ gfx908 gfx942 [#mi300_600-past-60]_ gfx908
22 gfx90a gfx90a gfx90a gfx90a gfx90a gfx90a gfx90a gfx90a gfx90a gfx90a gfx90a gfx90a gfx90a
23 FRAMEWORK SUPPORT gfx908 gfx908 .. _framework-support-compatibility-matrix-past-60: gfx908 gfx908 gfx908 gfx908 gfx908 gfx908 gfx908 gfx908 gfx908 gfx908 gfx908
24 :doc:`PyTorch <rocm-install-on-linux:install/3rd-party/pytorch-install>` 2.4, 2.3, 2.2, 2.1, 2.0, 1.13 2.3, 2.2, 2.1, 2.0, 1.13 2.3, 2.2, 2.1, 2.0, 1.13 2.3, 2.2, 2.1, 2.0, 1.13 2.3, 2.2, 2.1, 2.0, 1.13 2.1, 2.0, 1.13 2.1, 2.0, 1.13 2.1, 2.0, 1.13 2.1, 2.0, 1.13 2.1, 2.0, 1.13 2.1, 2.0, 1.13
25 FRAMEWORK SUPPORT :doc:`TensorFlow <rocm-install-on-linux:install/3rd-party/tensorflow-install>` .. _framework-support-compatibility-matrix-past-60: 2.17.0, 2.16.2, 2.15.1 2.16.1, 2.15.1, 2.14.1 2.16.1, 2.15.1, 2.14.1 2.16.1, 2.15.1, 2.14.1 2.16.1, 2.15.1, 2.14.1 2.15.0, 2.14.0, 2.13.1 2.15.0, 2.14.0, 2.13.1 2.15.0, 2.14.0, 2.13.1 2.15.0, 2.14.0, 2.13.1 2.14.0, 2.13.1, 2.12.1 2.14.0, 2.13.1, 2.12.1
26 :doc:`PyTorch <../compatibility/ml-compatibility/pytorch-compatibility>` :doc:`JAX <rocm-install-on-linux:install/3rd-party/jax-install>` 2.4, 2.3, 2.2, 1.13 2.4, 2.3, 2.2, 2.1, 2.0, 1.13 0.4.26 2.4, 2.3, 2.2, 1.13 2.3, 2.2, 2.1, 2.0, 1.13 0.4.26 2.3, 2.2, 2.1, 2.0, 1.13 0.4.26 2.3, 2.2, 2.1, 2.0, 1.13 0.4.26 2.3, 2.2, 2.1, 2.0, 1.13 0.4.26 2.1, 2.0, 1.13 0.4.26 2.1, 2.0, 1.13 0.4.26 2.1, 2.0, 1.13 0.4.26 2.1, 2.0, 1.13 0.4.26 2.1, 2.0, 1.13 0.4.26 2.1, 2.0, 1.13 0.4.26
27 :doc:`TensorFlow <../compatibility/ml-compatibility/tensorflow-compatibility>` `ONNX Runtime <https://onnxruntime.ai/docs/build/eps.html#amd-migraphx>`_ 2.17.0, 2.16.2, 2.15.1 2.17.0, 2.16.2, 2.15.1 1.17.3 2.17.0, 2.16.2, 2.15.1 2.16.1, 2.15.1, 2.14.1 1.17.3 2.16.1, 2.15.1, 2.14.1 1.17.3 2.16.1, 2.15.1, 2.14.1 1.17.3 2.16.1, 2.15.1, 2.14.1 1.17.3 2.15.0, 2.14.0, 2.13.1 1.17.3 2.15.0, 2.14.0, 2.13.1 1.17.3 2.15.0, 2.14.0, 2.13.1 1.17.3 2.15.0, 2.14.0, 2.13.1 1.17.3 2.14.0, 2.13.1, 2.12.1 1.14.1 2.14.0, 2.13.1, 2.12.1 1.14.1
28 :doc:`JAX <../compatibility/ml-compatibility/jax-compatibility>` 0.4.31 0.4.31 0.4.31 0.4.26 0.4.26 0.4.26 0.4.26 0.4.26 0.4.26 0.4.26 0.4.26 0.4.26 0.4.26
29 `ONNX Runtime <https://onnxruntime.ai/docs/build/eps.html#amd-migraphx>`_ THIRD PARTY COMMS 1.17.3 1.17.3 .. _thirdpartycomms-support-compatibility-matrix-past-60: 1.17.3 1.17.3 1.17.3 1.17.3 1.17.3 1.17.3 1.17.3 1.17.3 1.17.3 1.14.1 1.14.1
30 `UCC <https://github.com/ROCm/ucc>`_ >=1.3.0 >=1.3.0 >=1.3.0 >=1.3.0 >=1.3.0 >=1.3.0 >=1.3.0 >=1.3.0 >=1.3.0 >=1.2.0 >=1.2.0
31 THIRD PARTY COMMS `UCX <https://github.com/ROCm/ucx>`_ .. _thirdpartycomms-support-compatibility-matrix-past-60: >=1.15.0 >=1.15.0 >=1.15.0 >=1.15.0 >=1.15.0 >=1.14.1 >=1.14.1 >=1.14.1 >=1.14.1 >=1.14.1 >=1.14.1
32 `UCC <https://github.com/ROCm/ucc>`_ >=1.3.0 >=1.3.0 >=1.3.0 >=1.3.0 >=1.3.0 >=1.3.0 >=1.3.0 >=1.3.0 >=1.3.0 >=1.3.0 >=1.3.0 >=1.2.0 >=1.2.0
33 `UCX <https://github.com/ROCm/ucx>`_ THIRD PARTY ALGORITHM >=1.15.0 >=1.15.0 .. _thirdpartyalgorithm-support-compatibility-matrix-past-60: >=1.15.0 >=1.15.0 >=1.15.0 >=1.15.0 >=1.15.0 >=1.14.1 >=1.14.1 >=1.14.1 >=1.14.1 >=1.14.1 >=1.14.1
34 Thrust 2.3.2 2.2.0 2.2.0 2.2.0 2.2.0 2.1.0 2.1.0 2.1.0 2.1.0 2.0.1 2.0.1
35 THIRD PARTY ALGORITHM CUB .. _thirdpartyalgorithm-support-compatibility-matrix-past-60: 2.3.2 2.2.0 2.2.0 2.2.0 2.2.0 2.1.0 2.1.0 2.1.0 2.1.0 2.0.1 2.0.1
36 Thrust 2.3.2 2.3.2 2.3.2 2.2.0 2.2.0 2.2.0 2.2.0 2.1.0 2.1.0 2.1.0 2.1.0 2.0.1 2.0.1
37 CUB KMD & USER SPACE [#kfd_support-past-60]_ 2.3.2 2.3.2 .. _kfd-userspace-support-compatibility-matrix-past-60: 2.3.2 2.2.0 2.2.0 2.2.0 2.2.0 2.1.0 2.1.0 2.1.0 2.1.0 2.0.1 2.0.1
38 Tested user space versions 6.3.x, 6.2.x, 6.1.x 6.3.x, 6.2.x, 6.1.x, 6.0.x 6.3.x, 6.2.x, 6.1.x, 6.0.x 6.3.x, 6.2.x, 6.1.x, 6.0.x 6.3.x, 6.2.x, 6.1.x, 6.0.x 6.3.x, 6.2.x, 6.1.x, 6.0.x, 5.7.x 6.3.x, 6.2.x, 6.1.x, 6.0.x, 5.7.x 6.3.x, 6.2.x, 6.1.x, 6.0.x, 5.7.x 6.3.x, 6.2.x, 6.1.x, 6.0.x, 5.7.x 6.2.x, 6.1.x, 6.0.x, 5.7.x, 5.6.x 6.2.x, 6.1.x, 6.0.x, 5.7.x, 5.6.x
39 KMD & USER SPACE [#kfd_support-past-60]_ .. _kfd-userspace-support-compatibility-matrix-past-60:
40 Tested user space versions ML & COMPUTER VISION 6.3.x, 6.2.x, 6.1.x 6.3.x, 6.2.x, 6.1.x .. _mllibs-support-compatibility-matrix-past-60: 6.3.x, 6.2.x, 6.1.x 6.3.x, 6.2.x, 6.1.x, 6.0.x 6.3.x, 6.2.x, 6.1.x, 6.0.x 6.3.x, 6.2.x, 6.1.x, 6.0.x 6.3.x, 6.2.x, 6.1.x, 6.0.x 6.3.x, 6.2.x, 6.1.x, 6.0.x, 5.7.x 6.3.x, 6.2.x, 6.1.x, 6.0.x, 5.7.x 6.3.x, 6.2.x, 6.1.x, 6.0.x, 5.7.x 6.3.x, 6.2.x, 6.1.x, 6.0.x, 5.7.x 6.2.x, 6.1.x, 6.0.x, 5.7.x, 5.6.x 6.2.x, 6.1.x, 6.0.x, 5.7.x, 5.6.x
41 :doc:`Composable Kernel <composable_kernel:index>` 1.1.0 1.1.0 1.1.0 1.1.0 1.1.0 1.1.0 1.1.0 1.1.0 1.1.0 1.1.0 1.1.0
42 ML & COMPUTER VISION :doc:`MIGraphX <amdmigraphx:index>` .. _mllibs-support-compatibility-matrix-past-60: 2.11.0 2.10.0 2.10.0 2.10.0 2.10.0 2.9.0 2.9.0 2.9.0 2.9.0 2.8.0 2.8.0
43 :doc:`Composable Kernel <composable_kernel:index>` :doc:`MIOpen <miopen:index>` 1.1.0 1.1.0 3.3.0 1.1.0 1.1.0 3.2.0 1.1.0 3.2.0 1.1.0 3.2.0 1.1.0 3.2.0 1.1.0 3.1.0 1.1.0 3.1.0 1.1.0 3.1.0 1.1.0 3.1.0 1.1.0 3.0.0 1.1.0 3.0.0
44 :doc:`MIGraphX <amdmigraphx:index>` :doc:`MIVisionX <mivisionx:index>` 2.11.0 2.11.0 3.1.0 2.11.0 2.10.0 3.0.0 2.10.0 3.0.0 2.10.0 3.0.0 2.10.0 3.0.0 2.9.0 2.5.0 2.9.0 2.5.0 2.9.0 2.5.0 2.9.0 2.5.0 2.8.0 2.5.0 2.8.0 2.5.0
45 :doc:`MIOpen <miopen:index>` :doc:`rocAL <rocal:index>` 3.3.0 3.3.0 2.1.0 3.3.0 3.2.0 2.0.0 3.2.0 2.0.0 3.2.0 2.0.0 3.2.0 1.0.0 3.1.0 1.0.0 3.1.0 1.0.0 3.1.0 1.0.0 3.1.0 1.0.0 3.0.0 1.0.0 3.0.0 1.0.0
46 :doc:`MIVisionX <mivisionx:index>` :doc:`rocDecode <rocdecode:index>` 3.1.0 3.1.0 0.8.0 3.1.0 3.0.0 0.6.0 3.0.0 0.6.0 3.0.0 0.6.0 3.0.0 0.6.0 2.5.0 0.6.0 2.5.0 0.6.0 2.5.0 0.5.0 2.5.0 0.5.0 2.5.0 N/A 2.5.0 N/A
47 :doc:`rocAL <rocal:index>` :doc:`rocJPEG <rocjpeg:index>` 2.1.0 2.1.0 0.6.0 2.1.0 2.0.0 N/A 2.0.0 N/A 2.0.0 N/A 1.0.0 N/A 1.0.0 N/A 1.0.0 N/A 1.0.0 N/A 1.0.0 N/A 1.0.0 N/A 1.0.0 N/A
48 :doc:`rocDecode <rocdecode:index>` :doc:`rocPyDecode <rocpydecode:index>` 0.8.0 0.8.0 0.2.0 0.8.0 0.6.0 0.1.0 0.6.0 0.1.0 0.6.0 0.1.0 0.6.0 0.1.0 0.6.0 N/A 0.6.0 N/A 0.5.0 N/A 0.5.0 N/A N/A N/A
49 :doc:`rocJPEG <rocjpeg:index>` :doc:`RPP <rpp:index>` 0.6.0 0.6.0 1.9.1 0.6.0 N/A 1.8.0 N/A 1.8.0 N/A 1.8.0 N/A 1.8.0 N/A 1.5.0 N/A 1.5.0 N/A 1.5.0 N/A 1.5.0 N/A 1.4.0 N/A 1.4.0
50 :doc:`rocPyDecode <rocpydecode:index>` 0.2.0 0.2.0 0.2.0 0.1.0 0.1.0 0.1.0 0.1.0 N/A N/A N/A N/A N/A N/A
51 :doc:`RPP <rpp:index>` COMMUNICATION 1.9.1 1.9.1 .. _commlibs-support-compatibility-matrix-past-60: 1.9.1 1.8.0 1.8.0 1.8.0 1.8.0 1.5.0 1.5.0 1.5.0 1.5.0 1.4.0 1.4.0
52 :doc:`RCCL <rccl:index>` 2.21.5 2.20.5 2.20.5 2.20.5 2.20.5 2.18.6 2.18.6 2.18.6 2.18.6 2.18.3 2.18.3
53 COMMUNICATION .. _commlibs-support-compatibility-matrix-past-60:
54 :doc:`RCCL <rccl:index>` MATH LIBS 2.21.5 2.21.5 .. _mathlibs-support-compatibility-matrix-past-60: 2.21.5 2.20.5 2.20.5 2.20.5 2.20.5 2.18.6 2.18.6 2.18.6 2.18.6 2.18.3 2.18.3
55 `half <https://github.com/ROCm/half>`_ 1.12.0 1.12.0 1.12.0 1.12.0 1.12.0 1.12.0 1.12.0 1.12.0 1.12.0 1.12.0 1.12.0
56 MATH LIBS :doc:`hipBLAS <hipblas:index>` .. _mathlibs-support-compatibility-matrix-past-60: 2.3.0 2.2.0 2.2.0 2.2.0 2.2.0 2.1.0 2.1.0 2.1.0 2.1.0 2.0.0 2.0.0
57 `half <https://github.com/ROCm/half>`_ :doc:`hipBLASLt <hipblaslt:index>` 1.12.0 1.12.0 0.10.0 1.12.0 1.12.0 0.8.0 1.12.0 0.8.0 1.12.0 0.8.0 1.12.0 0.8.0 1.12.0 0.7.0 1.12.0 0.7.0 1.12.0 0.7.0 1.12.0 0.7.0 1.12.0 0.6.0 1.12.0 0.6.0
58 :doc:`hipBLAS <hipblas:index>` :doc:`hipFFT <hipfft:index>` 2.3.0 2.3.0 1.0.17 2.3.0 2.2.0 1.0.16 2.2.0 1.0.15 2.2.0 1.0.15 2.2.0 1.0.14 2.1.0 1.0.14 2.1.0 1.0.14 2.1.0 1.0.14 2.1.0 1.0.14 2.0.0 1.0.13 2.0.0 1.0.13
59 :doc:`hipBLASLt <hipblaslt:index>` :doc:`hipfort <hipfort:index>` 0.10.0 0.10.0 0.5.0 0.10.0 0.8.0 0.4.0 0.8.0 0.4.0 0.8.0 0.4.0 0.8.0 0.4.0 0.7.0 0.4.0 0.7.0 0.4.0 0.7.0 0.4.0 0.7.0 0.4.0 0.6.0 0.4.0 0.6.0 0.4.0
60 :doc:`hipFFT <hipfft:index>` :doc:`hipRAND <hiprand:index>` 1.0.17 1.0.17 2.11.0 1.0.17 1.0.16 2.11.1 1.0.15 2.11.0 1.0.15 2.11.0 1.0.14 2.11.0 1.0.14 2.10.16 1.0.14 2.10.16 1.0.14 2.10.16 1.0.14 2.10.16 1.0.13 2.10.16 1.0.13 2.10.16
61 :doc:`hipfort <hipfort:index>` :doc:`hipSOLVER <hipsolver:index>` 0.5.1 0.5.0 2.3.0 0.5.0 0.4.0 2.2.0 0.4.0 2.2.0 0.4.0 2.2.0 0.4.0 2.2.0 0.4.0 2.1.1 0.4.0 2.1.1 0.4.0 2.1.1 0.4.0 2.1.0 0.4.0 2.0.0 0.4.0 2.0.0
62 :doc:`hipRAND <hiprand:index>` :doc:`hipSPARSE <hipsparse:index>` 2.11.1 2.11.0 3.1.2 2.11.1 2.11.1 3.1.1 2.11.0 3.1.1 2.11.0 3.1.1 2.11.0 3.1.1 2.10.16 3.0.1 2.10.16 3.0.1 2.10.16 3.0.1 2.10.16 3.0.1 2.10.16 3.0.0 2.10.16 3.0.0
63 :doc:`hipSOLVER <hipsolver:index>` :doc:`hipSPARSELt <hipsparselt:index>` 2.3.0 2.3.0 0.2.2 2.3.0 2.2.0 0.2.1 2.2.0 0.2.1 2.2.0 0.2.1 2.2.0 0.2.1 2.1.1 0.2.0 2.1.1 0.2.0 2.1.1 0.1.0 2.1.0 0.1.0 2.0.0 0.1.0 2.0.0 0.1.0
64 :doc:`hipSPARSE <hipsparse:index>` :doc:`rocALUTION <rocalution:index>` 3.1.2 3.1.2 3.2.1 3.1.2 3.1.1 3.2.1 3.1.1 3.2.0 3.1.1 3.2.0 3.1.1 3.2.0 3.0.1 3.1.1 3.0.1 3.1.1 3.0.1 3.1.1 3.0.1 3.1.1 3.0.0 3.0.3 3.0.0 3.0.3
65 :doc:`hipSPARSELt <hipsparselt:index>` :doc:`rocBLAS <rocblas:index>` 0.2.2 0.2.2 4.3.0 0.2.2 0.2.1 4.2.4 0.2.1 4.2.1 0.2.1 4.2.1 0.2.1 4.2.0 0.2.0 4.1.2 0.2.0 4.1.2 0.1.0 4.1.0 0.1.0 4.1.0 0.1.0 4.0.0 0.1.0 4.0.0
66 :doc:`rocALUTION <rocalution:index>` :doc:`rocFFT <rocfft:index>` 3.2.1 3.2.1 1.0.31 3.2.1 3.2.1 1.0.30 3.2.0 1.0.29 3.2.0 1.0.29 3.2.0 1.0.28 3.1.1 1.0.27 3.1.1 1.0.27 3.1.1 1.0.27 3.1.1 1.0.26 3.0.3 1.0.25 3.0.3 1.0.23
67 :doc:`rocBLAS <rocblas:index>` :doc:`rocRAND <rocrand:index>` 4.3.0 4.3.0 3.2.0 4.3.0 4.2.4 3.1.1 4.2.1 3.1.0 4.2.1 3.1.0 4.2.0 3.1.0 4.1.2 3.0.1 4.1.2 3.0.1 4.1.0 3.0.1 4.1.0 3.0.1 4.0.0 3.0.0 4.0.0 2.10.17
68 :doc:`rocFFT <rocfft:index>` :doc:`rocSOLVER <rocsolver:index>` 1.0.31 1.0.31 3.27.0 1.0.31 1.0.30 3.26.2 1.0.29 3.26.0 1.0.29 3.26.0 1.0.28 3.26.0 1.0.27 3.25.0 1.0.27 3.25.0 1.0.27 3.25.0 1.0.26 3.25.0 1.0.25 3.24.0 1.0.23 3.24.0
69 :doc:`rocRAND <rocrand:index>` :doc:`rocSPARSE <rocsparse:index>` 3.2.0 3.2.0 3.3.0 3.2.0 3.1.1 3.2.1 3.1.0 3.2.0 3.1.0 3.2.0 3.1.0 3.2.0 3.0.1 3.1.2 3.0.1 3.1.2 3.0.1 3.1.2 3.0.1 3.1.2 3.0.0 3.0.2 2.10.17 3.0.2
70 :doc:`rocSOLVER <rocsolver:index>` :doc:`rocWMMA <rocwmma:index>` 3.27.0 3.27.0 1.6.0 3.27.0 3.26.2 1.5.0 3.26.0 1.5.0 3.26.0 1.5.0 3.26.0 1.5.0 3.25.0 1.4.0 3.25.0 1.4.0 3.25.0 1.4.0 3.25.0 1.4.0 3.24.0 1.3.0 3.24.0 1.3.0
71 :doc:`rocSPARSE <rocsparse:index>` :doc:`Tensile <tensile:index>` 3.3.0 3.3.0 4.42.0 3.3.0 3.2.1 4.41.0 3.2.0 4.41.0 3.2.0 4.41.0 3.2.0 4.41.0 3.1.2 4.40.0 3.1.2 4.40.0 3.1.2 4.40.0 3.1.2 4.40.0 3.0.2 4.39.0 3.0.2 4.39.0
72 :doc:`rocWMMA <rocwmma:index>` 1.6.0 1.6.0 1.6.0 1.5.0 1.5.0 1.5.0 1.5.0 1.4.0 1.4.0 1.4.0 1.4.0 1.3.0 1.3.0
73 :doc:`Tensile <tensile:src/index>` PRIMITIVES 4.42.0 4.42.0 .. _primitivelibs-support-compatibility-matrix-past-60: 4.42.0 4.41.0 4.41.0 4.41.0 4.41.0 4.40.0 4.40.0 4.40.0 4.40.0 4.39.0 4.39.0
74 :doc:`hipCUB <hipcub:index>` 3.3.0 3.2.1 3.2.0 3.2.0 3.2.0 3.1.0 3.1.0 3.1.0 3.1.0 3.0.0 3.0.0
75 PRIMITIVES :doc:`hipTensor <hiptensor:index>` .. _primitivelibs-support-compatibility-matrix-past-60: 1.4.0 1.3.0 1.3.0 1.3.0 1.3.0 1.2.0 1.2.0 1.2.0 1.2.0 1.1.0 1.1.0
76 :doc:`hipCUB <hipcub:index>` :doc:`rocPRIM <rocprim:index>` 3.3.0 3.3.0 3.3.0 3.2.1 3.2.2 3.2.0 3.2.0 3.2.0 3.1.0 3.1.0 3.1.0 3.1.0 3.0.0 3.0.0
77 :doc:`hipTensor <hiptensor:index>` :doc:`rocThrust <rocthrust:index>` 1.4.0 1.4.0 3.3.0 1.4.0 1.3.0 3.1.1 1.3.0 3.1.0 1.3.0 3.1.0 1.3.0 3.0.1 1.2.0 3.0.1 1.2.0 3.0.1 1.2.0 3.0.1 1.2.0 3.0.1 1.1.0 3.0.0 1.1.0 3.0.0
78 :doc:`rocPRIM <rocprim:index>` 3.3.0 3.3.0 3.3.0 3.2.2 3.2.0 3.2.0 3.2.0 3.1.0 3.1.0 3.1.0 3.1.0 3.0.0 3.0.0
79 :doc:`rocThrust <rocthrust:index>` SUPPORT LIBS 3.3.0 3.3.0 3.3.0 3.1.1 3.1.0 3.1.0 3.0.1 3.0.1 3.0.1 3.0.1 3.0.1 3.0.0 3.0.0
80 `hipother <https://github.com/ROCm/hipother>`_ 6.3.42131 6.2.41134 6.2.41134 6.2.41134 6.2.41133 6.1.40093 6.1.40093 6.1.40092 6.1.40091 6.1.32831 6.1.32830
81 SUPPORT LIBS `rocm-core <https://github.com/ROCm/rocm-core>`_ 6.3.0 6.2.4 6.2.2 6.2.1 6.2.0 6.1.5 6.1.2 6.1.1 6.1.0 6.0.2 6.0.0
82 `hipother <https://github.com/ROCm/hipother>`_ `ROCT-Thunk-Interface <https://github.com/ROCm/ROCT-Thunk-Interface>`_ 6.3.42134 6.3.42131 N/A [#ROCT-rocr-past-60]_ 6.3.42133 6.2.41134 20240607.5.7 6.2.41134 20240607.5.7 6.2.41134 20240607.4.05 6.2.41133 20240607.1.4246 6.1.40093 20240125.5.08 6.1.40093 20240125.5.08 6.1.40092 20240125.5.08 6.1.40091 20240125.3.30 6.1.32831 20231016.2.245 6.1.32830 20231016.2.245
83 `rocm-core <https://github.com/ROCm/rocm-core>`_ 6.3.2 6.3.0 6.3.1 6.2.4 6.2.2 6.2.1 6.2.0 6.1.5 6.1.2 6.1.1 6.1.0 6.0.2 6.0.0
84 `ROCT-Thunk-Interface <https://github.com/ROCm/ROCT-Thunk-Interface>`_ SYSTEM MGMT TOOLS N/A [#ROCT-rocr-past-60]_ N/A [#ROCT-rocr-past-60]_ .. _tools-support-compatibility-matrix-past-60: N/A [#ROCT-rocr-past-60]_ 20240607.5.7 20240607.5.7 20240607.4.05 20240607.1.4246 20240125.5.08 20240125.5.08 20240125.5.08 20240125.3.30 20231016.2.245 20231016.2.245
85 :doc:`AMD SMI <amdsmi:index>` 24.7.1 24.6.3 24.6.3 24.6.3 24.6.2 24.5.1 24.5.1 24.5.1 24.4.1 23.4.2 23.4.2
86 SYSTEM MGMT TOOLS :doc:`ROCm Data Center Tool <rdc:index>` .. _tools-support-compatibility-matrix-past-60: 0.3.0 0.3.0 0.3.0 0.3.0 0.3.0 0.3.0 0.3.0 0.3.0 0.3.0 0.3.0 0.3.0
87 :doc:`AMD SMI <amdsmi:index>` :doc:`rocminfo <rocminfo:index>` 24.7.1 24.7.1 1.0.0 24.7.1 24.6.3 1.0.0 24.6.3 1.0.0 24.6.3 1.0.0 24.6.2 1.0.0 24.5.1 1.0.0 24.5.1 1.0.0 24.5.1 1.0.0 24.4.1 1.0.0 23.4.2 1.0.0 23.4.2 1.0.0
88 :doc:`ROCm Data Center Tool <rdc:index>` :doc:`ROCm SMI <rocm_smi_lib:index>` 0.3.0 0.3.0 7.4.0 0.3.0 0.3.0 7.3.0 0.3.0 7.3.0 0.3.0 7.3.0 0.3.0 7.3.0 0.3.0 7.2.0 0.3.0 7.2.0 0.3.0 7.0.0 0.3.0 7.0.0 0.3.0 6.0.2 0.3.0 6.0.0
89 :doc:`rocminfo <rocminfo:index>` :doc:`ROCm Validation Suite <rocmvalidationsuite:index>` 1.0.0 1.0.0 1.1.0 1.0.0 1.0.0 1.0.60204 1.0.0 1.0.60202 1.0.0 1.0.60201 1.0.0 1.0.60200 1.0.0 1.0.60105 1.0.0 1.0.60102 1.0.0 1.0.60101 1.0.0 1.0.60100 1.0.0 1.0.60002 1.0.0 1.0.60000
90 :doc:`ROCm SMI <rocm_smi_lib:index>` 7.4.0 7.4.0 7.4.0 7.3.0 7.3.0 7.3.0 7.3.0 7.2.0 7.2.0 7.0.0 7.0.0 6.0.2 6.0.0
91 :doc:`ROCm Validation Suite <rocmvalidationsuite:index>` PERFORMANCE TOOLS 1.1.0 1.1.0 1.1.0 1.0.60204 1.0.60202 1.0.60201 1.0.60200 1.0.60105 1.0.60102 1.0.60101 1.0.60100 1.0.60002 1.0.60000
92 :doc:`ROCm Bandwidth Test <rocm_bandwidth_test:index>` 1.4.0 1.4.0 1.4.0 1.4.0 1.4.0 1.4.0 1.4.0 1.4.0 1.4.0 1.4.0 1.4.0
93 PERFORMANCE TOOLS :doc:`ROCm Compute Profiler <rocprofiler-compute:index>` 3.0.0 2.0.1 2.0.1 2.0.1 2.0.1 N/A N/A N/A N/A N/A N/A
94 :doc:`ROCm Bandwidth Test <rocm_bandwidth_test:index>` :doc:`ROCm Systems Profiler <rocprofiler-systems:index>` 1.4.0 1.4.0 0.1.0 1.4.0 1.4.0 1.11.2 1.4.0 1.11.2 1.4.0 1.11.2 1.4.0 1.11.2 1.4.0 N/A 1.4.0 N/A 1.4.0 N/A 1.4.0 N/A 1.4.0 N/A 1.4.0 N/A
95 :doc:`ROCm Compute Profiler <rocprofiler-compute:index>` :doc:`ROCProfiler <rocprofiler:index>` 3.0.0 3.0.0 2.0.60300 3.0.0 2.0.1 2.0.60204 2.0.1 2.0.60202 2.0.1 2.0.60201 2.0.1 2.0.60200 N/A 2.0.60105 N/A 2.0.60102 N/A 2.0.60101 N/A 2.0.60100 N/A 2.0.60002 N/A 2.0.60000
96 :doc:`ROCm Systems Profiler <rocprofiler-systems:index>` :doc:`ROCprofiler-SDK <rocprofiler-sdk:index>` 0.1.1 0.1.0 0.5.0 0.1.0 1.11.2 0.4.0 1.11.2 0.4.0 1.11.2 0.4.0 1.11.2 0.4.0 N/A N/A N/A N/A N/A N/A
97 :doc:`ROCProfiler <rocprofiler:index>` :doc:`ROCTracer <roctracer:index>` 2.0.60302 2.0.60300 4.1.60300 2.0.60301 2.0.60204 4.1.60204 2.0.60202 4.1.60202 2.0.60201 4.1.60201 2.0.60200 4.1.60200 2.0.60105 4.1.60105 2.0.60102 4.1.60102 2.0.60101 4.1.60101 2.0.60100 4.1.60100 2.0.60002 4.1.60002 2.0.60000 4.1.60000
98 :doc:`ROCprofiler-SDK <rocprofiler-sdk:index>` 0.5.0 0.5.0 0.5.0 0.4.0 0.4.0 0.4.0 0.4.0 N/A N/A N/A N/A N/A N/A
99 :doc:`ROCTracer <roctracer:index>` DEVELOPMENT TOOLS 4.1.60302 4.1.60300 4.1.60301 4.1.60204 4.1.60202 4.1.60201 4.1.60200 4.1.60105 4.1.60102 4.1.60101 4.1.60100 4.1.60002 4.1.60000
100 :doc:`HIPIFY <hipify:index>` 18.0.0.24455 18.0.0.24392 18.0.0.24355 18.0.0.24355 18.0.0.24232 17.0.0.24193 17.0.0.24193 17.0.0.24154 17.0.0.24103 17.0.0.24012 17.0.0.23483
101 DEVELOPMENT TOOLS :doc:`ROCm CMake <rocmcmakebuildtools:index>` 0.14.0 0.13.0 0.13.0 0.13.0 0.13.0 0.12.0 0.12.0 0.12.0 0.12.0 0.11.0 0.11.0
102 :doc:`HIPIFY <hipify:index>` :doc:`ROCdbgapi <rocdbgapi:index>` 18.0.0.25012 18.0.0.24455 0.77.0 18.0.0.24491 18.0.0.24392 0.76.0 18.0.0.24355 0.76.0 18.0.0.24355 0.76.0 18.0.0.24232 0.76.0 17.0.0.24193 0.71.0 17.0.0.24193 0.71.0 17.0.0.24154 0.71.0 17.0.0.24103 0.71.0 17.0.0.24012 0.71.0 17.0.0.23483 0.71.0
103 :doc:`ROCm CMake <rocmcmakebuildtools:index>` :doc:`ROCm Debugger (ROCgdb) <rocgdb:index>` 0.14.0 0.14.0 15.2.0 0.14.0 0.13.0 14.2.0 0.13.0 14.2.0 0.13.0 14.2.0 0.13.0 14.2.0 0.12.0 14.1.0 0.12.0 14.1.0 0.12.0 14.1.0 0.12.0 14.1.0 0.11.0 13.2.0 0.11.0 13.2.0
104 :doc:`ROCdbgapi <rocdbgapi:index>` `rocprofiler-register <https://github.com/ROCm/rocprofiler-register>`_ 0.77.0 0.77.0 0.4.0 0.77.0 0.76.0 0.4.0 0.76.0 0.4.0 0.76.0 0.4.0 0.76.0 0.4.0 0.71.0 0.3.0 0.71.0 0.3.0 0.71.0 0.3.0 0.71.0 0.3.0 0.71.0 N/A 0.71.0 N/A
105 :doc:`ROCm Debugger (ROCgdb) <rocgdb:index>` :doc:`ROCr Debug Agent <rocr_debug_agent:index>` 15.2.0 15.2.0 2.0.3 15.2.0 14.2.0 2.0.3 14.2.0 2.0.3 14.2.0 2.0.3 14.2.0 2.0.3 14.1.0 2.0.3 14.1.0 2.0.3 14.1.0 2.0.3 14.1.0 2.0.3 13.2.0 2.0.3 13.2.0 2.0.3
106 `rocprofiler-register <https://github.com/ROCm/rocprofiler-register>`_ 0.4.0 0.4.0 0.4.0 0.4.0 0.4.0 0.4.0 0.4.0 0.3.0 0.3.0 0.3.0 0.3.0 N/A N/A
107 :doc:`ROCr Debug Agent <rocr_debug_agent:index>` COMPILERS 2.0.3 2.0.3 .. _compilers-support-compatibility-matrix-past-60: 2.0.3 2.0.3 2.0.3 2.0.3 2.0.3 2.0.3 2.0.3 2.0.3 2.0.3 2.0.3 2.0.3
108 `clang-ocl <https://github.com/ROCm/clang-ocl>`_ N/A N/A N/A N/A N/A 0.5.0 0.5.0 0.5.0 0.5.0 0.5.0 0.5.0
109 COMPILERS :doc:`hipCC <hipcc:index>` .. _compilers-support-compatibility-matrix-past-60: 1.1.1 1.1.1 1.1.1 1.1.1 1.1.1 1.0.0 1.0.0 1.0.0 1.0.0 1.0.0 1.0.0
110 `clang-ocl <https://github.com/ROCm/clang-ocl>`_ `Flang <https://github.com/ROCm/flang>`_ N/A N/A 18.0.0.24455 N/A N/A 18.0.0.24392 N/A 18.0.0.24355 N/A 18.0.0.24355 N/A 18.0.0.24232 0.5.0 17.0.0.24193 0.5.0 17.0.0.24193 0.5.0 17.0.0.24154 0.5.0 17.0.0.24103 0.5.0 17.0.0.24012 0.5.0 17.0.0.23483
111 :doc:`hipCC <hipcc:index>` :doc:`llvm-project <llvm-project:index>` 1.1.1 1.1.1 18.0.0.24455 1.1.1 1.1.1 18.0.0.24392 1.1.1 18.0.0.24355 1.1.1 18.0.0.24355 1.1.1 18.0.0.24232 1.0.0 17.0.0.24193 1.0.0 17.0.0.24193 1.0.0 17.0.0.24154 1.0.0 17.0.0.24103 1.0.0 17.0.0.24012 1.0.0 17.0.0.23483
112 `Flang <https://github.com/ROCm/flang>`_ `OpenMP <https://github.com/ROCm/llvm-project/tree/amd-staging/openmp>`_ 18.0.0.25012 18.0.0.24455 18.0.0.24491 18.0.0.24392 18.0.0.24355 18.0.0.24355 18.0.0.24232 17.0.0.24193 17.0.0.24193 17.0.0.24154 17.0.0.24103 17.0.0.24012 17.0.0.23483
113 :doc:`llvm-project <llvm-project:index>` 18.0.0.25012 18.0.0.24491 18.0.0.24491 18.0.0.24392 18.0.0.24355 18.0.0.24355 18.0.0.24232 17.0.0.24193 17.0.0.24193 17.0.0.24154 17.0.0.24103 17.0.0.24012 17.0.0.23483
114 `OpenMP <https://github.com/ROCm/llvm-project/tree/amd-staging/openmp>`_ RUNTIMES 18.0.0.25012 18.0.0.24491 .. _runtime-support-compatibility-matrix-past-60: 18.0.0.24491 18.0.0.24392 18.0.0.24355 18.0.0.24355 18.0.0.24232 17.0.0.24193 17.0.0.24193 17.0.0.24154 17.0.0.24103 17.0.0.24012 17.0.0.23483
115 :doc:`AMD CLR <hip:understand/amd_clr>` 6.3.42131 6.2.41134 6.2.41134 6.2.41134 6.2.41133 6.1.40093 6.1.40093 6.1.40092 6.1.40091 6.1.32831 6.1.32830
116 RUNTIMES :doc:`HIP <hip:index>` .. _runtime-support-compatibility-matrix-past-60: 6.3.42131 6.2.41134 6.2.41134 6.2.41134 6.2.41133 6.1.40093 6.1.40093 6.1.40092 6.1.40091 6.1.32831 6.1.32830
117 :doc:`AMD CLR <hip:understand/amd_clr>` `OpenCL Runtime <https://github.com/ROCm/clr/tree/develop/opencl>`_ 6.3.42134 6.3.42131 2.0.0 6.3.42133 6.2.41134 2.0.0 6.2.41134 2.0.0 6.2.41134 2.0.0 6.2.41133 2.0.0 6.1.40093 2.0.0 6.1.40093 2.0.0 6.1.40092 2.0.0 6.1.40091 2.0.0 6.1.32831 2.0.0 6.1.32830 2.0.0
118 :doc:`HIP <hip:index>` :doc:`ROCr Runtime <rocr-runtime:index>` 6.3.42134 6.3.42131 1.14.0 6.3.42133 6.2.41134 1.14.0 6.2.41134 1.14.0 6.2.41134 1.14.0 6.2.41133 1.13.0 6.1.40093 1.13.0 6.1.40093 1.13.0 6.1.40092 1.13.0 6.1.40091 1.13.0 6.1.32831 1.12.0 6.1.32830 1.12.0
`OpenCL Runtime <https://github.com/ROCm/clr/tree/develop/opencl>`_ 2.0.0 2.0.0 2.0.0 2.0.0 2.0.0 2.0.0 2.0.0 2.0.0 2.0.0 2.0.0 2.0.0 2.0.0 2.0.0
:doc:`ROCr Runtime <rocr-runtime:index>` 1.14.0 1.14.0 1.14.0 1.14.0 1.14.0 1.14.0 1.13.0 1.13.0 1.13.0 1.13.0 1.13.0 1.12.0 1.12.0

View File

@@ -23,17 +23,17 @@ compatibility and system requirements.
.. container:: format-big-table
.. csv-table::
:header: "ROCm Version", "6.3.2", "6.3.1", "6.2.0"
:header: "ROCm Version", "6.3.0", "6.2.4", "6.1.0"
:stub-columns: 1
:ref:`Operating systems & kernels <OS-kernel-versions>`,Ubuntu 24.04.2,Ubuntu 24.04.2,Ubuntu 24.04
,Ubuntu 22.04.5,Ubuntu 22.04.5,"Ubuntu 22.04.5, 22.04.4"
,"RHEL 9.5, 9.4","RHEL 9.5, 9.4","RHEL 9.4, 9.3"
,RHEL 8.10,RHEL 8.10,"RHEL 8.10, 8.9"
,"SLES 15 SP6, SP5","SLES 15 SP6, SP5","SLES 15 SP6, SP5"
,Oracle Linux 8.10 [#mi300x]_,Oracle Linux 8.10 [#mi300x]_,Oracle Linux 8.9 [#mi300x]_
,Debian 12 [#single-node]_,Debian 12 [#single-node]_,
,Azure Linux 3.0 [#mi300x]_,,
:ref:`Operating systems & kernels <OS-kernel-versions>`,Ubuntu 24.04.2,"Ubuntu 24.04.1, 24.04",
,Ubuntu 22.04.5,"Ubuntu 22.04.5, 22.04.4","Ubuntu 22.04.4, 22.04.3"
,,,"Ubuntu 20.04.6, 20.04.5"
,"RHEL 9.5, 9.4","RHEL 9.4, 9.3","RHEL 9.4 [#red-hat94]_, 9.3, 9.2"
,"RHEL 8.10","RHEL 8.10, 8.9","RHEL 8.9, 8.8"
,"SLES 15 SP6, SP5","SLES 15 SP6, SP5","SLES 15 SP5, SP4"
,,,CentOS 7.9
,Oracle Linux 8.10 [#oracle89]_,Oracle Linux 8.9 [#oracle89]_,
,.. _architecture-support-compatibility-matrix:,,
:doc:`Architecture <rocm-install-on-linux:reference/system-requirements>`,CDNA3,CDNA3,CDNA3
,CDNA2,CDNA2,CDNA2
@@ -43,116 +43,115 @@ compatibility and system requirements.
,.. _gpu-support-compatibility-matrix:,,
:doc:`GPU / LLVM target <rocm-install-on-linux:reference/system-requirements>`,gfx1100,gfx1100,gfx1100
,gfx1030,gfx1030,gfx1030
,gfx942,gfx942,gfx942 [#mi300_620]_
,gfx942,gfx942 [#mi300_624]_, gfx942 [#mi300_610]_
,gfx90a,gfx90a,gfx90a
,gfx908,gfx908,gfx908
,,,
FRAMEWORK SUPPORT,.. _framework-support-compatibility-matrix:,,
:doc:`PyTorch <../compatibility/ml-compatibility/pytorch-compatibility>`,"2.4, 2.3, 2.2, 1.13","2.4, 2.3, 2.2, 1.13","2.3, 2.2, 2.1, 2.0, 1.13"
:doc:`TensorFlow <../compatibility/ml-compatibility/tensorflow-compatibility>`,"2.17.0, 2.16.2, 2.15.1","2.17.0, 2.16.2, 2.15.1","2.16.1, 2.15.1, 2.14.1"
:doc:`JAX <../compatibility/ml-compatibility/jax-compatibility>`,0.4.31,0.4.31,0.4.26
:doc:`PyTorch <rocm-install-on-linux:install/3rd-party/pytorch-install>`,"2.4, 2.3, 2.2, 2.1, 2.0, 1.13","2.3, 2.2, 2.1, 2.0, 1.13","2.1, 2.0, 1.13"
:doc:`TensorFlow <rocm-install-on-linux:install/3rd-party/tensorflow-install>`,"2.17.0, 2.16.2, 2.15.1","2.16.1, 2.15.1, 2.14.1","2.15.0, 2.14.0, 2.13.1"
:doc:`JAX <rocm-install-on-linux:install/3rd-party/jax-install>`,0.4.26,0.4.26,0.4.26
`ONNX Runtime <https://onnxruntime.ai/docs/build/eps.html#amd-migraphx>`_,1.17.3,1.17.3,1.17.3
,,,
THIRD PARTY COMMS,.. _thirdpartycomms-support-compatibility-matrix:,,
`UCC <https://github.com/ROCm/ucc>`_,>=1.3.0,>=1.3.0,>=1.3.0
`UCX <https://github.com/ROCm/ucx>`_,>=1.15.0,>=1.15.0,>=1.15.0
`UCX <https://github.com/ROCm/ucx>`_,>=1.15.0,>=1.15.0,>=1.14.1
,,,
THIRD PARTY ALGORITHM,.. _thirdpartyalgorithm-support-compatibility-matrix:,,
Thrust,2.3.2,2.3.2,2.2.0
CUB,2.3.2,2.3.2,2.2.0
Thrust,2.3.2,2.2.0,2.1.0
CUB,2.3.2,2.2.0,2.1.0
,,,
KMD & USER SPACE [#kfd_support]_,.. _kfd-userspace-support-compatibility-matrix:,,
Tested user space versions,"6.3.x, 6.2.x, 6.1.x","6.3.x, 6.2.x, 6.1.x","6.3.x, 6.2.x, 6.1.x, 6.0.x"
Tested user space versions,"6.3.x, 6.2.x, 6.1.x","6.3.x, 6.2.x, 6.1.x, 6.0.x","6.3.x, 6.2.x, 6.1.x, 6.0.x, 5.7.x"
,,,
ML & COMPUTER VISION,.. _mllibs-support-compatibility-matrix:,,
:doc:`Composable Kernel <composable_kernel:index>`,1.1.0,1.1.0,1.1.0
:doc:`MIGraphX <amdmigraphx:index>`,2.11.0,2.11.0,2.10.0
:doc:`MIOpen <miopen:index>`,3.3.0,3.3.0,3.2.0
:doc:`MIVisionX <mivisionx:index>`,3.1.0,3.1.0,3.0.0
:doc:`rocAL <rocal:index>`,2.1.0,2.1.0,1.0.0
:doc:`rocDecode <rocdecode:index>`,0.8.0,0.8.0,0.6.0
:doc:`rocJPEG <rocjpeg:index>`,0.6.0,0.6.0,N/A
:doc:`rocPyDecode <rocpydecode:index>`,0.2.0,0.2.0,0.1.0
:doc:`RPP <rpp:index>`,1.9.1,1.9.1,1.8.0
:doc:`MIGraphX <amdmigraphx:index>`,2.11.0,2.10.0,2.9.0
:doc:`MIOpen <miopen:index>`,3.3.0,3.2.0,3.1.0
:doc:`MIVisionX <mivisionx:index>`,3.1.0,3.0.0,2.5.0
:doc:`rocAL <rocal:index>`,2.1.0,2.0.0,1.0.0
:doc:`rocDecode <rocdecode:index>`,0.8.0,0.6.0,0.5.0
:doc:`rocJPEG <rocjpeg:index>`,0.6.0,N/A,N/A
:doc:`rocPyDecode <rocpydecode:index>`,0.2.0,0.1.0,N/A
:doc:`RPP <rpp:index>`,1.9.1,1.8.0,1.5.0
,,,
COMMUNICATION,.. _commlibs-support-compatibility-matrix:,,
:doc:`RCCL <rccl:index>`,2.21.5,2.21.5,2.20.5
:doc:`RCCL <rccl:index>`,2.21.5,2.20.5,2.18.6
,,,
MATH LIBS,.. _mathlibs-support-compatibility-matrix:,,
`half <https://github.com/ROCm/half>`_ ,1.12.0,1.12.0,1.12.0
:doc:`hipBLAS <hipblas:index>`,2.3.0,2.3.0,2.2.0
:doc:`hipBLASLt <hipblaslt:index>`,0.10.0,0.10.0,0.8.0
:doc:`hipFFT <hipfft:index>`,1.0.17,1.0.17,1.0.14
:doc:`hipfort <hipfort:index>`,0.5.1,0.5.0,0.4.0
:doc:`hipRAND <hiprand:index>`,2.11.1,2.11.1,2.11.0
:doc:`hipSOLVER <hipsolver:index>`,2.3.0,2.3.0,2.2.0
:doc:`hipSPARSE <hipsparse:index>`,3.1.2,3.1.2,3.1.1
:doc:`hipSPARSELt <hipsparselt:index>`,0.2.2,0.2.2,0.2.1
:doc:`rocALUTION <rocalution:index>`,3.2.1,3.2.1,3.2.0
:doc:`rocBLAS <rocblas:index>`,4.3.0,4.3.0,4.2.0
:doc:`rocFFT <rocfft:index>`,1.0.31,1.0.31,1.0.28
:doc:`rocRAND <rocrand:index>`,3.2.0,3.2.0,3.1.0
:doc:`rocSOLVER <rocsolver:index>`,3.27.0,3.27.0,3.26.0
:doc:`rocSPARSE <rocsparse:index>`,3.3.0,3.3.0,3.2.0
:doc:`rocWMMA <rocwmma:index>`,1.6.0,1.6.0,1.5.0
:doc:`Tensile <tensile:src/index>`,4.42.0,4.42.0,4.41.0
:doc:`hipBLAS <hipblas:index>`,2.3.0,2.2.0,2.1.0
:doc:`hipBLASLt <hipblaslt:index>`,0.10.0,0.8.0,0.7.0
:doc:`hipFFT <hipfft:index>`,1.0.17,1.0.16,1.0.14
:doc:`hipfort <hipfort:index>`,0.5.0,0.4.0,0.4.0
:doc:`hipRAND <hiprand:index>`,2.11.0,2.11.1,2.10.16
:doc:`hipSOLVER <hipsolver:index>`,2.3.0,2.2.0,2.1.0
:doc:`hipSPARSE <hipsparse:index>`,3.1.2,3.1.1,3.0.1
:doc:`hipSPARSELt <hipsparselt:index>`,0.2.2,0.2.1,0.1.0
:doc:`rocALUTION <rocalution:index>`,3.2.1,3.2.1,3.1.1
:doc:`rocBLAS <rocblas:index>`,4.3.0,4.2.4,4.1.0
:doc:`rocFFT <rocfft:index>`,1.0.31,1.0.30,1.0.26
:doc:`rocRAND <rocrand:index>`,3.2.0,3.1.1,3.0.1
:doc:`rocSOLVER <rocsolver:index>`,3.27.0,3.26.2,3.25.0
:doc:`rocSPARSE <rocsparse:index>`,3.3.0,3.2.1,3.1.2
:doc:`rocWMMA <rocwmma:index>`,1.6.0,1.5.0,1.4.0
:doc:`Tensile <tensile:index>`,4.42.0,4.41.0,4.40.0
,,,
PRIMITIVES,.. _primitivelibs-support-compatibility-matrix:,,
:doc:`hipCUB <hipcub:index>`,3.3.0,3.3.0,3.2.0
:doc:`hipTensor <hiptensor:index>`,1.4.0,1.4.0,1.3.0
:doc:`rocPRIM <rocprim:index>`,3.3.0,3.3.0,3.2.0
:doc:`rocThrust <rocthrust:index>`,3.3.0,3.3.0,3.0.1
:doc:`hipCUB <hipcub:index>`,3.3.0,3.2.1,3.1.0
:doc:`hipTensor <hiptensor:index>`,1.4.0,1.3.0,1.2.0
:doc:`rocPRIM <rocprim:index>`,3.3.0,3.2.2,3.1.0
:doc:`rocThrust <rocthrust:index>`,3.3.0,3.1.1,3.0.1
,,,
SUPPORT LIBS,,,
`hipother <https://github.com/ROCm/hipother>`_,6.3.42134,6.3.42133,6.2.41133
`rocm-core <https://github.com/ROCm/rocm-core>`_,6.3.2,6.3.1,6.2.0
`ROCT-Thunk-Interface <https://github.com/ROCm/ROCT-Thunk-Interface>`_,N/A [#ROCT-rocr]_,N/A [#ROCT-rocr]_,20240607.1.4246
`hipother <https://github.com/ROCm/hipother>`_,6.3.42131,6.2.41134,6.1.40091
`rocm-core <https://github.com/ROCm/rocm-core>`_,6.3.0,6.2.4,6.1.0
`ROCT-Thunk-Interface <https://github.com/ROCm/ROCT-Thunk-Interface>`_,N/A [#ROCT-rocr]_,20240607.5.7,20240125.3.30
,,,
SYSTEM MGMT TOOLS,.. _tools-support-compatibility-matrix:,,
:doc:`AMD SMI <amdsmi:index>`,24.7.1,24.7.1,24.6.2
:doc:`AMD SMI <amdsmi:index>`,24.7.1,24.6.3,24.4.1
:doc:`ROCm Data Center Tool <rdc:index>`,0.3.0,0.3.0,0.3.0
:doc:`rocminfo <rocminfo:index>`,1.0.0,1.0.0,1.0.0
:doc:`ROCm SMI <rocm_smi_lib:index>`,7.4.0,7.4.0,7.3.0
:doc:`ROCm Validation Suite <rocmvalidationsuite:index>`,1.1.0,1.1.0,1.0.60200
:doc:`ROCm SMI <rocm_smi_lib:index>`,7.4.0,7.3.0,7.0.0
:doc:`ROCm Validation Suite <rocmvalidationsuite:index>`,1.1.0,1.0.60204,1.0.60100
,,,
PERFORMANCE TOOLS,,,
:doc:`ROCm Bandwidth Test <rocm_bandwidth_test:index>`,1.4.0,1.4.0,1.4.0
:doc:`ROCm Compute Profiler <rocprofiler-compute:index>`,3.0.0,3.0.0,2.0.1
:doc:`ROCm Systems Profiler <rocprofiler-systems:index>`,0.1.1,0.1.0,1.11.2
:doc:`ROCProfiler <rocprofiler:index>`,2.0.60302,2.0.60301,2.0.60200
:doc:`ROCprofiler-SDK <rocprofiler-sdk:index>`,0.5.0,0.5.0,0.4.0
:doc:`ROCTracer <roctracer:index>`,4.1.60302,4.1.60301,4.1.60200
:doc:`ROCm Compute Profiler <rocprofiler-compute:index>`,3.0.0,2.0.1,N/A
:doc:`ROCm Systems Profiler <rocprofiler-systems:index>`,0.1.0,1.11.2,N/A
:doc:`ROCProfiler <rocprofiler:index>`,2.0.60300,2.0.60204,2.0.60100
:doc:`ROCprofiler-SDK <rocprofiler-sdk:index>`,0.5.0,0.4.0,N/A
:doc:`ROCTracer <roctracer:index>`,4.1.60300,4.1.60204,4.1.60100
,,,
DEVELOPMENT TOOLS,,,
:doc:`HIPIFY <hipify:index>`,18.0.0.25012,18.0.0.24491,18.0.0.24232
:doc:`ROCm CMake <rocmcmakebuildtools:index>`,0.14.0,0.14.0,0.13.0
:doc:`ROCdbgapi <rocdbgapi:index>`,0.77.0,0.77.0,0.76.0
:doc:`ROCm Debugger (ROCgdb) <rocgdb:index>`,15.2.0,15.2.0,14.2.0
`rocprofiler-register <https://github.com/ROCm/rocprofiler-register>`_,0.4.0,0.4.0,0.4.0
:doc:`HIPIFY <hipify:index>`,18.0.0.24455,18.0.0.24392,17.0.0.24103
:doc:`ROCm CMake <rocmcmakebuildtools:index>`,0.14.0,0.13.0,0.12.0
:doc:`ROCdbgapi <rocdbgapi:index>`,0.77.0,0.76.0,0.71.0
:doc:`ROCm Debugger (ROCgdb) <rocgdb:index>`,15.2.0,14.2.0,14.1.0
`rocprofiler-register <https://github.com/ROCm/rocprofiler-register>`_,0.4.0,0.4.0,0.3.0
:doc:`ROCr Debug Agent <rocr_debug_agent:index>`,2.0.3,2.0.3,2.0.3
,,,
COMPILERS,.. _compilers-support-compatibility-matrix:,..
`clang-ocl <https://github.com/ROCm/clang-ocl>`_,N/A,N/A,N/A
:doc:`hipCC <hipcc:index>`,1.1.1,1.1.1,1.1.1
`Flang <https://github.com/ROCm/flang>`_,18.0.0.25012,18.0.0.24491,18.0.0.24232
:doc:`llvm-project <llvm-project:index>`,18.0.0.25012,18.0.0.24491,18.0.0.24232
`OpenMP <https://github.com/ROCm/llvm-project/tree/amd-staging/openmp>`_,18.0.0.25012,18.0.0.24491,18.0.0.24232
COMPILERS,.. _compilers-support-compatibility-matrix:,,
`clang-ocl <https://github.com/ROCm/clang-ocl>`_,N/A,N/A,0.5.0
:doc:`hipCC <hipcc:index>`,1.1.1,1.1.1,1.0.0
`Flang <https://github.com/ROCm/flang>`_,18.0.0.24455,18.0.0.24392,17.0.0.24103
:doc:`llvm-project <llvm-project:index>`,18.0.0.24455,18.0.0.24392,17.0.0.24103
`OpenMP <https://github.com/ROCm/llvm-project/tree/amd-staging/openmp>`_,18.0.0.24455,18.0.0.24392,17.0.0.24103
,,,
RUNTIMES,.. _runtime-support-compatibility-matrix:,..
:doc:`AMD CLR <hip:understand/amd_clr>`,6.3.42134,6.3.42133,6.2.41133
:doc:`HIP <hip:index>`,6.3.42134,6.3.42133,6.2.41133
RUNTIMES,.. _runtime-support-compatibility-matrix:,,
:doc:`AMD CLR <hip:understand/amd_clr>`,6.3.42131,6.2.41134,6.1.40091
:doc:`HIP <hip:index>`,6.3.42131,6.2.41134,6.1.40091
`OpenCL Runtime <https://github.com/ROCm/clr/tree/develop/opencl>`_,2.0.0,2.0.0,2.0.0
:doc:`ROCr Runtime <rocr-runtime:index>`,1.14.0,1.14.0,1.13.0
.. rubric:: Footnotes
.. [#mi300x] Oracle Linux and Azure Linux are supported only on AMD Instinct MI300X.
.. [#single-node] Debian 12 is supported only on AMD Instinct MI300X for single-node functionality.
.. [#mi300_620] **For ROCm 6.2.0** - MI300X (gfx942) is supported on listed operating systems *except* Ubuntu 22.04.5 [6.8 HWE] and Ubuntu 22.04.4 [6.5 HWE].
.. [#kfd_support] ROCm provides forward and backward compatibility between the AMD Kernel-mode GPU Driver (KMD) and its user space software for +/- 2 releases. These are the compatibility combinations that are currently supported. The tested user space versions on this page were accurate as of the time of initial ROCm release. For the most up-to-date information, see the latest version of this information at `User and kernel-space support matrix <https://rocm.docs.amd.com/projects/install-on-linux/en/latest/reference/user-kernel-space-compat-matrix.html>`_.
.. [#ROCT-rocr] Starting from ROCm 6.3.0, the ROCT Thunk Interface is included as part of the ROCr runtime package.
.. [#red-hat94] **For ROCm 6.1** - RHEL 9.4 is supported only on AMD Instinct MI300A.
.. [#oracle89] Oracle Linux is supported only on AMD Instinct MI300X.
.. [#mi300_624] **For ROCm 6.2.4** - MI300X (gfx942) is supported on listed operating systems *except* Ubuntu 22.04.5 [6.8 HWE] and Ubuntu 22.04.4 [6.5 HWE].
.. [#mi300_610] **For ROCm 6.1.0** - MI300A (gfx942) is supported on Ubuntu 22.04.4, RHEL 9.4, RHEL 9.3, RHEL 8.9, and SLES 15 SP5. MI300X (gfx942) is only supported on Ubuntu 22.04.4.
.. [#kfd_support] ROCm provides forward and backward compatibility between the AMD Kernel-mode GPU Driver (KMD) and its user space software for +/- 2 releases. These are the compatibility combinations that are currently supported.
.. [#ROCT-rocr] As of ROCm 6.3.0, the ROCT Thunk Interface is now included as part of the ROCr runtime package.
.. _OS-kernel-versions:
@@ -161,39 +160,44 @@ Operating systems and kernel versions
Use this lookup table to confirm which operating system and kernel versions are supported with ROCm.
.. csv-table::
.. csv-table::
:header: "OS", "Version", "Kernel"
:widths: 40, 20, 40
:stub-columns: 1
`Ubuntu <https://ubuntu.com/about/release-cycle#ubuntu-kernel-release-cycle>`_, 24.04.2, "6.8 GA, 6.11 HWE"
, 24.04.1, "6.8 GA"
, 24.04, "6.8 GA"
,,
`Ubuntu <https://ubuntu.com/about/release-cycle#ubuntu-kernel-release-cycle>`_, 22.04.5, "5.15 GA, 6.8 HWE"
, 22.04.4, "5.15 GA, 6.5 HWE"
, 22.04.3, "5.15 GA, 6.2 HWE"
,,
`Red Hat Enterprise Linux (RHEL 9) <https://access.redhat.com/articles/3078#RHEL9>`_, 9.5, 5.14.0
`Ubuntu <https://ubuntu.com/about/release-cycle#ubuntu-kernel-release-cycle>`_, 20.04.6, "5.15 HWE"
, 20.04.5, "5.15 HWE"
,,
`Red Hat Enterprise Linux (RHEL) <https://access.redhat.com/articles/3078#RHEL9>`_, 9.5, 5.14.0
,9.4, 5.14.0
,9.3, 5.14.0
,9.2, 5.14.0
,,
`Red Hat Enterprise Linux (RHEL 8) <https://access.redhat.com/articles/3078#RHEL8>`_, 8.10, 4.18.0
`Red Hat Enterprise Linux (RHEL) <https://access.redhat.com/articles/3078#RHEL8>`_, 8.10, 4.18.0
,8.9, 4.18.0
,8.8, 4.18.0
,,
`CentOS <https://access.redhat.com/articles/3078#RHEL7>`_, 7.9, 3.10
,,
`SUSE Linux Enterprise Server (SLES) <https://www.suse.com/support/kb/doc/?id=000019587#SLE15SP4>`_, 15 SP6, 6.4.0
,15 SP5, 5.14.21
,15 SP4, 5.14.21
,,
`Oracle Linux <https://blogs.oracle.com/scoter/post/oracle-linux-and-unbreakable-enterprise-kernel-uek-releases>`_, 8.10, 5.15.0
,8.9, 5.15.0
,,
`Debian <https://www.debian.org/download>`_,12, 6.1
,,
`Azure Linux <https://techcommunity.microsoft.com/blog/linuxandopensourceblog/azure-linux-3-0-now-in-preview-on-azure-kubernetes-service-v1-31/4287229>`_,3.0, 6.6
,,
..
Footnotes and ref anchors in below historical tables should be appended with "-past-60", to differentiate from the
Footnotes and ref anchors in below historical tables should be appended with "-past-60", to differentiate from the
footnote references in the above, latest, compatibility matrix. It also allows to easily find & replace.
An easy way to work is to download the historical.CSV file, and update open it in excel. Then when content is ready,
An easy way to work is to download the historical.CSV file, and update open it in excel. Then when content is ready,
delete the columns you don't need, to build the current compatibility matrix to use in above table. Find & replace all
instances of "-past-60" to make it ready for above table.
@@ -216,8 +220,8 @@ Expand for full historical view of:
.. rubric:: Footnotes
.. [#mi300x-past-60] Oracle Linux and Azure Linux are supported only on AMD Instinct MI300X.
.. [#single-node-past-60] Debian 12 is supported only on AMD Instinct MI300X for single-node functionality.
.. [#red-hat94-past-60] **For ROCm 6.1** - RHEL 9.4 is supported only on AMD Instinct MI300A.
.. [#oracle89-past-60] Oracle Linux is supported only on AMD Instinct MI300X.
.. [#mi300_624-past-60] **For ROCm 6.2.4** - MI300X (gfx942) is supported on listed operating systems *except* Ubuntu 22.04.5 [6.8 HWE] and Ubuntu 22.04.4 [6.5 HWE].
.. [#mi300_622-past-60] **For ROCm 6.2.2** - MI300X (gfx942) is supported on listed operating systems *except* Ubuntu 22.04.5 [6.8 HWE] and Ubuntu 22.04.4 [6.5 HWE].
.. [#mi300_621-past-60] **For ROCm 6.2.1** - MI300X (gfx942) is supported on listed operating systems *except* Ubuntu 22.04.5 [6.8 HWE] and Ubuntu 22.04.4 [6.5 HWE].
@@ -227,5 +231,5 @@ Expand for full historical view of:
.. [#mi300_610-past-60] **For ROCm 6.1.0** - MI300A (gfx942) is supported on Ubuntu 22.04.4, RHEL 9.4, RHEL 9.3, RHEL 8.9, and SLES 15 SP5. MI300X (gfx942) is only supported on Ubuntu 22.04.4.
.. [#mi300_602-past-60] **For ROCm 6.0.2** - MI300A (gfx942) is supported on Ubuntu 22.04.3, RHEL 8.9, and SLES 15 SP5. MI300X (gfx942) is only supported on Ubuntu 22.04.3.
.. [#mi300_600-past-60] **For ROCm 6.0.0** - MI300A (gfx942) is supported on Ubuntu 22.04.3, RHEL 8.9, and SLES 15 SP5. MI300X (gfx942) is only supported on Ubuntu 22.04.3.
.. [#kfd_support-past-60] ROCm provides forward and backward compatibility between the AMD Kernel-mode GPU Driver (KMD) and its user space software for +/- 2 releases. The tested user space versions on this page were accurate as of the time of initial ROCm release. For the most up-to-date information, see the latest version of this information at `User and kernel-space support matrix <https://rocm.docs.amd.com/projects/install-on-linux/en/latest/reference/user-kernel-space-compat-matrix.html>`_.
.. [#ROCT-rocr-past-60] Starting from ROCm 6.3.0, the ROCT Thunk Interface is included as part of the ROCr runtime package.
.. [#kfd_support-past-60] ROCm provides forward and backward compatibility between the AMD Kernel-mode GPU Driver (KMD) and its user space software for +/- 2 releases. These are the compatibility combinations that are currently supported.
.. [#ROCT-rocr-past-60] As of ROCm 6.3.0, the ROCT Thunk Interface is now included as part of the ROCr runtime package.

View File

@@ -1,665 +0,0 @@
:orphan:
.. meta::
:description: JAX compatibility
:keywords: GPU, JAX compatibility
*******************************************************************************
JAX compatibility
*******************************************************************************
JAX provides a NumPy-like API, which combines automatic differentiation and the
Accelerated Linear Algebra (XLA) compiler to achieve high-performance machine
learning at scale.
JAX uses composable transformations of Python and NumPy through just-in-time (JIT) compilation,
automatic vectorization, and parallelization. To learn about JAX, including profiling and
optimizations, see the official `JAX documentation
<https://jax.readthedocs.io/en/latest/notebooks/quickstart.html>`_.
ROCm support for JAX is upstreamed and users can build the official source code with ROCm
support:
- ROCm JAX release:
- Offers AMD-validated and community :ref:`Docker images <jax-docker-compat>` with ROCm and JAX pre-installed.
- ROCm JAX repository: `ROCm/jax <https://github.com/ROCm/jax>`_
- See the :doc:`ROCm JAX installation guide <rocm-install-on-linux:install/3rd-party/jax-install>`
to get started.
- Official JAX release:
- Official JAX repository: `jax-ml/jax <https://github.com/jax-ml/jax>`_
- See the `AMD GPU (Linux) installation section
<https://jax.readthedocs.io/en/latest/installation.html#amd-gpu-linux>`_ in the JAX
documentation.
.. note::
AMD releases official `ROCm JAX Docker images <https://hub.docker.com/r/rocm/jax>`_
quarterly alongside new ROCm releases. These images undergo full AMD testing.
`Community ROCm JAX Docker images <https://hub.docker.com/r/rocm/jax-community>`_
follow upstream JAX releases and use the latest available ROCm version.
.. _jax-docker-compat:
Docker image compatibility
================================================================================
.. |docker-icon| raw:: html
<i class="fab fa-docker"></i>
AMD validates and publishes ready-made `ROCm JAX Docker images <https://hub.docker.com/r/rocm/jax>`_
with ROCm backends on Docker Hub. The following Docker image tags and
associated inventories are validated for
`ROCm 6.3.1 <https://repo.radeon.com/rocm/apt/6.3.1/>`_. Click the |docker-icon|
icon to view the image on Docker Hub.
.. list-table:: JAX Docker image components
:header-rows: 1
* - Docker image
- JAX
- Linux
- Python
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/jax/rocm6.3.1-jax0.4.31-py3.12/images/sha256-085a0cd5207110922f1fca684933a9359c66d42db6c5aba4760ed5214fdabde0"><i class="fab fa-docker fa-lg"></i> rocm/jax</a>
- `0.4.31 <https://github.com/ROCm/jax/releases/tag/rocm-jax-v0.4.31>`_
- Ubuntu 24.04
- `3.12.7 <https://www.python.org/downloads/release/python-3127/>`_
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/jax/rocm6.3.1-jax0.4.31-py3.10/images/sha256-f88eddad8f47856d8640b694da4da347ffc1750d7363175ab7dc872e82b43324"><i class="fab fa-docker fa-lg"></i> rocm/jax</a>
- `0.4.31 <https://github.com/ROCm/jax/releases/tag/rocm-jax-v0.4.31>`_
- Ubuntu 22.04
- `3.10.14 <https://www.python.org/downloads/release/python-31014/>`_
AMD publishes `Community ROCm JAX Docker images <https://hub.docker.com/r/rocm/jax-community>`_
with ROCm backends on Docker Hub. The following Docker image tags and
associated inventories are tested for `ROCm 6.2.4 <https://repo.radeon.com/rocm/apt/6.2.4/>`_.
.. list-table:: JAX community Docker image components
:header-rows: 1
* - Docker image
- JAX
- Linux
- Python
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/jax-community/rocm6.2.4-jax0.4.35-py3.12.7/images/sha256-a6032d89c07573b84c44e42c637bf9752b1b7cd2a222d39344e603d8f4c63beb?context=explore"><i class="fab fa-docker fa-lg"></i> rocm/jax-community</a>
- `0.4.35 <https://github.com/ROCm/jax/releases/tag/rocm-jax-v0.4.35>`_
- Ubuntu 22.04
- `3.12.7 <https://www.python.org/downloads/release/python-3127/>`_
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/jax-community/rocm6.2.4-jax0.4.35-py3.11.10/images/sha256-d462f7e445545fba2f3b92234a21beaa52fe6c5f550faabcfdcd1bf53486d991?context=explore"><i class="fab fa-docker fa-lg"></i> rocm/jax-community</a>
- `0.4.35 <https://github.com/ROCm/jax/releases/tag/rocm-jax-v0.4.35>`_
- Ubuntu 22.04
- `3.11.10 <https://www.python.org/downloads/release/python-31110/>`_
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/jax-community/rocm6.2.4-jax0.4.35-py3.10.15/images/sha256-6f2d4d0f529378d9572f0e8cfdcbc101d1e1d335bd626bb3336fff87814e9d60?context=explore"><i class="fab fa-docker fa-lg"></i> rocm/jax-community</a>
- `0.4.35 <https://github.com/ROCm/jax/releases/tag/rocm-jax-v0.4.35>`_
- Ubuntu 22.04
- `3.10.15 <https://www.python.org/downloads/release/python-31015/>`_
Critical ROCm libraries for JAX
================================================================================
The functionality of JAX with ROCm is determined by its underlying library
dependencies. These critical ROCm components affect the capabilities,
performance, and feature set available to developers.
.. list-table::
:header-rows: 1
* - ROCm library
- Version
- Purpose
- Used in
* - `hipBLAS <https://github.com/ROCm/hipBLAS>`_
- 2.3.0
- Provides GPU-accelerated Basic Linear Algebra Subprograms (BLAS) for
matrix and vector operations.
- Matrix multiplication in ``jax.numpy.matmul``, ``jax.lax.dot`` and
``jax.lax.dot_general``, operations like ``jax.numpy.dot``, which
involve vector and matrix computations and batch matrix multiplications
``jax.numpy.einsum`` with matrix-multiplication patterns algebra
operations.
* - `hipBLASLt <https://github.com/ROCm/hipBLASLt>`_
- 0.10.0
- hipBLASLt is an extension of hipBLAS, providing additional
features like epilogues fused into the matrix multiplication kernel or
use of integer tensor cores.
- Matrix multiplication in ``jax.numpy.matmul`` or ``jax.lax.dot``, and
the XLA (Accelerated Linear Algebra) use hipBLASLt for optimized matrix
operations, mixed-precision support, and hardware-specific
optimizations.
* - `hipCUB <https://github.com/ROCm/hipCUB>`_
- 3.3.0
- Provides a C++ template library for parallel algorithms for reduction,
scan, sort and select.
- Reduction functions (``jax.numpy.sum``, ``jax.numpy.mean``,
``jax.numpy.prod``, ``jax.numpy.max`` and ``jax.numpy.min``), prefix sum
(``jax.numpy.cumsum``, ``jax.numpy.cumprod``) and sorting
(``jax.numpy.sort``, ``jax.numpy.argsort``).
* - `hipFFT <https://github.com/ROCm/hipFFT>`_
- 1.0.17
- Provides GPU-accelerated Fast Fourier Transform (FFT) operations.
- Used in functions like ``jax.numpy.fft``.
* - `hipRAND <https://github.com/ROCm/hipRAND>`_
- 2.11.0
- Provides fast random number generation for GPUs.
- The ``jax.random.uniform``, ``jax.random.normal``,
``jax.random.randint`` and ``jax.random.split``.
* - `hipSOLVER <https://github.com/ROCm/hipSOLVER>`_
- 2.3.0
- Provides GPU-accelerated solvers for linear systems, eigenvalues, and
singular value decompositions (SVD).
- Solving linear systems (``jax.numpy.linalg.solve``), matrix
factorizations, SVD (``jax.numpy.linalg.svd``) and eigenvalue problems
(``jax.numpy.linalg.eig``).
* - `hipSPARSE <https://github.com/ROCm/hipSPARSE>`_
- 3.1.2
- Accelerates operations on sparse matrices, such as sparse matrix-vector
or matrix-matrix products.
- Sparse matrix multiplication (``jax.numpy.matmul``), sparse
matrix-vector and matrix-matrix products
(``jax.experimental.sparse.dot``), sparse linear system solvers and
sparse data handling.
* - `hipSPARSELt <https://github.com/ROCm/hipSPARSELt>`_
- 0.2.2
- Accelerates operations on sparse matrices, such as sparse matrix-vector
or matrix-matrix products.
- Sparse matrix multiplication (``jax.numpy.matmul``), sparse
matrix-vector and matrix-matrix products
(``jax.experimental.sparse.dot``) and sparse linear system solvers.
* - `MIOpen <https://github.com/ROCm/MIOpen>`_
- 3.3.0
- Optimized for deep learning primitives such as convolutions, pooling,
normalization, and activation functions.
- Speeds up convolutional neural networks (CNNs), recurrent neural
networks (RNNs), and other layers. Used in operations like
``jax.nn.conv``, ``jax.nn.relu``, and ``jax.nn.batch_norm``.
* - `RCCL <https://github.com/ROCm/rccl>`_
- 2.21.5
- Optimized for multi-GPU communication for operations like all-reduce,
broadcast, and scatter.
- Distribute computations across multiple GPU with ``pmap`` and
``jax.distributed``. XLA automatically uses rccl when executing
operations across multiple GPUs on AMD hardware.
* - `rocThrust <https://github.com/ROCm/rocThrust>`_
- 3.3.0
- Provides a C++ template library for parallel algorithms like sorting,
reduction, and scanning.
- Reduction operations like ``jax.numpy.sum``, ``jax.pmap`` for
distributed training, which involves parallel reductions or
operations like ``jax.numpy.cumsum`` can use rocThrust.
Supported and unsupported features
===============================================================================
The following table maps GPU-accelerated JAX modules to their supported
ROCm and JAX versions.
.. list-table::
:header-rows: 1
* - Module
- Description
- Since JAX
- Since ROCm
* - ``jax.numpy``
- Implements the NumPy API, using the primitives in ``jax.lax``.
- 0.1.56
- 5.0.0
* - ``jax.scipy``
- Provides GPU-accelerated and differentiable implementations of many
functions from the SciPy library, leveraging JAX's transformations
(e.g., ``grad``, ``jit``, ``vmap``).
- 0.1.56
- 5.0.0
* - ``jax.lax``
- A library of primitives operations that underpins libraries such as
``jax.numpy.`` Transformation rules, such as Jacobian-vector product
(JVP) and batching rules, are typically defined as transformations on
``jax.lax`` primitives.
- 0.1.57
- 5.0.0
* - ``jax.random``
- Provides a number of routines for deterministic generation of sequences
of pseudorandom numbers.
- 0.1.58
- 5.0.0
* - ``jax.sharding``
- Allows to define partitioning and distributing arrays across multiple
devices.
- 0.3.20
- 5.1.0
* - ``jax.dlpack``
- For exchanging tensor data between JAX and other libraries that support the
DLPack standard.
- 0.1.57
- 5.0.0
* - ``jax.distributed``
- Enables the scaling of computations across multiple devices on a single
machine or across multiple machines.
- 0.1.74
- 5.0.0
* - ``jax.dtypes``
- Provides utilities for working with and managing data types in JAX
arrays and computations.
- 0.1.66
- 5.0.0
* - ``jax.image``
- Contains image manipulation functions like resize, scale and translation.
- 0.1.57
- 5.0.0
* - ``jax.nn``
- Contains common functions for neural network libraries.
- 0.1.56
- 5.0.0
* - ``jax.ops``
- Computes the minimum, maximum, sum or product within segments of an
array.
- 0.1.57
- 5.0.0
* - ``jax.profiler``
- Contains JAXs tracing and time profiling features.
- 0.1.57
- 5.0.0
* - ``jax.stages``
- Contains interfaces to stages of the compiled execution process.
- 0.3.4
- 5.0.0
* - ``jax.tree``
- Provides utilities for working with tree-like container data structures.
- 0.4.26
- 5.6.0
* - ``jax.tree_util``
- Provides utilities for working with nested data structures, or
``pytrees``.
- 0.1.65
- 5.0.0
* - ``jax.typing``
- Provides JAX-specific static type annotations.
- 0.3.18
- 5.1.0
* - ``jax.extend``
- Provides modules for access to JAX internal machinery module. The
``jax.extend`` module defines a library view of some of JAXs internal
components.
- 0.4.15
- 5.5.0
* - ``jax.example_libraries``
- Serves as a collection of example code and libraries that demonstrate
various capabilities of JAX.
- 0.1.74
- 5.0.0
* - ``jax.experimental``
- Namespace for experimental features and APIs that are in development or
are not yet fully stable for production use.
- 0.1.56
- 5.0.0
* - ``jax.lib``
- Set of internal tools and types for bridging between JAXs Python
frontend and its XLA backend.
- 0.4.6
- 5.3.0
* - ``jax_triton``
- Library that integrates the Triton deep learning compiler with JAX.
- jax_triton 0.2.0
- 6.2.4
jax.scipy module
-------------------------------------------------------------------------------
A SciPy-like API for scientific computing.
.. list-table::
:header-rows: 1
* - Module
- Since JAX
- Since ROCm
* - ``jax.scipy.cluster``
- 0.3.11
- 5.1.0
* - ``jax.scipy.fft``
- 0.1.71
- 5.0.0
* - ``jax.scipy.integrate``
- 0.4.15
- 5.5.0
* - ``jax.scipy.interpolate``
- 0.1.76
- 5.0.0
* - ``jax.scipy.linalg``
- 0.1.56
- 5.0.0
* - ``jax.scipy.ndimage``
- 0.1.56
- 5.0.0
* - ``jax.scipy.optimize``
- 0.1.57
- 5.0.0
* - ``jax.scipy.signal``
- 0.1.56
- 5.0.0
* - ``jax.scipy.spatial.transform``
- 0.4.12
- 5.4.0
* - ``jax.scipy.sparse.linalg``
- 0.1.56
- 5.0.0
* - ``jax.scipy.special``
- 0.1.56
- 5.0.0
* - ``jax.scipy.stats``
- 0.1.56
- 5.0.0
jax.scipy.stats module
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.. list-table::
:header-rows: 1
* - Module
- Since JAX
- Since ROCm
* - ``jax.scipy.stats.bernouli``
- 0.1.56
- 5.0.0
* - ``jax.scipy.stats.beta``
- 0.1.56
- 5.0.0
* - ``jax.scipy.stats.betabinom``
- 0.1.61
- 5.0.0
* - ``jax.scipy.stats.binom``
- 0.4.14
- 5.4.0
* - ``jax.scipy.stats.cauchy``
- 0.1.56
- 5.0.0
* - ``jax.scipy.stats.chi2``
- 0.1.61
- 5.0.0
* - ``jax.scipy.stats.dirichlet``
- 0.1.56
- 5.0.0
* - ``jax.scipy.stats.expon``
- 0.1.56
- 5.0.0
* - ``jax.scipy.stats.gamma``
- 0.1.56
- 5.0.0
* - ``jax.scipy.stats.gennorm``
- 0.3.15
- 5.2.0
* - ``jax.scipy.stats.geom``
- 0.1.56
- 5.0.0
* - ``jax.scipy.stats.laplace``
- 0.1.56
- 5.0.0
* - ``jax.scipy.stats.logistic``
- 0.1.56
- 5.0.0
* - ``jax.scipy.stats.multinomial``
- 0.3.18
- 5.1.0
* - ``jax.scipy.stats.multivariate_normal``
- 0.1.56
- 5.0.0
* - ``jax.scipy.stats.nbinom``
- 0.1.72
- 5.0.0
* - ``jax.scipy.stats.norm``
- 0.1.56
- 5.0.0
* - ``jax.scipy.stats.pareto``
- 0.1.56
- 5.0.0
* - ``jax.scipy.stats.poisson``
- 0.1.56
- 5.0.0
* - ``jax.scipy.stats.t``
- 0.1.56
- 5.0.0
* - ``jax.scipy.stats.truncnorm``
- 0.4.0
- 5.3.0
* - ``jax.scipy.stats.uniform``
- 0.1.56
- 5.0.0
* - ``jax.scipy.stats.vonmises``
- 0.4.2
- 5.3.0
* - ``jax.scipy.stats.wrapcauchy``
- 0.4.20
- 5.6.0
jax.extend module
-------------------------------------------------------------------------------
Modules for JAX extensions.
.. list-table::
:header-rows: 1
* - Module
- Since JAX
- Since ROCm
* - ``jax.extend.ffi``
- 0.4.30
- 6.0.0
* - ``jax.extend.linear_util``
- 0.4.17
- 5.6.0
* - ``jax.extend.mlir``
- 0.4.26
- 5.6.0
* - ``jax.extend.random``
- 0.4.15
- 5.5.0
jax.experimental module
-------------------------------------------------------------------------------
Experimental modules and APIs.
.. list-table::
:header-rows: 1
* - Module
- Since JAX
- Since ROCm
* - ``jax.experimental.checkify``
- 0.1.75
- 5.0.0
* - ``jax.experimental.compilation_cache.compilation_cache``
- 0.1.68
- 5.0.0
* - ``jax.experimental.custom_partitioning``
- 0.4.0
- 5.3.0
* - ``jax.experimental.jet``
- 0.1.56
- 5.0.0
* - ``jax.experimental.key_reuse``
- 0.4.26
- 5.6.0
* - ``jax.experimental.mesh_utils``
- 0.1.76
- 5.0.0
* - ``jax.experimental.multihost_utils``
- 0.3.2
- 5.0.0
* - ``jax.experimental.pallas``
- 0.4.15
- 5.5.0
* - ``jax.experimental.pjit``
- 0.1.61
- 5.0.0
* - ``jax.experimental.serialize_executable``
- 0.4.0
- 5.3.0
* - ``jax.experimental.shard_map``
- 0.4.3
- 5.3.0
* - ``jax.experimental.sparse``
- 0.1.75
- 5.0.0
.. list-table::
:header-rows: 1
* - API
- Since JAX
- Since ROCm
* - ``jax.experimental.enable_x64``
- 0.1.60
- 5.0.0
* - ``jax.experimental.disable_x64``
- 0.1.60
- 5.0.0
jax.experimental.pallas module
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Module for Pallas, a JAX extension for custom kernels.
.. list-table::
:header-rows: 1
* - Module
- Since JAX
- Since ROCm
* - ``jax.experimental.pallas.mosaic_gpu``
- 0.4.31
- 6.1.3
* - ``jax.experimental.pallas.tpu``
- 0.4.15
- 5.5.0
* - ``jax.experimental.pallas.triton``
- 0.4.32
- 6.1.3
jax.experimental.sparse module
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Experimental support for sparse matrix operations.
.. list-table::
:header-rows: 1
* - Module
- Since JAX
- Since ROCm
* - ``jax.experimental.sparse.linalg``
- 0.3.15
- 5.2.0
* - ``jax.experimental.sparse.sparsify``
- 0.3.25
- ❌
.. list-table::
:header-rows: 1
* - ``sparse`` data structure API
- Since JAX
- Since ROCm
* - ``jax.experimental.sparse.BCOO``
- 0.1.72
- 5.0.0
* - ``jax.experimental.sparse.BCSR``
- 0.3.20
- 5.1.0
* - ``jax.experimental.sparse.CSR``
- 0.1.75
- 5.0.0
* - ``jax.experimental.sparse.NM``
- 0.4.27
- 5.6.0
* - ``jax.experimental.sparse.COO``
- 0.1.75
- 5.0.0
Unsupported JAX features
------------------------
The following are GPU-accelerated JAX features not currently supported by
ROCm.
.. list-table::
:header-rows: 1
* - Feature
- Description
- Since JAX
* - Mixed Precision with TF32
- Mixed precision with TF32 is used for matrix multiplications,
convolutions, and other linear algebra operations, particularly in
deep learning workloads like CNNs and transformers.
- 0.2.25
* - RNN support
- Currently only LSTM with double bias is supported with float32 input
and weight.
- 0.3.25
* - XLA int4 support
- 4-bit integer (int4) precision in the XLA compiler.
- 0.4.0
* - ``jax.experimental.sparsify``
- Converts a dense matrix to a sparse matrix representation.
- Experimental
Use cases and recommendations
================================================================================
* The `nanoGPT in JAX <https://rocm.blogs.amd.com/artificial-intelligence/nanoGPT-JAX/README.html>`_
blog explores the implementation and training of a Generative Pre-trained
Transformer (GPT) model in JAX, inspired by Andrej Karpathys PyTorch-based
nanoGPT. By comparing how essential GPT components—such as self-attention
mechanisms and optimizers—are realized in PyTorch and JAX, also highlight
JAXs unique features.
* The `Optimize GPT Training: Enabling Mixed Precision Training in JAX using
ROCm on AMD GPUs <https://rocm.blogs.amd.com/artificial-intelligence/jax-mixed-precision/README.html>`_
blog post provides a comprehensive guide on enhancing the training efficiency
of GPT models by implementing mixed precision techniques in JAX, specifically
tailored for AMD GPUs utilizing the ROCm platform.
* The `Supercharging JAX with Triton Kernels on AMD GPUs <https://rocm.blogs.amd.com/artificial-intelligence/jax-triton/README.html>`_
blog demonstrates how to develop a custom fused dropout-activation kernel for
matrices using Triton, integrate it with JAX, and benchmark its performance
using ROCm.
* The `Distributed fine-tuning with JAX on AMD GPUs <https://rocm.blogs.amd.com/artificial-intelligence/distributed-sft-jax/README.html>`_
outlines the process of fine-tuning a Bidirectional Encoder Representations
from Transformers (BERT)-based large language model (LLM) using JAX for a text
classification task. The blog post discuss techniques for parallelizing the
fine-tuning across multiple AMD GPUs and assess the model's performance on a
holdout dataset. During the fine-tuning, a BERT-base-cased transformer model
and the General Language Understanding Evaluation (GLUE) benchmark dataset was
used on a multi-GPU setup.
* The `MI300X workload optimization guide <https://rocm.docs.amd.com/en/latest/how-to/tuning-guides/mi300x/workload.html>`_
provides detailed guidance on optimizing workloads for the AMD Instinct MI300X
accelerator using ROCm. The page is aimed at helping users achieve optimal
performance for deep learning and other high-performance computing tasks on
the MI300X GPU.
For more use cases and recommendations, see `ROCm JAX blog posts <https://rocm.blogs.amd.com/blog/tag/jax.html>`_.

View File

@@ -1,924 +0,0 @@
:orphan:
.. meta::
:description: PyTorch compatibility
:keywords: GPU, PyTorch compatibility
********************************************************************************
PyTorch compatibility
********************************************************************************
`PyTorch <https://pytorch.org/>`_ is an open-source tensor library designed for
deep learning. PyTorch on ROCm provides mixed-precision and large-scale training
using `MIOpen <https://github.com/ROCm/MIOpen>`_ and
`RCCL <https://github.com/ROCm/rccl>`_ libraries.
ROCm support for PyTorch is upstreamed into the official PyTorch repository. Due
to independent compatibility considerations, this results in two distinct
release cycles for PyTorch on ROCm:
- ROCm PyTorch release:
- Provides the latest version of ROCm but doesn't immediately support the latest stable PyTorch
version.
- Offers :ref:`Docker images <pytorch-docker-compat>` with ROCm and PyTorch
pre-installed.
- ROCm PyTorch repository: `<https://github.com/ROCm/pytorch>`_
- See the :doc:`ROCm PyTorch installation guide <rocm-install-on-linux:install/3rd-party/pytorch-install>` to get started.
- Official PyTorch release:
- Provides the latest stable version of PyTorch but doesn't immediately support the latest ROCm version.
- Official PyTorch repository: `<https://github.com/pytorch/pytorch>`_
- See the `Nightly and latest stable version installation guide <https://pytorch.org/get-started/locally/>`_
or `Previous versions <https://pytorch.org/get-started/previous-versions/>`_ to get started.
The upstream PyTorch includes an automatic HIPification solution that automatically generates HIP
source code from the CUDA backend. This approach allows PyTorch to support ROCm without requiring
manual code modifications.
Development of ROCm is aligned with the stable release of PyTorch while upstream PyTorch testing uses
the stable release of ROCm to maintain consistency.
.. _pytorch-docker-compat:
Docker image compatibility
================================================================================
.. |docker-icon| raw:: html
<i class="fab fa-docker"></i>
AMD validates and publishes ready-made `PyTorch images <https://hub.docker.com/r/rocm/pytorch>`_
with ROCm backends on Docker Hub. The following Docker image tags and
associated inventories are validated for `ROCm 6.3.0 <https://repo.radeon.com/rocm/apt/6.3/>`_.
Click the |docker-icon| icon to view the image on Docker Hub.
.. list-table:: PyTorch Docker image components
:header-rows: 1
:class: docker-image-compatibility
* - Docker
- PyTorch
- Ubuntu
- Python
- Apex
- torchvision
- TensorBoard
- MAGMA
- UCX
- OMPI
- OFED
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/pytorch/rocm6.3_ubuntu24.04_py3.12_pytorch_release_2.4.0/images/sha256-98ddf20333bd01ff749b8092b1190ee369a75d3b8c71c2fac80ffdcb1a98d529?context=explore"><i class="fab fa-docker fa-lg"></i></a>
- `2.4.0 <https://github.com/ROCm/pytorch/tree/release/2.4>`_
- 24.04
- `3.12 <https://www.python.org/downloads/release/python-3128/>`_
- `1.4.0 <https://github.com/ROCm/apex/tree/release/1.4.0>`_
- `0.19.0 <https://github.com/pytorch/vision/tree/v0.19.0>`_
- `2.13.0 <https://github.com/tensorflow/tensorboard/tree/2.13.0>`_
- `master <https://bitbucket.org/icl/magma/src/master/>`_
- `1.10.0 <https://github.com/openucx/ucx/tree/v1.10.0>`_
- `4.0.7 <https://github.com/open-mpi/ompi/tree/v4.0.7>`_
- `5.3-1.0.5.0 <https://content.mellanox.com/ofed/MLNX_OFED-5.3-1.0.5.0/MLNX_OFED_LINUX-5.3-1.0.5.0-ubuntu20.04-x86_64.tgz>`_
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/pytorch/rocm6.3_ubuntu22.04_py3.10_pytorch_release_2.4.0/images/sha256-402c9b4f1a6b5a81c634a1932b56cbe01abb699cfcc7463d226276997c6cf8ea?context=explore"><i class="fab fa-docker fa-lg"></i></a>
- `2.4.0 <https://github.com/ROCm/pytorch/tree/release/2.4>`_
- 22.04
- `3.10 <https://www.python.org/downloads/release/python-31016/>`_
- `1.4.0 <https://github.com/ROCm/apex/tree/release/1.4.0>`_
- `0.19.0 <https://github.com/pytorch/vision/tree/v0.19.0>`_
- `2.13.0 <https://github.com/tensorflow/tensorboard/tree/2.13.0>`_
- `master <https://bitbucket.org/icl/magma/src/master/>`_
- `1.10.0 <https://github.com/openucx/ucx/tree/v1.10.0>`_
- `4.0.7 <https://github.com/open-mpi/ompi/tree/v4.0.7>`_
- `5.3-1.0.5.0 <https://content.mellanox.com/ofed/MLNX_OFED-5.3-1.0.5.0/MLNX_OFED_LINUX-5.3-1.0.5.0-ubuntu20.04-x86_64.tgz>`_
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/pytorch/rocm6.3_ubuntu22.04_py3.9_pytorch_release_2.4.0/images/sha256-e0608b55d408c3bfe5c19fdd57a4ced3e0eb3a495b74c309980b60b156c526dd?context=explore"><i class="fab fa-docker fa-lg"></i></a>
- `2.4.0 <https://github.com/ROCm/pytorch/tree/release/2.4>`_
- 22.04
- `3.9.18 <https://www.python.org/downloads/release/python-3918/>`_
- `1.4.0 <https://github.com/ROCm/apex/tree/release/1.4.0>`_
- `0.19.0 <https://github.com/pytorch/vision/tree/v0.19.0>`_
- `2.13.0 <https://github.com/tensorflow/tensorboard/tree/2.13.0>`_
- `master <https://bitbucket.org/icl/magma/src/master/>`_
- `1.10.0 <https://github.com/openucx/ucx/tree/v1.10.0>`_
- `4.0.7 <https://github.com/open-mpi/ompi/tree/v4.0.7>`_
- `5.3-1.0.5.0 <https://content.mellanox.com/ofed/MLNX_OFED-5.3-1.0.5.0/MLNX_OFED_LINUX-5.3-1.0.5.0-ubuntu20.04-x86_64.tgz>`_
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/pytorch/rocm6.3_ubuntu22.04_py3.10_pytorch_release_2.3.0/images/sha256-652cf25263d05b1de548222970aeb76e60b12de101de66751264709c0d0ff9d8?context=explore"><i class="fab fa-docker fa-lg"></i></a>
- `2.3.0 <https://github.com/ROCm/pytorch/tree/release/2.3>`_
- 22.04
- `3.10 <https://www.python.org/downloads/release/python-31016/>`_
- `1.3.0 <https://github.com/ROCm/apex/tree/release/1.3.0>`_
- `0.18.0 <https://github.com/pytorch/vision/tree/v0.18.0>`_
- `2.13.0 <https://github.com/tensorflow/tensorboard/tree/2.13.0>`_
- `master <https://bitbucket.org/icl/magma/src/master/>`_
- `1.14.1 <https://github.com/openucx/ucx/tree/v1.14.1>`_
- `4.1.5 <https://github.com/open-mpi/ompi/tree/v4.1.5>`_
- `5.3-1.0.5.0 <https://content.mellanox.com/ofed/MLNX_OFED-5.3-1.0.5.0/MLNX_OFED_LINUX-5.3-1.0.5.0-ubuntu20.04-x86_64.tgz>`_
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/pytorch/rocm6.3_ubuntu22.04_py3.10_pytorch_release_2.2.1/images/sha256-051976f26beab8f9aa65d999e3ad546c027b39240a0cc3ee81b114a9024f2912?context=explore"><i class="fab fa-docker fa-lg"></i></a>
- `2.2.1 <https://github.com/ROCm/pytorch/tree/release/2.2>`_
- 22.04
- `3.10 <https://www.python.org/downloads/release/python-31016/>`_
- `1.2.0 <https://github.com/ROCm/apex/tree/release/1.2.0>`_
- `0.17.1 <https://github.com/pytorch/vision/tree/v0.17.1>`_
- `2.13.0 <https://github.com/tensorflow/tensorboard/tree/2.13.0>`_
- `master <https://bitbucket.org/icl/magma/src/master/>`_
- `1.14.1 <https://github.com/openucx/ucx/tree/v1.14.1>`_
- `4.1.5 <https://github.com/open-mpi/ompi/tree/v4.1.5>`_
- `5.3-1.0.5.0 <https://content.mellanox.com/ofed/MLNX_OFED-5.3-1.0.5.0/MLNX_OFED_LINUX-5.3-1.0.5.0-ubuntu20.04-x86_64.tgz>`_
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/pytorch/rocm6.3_ubuntu20.04_py3.9_pytorch_release_2.2.1/images/sha256-88c839a364d109d3748c100385bfa100d28090d25118cc723fd0406390ab2f7e?context=explore"><i class="fab fa-docker fa-lg"></i></a>
- `2.2.1 <https://github.com/ROCm/pytorch/tree/release/2.2>`_
- 20.04
- `3.9.21 <https://www.python.org/downloads/release/python-3921/>`_
- `1.2.0 <https://github.com/ROCm/apex/tree/release/1.2.0>`_
- `0.17.1 <https://github.com/pytorch/vision/tree/v0.17.1>`_
- `2.13.0 <https://github.com/tensorflow/tensorboard/tree/2.13.0>`_
- `master <https://bitbucket.org/icl/magma/src/master/>`_
- `1.10.0 <https://github.com/openucx/ucx/tree/v1.10.0>`_
- `4.0.3 <https://github.com/open-mpi/ompi/tree/v4.0.3>`_
- `5.3-1.0.5.0 <https://content.mellanox.com/ofed/MLNX_OFED-5.3-1.0.5.0/MLNX_OFED_LINUX-5.3-1.0.5.0-ubuntu20.04-x86_64.tgz>`_
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/pytorch/rocm6.3_ubuntu22.04_py3.9_pytorch_release_1.13.1/images/sha256-994424ed07a63113f79dd9aa72159124c00f5fbfe18127151e6658f7d0b6f821?context=explore"><i class="fab fa-docker fa-lg"></i></a>
- `1.13.1 <https://github.com/ROCm/pytorch/tree/release/1.13>`_
- 22.04
- `3.9.21 <https://www.python.org/downloads/release/python-3921/>`_
- `1.0.0 <https://github.com/ROCm/apex/tree/release/1.0.0>`_
- `0.14.0 <https://github.com/pytorch/vision/tree/v0.14.0>`_
- `2.18.0 <https://github.com/tensorflow/tensorboard/tree/2.18>`_
- `master <https://bitbucket.org/icl/magma/src/master/>`_
- `1.14.1 <https://github.com/openucx/ucx/tree/v1.14.1>`_
- `4.1.5 <https://github.com/open-mpi/ompi/tree/v4.1.5>`_
- `5.3-1.0.5.0 <https://content.mellanox.com/ofed/MLNX_OFED-5.3-1.0.5.0/MLNX_OFED_LINUX-5.3-1.0.5.0-ubuntu20.04-x86_64.tgz>`_
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/pytorch/rocm6.3_ubuntu20.04_py3.9_pytorch_release_1.13.1/images/sha256-7b8139fe40a9aeb4bca3aecd15c22c1fa96e867d93479fa3a24fdeeeeafa1219?context=explore"><i class="fab fa-docker fa-lg"></i></a>
- `1.13.1 <https://github.com/ROCm/pytorch/tree/release/1.13>`_
- 20.04
- `3.9.21 <https://www.python.org/downloads/release/python-3921/>`_
- `1.0.0 <https://github.com/ROCm/apex/tree/release/1.0.0>`_
- `0.14.0 <https://github.com/pytorch/vision/tree/v0.14.0>`_
- `2.18.0 <https://github.com/tensorflow/tensorboard/tree/2.18>`_
- `master <https://bitbucket.org/icl/magma/src/master/>`_
- `1.10.0 <https://github.com/openucx/ucx/tree/v1.10.0>`_
- `4.0.3 <https://github.com/open-mpi/ompi/tree/v4.0.3>`_
- `5.3-1.0.5.0 <https://content.mellanox.com/ofed/MLNX_OFED-5.3-1.0.5.0/MLNX_OFED_LINUX-5.3-1.0.5.0-ubuntu20.04-x86_64.tgz>`_
Critical ROCm libraries for PyTorch
================================================================================
The functionality of PyTorch with ROCm is determined by its underlying library
dependencies. These critical ROCm components affect the capabilities,
performance, and feature set available to developers.
.. list-table::
:header-rows: 1
* - ROCm library
- Version
- Purpose
- Used in
* - `Composable Kernel <https://github.com/ROCm/composable_kernel>`_
- 1.1.0
- Enables faster execution of core operations like matrix multiplication
(GEMM), convolutions and transformations.
- Speeds up ``torch.permute``, ``torch.view``, ``torch.matmul``,
``torch.mm``, ``torch.bmm``, ``torch.nn.Conv2d``, ``torch.nn.Conv3d``
and ``torch.nn.MultiheadAttention``.
* - `hipBLAS <https://github.com/ROCm/hipBLAS>`_
- 2.3.0
- Provides GPU-accelerated Basic Linear Algebra Subprograms (BLAS) for
matrix and vector operations.
- Supports operations like matrix multiplication, matrix-vector products,
and tensor contractions. Utilized in both dense and batched linear
algebra operations.
* - `hipBLASLt <https://github.com/ROCm/hipBLASLt>`_
- 0.10.0
- hipBLASLt is an extension of the hipBLAS library, providing additional
features like epilogues fused into the matrix multiplication kernel or
use of integer tensor cores.
- It accelerates operations like ``torch.matmul``, ``torch.mm``, and the
matrix multiplications used in convolutional and linear layers.
* - `hipCUB <https://github.com/ROCm/hipCUB>`_
- 3.3.0
- Provides a C++ template library for parallel algorithms for reduction,
scan, sort and select.
- Supports operations like ``torch.sum``, ``torch.cumsum``, ``torch.sort``
and ``torch.topk``. Operations on sparse tensors or tensors with
irregular shapes often involve scanning, sorting, and filtering, which
hipCUB handles efficiently.
* - `hipFFT <https://github.com/ROCm/hipFFT>`_
- 1.0.17
- Provides GPU-accelerated Fast Fourier Transform (FFT) operations.
- Used in functions like the ``torch.fft`` module.
* - `hipRAND <https://github.com/ROCm/hipRAND>`_
- 2.11.0
- Provides fast random number generation for GPUs.
- The ``torch.rand``, ``torch.randn`` and stochastic layers like
``torch.nn.Dropout``.
* - `hipSOLVER <https://github.com/ROCm/hipSOLVER>`_
- 2.3.0
- Provides GPU-accelerated solvers for linear systems, eigenvalues, and
singular value decompositions (SVD).
- Supports functions like ``torch.linalg.solve``,
``torch.linalg.eig``, and ``torch.linalg.svd``.
* - `hipSPARSE <https://github.com/ROCm/hipSPARSE>`_
- 3.1.2
- Accelerates operations on sparse matrices, such as sparse matrix-vector
or matrix-matrix products.
- Sparse tensor operations ``torch.sparse``.
* - `hipSPARSELt <https://github.com/ROCm/hipSPARSELt>`_
- 0.2.2
- Accelerates operations on sparse matrices, such as sparse matrix-vector
or matrix-matrix products.
- Sparse tensor operations ``torch.sparse``.
* - `hipTensor <https://github.com/ROCm/hipTensor>`_
- 1.4.0
- Optimizes for high-performance tensor operations, such as contractions.
- Accelerates tensor algebra, especially in deep learning and scientific
computing.
* - `MIOpen <https://github.com/ROCm/MIOpen>`_
- 3.3.0
- Optimizes deep learning primitives such as convolutions, pooling,
normalization, and activation functions.
- Speeds up convolutional neural networks (CNNs), recurrent neural
networks (RNNs), and other layers. Used in operations like
``torch.nn.Conv2d``, ``torch.nn.ReLU``, and ``torch.nn.LSTM``.
* - `MIGraphX <https://github.com/ROCm/AMDMIGraphX>`_
- 2.11.0
- Adds graph-level optimizations, ONNX models and mixed precision support
and enable Ahead-of-Time (AOT) Compilation.
- Speeds up inference models and executes ONNX models for
compatibility with other frameworks.
``torch.nn.Conv2d``, ``torch.nn.ReLU``, and ``torch.nn.LSTM``.
* - `MIVisionX <https://github.com/ROCm/MIVisionX>`_
- 3.1.0
- Optimizes acceleration for computer vision and AI workloads like
preprocessing, augmentation, and inferencing.
- Faster data preprocessing and augmentation pipelines for datasets like
ImageNet or COCO and easy to integrate into PyTorch's ``torch.utils.data``
and ``torchvision`` workflows.
* - `rocAL <https://github.com/ROCm/rocAL>`_
- 2.1.0
- Accelerates the data pipeline by offloading intensive preprocessing and
augmentation tasks. rocAL is part of MIVisionX.
- Easy to integrate into PyTorch's ``torch.utils.data`` and
``torchvision`` data load workloads.
* - `RCCL <https://github.com/ROCm/rccl>`_
- 2.21.5
- Optimizes for multi-GPU communication for operations like AllReduce and
Broadcast.
- Distributed data parallel training (``torch.nn.parallel.DistributedDataParallel``).
Handles communication in multi-GPU setups.
* - `rocDecode <https://github.com/ROCm/rocDecode>`_
- 0.8.0
- Provides hardware-accelerated data decoding capabilities, particularly
for image, video, and other dataset formats.
- Can be integrated in ``torch.utils.data``, ``torchvision.transforms``
and ``torch.distributed``.
* - `rocJPEG <https://github.com/ROCm/rocJPEG>`_
- 0.6.0
- Provides hardware-accelerated JPEG image decoding and encoding.
- GPU accelerated ``torchvision.io.decode_jpeg`` and
``torchvision.io.encode_jpeg`` and can be integrated in
``torch.utils.data`` and ``torchvision``.
* - `RPP <https://github.com/ROCm/RPP>`_
- 1.9.1
- Speeds up data augmentation, transformation, and other preprocessing steps.
- Easy to integrate into PyTorch's ``torch.utils.data`` and
``torchvision`` data load workloads.
* - `rocThrust <https://github.com/ROCm/rocThrust>`_
- 3.3.0
- Provides a C++ template library for parallel algorithms like sorting,
reduction, and scanning.
- Utilized in backend operations for tensor computations requiring
parallel processing.
* - `rocWMMA <https://github.com/ROCm/rocWMMA>`_
- 1.6.0
- Accelerates warp-level matrix-multiply and matrix-accumulate to speed up matrix
multiplication (GEMM) and accumulation operations with mixed precision
support.
- Linear layers (``torch.nn.Linear``), convolutional layers
(``torch.nn.Conv2d``), attention layers, general tensor operations that
involve matrix products, such as ``torch.matmul``, ``torch.bmm``, and
more.
Supported and unsupported features
================================================================================
The following section maps GPU-accelerated PyTorch features to their supported
ROCm and PyTorch versions.
torch
--------------------------------------------------------------------------------
`torch <https://pytorch.org/docs/stable/index.html>`_ is the central module of
PyTorch, providing data structures for multi-dimensional tensors and
implementing mathematical operations on them. It also includes utilities for
efficient serialization of tensors and arbitrary data types, along with various
other tools.
Tensor data types
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
The data type of a tensor is specified using the ``dtype`` attribute or argument, and PyTorch supports a wide range of data types for different use cases.
The following table lists `torch.Tensor <https://pytorch.org/docs/stable/tensors.html>`_'s single data types:
.. list-table::
:header-rows: 1
* - Data type
- Description
- Since PyTorch
- Since ROCm
* - ``torch.float8_e4m3fn``
- 8-bit floating point, e4m3
- 2.3
- 5.5
* - ``torch.float8_e5m2``
- 8-bit floating point, e5m2
- 2.3
- 5.5
* - ``torch.float16`` or ``torch.half``
- 16-bit floating point
- 0.1.6
- 2.0
* - ``torch.bfloat16``
- 16-bit floating point
- 1.6
- 2.6
* - ``torch.float32`` or ``torch.float``
- 32-bit floating point
- 0.1.12_2
- 2.0
* - ``torch.float64`` or ``torch.double``
- 64-bit floating point
- 0.1.12_2
- 2.0
* - ``torch.complex32`` or ``torch.chalf``
- PyTorch provides native support for 32-bit complex numbers
- 1.6
- 2.0
* - ``torch.complex64`` or ``torch.cfloat``
- PyTorch provides native support for 64-bit complex numbers
- 1.6
- 2.0
* - ``torch.complex128`` or ``torch.cdouble``
- PyTorch provides native support for 128-bit complex numbers
- 1.6
- 2.0
* - ``torch.uint8``
- 8-bit integer (unsigned)
- 0.1.12_2
- 2.0
* - ``torch.uint16``
- 16-bit integer (unsigned)
- 2.3
- Not natively supported
* - ``torch.uint32``
- 32-bit integer (unsigned)
- 2.3
- Not natively supported
* - ``torch.uint64``
- 32-bit integer (unsigned)
- 2.3
- Not natively supported
* - ``torch.int8``
- 8-bit integer (signed)
- 1.12
- 5.0
* - ``torch.int16`` or ``torch.short``
- 16-bit integer (signed)
- 0.1.12_2
- 2.0
* - ``torch.int32`` or ``torch.int``
- 32-bit integer (signed)
- 0.1.12_2
- 2.0
* - ``torch.int64`` or ``torch.long``
- 64-bit integer (signed)
- 0.1.12_2
- 2.0
* - ``torch.bool``
- Boolean
- 1.2
- 2.0
* - ``torch.quint8``
- Quantized 8-bit integer (unsigned)
- 1.8
- 5.0
* - ``torch.qint8``
- Quantized 8-bit integer (signed)
- 1.8
- 5.0
* - ``torch.qint32``
- Quantized 32-bit integer (signed)
- 1.8
- 5.0
* - ``torch.quint4x2``
- Quantized 4-bit integer (unsigned)
- 1.8
- 5.0
.. note::
Unsigned types aside from ``uint8`` are currently only have limited support in
eager mode (they primarily exist to assist usage with ``torch.compile``).
The :doc:`ROCm precision support page <rocm:reference/precision-support>`
collected the native HW support of different data types.
torch.cuda
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
``torch.cuda`` in PyTorch is a module that provides utilities and functions for
managing and utilizing AMD and NVIDIA GPUs. It enables GPU-accelerated
computations, memory management, and efficient execution of tensor operations,
leveraging ROCm and CUDA as the underlying frameworks.
.. list-table::
:header-rows: 1
* - Feature
- Description
- Since PyTorch
- Since ROCm
* - Device management
- Utilities for managing and interacting with GPUs.
- 0.4.0
- 3.8
* - Tensor operations on GPU
- Performs tensor operations such as addition and matrix multiplications on
the GPU.
- 0.4.0
- 3.8
* - Streams and events
- Streams allow overlapping computation and communication for optimized
performance. Events enable synchronization.
- 1.6.0
- 3.8
* - Memory management
- Functions to manage and inspect memory usage like
``torch.cuda.memory_allocated()``, ``torch.cuda.max_memory_allocated()``,
``torch.cuda.memory_reserved()`` and ``torch.cuda.empty_cache()``.
- 0.3.0
- 1.9.2
* - Running process lists of memory management
- Returns a human-readable printout of the running processes and their GPU
memory use for a given device with functions like
``torch.cuda.memory_stats()`` and ``torch.cuda.memory_summary()``.
- 1.8.0
- 4.0
* - Communication collectives
- Set of APIs that enable efficient communication between multiple GPUs,
allowing for distributed computing and data parallelism.
- 1.9.0
- 5.0
* - ``torch.cuda.CUDAGraph``
- Graphs capture sequences of GPU operations to minimize kernel launch
overhead and improve performance.
- 1.10.0
- 5.3
* - TunableOp
- A mechanism that allows certain operations to be more flexible and
optimized for performance. It enables automatic tuning of kernel
configurations and other settings to achieve the best possible
performance based on the specific hardware (GPU) and workload.
- 2.0
- 5.4
* - NVIDIA Tools Extension (NVTX)
- Integration with NVTX for profiling and debugging GPU performance using
NVIDIA's Nsight tools.
- 1.8.0
- ❌
* - Lazy loading NVRTC
- Delays JIT compilation with NVRTC until the code is explicitly needed.
- 1.13.0
- ❌
* - Jiterator (beta)
- Jiterator allows asynchronous data streaming into computation streams
during training loops.
- 1.13.0
- 5.2
.. Need to validate and extend.
torch.backends.cuda
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
``torch.backends.cuda`` is a PyTorch module that provides configuration options
and flags to control the behavior of ROCm or CUDA operations. It is part of the
PyTorch backend configuration system, which allows users to fine-tune how
PyTorch interacts with the ROCm or CUDA environment.
.. list-table::
:header-rows: 1
* - Feature
- Description
- Since PyTorch
- Since ROCm
* - ``cufft_plan_cache``
- Manages caching of GPU FFT plans to optimize repeated FFT computations.
- 1.7.0
- 5.0
* - ``matmul.allow_tf32``
- Enables or disables the use of TensorFloat-32 (TF32) precision for
faster matrix multiplications on GPUs with Tensor Cores.
- 1.10.0
- ❌
* - ``matmul.allow_fp16_reduced_precision_reduction``
- Reduced precision reductions (e.g., with fp16 accumulation type) are
allowed with fp16 GEMMs.
- 2.0
- ❌
* - ``matmul.allow_bf16_reduced_precision_reduction``
- Reduced precision reductions are allowed with bf16 GEMMs.
- 2.0
- ❌
* - ``enable_cudnn_sdp``
- Globally enables cuDNN SDPA's kernels within SDPA.
- 2.0
- ❌
* - ``enable_flash_sdp``
- Globally enables or disables FlashAttention for SDPA.
- 2.1
- ❌
* - ``enable_mem_efficient_sdp``
- Globally enables or disables Memory-Efficient Attention for SDPA.
- 2.1
- ❌
* - ``enable_math_sdp``
- Globally enables or disables the PyTorch C++ implementation within SDPA.
- 2.1
- ❌
.. Need to validate and extend.
torch.backends.cudnn
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Supported ``torch`` options include:
.. list-table::
:header-rows: 1
* - Option
- Description
- Since PyTorch
- Since ROCm
* - ``allow_tf32``
- TensorFloat-32 tensor cores may be used in cuDNN convolutions on NVIDIA
Ampere or newer GPUs.
- 1.12.0
- ❌
* - ``deterministic``
- A bool that, if True, causes cuDNN to only use deterministic
convolution algorithms.
- 1.12.0
- 6.0
Automatic mixed precision: torch.amp
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
PyTorch that automates the process of using both 16-bit (half-precision,
float16) and 32-bit (single-precision, float32) floating-point types in model
training and inference.
.. list-table::
:header-rows: 1
* - Feature
- Description
- Since PyTorch
- Since ROCm
* - Autocasting
- Instances of autocast serve as context managers or decorators that allow
regions of your script to run in mixed precision.
- 1.9
- 2.5
* - Gradient scaling
- To prevent underflow, “gradient scaling” multiplies the networks
loss(es) by a scale factor and invokes a backward pass on the scaled
loss(es). Gradients flowing backward through the network are then
scaled by the same factor. In other words, gradient values have a
larger magnitude, so they dont flush to zero.
- 1.9
- 2.5
* - CUDA op-specific behavior
- These ops always go through autocasting whether they are invoked as part
of a ``torch.nn.Module``, as a function, or as a ``torch.Tensor`` method. If
functions are exposed in multiple namespaces, they go through
autocasting regardless of the namespace.
- 1.9
- 2.5
Distributed library features
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
The PyTorch distributed library includes a collective of parallelism modules, a
communications layer, and infrastructure for launching and debugging large
training jobs. See :ref:`rocm-for-ai-pytorch-distributed` for more information.
The Distributed Library feature in PyTorch provides tools and APIs for building
and running distributed machine learning workflows. It allows training models
across multiple processes, GPUs, or nodes in a cluster, enabling efficient use
of computational resources and scalability for large-scale tasks.
.. list-table::
:header-rows: 1
* - Feature
- Description
- Since PyTorch
- Since ROCm
* - TensorPipe
- A point-to-point communication library integrated into
PyTorch for distributed training. It is designed to handle tensor data
transfers efficiently between different processes or devices, including
those on separate machines.
- 1.8
- 5.4
* - Gloo
- Designed for multi-machine and multi-GPU setups, enabling
efficient communication and synchronization between processes. Gloo is
one of the default backends for PyTorch's Distributed Data Parallel
(DDP) and RPC frameworks, alongside other backends like NCCL and MPI.
- 1.0
- 2.0
torch.compiler
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.. list-table::
:header-rows: 1
* - Feature
- Description
- Since PyTorch
- Since ROCm
* - ``torch.compiler`` (AOT Autograd)
- Autograd captures not only the user-level code, but also backpropagation,
which results in capturing the backwards pass “ahead-of-time”. This
enables acceleration of both forwards and backwards pass using
``TorchInductor``.
- 2.0
- 5.3
* - ``torch.compiler`` (TorchInductor)
- The default ``torch.compile`` deep learning compiler that generates fast
code for multiple accelerators and backends. You need to use a backend
compiler to make speedups through ``torch.compile`` possible. For AMD,
NVIDIA, and Intel GPUs, it leverages OpenAI Triton as the key building block.
- 2.0
- 5.3
torchaudio
--------------------------------------------------------------------------------
The `torchaudio <https://pytorch.org/audio/stable/index.html>`_ library provides
utilities for processing audio data in PyTorch, such as audio loading,
transformations, and feature extraction.
To ensure GPU-acceleration with ``torchaudio.transforms``, you need to move audio
data (waveform tensor) explicitly to GPU using ``.to('cuda')``.
The following ``torchaudio`` features are GPU-accelerated.
.. list-table::
:header-rows: 1
* - Feature
- Description
- Since torchaudio version
- Since ROCm
* - ``torchaudio.transforms.Spectrogram``
- Generates spectrogram of an input waveform using STFT.
- 0.6.0
- 4.5
* - ``torchaudio.transforms.MelSpectrogram``
- Generates the mel-scale spectrogram of raw audio signals.
- 0.9.0
- 4.5
* - ``torchaudio.transforms.MFCC``
- Extract of MFCC features.
- 0.9.0
- 4.5
* - ``torchaudio.transforms.Resample``
- Resamples a signal from one frequency to another.
- 0.9.0
- 4.5
torchvision
--------------------------------------------------------------------------------
The `torchvision <https://pytorch.org/vision/stable/index.html>`_ library
provide datasets, model architectures, and common image transformations for
computer vision.
The following ``torchvision`` features are GPU-accelerated.
.. list-table::
:header-rows: 1
* - Feature
- Description
- Since torchvision version
- Since ROCm
* - ``torchvision.transforms.functional``
- Provides GPU-compatible transformations for image preprocessing like
resize, normalize, rotate and crop.
- 0.2.0
- 4.0
* - ``torchvision.ops``
- GPU-accelerated operations for object detection and segmentation tasks.
``torchvision.ops.roi_align``, ``torchvision.ops.nms`` and
``box_convert``.
- 0.6.0
- 3.3
* - ``torchvision.models`` with ``.to('cuda')``
- ``torchvision`` provides several pre-trained models (ResNet, Faster
R-CNN, Mask R-CNN, ...) that can run on CUDA for faster inference and
training.
- 0.1.6
- 2.x
* - ``torchvision.io``
- Enables video decoding and frame extraction using GPU acceleration with NVIDIAs
NVDEC and nvJPEG (rocJPEG) on CUDA-enabled GPUs.
- 0.4.0
- 6.3
torchtext
--------------------------------------------------------------------------------
The `torchtext <https://pytorch.org/text/stable/index.html>`_ library provides
utilities for processing and working with text data in PyTorch, including
tokenization, vocabulary management, and text embeddings. torchtext supports
preprocessing pipelines and integration with PyTorch models, simplifying the
implementation of natural language processing (NLP) tasks.
To leverage GPU acceleration in torchtext, you need to move tensors
explicitly to the GPU using ``.to('cuda')``.
* torchtext does not implement its own kernels. ROCm support is enabled by linking against ROCm libraries.
* Only official release exists.
torchtune
--------------------------------------------------------------------------------
The `torchtune <https://pytorch.org/torchtune/stable/index.html>`_ library for
authoring, fine-tuning and experimenting with LLMs.
* Usage: It works out-of-the-box, enabling developers to fine-tune ROCm PyTorch solutions.
* Only official release exists.
torchserve
--------------------------------------------------------------------------------
The `torchserve <https://pytorch.org/torchserve/>`_ is a PyTorch domain library
for common sparsity and parallelism primitives needed for large-scale recommender
systems.
* torchtext does not implement its own kernels. ROCm support is enabled by linking against ROCm libraries.
* Only official release exists.
torchrec
--------------------------------------------------------------------------------
The `torchrec <https://pytorch.org/torchrec/>`_ is a PyTorch domain library for
common sparsity and parallelism primitives needed for large-scale recommender
systems.
* torchrec does not implement its own kernels. ROCm support is enabled by linking against ROCm libraries.
* Only official release exists.
Unsupported PyTorch features
----------------------------
The following are GPU-accelerated PyTorch features not currently supported by ROCm.
.. list-table::
:widths: 30, 60, 10
:header-rows: 1
* - Feature
- Description
- Since PyTorch
* - APEX batch norm
- Use APEX batch norm instead of PyTorch batch norm.
- 1.6.0
* - ``torch.backends.cuda`` / ``matmul.allow_tf32``
- A bool that controls whether TensorFloat-32 tensor cores may be used in
matrix multiplications.
- 1.7
* - ``torch.cuda`` / NVIDIA Tools Extension (NVTX)
- Integration with NVTX for profiling and debugging GPU performance using
NVIDIA's Nsight tools.
- 1.7.0
* - ``torch.cuda`` / Lazy loading NVRTC
- Delays JIT compilation with NVRTC until the code is explicitly needed.
- 1.8.0
* - ``torch-tensorrt``
- Integrate TensorRT library for optimizing and deploying PyTorch models.
ROCm does not have equialent library for TensorRT.
- 1.9.0
* - ``torch.backends`` / ``cudnn.allow_tf32``
- TensorFloat-32 tensor cores may be used in cuDNN convolutions.
- 1.10.0
* - ``torch.backends.cuda`` / ``matmul.allow_fp16_reduced_precision_reduction``
- Reduced precision reductions with fp16 accumulation type are
allowed with fp16 GEMMs.
- 2.0
* - ``torch.backends.cuda`` / ``matmul.allow_bf16_reduced_precision_reduction``
- Reduced precision reductions are allowed with bf16 GEMMs.
- 2.0
* - ``torch.nn.functional`` / ``scaled_dot_product_attention``
- Flash attention backend for SDPA to accelerate attention computation in
transformer-based models.
- 2.0
* - ``torch.backends.cuda`` / ``enable_cudnn_sdp``
- Globally enables cuDNN SDPA's kernels within SDPA.
- 2.0
* - ``torch.backends.cuda`` / ``enable_flash_sdp``
- Globally enables or disables FlashAttention for SDPA.
- 2.1
* - ``torch.backends.cuda`` / ``enable_mem_efficient_sdp``
- Globally enables or disables Memory-Efficient Attention for SDPA.
- 2.1
* - ``torch.backends.cuda`` / ``enable_math_sdp``
- Globally enables or disables the PyTorch C++ implementation within SDPA.
- 2.1
* - Dynamic parallelism
- PyTorch itself does not directly expose dynamic parallelism as a core
feature. Dynamic parallelism allow GPU threads to launch additional
threads which can be reached using custom operations via the
``torch.utils.cpp_extension`` module.
- Not a core feature
* - Unified memory support in PyTorch
- Unified Memory is not directly exposed in PyTorch's core API, it can be
utilized effectively through custom CUDA extensions or advanced
workflows.
- Not a core feature
Use cases and recommendations
================================================================================
* :doc:`Using ROCm for AI: training a model </how-to/rocm-for-ai/training/train-a-model>` provides
guidance on how to leverage the ROCm platform for training AI models. It covers the steps, tools, and best practices
for optimizing training workflows on AMD GPUs using PyTorch features.
* :doc:`Single-GPU fine-tuning and inference </how-to/rocm-for-ai/fine-tuning/single-gpu-fine-tuning-and-inference>`
describes and demonstrates how to use the ROCm platform for the fine-tuning and inference of
machine learning models, particularly large language models (LLMs), on systems with a single AMD
Instinct MI300X accelerator. This page provides a detailed guide for setting up, optimizing, and
executing fine-tuning and inference workflows in such environments.
* :doc:`Multi-GPU fine-tuning and inference optimization </how-to/rocm-for-ai/fine-tuning/multi-gpu-fine-tuning-and-inference>`
describes and demonstrates the fine-tuning and inference of machine learning models on systems
with multi MI300X accelerators.
* The :doc:`Instinct MI300X workload optimization guide </how-to/rocm-for-ai/inference-optimization/workload>` provides detailed
guidance on optimizing workloads for the AMD Instinct MI300X accelerator using ROCm. This guide is aimed at helping
users achieve optimal performance for deep learning and other high-performance computing tasks on the MI300X
accelerator.
* The :doc:`Inception with PyTorch documentation </conceptual/ai-pytorch-inception>`
describes how PyTorch integrates with ROCm for AI workloads It outlines the use of PyTorch on the ROCm platform and
focuses on how to efficiently leverage AMD GPU hardware for training and inference tasks in AI applications.
For more use cases and recommendations, see `ROCm PyTorch blog posts <https://rocm.blogs.amd.com/blog/tag/pytorch.html>`_.

View File

@@ -1,491 +0,0 @@
:orphan:
.. meta::
:description: TensorFlow compatibility
:keywords: GPU, TensorFlow compatibility
*******************************************************************************
TensorFlow compatibility
*******************************************************************************
`TensorFlow <https://www.tensorflow.org/>`_ is an open-source library for
solving machine learning, deep learning, and AI problems. It can solve many
problems across different sectors and industries but primarily focuses on
neural network training and inference. It is one of the most popular and
in-demand frameworks and is very active in open-source contribution and
development.
The `official TensorFlow repository <http://github.com/tensorflow/tensorflow>`_
includes full ROCm support. AMD maintains a TensorFlow `ROCm repository
<http://github.com/rocm/tensorflow-upstream>`_ in order to quickly add bug
fixes, updates, and support for the latest ROCM versions.
- ROCm TensorFlow release:
- Offers :ref:`Docker images <tensorflow-docker-compat>` with
ROCm and TensorFlow pre-installed.
- ROCm TensorFlow repository: `<https://github.com/ROCm/tensorflow-upstream>`_
- See the :doc:`ROCm TensorFlow installation guide <rocm-install-on-linux:install/3rd-party/tensorflow-install>`
to get started.
- Official TensorFlow release:
- Official TensorFlow repository: `<https://github.com/tensorflow/tensorflow>`_
- See the `TensorFlow API versions <https://www.tensorflow.org/versions>`_ list.
.. note::
The official TensorFlow documentation does not cover ROCm support. Use the
ROCm documentation for installation instructions for Tensorflow on ROCm.
See :doc:`rocm-install-on-linux:install/3rd-party/tensorflow-install`.
.. _tensorflow-docker-compat:
Docker image compatibility
===============================================================================
.. |docker-icon| raw:: html
<i class="fab fa-docker"></i>
AMD validates and publishes ready-made `TensorFlow images
<https://hub.docker.com/r/rocm/tensorflow>`_ with ROCm backends on
Docker Hub. The following Docker image tags and associated inventories are
validated for `ROCm 6.3.1 <https://repo.radeon.com/rocm/apt/6.3.1/>`_. Click
the |docker-icon| icon to view the image on Docker Hub.
.. list-table:: TensorFlow Docker image components
:header-rows: 1
* - Docker image
- TensorFlow
- Dev
- Python
- TensorBoard
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/tensorflow/rocm6.3.1-py3.12-tf2.17.0-dev/images/sha256-804121ee4985718277ba7dcec53c57bdade130a1ef42f544b6c48090ad379c17"><i class="fab fa-docker fa-lg"></i> rocm/tensorflow</a>
- `tensorflow-rocm 2.17.0 <https://repo.radeon.com/rocm/manylinux/rocm-rel-6.3/tensorflow_rocm-2.17.0-cp312-cp312-manylinux_2_28_x86_64.whl>`__
- dev
- `Python 3.12 <https://www.python.org/downloads/release/python-3124/>`_
- `TensorBoard 2.17.1 <https://github.com/tensorflow/tensorboard/tree/2.17.1>`_
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/tensorflow/rocm6.3.1-py3.10-tf2.17.0-dev/images/sha256-776837ffa945913f6c466bfe477810a11453d21d5b6afb200be1c36e48fbc08e"><i class="fab fa-docker fa-lg"></i> rocm/tensorflow</a>
- `tensorflow-rocm 2.17.0 <https://repo.radeon.com/rocm/manylinux/rocm-rel-6.3/tensorflow_rocm-2.17.0-cp310-cp310-manylinux_2_28_x86_64.whl>`__
- dev
- `Python 3.10 <https://www.python.org/downloads/release/python-31012/>`_
- `TensorBoard 2.17.0 <https://github.com/tensorflow/tensorboard/tree/2.17.0>`_
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/tensorflow/rocm6.3.1-py3.12-tf2.16.2-dev/images/sha256-c793e1483e30809c3c28fc5d7805bedc033c73da224f839fff370717cb100944"><i class="fab fa-docker fa-lg"></i> rocm/tensorflow</a>
- `tensorflow-rocm 2.16.2 <https://repo.radeon.com/rocm/manylinux/rocm-rel-6.3/tensorflow_rocm-2.16.2-cp312-cp312-manylinux_2_28_x86_64.whl>`__
- dev
- `Python 3.12 <https://www.python.org/downloads/release/python-3124/>`_
- `TensorBoard 2.16.2 <https://github.com/tensorflow/tensorboard/tree/2.16.2>`_
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/tensorflow/rocm6.3.1-py3.10-tf2.16.0-dev/images/sha256-263e78414ae85d7bcd52a025a94131d0a279872a45ed632b9165336dfdcd4443"><i class="fab fa-docker fa-lg"></i> rocm/tensorflow</a>
- `tensorflow-rocm 2.16.2 <https://repo.radeon.com/rocm/manylinux/rocm-rel-6.3/tensorflow_rocm-2.16.2-cp310-cp310-manylinux_2_28_x86_64.whl>`__
- dev
- `Python 3.10 <https://www.python.org/downloads/release/python-31012/>`_
- `TensorBoard 2.16.2 <https://github.com/tensorflow/tensorboard/tree/2.16.2>`_
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/tensorflow/rocm6.3.1-py3.10-tf2.15.0-dev/images/sha256-479046a8477ca701a9494a813ab17e8ab4f6baa54641e65dc8d07629f1e6a880"><i class="fab fa-docker fa-lg"></i> rocm/tensorflow</a>
- `tensorflow-rocm 2.15.1 <https://repo.radeon.com/rocm/manylinux/rocm-rel-6.3/tensorflow_rocm-2.15.1-cp310-cp310-manylinux_2_28_x86_64.whl>`_
- dev
- `Python 3.10 <https://www.python.org/downloads/release/python-31012/>`_
- `TensorBoard 2.15.2 <https://github.com/tensorflow/tensorboard/tree/2.15.2>`_
Critical ROCm libraries for TensorFlow
===============================================================================
TensorFlow depends on multiple components and the supported features of those
components can affect the TensorFlow ROCm supported feature set. The versions
in the following table refer to the first TensorFlow version where the ROCm
library was introduced as a dependency.
.. list-table::
:widths: 25, 10, 35, 30
:header-rows: 1
* - ROCm library
- Version
- Purpose
- Used in
* - `hipBLAS <https://github.com/ROCm/hipBLAS>`_
- 2.3.0
- Provides GPU-accelerated Basic Linear Algebra Subprograms (BLAS) for
matrix and vector operations.
- Accelerates operations like ``tf.matmul``, ``tf.linalg.matmul``, and
other matrix multiplications commonly used in neural network layers.
* - `hipBLASLt <https://github.com/ROCm/hipBLASLt>`_
- 0.10.0
- Extends hipBLAS with additional optimizations like fused kernels and
integer tensor cores.
- Optimizes matrix multiplications and linear algebra operations used in
layers like dense, convolutional, and RNNs in TensorFlow.
* - `hipCUB <https://github.com/ROCm/hipCUB>`_
- 3.3.0
- Provides a C++ template library for parallel algorithms for reduction,
scan, sort and select.
- Supports operations like ``tf.reduce_sum``, ``tf.cumsum``, ``tf.sort``
and other tensor operations in TensorFlow, especially those involving
scanning, sorting, and filtering.
* - `hipFFT <https://github.com/ROCm/hipFFT>`_
- 1.0.17
- Accelerates Fast Fourier Transforms (FFT) for signal processing tasks.
- Used for operations like signal processing, image filtering, and
certain types of neural networks requiring FFT-based transformations.
* - `hipSOLVER <https://github.com/ROCm/hipSOLVER>`_
- 2.3.0
- Provides GPU-accelerated direct linear solvers for dense and sparse
systems.
- Optimizes linear algebra functions such as solving systems of linear
equations, often used in optimization and training tasks.
* - `hipSPARSE <https://github.com/ROCm/hipSPARSE>`_
- 3.1.2
- Optimizes sparse matrix operations for efficient computations on sparse
data.
- Accelerates sparse matrix operations in models with sparse weight
matrices or activations, commonly used in neural networks.
* - `MIOpen <https://github.com/ROCm/MIOpen>`_
- 3.3.0
- Provides optimized deep learning primitives such as convolutions,
pooling,
normalization, and activation functions.
- Speeds up convolutional neural networks (CNNs) and other layers. Used
in TensorFlow for layers like ``tf.nn.conv2d``, ``tf.nn.relu``, and
``tf.nn.lstm_cell``.
* - `RCCL <https://github.com/ROCm/rccl>`_
- 2.21.5
- Optimizes for multi-GPU communication for operations like AllReduce and
Broadcast.
- Distributed data parallel training (``tf.distribute.MirroredStrategy``).
Handles communication in multi-GPU setups.
* - `rocThrust <https://github.com/ROCm/rocThrust>`_
- 3.3.0
- Provides a C++ template library for parallel algorithms like sorting,
reduction, and scanning.
- Reduction operations like ``tf.reduce_sum``, ``tf.cumsum`` for computing
the cumulative sum of elements along a given axis or ``tf.unique`` to
finds unique elements in a tensor can use rocThrust.
Supported and unsupported features
===============================================================================
The following section maps supported data types and GPU-accelerated TensorFlow
features to their minimum supported ROCm and TensorFlow versions.
Data types
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
The data type of a tensor is specified using the ``dtype`` attribute or
argument, and TensorFlow supports a wide range of data types for different use
cases.
The basic, single data types of `tf.dtypes <https://www.tensorflow.org/api_docs/python/tf/dtypes>`_
are as follows:
.. list-table::
:header-rows: 1
* - Data type
- Description
- Since TensorFlow
- Since ROCm
* - ``bfloat16``
- 16-bit bfloat (brain floating point).
- 1.0.0
- 1.7
* - ``bool``
- Boolean.
- 1.0.0
- 1.7
* - ``complex128``
- 128-bit complex.
- 1.0.0
- 1.7
* - ``complex64``
- 64-bit complex.
- 1.0.0
- 1.7
* - ``double``
- 64-bit (double precision) floating-point.
- 1.0.0
- 1.7
* - ``float16``
- 16-bit (half precision) floating-point.
- 1.0.0
- 1.7
* - ``float32``
- 32-bit (single precision) floating-point.
- 1.0.0
- 1.7
* - ``float64``
- 64-bit (double precision) floating-point.
- 1.0.0
- 1.7
* - ``half``
- 16-bit (half precision) floating-point.
- 2.0.0
- 2.0
* - ``int16``
- Signed 16-bit integer.
- 1.0.0
- 1.7
* - ``int32``
- Signed 32-bit integer.
- 1.0.0
- 1.7
* - ``int64``
- Signed 64-bit integer.
- 1.0.0
- 1.7
* - ``int8``
- Signed 8-bit integer.
- 1.0.0
- 1.7
* - ``qint16``
- Signed quantized 16-bit integer.
- 1.0.0
- 1.7
* - ``qint32``
- Signed quantized 32-bit integer.
- 1.0.0
- 1.7
* - ``qint8``
- Signed quantized 8-bit integer.
- 1.0.0
- 1.7
* - ``quint16``
- Unsigned quantized 16-bit integer.
- 1.0.0
- 1.7
* - ``quint8``
- Unsigned quantized 8-bit integer.
- 1.0.0
- 1.7
* - ``resource``
- Handle to a mutable, dynamically allocated resource.
- 1.0.0
- 1.7
* - ``string``
- Variable-length string, represented as byte array.
- 1.0.0
- 1.7
* - ``uint16``
- Unsigned 16-bit (word) integer.
- 1.0.0
- 1.7
* - ``uint32``
- Unsigned 32-bit (dword) integer.
- 1.5.0
- 1.7
* - ``uint64``
- Unsigned 64-bit (qword) integer.
- 1.5.0
- 1.7
* - ``uint8``
- Unsigned 8-bit (byte) integer.
- 1.0.0
- 1.7
* - ``variant``
- Data of arbitrary type (known at runtime).
- 1.4.0
- 1.7
Features
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
This table provides an overview of key features in TensorFlow and their
availability in ROCm.
.. list-table::
:header-rows: 1
* - Module
- Description
- Since TensorFlow
- Since ROCm
* - ``tf.linalg`` (Linear Algebra)
- Operations for matrix and tensor computations, such as
``tf.linalg.matmul`` (matrix multiplication), ``tf.linalg.inv``
(matrix inversion) and ``tf.linalg.cholesky`` (Cholesky decomposition).
These leverage GPUs for high-performance linear algebra operations.
- 1.4
- 1.8.2
* - ``tf.nn`` (Neural Network Operations)
- GPU-accelerated building blocks for deep learning models, such as 2D
convolutions with ``tf.nn.conv2d``, max pooling operations with
``tf.nn.max_pool``, activation functions like ``tf.nn.relu`` or softmax
for output layers with ``tf.nn.softmax``.
- 1.0
- 1.8.2
* - ``tf.image`` (Image Processing)
- GPU-accelerated functions for image preprocessing and augmentations,
such as resize images with ``tf.image.resize``, flip images horizontally
with ``tf.image.flip_left_right`` and adjust image brightness randomly
with ``tf.image.random_brightness``.
- 1.1
- 1.8.2
* - ``tf.keras`` (High-Level API)
- GPU acceleration for Keras layers and models, including dense layers
(``tf.keras.layers.Dense``), convolutional layers
(``tf.keras.layers.Conv2D``) and recurrent layers
(``tf.keras.layers.LSTM``).
- 1.4
- 1.8.2
* - ``tf.math`` (Mathematical Operations)
- GPU-accelerated mathematical operations, such as sum across dimensions
with ``tf.math.reduce_sum``, elementwise exponentiation with
``tf.math.exp`` and sigmoid activation (``tf.math.sigmoid``).
- 1.5
- 1.8.2
* - ``tf.signal`` (Signal Processing)
- Functions for spectral analysis and signal transformations.
- 1.13
- 2.1
* - ``tf.data`` (Data Input Pipeline)
- GPU-accelerated data preprocessing for efficient input pipelines,
Prefetching with ``tf.data.experimental.AUTOTUNE``. GPU-enabled
transformations like map and batch.
- 1.4
- 1.8.2
* - ``tf.distribute`` (Distributed Training)
- Enabling to scale computations across multiple devices on a single
machine or across multiple machines.
- 1.13
- 2.1
* - ``tf.random`` (Random Number Generation)
- GPU-accelerated random number generation
- 1.12
- 1.9.2
* - ``tf.TensorArray`` (Dynamic Array Operations)
- Enables dynamic tensor manipulation on GPUs.
- 1.0
- 1.8.2
* - ``tf.sparse`` (Sparse Tensor Operations)
- GPU-accelerated sparse matrix manipulations.
- 1.9
- 1.9.0
* - ``tf.experimental.numpy``
- GPU-accelerated NumPy-like API for numerical computations.
- 2.4
- 4.1.1
* - ``tf.RaggedTensor``
- Handling of variable-length sequences and ragged tensors with GPU
support.
- 1.13
- 2.1
* - ``tf.function`` with XLA (Accelerated Linear Algebra)
- Enable GPU-accelerated functions in optimization.
- 1.14
- 2.4
* - ``tf.quantization``
- Quantized operations for inference, accelerated on GPUs.
- 1.12
- 1.9.2
Distributed library features
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Enables developers to scale computations across multiple devices on a single machine or
across multiple machines.
.. list-table::
:header-rows: 1
* - Feature
- Description
- Since TensorFlow
- Since ROCm
* - ``MultiWorkerMirroredStrategy``
- Synchronous training across multiple workers using mirrored variables.
- 2.0
- 3.0
* - ``MirroredStrategy``
- Synchronous training across multiple GPUs on one machine.
- 1.5
- 2.5
* - ``TPUStrategy``
- Efficiently trains models on Google TPUs.
- 1.9
- ❌
* - ``ParameterServerStrategy``
- Asynchronous training using parameter servers for variable management.
- 2.1
- 4.0
* - ``CentralStorageStrategy``
- Keeps variables on a single device and performs computation on multiple
devices.
- 2.3
- 4.1
* - ``CollectiveAllReduceStrategy``
- Synchronous training across multiple devices and hosts.
- 1.14
- 3.5
* - Distribution Strategies API
- High-level API to simplify distributed training configuration and
execution.
- 1.10
- 3.0
Unsupported TensorFlow features
===============================================================================
The following are GPU-accelerated TensorFlow features not currently supported by
ROCm.
.. list-table::
:header-rows: 1
* - Feature
- Description
- Since TensorFlow
* - Mixed Precision with TF32
- Mixed precision with TF32 is used for matrix multiplications,
convolutions, and other linear algebra operations, particularly in
deep learning workloads like CNNs and transformers.
- 2.4
* - ``tf.distribute.TPUStrategy``
- Efficiently trains models on Google TPUs.
- 1.9
Use cases and recommendations
===============================================================================
* The `Training a Neural Collaborative Filtering (NCF) Recommender on an AMD
GPU <https://rocm.blogs.amd.com/artificial-intelligence/ncf/README.html>`_
blog post discusses training an NCF recommender system using TensorFlow. It
explains how NCF improves traditional collaborative filtering methods by
leveraging neural networks to model non-linear user-item interactions. The
post outlines the implementation using the recommenders library, focusing on
the use of implicit data (for example, user interactions like viewing or
purchasing) and how it addresses challenges like the lack of negative values.
* The `Creating a PyTorch/TensorFlow code environment on AMD GPUs
<https://rocm.blogs.amd.com/software-tools-optimization/pytorch-tensorflow-env/README.html>`_
blog post provides instructions for creating a machine learning environment
for PyTorch and TensorFlow on AMD GPUs using ROCm. It covers steps like
installing the libraries, cloning code repositories, installing dependencies,
and troubleshooting potential issues with CUDA-based code. Additionally, it
explains how to HIPify code (port CUDA code to HIP) and manage Docker images
for a better experience on AMD GPUs. This guide aims to help data scientists
and ML practitioners adapt their code for AMD GPUs.
For more use cases and recommendations, see the `ROCm Tensorflow blog posts <https://rocm.blogs.amd.com/blog/tag/tensorflow.html>`_.

View File

@@ -1,916 +0,0 @@
.. meta::
:description: PyTorch compatibility
:keywords: GPU, PyTorch compatibility
********************************************************************************
PyTorch compatibility
********************************************************************************
`PyTorch <https://pytorch.org/>`_ is an open-source tensor library designed for
deep learning. PyTorch on ROCm provides mixed-precision and large-scale training
using `MIOpen <https://github.com/ROCm/MIOpen>`_ and
`RCCL <https://github.com/ROCm/rccl>`_ libraries.
ROCm support for PyTorch is upstreamed into the official PyTorch repository. Due to independent
compatibility considerations, this results in two distinct release cycles for PyTorch on ROCm:
- ROCm PyTorch release:
- Provides the latest version of ROCm but doesn't immediately support the latest stable PyTorch
version.
- Offers :ref:`Docker images <pytorch-docker-compat>` with ROCm and PyTorch
pre-installed.
- ROCm PyTorch repository: `<https://github.com/rocm/pytorch>`__
- See the :doc:`ROCm PyTorch installation guide <rocm-install-on-linux:install/3rd-party/pytorch-install>` to get started.
- Official PyTorch release:
- Provides the latest stable version of PyTorch but doesn't immediately support the latest ROCm version.
- Official PyTorch repository: `<https://github.com/pytorch/pytorch>`__
- See the `Nightly and latest stable version installation guide <https://pytorch.org/get-started/locally/>`_
or `Previous versions <https://pytorch.org/get-started/previous-versions/>`_ to get started.
The upstream PyTorch includes an automatic HIPification solution that automatically generates HIP
source code from the CUDA backend. This approach allows PyTorch to support ROCm without requiring
manual code modifications.
ROCm's development is aligned with the stable release of PyTorch while upstream PyTorch testing uses
the stable release of ROCm to maintain consistency.
.. _pytorch-docker-compat:
Docker image compatibility
================================================================================
AMD validates and publishes ready-made `PyTorch <https://hub.docker.com/r/rocm/pytorch>`_
images with ROCm backends on Docker Hub. The following Docker image tags and
associated inventories are validated for `ROCm 6.3.0 <https://repo.radeon.com/rocm/apt/6.3/>`_.
.. list-table:: PyTorch Docker image components
:header-rows: 1
:class: docker-image-compatibility
* - Docker
- PyTorch
- Ubuntu
- Python
- Apex
- torchvision
- TensorBoard
- MAGMA
- UCX
- OMPI
- OFED
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/pytorch/rocm6.3_ubuntu24.04_py3.12_pytorch_release_2.4.0/images/sha256-98ddf20333bd01ff749b8092b1190ee369a75d3b8c71c2fac80ffdcb1a98d529?context=explore"><i class="fab fa-docker fa-lg"></i></a>
- `2.4.0 <https://github.com/ROCm/pytorch/tree/release/2.4>`_
- 24.04
- `3.12 <https://www.python.org/downloads/release/python-3128/>`_
- `1.4.0 <https://github.com/ROCm/apex/tree/release/1.4.0>`_
- `0.19.0 <https://github.com/pytorch/vision/tree/v0.19.0>`_
- `2.13.0 <https://github.com/tensorflow/tensorboard/tree/2.13>`_
- `master <https://bitbucket.org/icl/magma/src/master/>`_
- `1.10.0 <https://github.com/openucx/ucx/tree/v1.10.0>`_
- `4.0.7 <https://github.com/open-mpi/ompi/tree/v4.0.7>`_
- `5.3-1.0.5.0 <https://content.mellanox.com/ofed/MLNX_OFED-5.3-1.0.5.0/MLNX_OFED_LINUX-5.3-1.0.5.0-ubuntu20.04-x86_64.tgz>`_
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/pytorch/rocm6.3_ubuntu22.04_py3.10_pytorch_release_2.4.0/images/sha256-402c9b4f1a6b5a81c634a1932b56cbe01abb699cfcc7463d226276997c6cf8ea?context=explore"><i class="fab fa-docker fa-lg"></i></a>
- `2.4.0 <https://github.com/ROCm/pytorch/tree/release/2.4>`_
- 22.04
- `3.10 <https://www.python.org/downloads/release/python-31016/>`_
- `1.4.0 <https://github.com/ROCm/apex/tree/release/1.4.0>`_
- `0.19.0 <https://github.com/pytorch/vision/tree/v0.19.0>`_
- `2.13.0 <https://github.com/tensorflow/tensorboard/tree/2.13>`_
- `master <https://bitbucket.org/icl/magma/src/master/>`_
- `1.10.0 <https://github.com/openucx/ucx/tree/v1.10.0>`_
- `4.0.7 <https://github.com/open-mpi/ompi/tree/v4.0.7>`_
- `5.3-1.0.5.0 <https://content.mellanox.com/ofed/MLNX_OFED-5.3-1.0.5.0/MLNX_OFED_LINUX-5.3-1.0.5.0-ubuntu20.04-x86_64.tgz>`_
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/pytorch/rocm6.3_ubuntu22.04_py3.9_pytorch_release_2.4.0/images/sha256-e0608b55d408c3bfe5c19fdd57a4ced3e0eb3a495b74c309980b60b156c526dd?context=explore"><i class="fab fa-docker fa-lg"></i></a>
- `2.4.0 <https://github.com/ROCm/pytorch/tree/release/2.4>`_
- 22.04
- `3.9 <https://www.python.org/downloads/release/python-3918/>`_
- `1.4.0 <https://github.com/ROCm/apex/tree/release/1.4.0>`_
- `0.19.0 <https://github.com/pytorch/vision/tree/v0.19.0>`_
- `2.13.0 <https://github.com/tensorflow/tensorboard/tree/2.13>`_
- `master <https://bitbucket.org/icl/magma/src/master/>`_
- `1.10.0 <https://github.com/openucx/ucx/tree/v1.10.0>`_
- `4.0.7 <https://github.com/open-mpi/ompi/tree/v4.0.7>`_
- `5.3-1.0.5.0 <https://content.mellanox.com/ofed/MLNX_OFED-5.3-1.0.5.0/MLNX_OFED_LINUX-5.3-1.0.5.0-ubuntu20.04-x86_64.tgz>`_
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/pytorch/rocm6.3_ubuntu22.04_py3.10_pytorch_release_2.3.0/images/sha256-652cf25263d05b1de548222970aeb76e60b12de101de66751264709c0d0ff9d8?context=explore"><i class="fab fa-docker fa-lg"></i></a>
- `2.3.0 <https://github.com/ROCm/pytorch/tree/release/2.3>`_
- 22.04
- `3.10 <https://www.python.org/downloads/release/python-31016/>`_
- `1.3.0 <https://github.com/ROCm/apex/tree/release/1.3.0>`_
- `0.18.0 <https://github.com/pytorch/vision/tree/v0.18.0>`_
- `2.13.0 <https://github.com/tensorflow/tensorboard/tree/2.13>`_
- `master <https://bitbucket.org/icl/magma/src/master/>`_
- `1.14.1 <https://github.com/openucx/ucx/tree/v1.14.1>`_
- `4.1.5 <https://github.com/open-mpi/ompi/tree/v4.1.5>`_
- `5.3-1.0.5.0 <https://content.mellanox.com/ofed/MLNX_OFED-5.3-1.0.5.0/MLNX_OFED_LINUX-5.3-1.0.5.0-ubuntu20.04-x86_64.tgz>`_
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/pytorch/rocm6.3_ubuntu22.04_py3.10_pytorch_release_2.2.1/images/sha256-051976f26beab8f9aa65d999e3ad546c027b39240a0cc3ee81b114a9024f2912?context=explore"><i class="fab fa-docker fa-lg"></i></a>
- `2.2.1 <https://github.com/ROCm/pytorch/tree/release/2.2>`_
- 22.04
- `3.10 <https://www.python.org/downloads/release/python-31016/>`_
- `1.2.0 <https://github.com/ROCm/apex/tree/release/1.2.0>`_
- `0.17.1 <https://github.com/pytorch/vision/tree/v0.17.1>`_
- `2.13.0 <https://github.com/tensorflow/tensorboard/tree/2.13>`_
- `master <https://bitbucket.org/icl/magma/src/master/>`_
- `1.14.1 <https://github.com/openucx/ucx/tree/v1.14.1>`_
- `4.1.5 <https://github.com/open-mpi/ompi/tree/v4.1.5>`_
- `5.3-1.0.5.0 <https://content.mellanox.com/ofed/MLNX_OFED-5.3-1.0.5.0/MLNX_OFED_LINUX-5.3-1.0.5.0-ubuntu20.04-x86_64.tgz>`_
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/pytorch/rocm6.3_ubuntu20.04_py3.9_pytorch_release_2.2.1/images/sha256-88c839a364d109d3748c100385bfa100d28090d25118cc723fd0406390ab2f7e?context=explore"><i class="fab fa-docker fa-lg"></i></a>
- `2.2.1 <https://github.com/ROCm/pytorch/tree/release/2.2>`_
- 20.04
- `3.9 <https://www.python.org/downloads/release/python-3921/>`_
- `1.2.0 <https://github.com/ROCm/apex/tree/release/1.2.0>`_
- `0.17.1 <https://github.com/pytorch/vision/tree/v0.17.1>`_
- `2.13.0 <https://github.com/tensorflow/tensorboard/tree/2.13.0>`_
- `master <https://bitbucket.org/icl/magma/src/master/>`_
- `1.10.0 <https://github.com/openucx/ucx/tree/v1.10.0>`_
- `4.0.3 <https://github.com/open-mpi/ompi/tree/v4.0.3>`_
- `5.3-1.0.5.0 <https://content.mellanox.com/ofed/MLNX_OFED-5.3-1.0.5.0/MLNX_OFED_LINUX-5.3-1.0.5.0-ubuntu20.04-x86_64.tgz>`_
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/pytorch/rocm6.3_ubuntu22.04_py3.9_pytorch_release_1.13.1/images/sha256-994424ed07a63113f79dd9aa72159124c00f5fbfe18127151e6658f7d0b6f821?context=explore"><i class="fab fa-docker fa-lg"></i></a>
- `1.13.1 <https://github.com/ROCm/pytorch/tree/release/1.13>`_
- 22.04
- `3.9 <https://www.python.org/downloads/release/python-3921/>`_
- `1.0.0 <https://github.com/ROCm/apex/tree/release/1.0.0>`_
- `0.14.0 <https://github.com/pytorch/vision/tree/v0.14.0>`_
- `2.18.0 <https://github.com/tensorflow/tensorboard/tree/2.18>`_
- `master <https://bitbucket.org/icl/magma/src/master/>`_
- `1.14.1 <https://github.com/openucx/ucx/tree/v1.14.1>`_
- `4.1.5 <https://github.com/open-mpi/ompi/tree/v4.1.5>`_
- `5.3-1.0.5.0 <https://content.mellanox.com/ofed/MLNX_OFED-5.3-1.0.5.0/MLNX_OFED_LINUX-5.3-1.0.5.0-ubuntu20.04-x86_64.tgz>`_
* - .. raw:: html
<a href="https://hub.docker.com/layers/rocm/pytorch/rocm6.3_ubuntu20.04_py3.9_pytorch_release_1.13.1/images/sha256-7b8139fe40a9aeb4bca3aecd15c22c1fa96e867d93479fa3a24fdeeeeafa1219?context=explore"><i class="fab fa-docker fa-lg"></i></a>
- `1.13.1 <https://github.com/ROCm/pytorch/tree/release/1.13>`_
- 20.04
- `3.9 <https://www.python.org/downloads/release/python-3921/>`_
- `1.0.0 <https://github.com/ROCm/apex/tree/release/1.0.0>`_
- `0.14.0 <https://github.com/pytorch/vision/tree/v0.14.0>`_
- `2.18.0 <https://github.com/tensorflow/tensorboard/tree/2.18>`_
- `master <https://bitbucket.org/icl/magma/src/master/>`_
- `1.10.0 <https://github.com/openucx/ucx/tree/v1.10.0>`_
- `4.0.3 <https://github.com/open-mpi/ompi/tree/v4.0.3>`_
- `5.3-1.0.5.0 <https://content.mellanox.com/ofed/MLNX_OFED-5.3-1.0.5.0/MLNX_OFED_LINUX-5.3-1.0.5.0-ubuntu20.04-x86_64.tgz>`_
Critical ROCm libraries for PyTorch
================================================================================
The functionality of PyTorch with ROCm is shaped by its underlying library
dependencies. These critical ROCm components affect the capabilities,
performance, and feature set available to developers.
.. list-table::
:header-rows: 1
* - ROCm library
- Version
- Purpose
- Used in
* - `Composable Kernel <https://github.com/ROCm/composable_kernel>`_
- 1.1.0
- Enables faster execution of core operations like matrix multiplication
(GEMM), convolutions and transformations.
- Speeds up ``torch.permute``, ``torch.view``, ``torch.matmul``,
``torch.mm``, ``torch.bmm``, ``torch.nn.Conv2d``, ``torch.nn.Conv3d``
and ``torch.nn.MultiheadAttention``.
* - `hipBLAS <https://github.com/ROCm/hipBLAS>`_
- 2.3.0
- Provides GPU-accelerated Basic Linear Algebra Subprograms (BLAS) for
matrix and vector operations.
- Supports operations like matrix multiplication, matrix-vector products,
and tensor contractions. Utilized in both dense and batched linear
algebra operations.
* - `hipBLASLt <https://github.com/ROCm/hipBLASLt>`_
- 0.10.0
- hipBLASLt is an extension of the hipBLAS library, providing additional
features like epilogues fused into the matrix multiplication kernel or
use of integer tensor cores.
- It accelerates operations like ``torch.matmul``, ``torch.mm``, and the
matrix multiplications used in convolutional and linear layers.
* - `hipCUB <https://github.com/ROCm/hipCUB>`_
- 3.3.0
- Provides a C++ template library for parallel algorithms for reduction,
scan, sort and select.
- Supports operations like ``torch.sum``, ``torch.cumsum``, ``torch.sort``
and ``torch.topk``. Operations on sparse tensors or tensors with
irregular shapes often involve scanning, sorting, and filtering, which
hipCUB handles efficiently.
* - `hipFFT <https://github.com/ROCm/hipFFT>`_
- 1.0.17
- Provides GPU-accelerated Fast Fourier Transform (FFT) operations.
- Used in functions like the ``torch.fft`` module.
* - `hipRAND <https://github.com/ROCm/hipRAND>`_
- 2.11.0
- Provides fast random number generation for GPUs.
- The ``torch.rand``, ``torch.randn`` and stochastic layers like
``torch.nn.Dropout``.
* - `hipSOLVER <https://github.com/ROCm/hipSOLVER>`_
- 2.3.0
- Provides GPU-accelerated solvers for linear systems, eigenvalues, and
singular value decompositions (SVD).
- Supports functions like ``torch.linalg.solve``,
``torch.linalg.eig``, and ``torch.linalg.svd``.
* - `hipSPARSE <https://github.com/ROCm/hipSPARSE>`_
- 3.1.2
- Accelerates operations on sparse matrices, such as sparse matrix-vector
or matrix-matrix products.
- Sparse tensor operations ``torch.sparse``.
* - `hipSPARSELt <https://github.com/ROCm/hipSPARSELt>`_
- 0.2.2
- Accelerates operations on sparse matrices, such as sparse matrix-vector
or matrix-matrix products.
- Sparse tensor operations ``torch.sparse``.
* - `hipTensor <https://github.com/ROCm/hipTensor>`_
- 1.4.0
- Optimizes for high-performance tensor operations, such as contractions.
- Accelerates tensor algebra, especially in deep learning and scientific
computing.
* - `MIOpen <https://github.com/ROCm/MIOpen>`_
- 3.3.0
- Optimizes deep learning primitives such as convolutions, pooling,
normalization, and activation functions.
- Speeds up convolutional neural networks (CNNs), recurrent neural
networks (RNNs), and other layers. Used in operations like
``torch.nn.Conv2d``, ``torch.nn.ReLU``, and ``torch.nn.LSTM``.
* - `MIGraphX <https://github.com/ROCm/AMDMIGraphX>`_
- 2.11.0
- Add graph-level optimizations, ONNX models and mixed precision support
and enable Ahead-of-Time (AOT) Compilation.
- Speeds up inference models and executes ONNX models for
compatibility with other frameworks.
``torch.nn.Conv2d``, ``torch.nn.ReLU``, and ``torch.nn.LSTM``.
* - `MIVisionX <https://github.com/ROCm/MIVisionX>`_
- 3.1.0
- Optimizes acceleration for computer vision and AI workloads like
preprocessing, augmentation, and inferencing.
- Faster data preprocessing and augmentation pipelines for datasets like
ImageNet or COCO and easy to integrate into PyTorch's ``torch.utils.data``
and ``torchvision`` workflows.
* - `rocAL <https://github.com/ROCm/rocAL>`_
- 2.1.0
- Accelerates the data pipeline by offloading intensive preprocessing and
augmentation tasks. rocAL is part of MIVisionX.
- Easy to integrate into PyTorch's ``torch.utils.data`` and
``torchvision`` data load workloads.
* - `RCCL <https://github.com/ROCm/rccl>`_
- 2.21.5
- Optimizes for multi-GPU communication for operations like AllReduce and
Broadcast.
- Distributed data parallel training (``torch.nn.parallel.DistributedDataParallel``).
Handles communication in multi-GPU setups.
* - `rocDecode <https://github.com/ROCm/rocDecode>`_
- 0.8.0
- Provide hardware-accelerated data decoding capabilities, particularly
for image, video, and other dataset formats.
- Can be integrated in ``torch.utils.data``, ``torchvision.transforms``
and ``torch.distributed``.
* - `rocJPEG <https://github.com/ROCm/rocJPEG>`_
- 0.6.0
- Provide hardware-accelerated JPEG image decoding and encoding.
- GPU accelerated ``torchvision.io.decode_jpeg`` and
``torchvision.io.encode_jpeg`` and can be integrated in
``torch.utils.data`` and ``torchvision``.
* - `RPP <https://github.com/ROCm/RPP>`_
- 1.9.1
- Speed up data augmentation, transformation, and other preprocessing step.
- Easy to integrate into PyTorch's ``torch.utils.data`` and
``torchvision`` data load workloads.
* - `rocThrust <https://github.com/ROCm/rocThrust>`_
- 3.3.0
- Provides a C++ template library for parallel algorithms like sorting,
reduction, and scanning.
- Utilized in backend operations for tensor computations requiring
parallel processing.
* - `rocWMMA <https://github.com/ROCm/rocWMMA>`_
- 1.6.0
- Accelerates warp-level matrix-multiply and matrix-accumulate to speed up matrix
multiplication (GEMM) and accumulation operations with mixed precision
support.
- Linear layers (``torch.nn.Linear``), convolutional layers
(``torch.nn.Conv2d``), attention layers, general tensor operations that
involve matrix products, such as ``torch.matmul``, ``torch.bmm``, and
more.
Supported and unsupported features
================================================================================
The following section maps GPU-accelerated PyTorch features to their supported
ROCm and PyTorch versions.
torch
--------------------------------------------------------------------------------
`torch <https://pytorch.org/docs/stable/index.html>`_ is the central module of
PyTorch, providing data structures for multi-dimensional tensors and
implementing mathematical operations on them. It also includes utilities for
efficient serialization of tensors and arbitrary data types, along with various
other tools.
Tensor data types
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
The data type of a tensor is specified using the ``dtype`` attribute or argument, and PyTorch supports a wide range of data types for different use cases.
The following table lists `torch.Tensor <https://pytorch.org/docs/stable/tensors.html>`_'s single data types:
.. list-table::
:header-rows: 1
* - Data type
- Description
- Since PyTorch
- Since ROCm
* - ``torch.float8_e4m3fn``
- 8-bit floating point, e4m3
- 2.3
- 5.5
* - ``torch.float8_e5m2``
- 8-bit floating point, e5m2
- 2.3
- 5.5
* - ``torch.float16`` or ``torch.half``
- 16-bit floating point
- 0.1.6
- 2.0
* - ``torch.bfloat16``
- 16-bit floating point
- 1.6
- 2.6
* - ``torch.float32`` or ``torch.float``
- 32-bit floating point
- 0.1.12_2
- 2.0
* - ``torch.float64`` or ``torch.double``
- 64-bit floating point
- 0.1.12_2
- 2.0
* - ``torch.complex32`` or ``torch.chalf``
- PyTorch provides native support for 32-bit complex numbers
- 1.6
- 2.0
* - ``torch.complex64`` or ``torch.cfloat``
- PyTorch provides native support for 64-bit complex numbers
- 1.6
- 2.0
* - ``torch.complex128`` or ``torch.cdouble``
- PyTorch provides native support for 128-bit complex numbers
- 1.6
- 2.0
* - ``torch.uint8``
- 8-bit integer (unsigned)
- 0.1.12_2
- 2.0
* - ``torch.uint16``
- 16-bit integer (unsigned)
- 2.3
- Not natively supported
* - ``torch.uint32``
- 32-bit integer (unsigned)
- 2.3
- Not natively supported
* - ``torch.uint64``
- 32-bit integer (unsigned)
- 2.3
- Not natively supported
* - ``torch.int8``
- 8-bit integer (signed)
- 1.12
- 5.0
* - ``torch.int16`` or ``torch.short``
- 16-bit integer (signed)
- 0.1.12_2
- 2.0
* - ``torch.int32`` or ``torch.int``
- 32-bit integer (signed)
- 0.1.12_2
- 2.0
* - ``torch.int64`` or ``torch.long``
- 64-bit integer (signed)
- 0.1.12_2
- 2.0
* - ``torch.bool``
- Boolean
- 1.2
- 2.0
* - ``torch.quint8``
- Quantized 8-bit integer (unsigned)
- 1.8
- 5.0
* - ``torch.qint8``
- Quantized 8-bit integer (signed)
- 1.8
- 5.0
* - ``torch.qint32``
- Quantized 32-bit integer (signed)
- 1.8
- 5.0
* - ``torch.quint4x2``
- Quantized 4-bit integer (unsigned)
- 1.8
- 5.0
.. note::
Unsigned types aside from ``uint8`` are currently only have limited support in
eager mode (they primarily exist to assist usage with ``torch.compile``).
The :doc:`ROCm precision support page <rocm:reference/precision-support>`
collected the native HW support of different data types.
torch.cuda
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
``torch.cuda`` in PyTorch is a module that provides utilities and functions for
managing and utilizing AMD and NVIDIA GPUs. It enables GPU-accelerated
computations, memory management, and efficient execution of tensor operations,
leveraging ROCm and CUDA as the underlying frameworks.
.. list-table::
:header-rows: 1
* - Data type
- Description
- Since PyTorch
- Since ROCm
* - Device management
- Utilities for managing and interacting with GPUs.
- 0.4.0
- 3.8
* - Tensor operations on GPU
- Perform tensor operations such as addition and matrix multiplications on
the GPU.
- 0.4.0
- 3.8
* - Streams and events
- Streams allow overlapping computation and communication for optimized
performance, events enable synchronization.
- 1.6.0
- 3.8
* - Memory management
- Functions to manage and inspect memory usage like
``torch.cuda.memory_allocated()``, ``torch.cuda.max_memory_allocated()``,
``torch.cuda.memory_reserved()`` and ``torch.cuda.empty_cache()``.
- 0.3.0
- 1.9.2
* - Running process lists of memory management
- Return a human-readable printout of the running processes and their GPU
memory use for a given device with functions like
``torch.cuda.memory_stats()`` and ``torch.cuda.memory_summary()``.
- 1.8.0
- 4.0
* - Communication collectives
- A set of APIs that enable efficient communication between multiple GPUs,
allowing for distributed computing and data parallelism.
- 1.9.0
- 5.0
* - ``torch.cuda.CUDAGraph``
- Graphs capture sequences of GPU operations to minimize kernel launch
overhead and improve performance.
- 1.10.0
- 5.3
* - TunableOp
- A mechanism that allows certain operations to be more flexible and
optimized for performance. It enables automatic tuning of kernel
configurations and other settings to achieve the best possible
performance based on the specific hardware (GPU) and workload.
- 2.0
- 5.4
* - NVIDIA Tools Extension (NVTX)
- Integration with NVTX for profiling and debugging GPU performance using
NVIDIA's Nsight tools.
- 1.8.0
- ❌
* - Lazy loading NVRTC
- Delays JIT compilation with NVRTC until the code is explicitly needed.
- 1.13.0
- ❌
* - Jiterator (beta)
- Jiterator allows asynchronous data streaming into computation streams
during training loops.
- 1.13.0
- 5.2
.. Need to validate and extend.
torch.backends.cuda
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
``torch.backends.cuda`` is a PyTorch module that provides configuration options
and flags to control the behavior of CUDA or ROCm operations. It is part of the
PyTorch backend configuration system, which allows users to fine-tune how
PyTorch interacts with the CUDA or ROCm environment.
.. list-table::
:header-rows: 1
* - Data type
- Description
- Since PyTorch
- Since ROCm
* - ``cufft_plan_cache``
- Manages caching of GPU FFT plans to optimize repeated FFT computations.
- 1.7.0
- 5.0
* - ``matmul.allow_tf32``
- Enables or disables the use of TensorFloat-32 (TF32) precision for
faster matrix multiplications on GPUs with Tensor Cores.
- 1.10.0
- ❌
* - ``matmul.allow_fp16_reduced_precision_reduction``
- Reduced precision reductions (e.g., with fp16 accumulation type) are
allowed with fp16 GEMMs.
- 2.0
- ❌
* - ``matmul.allow_bf16_reduced_precision_reduction``
- Reduced precision reductions are allowed with bf16 GEMMs.
- 2.0
- ❌
* - ``enable_cudnn_sdp``
- Globally enables cuDNN SDPA's kernels within SDPA.
- 2.0
- ❌
* - ``enable_flash_sdp``
- Globally enables or disables FlashAttention for SDPA.
- 2.1
- ❌
* - ``enable_mem_efficient_sdp``
- Globally enables or disables Memory-Efficient Attention for SDPA.
- 2.1
- ❌
* - ``enable_math_sdp``
- Globally enables or disables the PyTorch C++ implementation within SDPA.
- 2.1
- ❌
.. Need to validate and extend.
torch.backends.cudnn
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Supported ``torch`` options:
.. list-table::
:header-rows: 1
* - Data type
- Description
- Since PyTorch
- Since ROCm
* - ``allow_tf32``
- TensorFloat-32 tensor cores may be used in cuDNN convolutions on NVIDIA
Ampere or newer GPUs.
- 1.12.0
- ❌
* - ``deterministic``
- A bool that, if True, causes cuDNN to only use deterministic
convolution algorithms.
- 1.12.0
- 6.0
Automatic mixed precision: torch.amp
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
PyTorch that automates the process of using both 16-bit (half-precision,
float16) and 32-bit (single-precision, float32) floating-point types in model
training and inference.
.. list-table::
:header-rows: 1
* - Data type
- Description
- Since PyTorch
- Since ROCm
* - Autocasting
- Instances of autocast serve as context managers or decorators that allow
regions of your script to run in mixed precision.
- 1.9
- 2.5
* - Gradient scaling
- To prevent underflow, “gradient scaling” multiplies the networks
loss(es) by a scale factor and invokes a backward pass on the scaled
loss(es). Gradients flowing backward through the network are then
scaled by the same factor. In other words, gradient values have a
larger magnitude, so they dont flush to zero.
- 1.9
- 2.5
* - CUDA op-specific behavior
- These ops always go through autocasting whether they are invoked as part
of a ``torch.nn.Module``, as a function, or as a ``torch.Tensor`` method. If
functions are exposed in multiple namespaces, they go through
autocasting regardless of the namespace.
- 1.9
- 2.5
Distributed library features
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
The PyTorch distributed library includes a collective of parallelism modules, a
communications layer, and infrastructure for launching and debugging large
training jobs. See :ref:`rocm-for-ai-pytorch-distributed` for more information.
The Distributed Library feature in PyTorch provides tools and APIs for building
and running distributed machine learning workflows. It allows training models
across multiple processes, GPUs, or nodes in a cluster, enabling efficient use
of computational resources and scalability for large-scale tasks.
.. list-table::
:header-rows: 1
* - Features
- Description
- Since PyTorch
- Since ROCm
* - TensorPipe
- TensorPipe is a point-to-point communication library integrated into
PyTorch for distributed training. It is designed to handle tensor data
transfers efficiently between different processes or devices, including
those on separate machines.
- 1.8
- 5.4
* - Gloo
- Gloo is designed for multi-machine and multi-GPU setups, enabling
efficient communication and synchronization between processes. Gloo is
one of the default backends for PyTorch's Distributed Data Parallel
(DDP) and RPC frameworks, alongside other backends like NCCL and MPI.
- 1.0
- 2.0
torch.compiler
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.. list-table::
:header-rows: 1
* - Features
- Description
- Since PyTorch
- Since ROCm
* - ``torch.compiler`` (AOT Autograd)
- Autograd captures not only the user-level code, but also backpropagation,
which results in capturing the backwards pass “ahead-of-time”. This
enables acceleration of both forwards and backwards pass using
``TorchInductor``.
- 2.0
- 5.3
* - ``torch.compiler`` (TorchInductor)
- The default ``torch.compile`` deep learning compiler that generates fast
code for multiple accelerators and backends. You need to use a backend
compiler to make speedups through ``torch.compile`` possible. For AMD,
NVIDIA, and Intel GPUs, it leverages OpenAI Triton as the key building block.
- 2.0
- 5.3
torchaudio
--------------------------------------------------------------------------------
The `torchaudio <https://pytorch.org/audio/stable/index.html>`_ library provides
utilities for processing audio data in PyTorch, such as audio loading,
transformations, and feature extraction.
To ensure GPU-acceleration with ``torchaudio.transforms``, you need to move audio
data (waveform tensor) explicitly to GPU using ``.to('cuda')``.
The following ``torchaudio`` features are GPU-accelerated.
.. list-table::
:header-rows: 1
* - Features
- Description
- Since torchaudio version
- Since ROCm
* - ``torchaudio.transforms.Spectrogram``
- Generate spectrogram of an input waveform using STFT.
- 0.6.0
- 4.5
* - ``torchaudio.transforms.MelSpectrogram``
- Generate the mel-scale spectrogram of raw audio signals.
- 0.9.0
- 4.5
* - ``torchaudio.transforms.MFCC``
- Extract of MFCC features.
- 0.9.0
- 4.5
* - ``torchaudio.transforms.Resample``
- Resample a signal from one frequency to another
- 0.9.0
- 4.5
torchvision
--------------------------------------------------------------------------------
The `torchvision <https://pytorch.org/vision/stable/index.html>`_ library
provide datasets, model architectures, and common image transformations for
computer vision.
The following ``torchvision`` features are GPU-accelerated.
.. list-table::
:header-rows: 1
* - Features
- Description
- Since torchvision version
- Since ROCm
* - ``torchvision.transforms.functional``
- Provides GPU-compatible transformations for image preprocessing like
resize, normalize, rotate and crop.
- 0.2.0
- 4.0
* - ``torchvision.ops``
- GPU-accelerated operations for object detection and segmentation tasks.
``torchvision.ops.roi_align``, ``torchvision.ops.nms`` and
``box_convert``.
- 0.6.0
- 3.3
* - ``torchvision.models`` with ``.to('cuda')``
- ``torchvision`` provides several pre-trained models (ResNet, Faster
R-CNN, Mask R-CNN, ...) that can run on CUDA for faster inference and
training.
- 0.1.6
- 2.x
* - ``torchvision.io``
- Video decoding and frame extraction using GPU acceleration with NVIDIAs
NVDEC and nvJPEG (rocJPEG) on CUDA-enabled GPUs.
- 0.4.0
- 6.3
torchtext
--------------------------------------------------------------------------------
The `torchtext <https://pytorch.org/text/stable/index.html>`_ library provides
utilities for processing and working with text data in PyTorch, including
tokenization, vocabulary management, and text embeddings. torchtext supports
preprocessing pipelines and integration with PyTorch models, simplifying the
implementation of natural language processing (NLP) tasks.
To leverage GPU acceleration in torchtext, you need to move tensors
explicitly to the GPU using ``.to('cuda')``.
* torchtext does not implement its own kernels. ROCm support is enabled by linking against ROCm libraries.
* Only official release exists.
torchtune
--------------------------------------------------------------------------------
The `torchtune <https://pytorch.org/torchtune/stable/index.html>`_ library for
authoring, fine-tuning and experimenting with LLMs.
* Usage: It works out-of-the-box, enabling developers to fine-tune ROCm PyTorch solutions.
* Only official release exists.
torchserve
--------------------------------------------------------------------------------
The `torchserve <https://pytorch.org/torchserve/>`_ is a PyTorch domain library
for common sparsity and parallelism primitives needed for large-scale recommender
systems.
* torchtext does not implement its own kernels. ROCm support is enabled by linking against ROCm libraries.
* Only official release exists.
torchrec
--------------------------------------------------------------------------------
The `torchrec <https://pytorch.org/torchrec/>`_ is a PyTorch domain library for
common sparsity and parallelism primitives needed for large-scale recommender
systems.
* torchrec does not implement its own kernels. ROCm support is enabled by linking against ROCm libraries.
* Only official release exists.
Unsupported PyTorch features
----------------------------
The following are GPU-accelerated PyTorch features not currently supported by ROCm.
.. list-table::
:widths: 30, 60, 10
:header-rows: 1
* - Data type
- Description
- Since PyTorch
* - APEX batch norm
- Use APEX batch norm instead of PyTorch batch norm.
- 1.6.0
* - ``torch.backends.cuda`` / ``matmul.allow_tf32``
- A bool that controls whether TensorFloat-32 tensor cores may be used in
matrix multiplications.
- 1.7
* - ``torch.cuda`` / NVIDIA Tools Extension (NVTX)
- Integration with NVTX for profiling and debugging GPU performance using
NVIDIA's Nsight tools.
- 1.7.0
* - ``torch.cuda`` / Lazy loading NVRTC
- Delays JIT compilation with NVRTC until the code is explicitly needed.
- 1.8.0
* - ``torch-tensorrt``
- Integrate TensorRT library for optimizing and deploying PyTorch models.
ROCm does not have equialent library for TensorRT.
- 1.9.0
* - ``torch.backends`` / ``cudnn.allow_tf32``
- TensorFloat-32 tensor cores may be used in cuDNN convolutions.
- 1.10.0
* - ``torch.backends.cuda`` / ``matmul.allow_fp16_reduced_precision_reduction``
- Reduced precision reductions with fp16 accumulation type are
allowed with fp16 GEMMs.
- 2.0
* - ``torch.backends.cuda`` / ``matmul.allow_bf16_reduced_precision_reduction``
- Reduced precision reductions are allowed with bf16 GEMMs.
- 2.0
* - ``torch.nn.functional`` / ``scaled_dot_product_attention``
- Flash attention backend for SDPA to accelerate attention computation in
transformer-based models.
- 2.0
* - ``torch.backends.cuda`` / ``enable_cudnn_sdp``
- Globally enables cuDNN SDPA's kernels within SDPA.
- 2.0
* - ``torch.backends.cuda`` / ``enable_flash_sdp``
- Globally enables or disables FlashAttention for SDPA.
- 2.1
* - ``torch.backends.cuda`` / ``enable_mem_efficient_sdp``
- Globally enables or disables Memory-Efficient Attention for SDPA.
- 2.1
* - ``torch.backends.cuda`` / ``enable_math_sdp``
- Globally enables or disables the PyTorch C++ implementation within SDPA.
- 2.1
* - Dynamic parallelism
- PyTorch itself does not directly expose dynamic parallelism as a core
feature. Dynamic parallelism allow GPU threads to launch additional
threads which can be reached using custom operations via the
``torch.utils.cpp_extension`` module.
- Not a core feature
* - Unified memory support in PyTorch
- Unified Memory is not directly exposed in PyTorch's core API, it can be
utilized effectively through custom CUDA extensions or advanced
workflows.
- Not a core feature
Use cases and recommendations
================================================================================
* :doc:`Using ROCm for AI: training a model </how-to/rocm-for-ai/train-a-model>` provides
guidance on how to leverage the ROCm platform for training AI models. It covers the steps, tools, and best practices
for optimizing training workflows on AMD GPUs using PyTorch features.
* :doc:`Single-GPU fine-tuning and inference </how-to/llm-fine-tuning-optimization/single-gpu-fine-tuning-and-inference>`
describes and demonstrates how to use the ROCm platform for the fine-tuning and inference of
machine learning models, particularly large language models (LLMs), on systems with a single AMD
Instinct MI300X accelerator. This page provides a detailed guide for setting up, optimizing, and
executing fine-tuning and inference workflows in such environments.
* :doc:`Multi-GPU fine-tuning and inference optimization </how-to/llm-fine-tuning-optimization/multi-gpu-fine-tuning-and-inference>`
describes and demonstrates the fine-tuning and inference of machine learning models on systems
with multi MI300X accelerators.
* The :doc:`Instinct MI300X workload optimization guide </how-to/tuning-guides/mi300x/workload>` provides detailed
guidance on optimizing workloads for the AMD Instinct MI300X accelerator using ROCm. This guide is aimed at helping
users achieve optimal performance for deep learning and other high-performance computing tasks on the MI300X
accelerator.
* The :doc:`Inception with PyTorch documentation </conceptual/ai-pytorch-inception>`
describes how PyTorch integrates with ROCm for AI workloads It outlines the use of PyTorch on the ROCm platform and
focuses on how to efficiently leverage AMD GPU hardware for training and inference tasks in AI applications.
For more use cases and recommendations, see `ROCm PyTorch blog posts <https://rocm.blogs.amd.com/blog/tag/pytorch.html>`_

View File

@@ -0,0 +1,156 @@
.. meta::
:description: How ROCm uses PCIe atomics
:keywords: PCIe, PCIe atomics, atomics, BAR memory, AMD, ROCm
*****************************************************************************
How ROCm uses PCIe atomics
*****************************************************************************
ROCm PCIe feature and overview of BAR memory
================================================================
ROCm is an extension of HSA platform architecture, so it shares the queuing model, memory model,
signaling and synchronization protocols. Platform atomics are integral to perform queuing and
signaling memory operations where there may be multiple-writers across CPU and GPU agents.
The full list of HSA system architecture platform requirements are here:
`HSA Sys Arch Features <http://hsafoundation.com/wp-content/uploads/2021/02/HSA-SysArch-1.2.pdf>`_.
AMD ROCm Software uses the new PCI Express 3.0 (Peripheral Component Interconnect Express [PCIe]
3.0) features for atomic read-modify-write transactions which extends inter-processor synchronization
mechanisms to IO to support the defined set of HSA capabilities needed for queuing and signaling
memory operations.
The new PCIe atomic operations operate as completers for ``CAS`` (Compare and Swap), ``FetchADD``,
``SWAP`` atomics. The atomic operations are initiated by the I/O device which support 32-bit, 64-bit and
128-bit operand which target address have to be naturally aligned to operation sizes.
For ROCm the Platform atomics are used in ROCm in the following ways:
* Update HSA queue's read_dispatch_id: 64 bit atomic add used by the command processor on the
GPU agent to update the packet ID it processed.
* Update HSA queue's write_dispatch_id: 64 bit atomic add used by the CPU and GPU agent to
support multi-writer queue insertions.
* Update HSA Signals -- 64bit atomic ops are used for CPU & GPU synchronization.
The PCIe 3.0 atomic operations feature allows atomic transactions to be requested by, routed through
and completed by PCIe components. Routing and completion does not require software support.
Component support for each is detectable via the Device Capabilities 2 (DevCap2) register. Upstream
bridges need to have atomic operations routing enabled or the atomic operations will fail even though
PCIe endpoint and PCIe I/O devices has the capability to atomic operations.
To do atomic operations routing capability between two or more Root Ports, each associated Root Port
must indicate that capability via the atomic operations routing supported bit in the DevCap2 register.
If your system has a PCIe Express Switch it needs to support atomic operations routing. Atomic
operations requests are permitted only if a component's ``DEVCTL2.ATOMICOP_REQUESTER_ENABLE``
field is set. These requests can only be serviced if the upstream components support atomic operation
completion and/or routing to a component which does. Atomic operations routing support=1, routing
is supported; atomic operations routing support=0, routing is not supported.
An atomic operation is a non-posted transaction supporting 32-bit and 64-bit address formats, there
must be a response for Completion containing the result of the operation. Errors associated with the
operation (uncorrectable error accessing the target location or carrying out the atomic operation) are
signaled to the requester by setting the Completion Status field in the completion descriptor, they are
set to to Completer Abort (CA) or Unsupported Request (UR).
To understand more about how PCIe atomic operations work, see
`PCIe atomics <https://pcisig.com/specifications/pciexpress/specifications/ECN_Atomic_Ops_080417.pdf>`_
`Linux Kernel Patch to pci_enable_atomic_request <https://patchwork.kernel.org/project/linux-pci/patch/1443110390-4080-1-git-send-email-jay@jcornwall.me/>`_
There are also a number of papers which talk about these new capabilities:
* `Atomic Read Modify Write Primitives by Intel <https://www.intel.es/content/dam/doc/white-paper/atomic-read-modify-write-primitives-i-o-devices-paper.pdf>`_
* `PCI express 3 Accelerator White paper by Intel <https://www.intel.sg/content/dam/doc/white-paper/pci-express3-accelerator-white-paper.pdf>`_
* `PCIe Generation 4 Base Specification includes atomic operations <https://astralvx.com/storage/2020/11/PCI_Express_Base_4.0_Rev0.3_February19-2014.pdf>`_
* `Xilinx PCIe Ultrascale White paper <https://docs.xilinx.com/v/u/8OZSA2V1b1LLU2rRCDVGQw>`_
Other I/O devices with PCIe atomics support:
* Mellanox ConnectX-5 InfiniBand Card
* Cray Aries Interconnect
* Xilinx 7 Series Devices
Future bus technology with richer I/O atomics operation Support
* GenZ
New PCIe Endpoints with support beyond AMD Ryzen and EPYC CPU; Intel Haswell or newer CPUs
with PCIe Generation 3.0 support.
* Mellanox Bluefield SOC
* Cavium Thunder X2
In ROCm, we also take advantage of PCIe ID based ordering technology for P2P when the GPU
originates two writes to two different targets:
* Write to another GPU memory
* Write to system memory to indicate transfer complete
They are routed off to different ends of the computer but we want to make sure the write to system
memory to indicate transfer complete occurs AFTER P2P write to GPU has complete.
BAR memory overview
----------------------------------------------------------------------------------------------------
On a Xeon E5 based system in the BIOS we can turn on above 4GB PCIe addressing, if so he need to set
memory-mapped input/output (MMIO) base address (MMIOH base) and range (MMIO high size) in the BIOS.
In the Supermicro system in the system bios you need to see the following
* Advanced->PCIe/PCI/PnP configuration-\> Above 4G Decoding = Enabled
* Advanced->PCIe/PCI/PnP Configuration-\>MMIOH Base = 512G
* Advanced->PCIe/PCI/PnP Configuration-\>MMIO High Size = 256G
When we support Large Bar Capability there is a Large Bar VBIOS which also disable the IO bar.
For GFX9 and Vega10 which have Physical Address up 44 bit and 48 bit Virtual address.
* BAR0-1 registers: 64bit, prefetchable, GPU memory. 8GB or 16GB depending on Vega10 SKU. Must
be placed < 2^44 to support P2P access from other Vega10.
* BAR2-3 registers: 64bit, prefetchable, Doorbell. Must be placed \< 2^44 to support P2P access from
other Vega10.
* BAR4 register: Optional, not a boot device.
* BAR5 register: 32bit, non-prefetchable, MMIO. Must be placed \< 4GB.
Here is how our base address register (BAR) works on GFX 8 GPUs with 40 bit Physical Address Limit ::
11:00.0 Display controller: Advanced Micro Devices, Inc. [AMD/ATI] Fiji [Radeon R9 FURY / NANO
Series] (rev c1)
Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Device 0b35
Flags: bus master, fast devsel, latency 0, IRQ 119
Memory at bf40000000 (64-bit, prefetchable) [size=256M]
Memory at bf50000000 (64-bit, prefetchable) [size=2M]
I/O ports at 3000 [size=256]
Memory at c7400000 (32-bit, non-prefetchable) [size=256K]
Expansion ROM at c7440000 [disabled] [size=128K]
Legend:
1 : GPU Frame Buffer BAR -- In this example it happens to be 256M, but typically this will be size of the
GPU memory (typically 4GB+). This BAR has to be placed \< 2^40 to allow peer-to-peer access from
other GFX8 AMD GPUs. For GFX9 (Vega GPU) the BAR has to be placed \< 2^44 to allow peer-to-peer
access from other GFX9 AMD GPUs.
2 : Doorbell BAR -- The size of the BAR is typically will be \< 10MB (currently fixed at 2MB) for this
generation GPUs. This BAR has to be placed \< 2^40 to allow peer-to-peer access from other current
generation AMD GPUs.
3 : IO BAR -- This is for legacy VGA and boot device support, but since this the GPUs in this project are
not VGA devices (headless), this is not a concern even if the SBIOS does not setup.
4 : MMIO BAR -- This is required for the AMD Driver SW to access the configuration registers. Since the
reminder of the BAR available is only 1 DWORD (32bit), this is placed \< 4GB. This is fixed at 256KB.
5 : Expansion ROM -- This is required for the AMD Driver SW to access the GPU video-bios. This is
currently fixed at 128KB.
For more information, you can review
`Overview of Changes to PCI Express 3.0 <https://www.mindshare.com/files/resources/PCIe%203-0.pdf>`_.

View File

@@ -53,7 +53,7 @@ The following sections contain case studies for the Inception V3 model.
### Inception V3 with PyTorch
Convolution Neural Networks are forms of artificial neural networks commonly used for image processing. One of the core layers of such a network is the convolutional layer, which convolves the input with a weight tensor and passes the result to the next layer. Inception V3 is an architectural development over the ImageNet competition-winning entry, AlexNet, using more profound and broader networks while attempting to meet computational and memory budgets.
Convolution Neural Networks are forms of artificial neural networks commonly used for image processing. One of the core layers of such a network is the convolutional layer, which convolves the input with a weight tensor and passes the result to the next layer. Inception V3[^inception_arch] is an architectural development over the ImageNet competition-winning entry, AlexNet, using more profound and broader networks while attempting to meet computational and memory budgets.
The implementation uses PyTorch as a framework. This case study utilizes [TorchVision](https://pytorch.org/vision/stable/index.html), a repository of popular datasets and model architectures, for obtaining the model. TorchVision also provides pre-trained weights as a starting point to develop new models or fine-tune the model for a new task.
@@ -162,7 +162,7 @@ Follow these steps:
docker run -it --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --device=/dev/kfd --device=/dev/dri --group-add video --ipc=host --shm-size 8G rocm/pytorch:latest
```
2. Download an ImageNet database. For this example, the `tiny-imagenet-200`, a smaller ImageNet variant with 200 image classes and a training dataset with 100,000 images, was downsized to 64x64 color images.
2. Download an ImageNet database. For this example, the `tiny-imagenet-200`[^Stanford_deep_learning], a smaller ImageNet variant with 200 image classes and a training dataset with 100,000 images, was downsized to 64x64 color images.
```bash
wget http://cs231n.stanford.edu/tiny-imagenet-200.zip
@@ -366,7 +366,7 @@ Follow these steps:
model.to(device)
```
13. Set the loss criteria. For this example, Cross Entropy Loss is used.
13. Set the loss criteria. For this example, Cross Entropy Loss[^cross_entropy] is used.
```py
criterion = torch.nn.CrossEntropyLoss()
@@ -586,7 +586,7 @@ Follow these steps:
import torch.optim as optim
```
10. Set the loss criteria. For this example, Cross Entropy Loss is used.
10. Set the loss criteria. For this example, Cross Entropy Loss[^cross_entropy] is used.
```py
criterion = nn.CrossEntropyLoss()

View File

@@ -1,9 +1,8 @@
---
myst:
html_meta:
"description lang=en": "Learn about the AMD Instinct MI100 series architecture."
"keywords": "Instinct, MI100, microarchitecture, AMD, ROCm"
---
<head>
<meta charset="UTF-8">
<meta name="description" content="AMD Instinct MI100 microarchitecture">
<meta name="keywords" content="Instinct, MI100, microarchitecture, AMD, ROCm">
</head>
# AMD Instinct™ MI100 microarchitecture

View File

@@ -1,9 +1,8 @@
---
myst:
html_meta:
"description lang=en": "Learn about the AMD Instinct MI250 series architecture."
"keywords": "Instinct, MI250, microarchitecture, AMD, ROCm"
---
<head>
<meta charset="UTF-8">
<meta name="description" content="AMD Instinct MI250 microarchitecture">
<meta name="keywords" content="Instinct, MI250, microarchitecture, AMD, ROCm">
</head>
# AMD Instinct™ MI250 microarchitecture

View File

@@ -615,6 +615,7 @@ The following table shows the hardware counters *by* all texture addressing unit
"``TA_FLAT_READ_WAVEFRONTS_sum``", "Sum of flat opcode reads processed"
"``TA_FLAT_WRITE_WAVEFRONTS_sum``", "Sum of flat opcode writes processed"
"``TA_FLAT_WAVEFRONTS_sum``", "Total number of flat opcode wavefronts processed"
"``TA_FLAT_READ_WAVEFRONTS_sum``", "Total number of flat opcode read wavefronts processed"
"``TA_FLAT_ATOMIC_WAVEFRONTS_sum``", "Total number of flat opcode atomic wavefronts processed"
"``TA_TOTAL_WAVEFRONTS_sum``", "Total number of wavefronts processed"

View File

@@ -1,10 +1,3 @@
---
myst:
html_meta:
"description lang=en": "Learn about the AMD Instinct MI300 series architecture."
"keywords": "Instinct, MI300X, MI300A, microarchitecture, AMD, ROCm"
---
# AMD Instinct™ MI300 series microarchitecture
The AMD Instinct MI300 series accelerators are based on the AMD CDNA 3

View File

@@ -0,0 +1,241 @@
<head>
<meta charset="UTF-8">
<meta name="description" content="GPU memory">
<meta name="keywords" content="GPU memory, VRAM, video random access memory, pageable
memory, pinned memory, managed memory, AMD, ROCm">
</head>
# GPU memory
For the HIP reference documentation, see:
* {doc}`hip:doxygen/html/group___memory`
* {doc}`hip:doxygen/html/group___memory_m`
Host memory exists on the host (e.g. CPU) of the machine in random access memory (RAM).
Device memory exists on the device (e.g. GPU) of the machine in video random access memory (VRAM).
Recent architectures use graphics double data rate (GDDR) synchronous dynamic random-access memory (SDRAM)such as GDDR6, or high-bandwidth memory (HBM) such as HBM2e.
## Memory allocation
Memory can be allocated in two ways: pageable memory, and pinned memory.
The following API calls with result in these allocations:
| API | Data location | Allocation |
|--------------------|---------------|------------|
| System allocated | Host | Pageable |
| `hipMallocManaged` | Host | Managed |
| `hipHostMalloc` | Host | Pinned |
| `hipMalloc` | Device | Pinned |
:::{tip}
`hipMalloc` and `hipFree` are blocking calls, however, HIP recently added non-blocking versions `hipMallocAsync` and `hipFreeAsync` which take in a stream as an additional argument.
:::
### Pageable memory
Pageable memory is usually gotten when calling `malloc` or `new` in a C++ application.
It is unique in that it exists on "pages" (blocks of memory), which can be migrated to other memory storage.
For example, migrating memory between CPU sockets on a motherboard, or a system that runs out of space in RAM and starts dumping pages of RAM into the swap partition of your hard drive.
### Pinned memory
Pinned memory (or page-locked memory, or non-pageable memory) is host memory that is mapped into the address space of all GPUs, meaning that the pointer can be used on both host and device.
Accessing host-resident pinned memory in device kernels is generally not recommended for performance, as it can force the data to traverse the host-device interconnect (e.g. PCIe), which is much slower than the on-device bandwidth (>40x on MI200).
Pinned host memory can be allocated with one of two types of coherence support:
:::{note}
In HIP, pinned memory allocations are coherent by default (`hipHostMallocDefault`).
There are additional pinned memory flags (e.g. `hipHostMallocMapped` and `hipHostMallocPortable`).
On MI200 these options do not impact performance.
<!-- TODO: link to programming_manual#memory-allocation-flags -->
For more information, see the section *memory allocation flags* in the HIP Programming Guide: {doc}`hip:how-to/programming_manual`.
:::
Much like how a process can be locked to a CPU core by setting affinity, a pinned memory allocator does this with the memory storage system.
On multi-socket systems it is important to ensure that pinned memory is located on the same socket as the owning process, or else each cache line will be moved through the CPU-CPU interconnect, thereby increasing latency and potentially decreasing bandwidth.
In practice, pinned memory is used to improve transfer times between host and device.
For transfer operations, such as `hipMemcpy` or `hipMemcpyAsync`, using pinned memory instead of pageable memory on host can lead to a ~3x improvement in bandwidth.
:::{tip}
If the application needs to move data back and forth between device and host (separate allocations), use pinned memory on the host side.
:::
### Managed memory
Managed memory refers to universally addressable, or unified memory available on the MI200 series of GPUs.
Much like pinned memory, managed memory shares a pointer between host and device and (by default) supports fine-grained coherence, however, managed memory can also automatically migrate pages between host and device.
The allocation will be managed by AMD GPU driver using the Linux HMM (Heterogeneous Memory Management) mechanism.
If heterogenous memory management (HMM) is not available, then `hipMallocManaged` will default back to using system memory and will act like pinned host memory.
Other managed memory API calls will have undefined behavior.
It is therefore recommended to check for managed memory capability with: `hipDeviceGetAttribute` and `hipDeviceAttributeManagedMemory`.
HIP supports additional calls that work with page migration:
* `hipMemAdvise`
* `hipMemPrefetchAsync`
:::{tip}
If the application needs to use data on both host and device regularly, does not want to deal with separate allocations, and is not worried about maxing out the VRAM on MI200 GPUs (64 GB per GCD), use managed memory.
:::
:::{tip}
If managed memory performance is poor, check to see if managed memory is supported on your system and if page migration (XNACK) is enabled.
:::
## Access behavior
Memory allocations for GPUs behave as follow:
| API | Data location | Host access | Device access |
|--------------------|---------------|--------------|----------------------|
| System allocated | Host | Local access | Unhandled page fault |
| `hipMallocManaged` | Host | Local access | Zero-copy |
| `hipHostMalloc` | Host | Local access | Zero-copy* |
| `hipMalloc` | Device | Zero-copy | Local access |
Zero-copy accesses happen over the Infinity Fabric interconnect or PCI-E lanes on discrete GPUs.
:::{note}
While `hipHostMalloc` allocated memory is accessible by a device, the host pointer must be converted to a device pointer with `hipHostGetDevicePointer`.
Memory allocated through standard system allocators such as `malloc`, can be accessed a device by registering the memory via `hipHostRegister`.
The device pointer to be used in kernels can be retrieved with `hipHostGetDevicePointer`.
Registered memory is treated like `hipHostMalloc` and will have similar performance.
On devices that support and have [](#xnack) enabled, such as the MI250X, `hipHostRegister` is not required as memory accesses are handled via automatic page migration.
:::
### XNACK
Normally, host and device memory are separate and data has to be transferred manually via `hipMemcpy`.
On a subset of GPUs, such as the MI200, there is an option to automatically migrate pages of memory between host and device.
This is important for managed memory, where the locality of the data is important for performance.
Depending on the system, page migration may be disabled by default in which case managed memory will act like pinned host memory and suffer degraded performance.
*XNACK* describes the GPUs ability to retry memory accesses that failed due a page fault (which normally would lead to a memory access error), and instead retrieve the missing page.
This also affects memory allocated by the system as indicated by the following table:
| API | Data location | Host after device access | Device after host access |
|--------------------|---------------|--------------------------|--------------------------|
| System allocated | Host | Migrate page to host | Migrate page to device |
| `hipMallocManaged` | Host | Migrate page to host | Migrate page to device |
| `hipHostMalloc` | Host | Local access | Zero-copy |
| `hipMalloc` | Device | Zero-copy | Local access |
To check if page migration is available on a platform, use `rocminfo`:
```sh
$ rocminfo | grep xnack
Name: amdgcn-amd-amdhsa--gfx90a:sramecc+:xnack-
```
Here, `xnack-` means that XNACK is available but is disabled by default.
Turning on XNACK by setting the environment variable `HSA_XNACK=1` and gives the expected result, `xnack+`:
```sh
$ HSA_XNACK=1 rocminfo | grep xnack
Name: amdgcn-amd-amdhsa--gfx90a:sramecc+:xnack+
```
`hipcc`by default will generate code that runs correctly with both XNACK enabled or disabled.
Setting the `--offload-arch=`-option with `xnack+` or `xnack-` forces code to be only run with XNACK enabled or disabled respectively.
```sh
# Compiled kernels will run regardless if XNACK is enabled or is disabled.
hipcc --offload-arch=gfx90a
# Compiled kernels will only be run if XNACK is enabled with XNACK=1.
hipcc --offload-arch=gfx90a:xnack+
# Compiled kernels will only be run if XNACK is disabled with XNACK=0.
hipcc --offload-arch=gfx90a:xnack-
```
:::{tip}
If you want to make use of page migration, use managed memory. While pageable memory will migrate correctly, it is not a portable solution and can have performance issues if the accessed data isn't page aligned.
:::
### Coherence
* *Coarse-grained coherence* means that memory is only considered up to date at kernel boundaries, which can be enforced through `hipDeviceSynchronize`, `hipStreamSynchronize`, or any blocking operation that acts on the null stream (e.g. `hipMemcpy`).
For example, cacheable memory is a type of coarse-grained memory where an up-to-date copy of the data can be stored elsewhere (e.g. in an L2 cache).
* *Fine-grained coherence* means the coherence is supported while a CPU/GPU kernel is running.
This can be useful if both host and device are operating on the same dataspace using system-scope atomic operations (e.g. updating an error code or flag to a buffer).
Fine-grained memory implies that up-to-date data may be made visible to others regardless of kernel boundaries as discussed above.
| API | Flag | Coherence |
|-------------------------|------------------------------|----------------|
| `hipHostMalloc` | `hipHostMallocDefault` | Fine-grained |
| `hipHostMalloc` | `hipHostMallocNonCoherent` | Coarse-grained |
| API | Flag | Coherence |
|-------------------------|------------------------------|----------------|
| `hipExtMallocWithFlags` | `hipDeviceMallocDefault` | Coarse-grained |
| `hipExtMallocWithFlags` | `hipDeviceMallocFinegrained` | Fine-grained |
| API | `hipMemAdvise` argument | Coherence |
|-------------------------|------------------------------|----------------|
| `hipMallocManaged` | | Fine-grained |
| `hipMallocManaged` | `hipMemAdviseSetCoarseGrain` | Coarse-grained |
| `malloc` | | Fine-grained |
| `malloc` | `hipMemAdviseSetCoarseGrain` | Coarse-grained |
:::{tip}
Try to design your algorithms to avoid host-device memory coherence (e.g. system scope atomics). While it can be a useful feature in very specific cases, it is not supported on all systems, and can negatively impact performance by introducing the host-device interconnect bottleneck.
:::
The availability of fine- and coarse-grained memory pools can be checked with `rocminfo`:
```sh
$ rocminfo
...
*******
Agent 1
*******
Name: AMD EPYC 7742 64-Core Processor
...
Pool Info:
Pool 1
Segment: GLOBAL; FLAGS: FINE GRAINED
...
Pool 3
Segment: GLOBAL; FLAGS: COARSE GRAINED
...
*******
Agent 9
*******
Name: gfx90a
...
Pool Info:
Pool 1
Segment: GLOBAL; FLAGS: COARSE GRAINED
...
```
## System direct memory access
In most cases, the default behavior for HIP in transferring data from a pinned host allocation to device will run at the limit of the interconnect.
However, there are certain cases where the interconnect is not the bottleneck.
The primary way to transfer data onto and off of a GPU, such as the MI200, is to use the onboard System Direct Memory Access engine, which is used to feed blocks of memory to the off-device interconnect (either GPU-CPU or GPU-GPU).
Each GCD has a separate SDMA engine for host-to-device and device-to-host memory transfers.
Importantly, SDMA engines are separate from the computing infrastructure, meaning that memory transfers to and from a device will not impact kernel compute performance, though they do impact memory bandwidth to a limited extent.
The SDMA engines are mainly tuned for PCIe-4.0 x16, which means they are designed to operate at bandwidths up to 32 GB/s.
:::{note}
An important feature of the MI250X platform is the Infinity Fabric™ interconnect between host and device.
The Infinity Fabric interconnect supports improved performance over standard PCIe-4.0 (usually ~50% more bandwidth); however, since the SDMA engine does not run at this speed, it will not max out the bandwidth of the faster interconnect.
:::
The bandwidth limitation can be countered by bypassing the SDMA engine and replacing it with a type of copy kernel known as a "blit" kernel.
Blit kernels will use the compute units on the GPU, thereby consuming compute resources, which may not always be beneficial.
The easiest way to enable blit kernels is to set an environment variable `HSA_ENABLE_SDMA=0`, which will disable the SDMA engine.
On systems where the GPU uses a PCIe interconnect instead of an Infinity Fabric interconnect, blit kernels will not impact bandwidth, but will still consume compute resources.
The use of SDMA vs blit kernels also applies to MPI data transfers and GPU-GPU transfers.

View File

@@ -1,57 +0,0 @@
.. meta::
:description: How ROCm uses PCIe atomics
:keywords: PCIe, PCIe atomics, atomics, Atomic operations, AMD, ROCm
*****************************************************************************
How ROCm uses PCIe atomics
*****************************************************************************
AMD ROCm is an extension of the Heterogeneous System Architecture (HSA). To meet the requirements of an HSA-compliant system, ROCm supports queuing models, memory models, and signaling and synchronization protocols. ROCm can perform atomic Read-Modify-Write (RMW) transactions that extend inter-processor synchronization mechanisms to Input/Output (I/O) devices starting from Peripheral Component Interconnect Express 3.0 (PCIe™ 3.0). It supports the defined HSA capabilities for queuing and signaling memory operations. To learn more about the requirements of an HSA-compliant system, see the
`HSA Platform System Architecture Specification <http://hsafoundation.com/wp-content/uploads/2021/02/HSA-SysArch-1.2.pdf>`_.
ROCm uses platform atomics to perform memory operations like queuing, signaling, and synchronization across multiple CPU, GPU agents, and I/O devices. Platform atomics ensure that atomic operations run synchronously, without interruptions or conflicts, across multiple shared resources.
Platform atomics in ROCm
==============================
Platform atomics enable the set of atomic operations that perform RMW actions across multiple processors, devices, and memory locations so that they run synchronously without interruption. An atomic operation is a sequence of computing instructions run as a single, indivisible unit. These instructions are completed in their entirety without any interruptions. If the instructions can't be completed as a unit without interruption, none of the instructions are run. These operations support 32-bit and 64-bit address formats.
Some of the operations for which ROCm uses platform atomics are:
* Update the HSA queue's ``read_dispatch_id``. The command processor on the GPU agent uses a 64-bit atomic add operation. It updates the packet ID it processed.
* Update the HSA queue's ``write_dispatch_id``. The CPU and GPU agents use a 64-bit atomic add operation. It supports multi-writer queue insertions.
* Update HSA Signals. A 64-bit atomic operation is used for CPU & GPU synchronization.
PCIe for atomic operations
----------------------------
ROCm requires CPUs that support PCIe atomics. Similarly, all connected I/O devices should also support PCIe atomics for optimum compatibility. PCIe supports the ``CAS`` (Compare and Swap), ``FetchADD``, and ``SWAP`` atomic operations across multiple resources. These atomic operations are initiated by the I/O devices that support 32-bit, 64-bit, and 128-bit operands. Likewise, the target memory address where these atomic operations are performed should also be aligned to the size of the operand. This alignment ensures that the operations are performed efficiently and correctly without failure.
When an atomic operation is successful, the requester receives a response of completion along with the operation result. However, any errors associated with the operation are signaled to the requester by updating the Completion Status field. Issues accessing the target location or running the atomic operation are common errors. Depending upon the error, the Completion Status field is updated to Completer Abort (CA) or Unsupported Request (UR). The field is present in the Completion Descriptor.
To learn more about the industry standards and specifications of PCIe, see `PCI-SIG Specification <https://pcisig.com/specifications>`_.
To learn more about PCIe and its capabilities, consult the following white papers:
* `Atomic Read Modify Write Primitives by Intel <https://www.intel.es/content/dam/doc/white-paper/atomic-read-modify-write-primitives-i-o-devices-paper.pdf>`_
* `PCI Express 3 Accelerator White paper by Intel <https://www.intel.sg/content/dam/doc/white-paper/pci-express3-accelerator-white-paper.pdf>`_
* `PCIe Generation 4 Base Specification includes atomic operations <https://astralvx.com/storage/2020/11/PCI_Express_Base_4.0_Rev0.3_February19-2014.pdf>`_
* `Xilinx PCIe Ultrascale White paper <https://docs.xilinx.com/v/u/8OZSA2V1b1LLU2rRCDVGQw>`_
Working with PCIe 3.0 in ROCm
-------------------------------
Starting with PCIe 3.0, atomic operations can be requested, routed through, and completed by PCIe components. Routing and completion do not require software support. Component support for each can be identified by the Device Capabilities 2 (DevCap2) register. Upstream
bridges need to have atomic operations routing enabled. If not enabled, the atomic operations will fail even if the
PCIe endpoint and PCIe I/O devices can perform atomic operations.
If your system uses PCIe switches to connect and enable communication between multiple PCIe components, the switches must also support atomic operations routing.
To enable atomic operations routing between multiple root ports, each root port must support atomic operation routing. This capability can be identified from the atomic operations routing support bit in the DevCap2 register. If the bit has value of 1, routing is supported. Atomic operation requests are permitted only if a component's ``DEVCTL2.ATOMICOP_REQUESTER_ENABLE``
field is set. These requests can only be serviced if the upstream components also support atomic operation completion or if the requests can be routed to a component that supports atomic operation completion.
ROCm uses the PCIe-ID-based ordering technology for peer-to-peer (P2P) data transmission. PCIe-ID-based ordering technology is used when the GPU initiates multiple write operations to different memory locations.
For more information on changes implemented in PCIe 3.0, see `Overview of Changes to PCI Express 3.0 <https://www.mindshare.com/files/resources/PCIe%203-0.pdf>`_.

View File

@@ -29,52 +29,60 @@ if os.environ.get("READTHEDOCS", "") == "True":
# configurations for PDF output by Read the Docs
project = "ROCm Documentation"
author = "Advanced Micro Devices, Inc."
copyright = "Copyright (c) 2025 Advanced Micro Devices, Inc. All rights reserved."
version = "6.3.2"
release = "6.3.2"
copyright = "Copyright (c) 2024 Advanced Micro Devices, Inc. All rights reserved."
version = "6.3.0"
release = "6.3.0"
setting_all_article_info = True
all_article_info_os = ["linux", "windows"]
all_article_info_author = ""
# pages with specific settings
article_pages = [
{"file": "about/release-notes", "os": ["linux"], "date": "2025-01-28"},
{"file": "compatibility/compatibility-matrix", "os": ["linux"]},
{"file": "compatibility/ml-compatibility/pytorch-compatibility", "os": ["linux"]},
{"file": "compatibility/ml-compatibility/tensorflow-compatibility", "os": ["linux"]},
{"file": "compatibility/ml-compatibility/jax-compatibility", "os": ["linux"]},
{"file": "about/release-notes", "os": ["linux", "windows"], "date": "2024-12-03"},
{"file": "how-to/deep-learning-rocm", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/index", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/training/index", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/training/train-a-model", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/training/prerequisite-system-validation", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/training/train-a-model/benchmark-docker/megatron-lm", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/training/train-a-model/benchmark-docker/pytorch-training", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/training/scale-model-training", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/fine-tuning/index", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/fine-tuning/overview", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/fine-tuning/fine-tuning-and-inference", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/fine-tuning/single-gpu-fine-tuning-and-inference", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/fine-tuning/multi-gpu-fine-tuning-and-inference", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/inference/index", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/inference/install", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/inference/hugging-face-models", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/inference/llm-inference-frameworks", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/inference/vllm-benchmark", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/inference/deploy-your-model", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/inference-optimization/index", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/inference-optimization/model-quantization", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/inference-optimization/model-acceleration-libraries", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/inference-optimization/optimizing-with-composable-kernel", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/inference-optimization/optimizing-triton-kernel", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/inference-optimization/profiling-and-debugging", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/inference-optimization/workload", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/install", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/train-a-model", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/accelerate-training", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/deploy-your-model", "os": ["linux"]},
{"file": "how-to/rocm-for-ai/hugging-face-models", "os": ["linux"]},
{"file": "how-to/rocm-for-hpc/index", "os": ["linux"]},
{"file": "how-to/llm-fine-tuning-optimization/index", "os": ["linux"]},
{"file": "how-to/llm-fine-tuning-optimization/overview", "os": ["linux"]},
{
"file": "how-to/llm-fine-tuning-optimization/fine-tuning-and-inference",
"os": ["linux"],
},
{
"file": "how-to/llm-fine-tuning-optimization/single-gpu-fine-tuning-and-inference",
"os": ["linux"],
},
{
"file": "how-to/llm-fine-tuning-optimization/multi-gpu-fine-tuning-and-inference",
"os": ["linux"],
},
{
"file": "how-to/llm-fine-tuning-optimization/llm-inference-frameworks",
"os": ["linux"],
},
{
"file": "how-to/llm-fine-tuning-optimization/model-acceleration-libraries",
"os": ["linux"],
},
{"file": "how-to/llm-fine-tuning-optimization/model-quantization", "os": ["linux"]},
{
"file": "how-to/llm-fine-tuning-optimization/optimizing-with-composable-kernel",
"os": ["linux"],
},
{
"file": "how-to/llm-fine-tuning-optimization/optimizing-triton-kernel",
"os": ["linux"],
},
{
"file": "how-to/llm-fine-tuning-optimization/profiling-and-debugging",
"os": ["linux"],
},
{"file": "how-to/performance-validation/mi300x/vllm-benchmark", "os": ["linux"]},
{"file": "how-to/system-optimization/index", "os": ["linux"]},
{"file": "how-to/system-optimization/mi300x", "os": ["linux"]},
{"file": "how-to/system-optimization/mi200", "os": ["linux"]},
@@ -93,9 +101,6 @@ extensions = ["rocm_docs", "sphinx_reredirects", "sphinx_sitemap"]
external_projects_current_project = "rocm"
# Uncomment if facing rate limit exceed issue with local build
# external_projects_remote_repository = ""
html_baseurl = os.environ.get("READTHEDOCS_CANONICAL_URL", "https://rocm-stg.amd.com/")
html_context = {}
if os.environ.get("READTHEDOCS", "") == "True":

View File

@@ -1,150 +0,0 @@
<head>
<meta charset="UTF-8">
<meta name="description" content="Building ROCm documentation">
<meta name="keywords" content="documentation, Visual Studio Code, GitHub, command line,
AMD, ROCm">
</head>
# Building documentation
## GitHub
If you open a pull request and scroll down to the summary panel,
there is a commit status section. Next to the line
`docs/readthedocs.com:advanced-micro-devices-demo`, there is a `Details` link.
If you click this, it takes you to the Read the Docs build for your pull request.
![GitHub PR commit status](../data/contribute/commit-status.png)
If you don't see this line, click `Show all checks` to get an itemized view.
## Command line
You can build our documentation via the command line using Python.
See the `build.tools.python` setting in the [Read the Docs configuration file](https://github.com/ROCm/ROCm/blob/develop/.readthedocs.yaml) for the Python version used by Read the Docs to build documentation.
See the [Python requirements file](https://github.com/ROCm/ROCm/blob/develop/docs/sphinx/requirements.txt) for Python packages needed to build the documentation.
Use the Python Virtual Environment (`venv`) and run the following commands from the project root:
```sh
python3 -mvenv .venv
.venv/bin/python -m pip install -r docs/sphinx/requirements.txt
.venv/bin/python -m sphinx -T -E -b html -d _build/doctrees -D language=en docs _build/html
```
Navigate to `_build/html/index.html` and open this file in a web browser.
## Visual Studio Code
With the help of a few extensions, you can create a productive environment to author and test
documentation locally using Visual Studio (VS) Code. Follow these steps to configure VS Code:
1. Install the required extensions:
* Python: `(ms-python.python)`
* Live Server: `(ritwickdey.LiveServer)`
2. Add the following entries to `.vscode/settings.json`.
```json
{
"liveServer.settings.root": "/.vscode/build/html",
"liveServer.settings.wait": 1000,
"python.terminal.activateEnvInCurrentTerminal": true
}
```
* `liveServer.settings.root`: Sets the root of the output website for live previews. Must be changed
alongside the `tasks.json` command.
* `liveServer.settings.wait`: Tells the live server to wait with the update in order to give Sphinx time to
regenerate the site contents and not refresh before the build is complete.
* `python.terminal.activateEnvInCurrentTerminal`: Activates the automatic virtual environment, so you
can build the site from the integrated terminal.
3. Add the following tasks to `.vscode/tasks.json`.
```json
{
"version": "2.0.0",
"tasks": [
{
"label": "Build Docs",
"type": "process",
"windows": {
"command": "${workspaceFolder}/.venv/Scripts/python.exe"
},
"command": "${workspaceFolder}/.venv/bin/python3",
"args": [
"-m",
"sphinx",
"-j",
"auto",
"-T",
"-b",
"html",
"-d",
"${workspaceFolder}/.vscode/build/doctrees",
"-D",
"language=en",
"${workspaceFolder}/docs",
"${workspaceFolder}/.vscode/build/html"
],
"problemMatcher": [
{
"owner": "sphinx",
"fileLocation": "absolute",
"pattern": {
"regexp": "^(?:.*\\.{3}\\s+)?(\\/[^:]*|[a-zA-Z]:\\\\[^:]*):(\\d+):\\s+(WARNING|ERROR):\\s+(.*)$",
"file": 1,
"line": 2,
"severity": 3,
"message": 4
}
},
{
"owner": "sphinx",
"fileLocation": "absolute",
"pattern": {
"regexp": "^(?:.*\\.{3}\\s+)?(\\/[^:]*|[a-zA-Z]:\\\\[^:]*):{1,2}\\s+(WARNING|ERROR):\\s+(.*)$",
"file": 1,
"severity": 2,
"message": 3
}
}
],
"group": {
"kind": "build",
"isDefault": true
}
}
]
}
```
> Implementation detail: two problem matchers were needed to be defined,
> because VS Code doesn't tolerate some problem information being potentially
> absent. While a single regex could match all types of errors, if a capture
> group remains empty (the line number doesn't show up in all warning/error
> messages) but the `pattern` references said empty capture group, VS Code
> discards the message completely.
4. Configure the Python virtual environment (`venv`).
From the Command Palette, run `Python: Create Environment`. Select `venv` environment and
`docs/sphinx/requirements.txt`.
5. Build the docs.
Launch the default build task using one of the following options:
* A hotkey (the default is `Ctrl+Shift+B`)
* Issuing the `Tasks: Run Build Task` from the Command Palette
6. Open the live preview.
Navigate to the site output within VS Code: right-click on `.vscode/build/html/index.html` and
select `Open with Live Server`. The contents should update on every rebuild without having to
refresh the browser.

Some files were not shown because too many files have changed in this diff Show More